Skip to content

Add 16MB SP1024 Value Residual + PPM mixture submission ppm_mix_bpb 0.829467#2103

Open
lkk688 wants to merge 3 commits intoopenai:mainfrom
lkk688:add-sp1024-ppm-record-submission
Open

Add 16MB SP1024 Value Residual + PPM mixture submission ppm_mix_bpb 0.829467#2103
lkk688 wants to merge 3 commits intoopenai:mainfrom
lkk688:add-sp1024-ppm-record-submission

Conversation

@lkk688
Copy link
Copy Markdown

@lkk688 lkk688 commented May 1, 2026

This PR adds a 16MB submission under:

/records/track_10min_16mb/2026-04-30_SP1024_ValueResid_PPMMix/

Summary:

  • SentencePiece 1024 tokenizer
  • 9-layer 512d Transformer with MLP mult 2
  • Value Residual enabled in the last 2 layers
  • Byte-level PPM mixture at evaluation time
  • Artifact fits under 16MB
  • Best reproduced ppm_mix_bpb: 0.829467

This is a record submission but the included result was produced on a single H100 (600s wallclock limit) rather than a 8xH100 / 10-minute run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant