Skip to content

Record: Full GPTQ + LeakyReLU² + Parallel Muon (3-seed mean 1.1180)#626

Closed
kshitizz36 wants to merge 1 commit intoopenai:mainfrom
kshitizz36:submission/full-gptq-leakyrelu2-parallel-muon-3seed-1.1180
Closed

Record: Full GPTQ + LeakyReLU² + Parallel Muon (3-seed mean 1.1180)#626
kshitizz36 wants to merge 1 commit intoopenai:mainfrom
kshitizz36:submission/full-gptq-leakyrelu2-parallel-muon-3seed-1.1180

Conversation

@kshitizz36
Copy link

Results

  • 3-seed mean val_bpb (sliding, stride=64): 1.11800697
  • std (sample): 0.00102882
  • mean size: 15,931,864 bytes
  • hardware: 8xH100 SXM, 600s train budget
Seed step_avg steps Pre-quant bpb Post-GPTQ sliding bpb Total size (bytes)
42 84.98ms 7062 1.1384 1.11752350 15,890,732
1337 86.65ms 6925 1.1402 1.11918848 15,891,780
2024 85.20ms 7043 1.1383 1.11730894 16,013,080
Mean 85.61ms 7010 1.1390 1.11800697 15,931,864

Notes

  • No TTT used.
  • Full logs for all 3 seeds are included under this record folder.
  • Independent 3-seed run of the same Full GPTQ + LeakyReLU² + Parallel Muon direction.

@valerio-oai
Copy link
Contributor

This submission uses GPTQ at test time, hence using training data at eval time, which is disallowed. Closing for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants