Release v2.0.1 · allenai/OLMo-core

What's new

Added 🎉

Added information about the official 32B training run.
Added automatic support for LL128 when running on Augusta.

Fixed ✅

The official config for the 32B had unrealistic batch size settings.
Ignore group_overrides for frozen parameters instead of throwing an error.

Removed 👋

Removed the "fused" cross-entropy loss variant. It had a bug and consistently under-performed the native PyTorch version when compiled. See Post Incident Report: bug with fused CE loss for more information.

Commits

27b1ae8 (chore) prepare for release v2.0.1
79ebc7f Add hybrid MoE transformer architecture (#223)
bce2b5b authenticate with Docker Hub to avoid rate limits
b1e0bbd Remove fused CE loss, reorganize MoE kernels/ops (#221)
56e06ee Ignore group_overrides for frozen params (#219)
9d80e8d Update logo for README header. (#218)
974e555 fix some typos, consistent naming
45fe007 Updated documentation (#217)
51aedcf More working config (#216)
47b2ad5 add release PR comments back in

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v2.0.1

Choose a tag to compare

Sorry, something went wrong.