v1.6.0
What's new
Added π
- Added option to compile the trainer's loss function (
Trainer.compile_loss). - Added
SourceMixtureDatasetfor composing a training mixture based on ratios of source datasets. - Added
NumpyFSLDatasetMixturefor constructing aNumpyDatasetBasefrom aSourceMixtureDataset. Note this is only supported for FSL datasets. - Added tests for
SourceMixture*andNumpyFSLDatasetMixture. - Added
DownstreamEvaluatorCallbackConfigclass for running in-loop downstream eval via OLMo-in-loop-evals.
Changed β οΈ
- Moved some types into
olmo_core.data.typesto avoid some circular dependencies.
Fixed β
- Made GCS client more robust by automatically retrying timeout errors for most operations.
Commits
29e1276 (chore) prepare for release v1.6.0
da39e97 Add note about optional dependencies
81b1249 Missed _bust_index_cache in one spot (#78)
00d34f6 Add option to compile loss function, move logits FP32 casting into loss function (#77)
4928f82 Adds mixing loader for FSL datasets (#70)
ecb0686 Allow stopping the experiment on keyboard int
41400c4 Add Llama 8B config (#76)
282c120 Update Docker build (#75)
55d261e Make GCS client more robust (#74)
3fe59b6 Add a callback for downstream evals, update Docker builds (#73)
ecd523e include release chore commit in release notes