ArcticTraining v0.7.0
What's Changed
- Speculator for gpt-oss by @sfc-gh-jaelee in #260
- Bump v0.6.1.dev0 by @sfc-gh-mwyatt in #268
- check
arctic_training_runis in thePATHenv var's paths by @sfc-gh-sbekman in #275 - fix 404 link by @sfc-gh-sbekman in #276
- liger-kernel: cleanly report when a model isn't supported by @sfc-gh-sbekman in #269
- change deepspeed defaults by @sfc-gh-sbekman in #271
- deal with torch_dtype deprecation by @sfc-gh-sbekman in #279
- another
torch_dtype => dtypeupdate by @sfc-gh-sbekman in #280 - fix SP size by @sfc-gh-sbekman in #282
- pynvml is deprecated by @sfc-gh-sbekman in #285
- Allow masking empty think tokens preventing the loss of thinking ability by @sfc-gh-lborchmann in #286
- Expose scheduler-specific kwargs, such as min_lr_rate by @sfc-gh-lborchmann in #284
- Fix: prevent conversion of bool to float in deepspeed config by @sfc-gh-mwyatt in #287
- add python profiler by @sfc-gh-sbekman in #288
- Optimization: Better data filter and packing performance by @sfc-gh-mwyatt in #292
- tiled mlp: auto-monkeypatch by @sfc-gh-sbekman in #290
- model-specific flop counters by @sfc-gh-sbekman in #289
- Finish porting testing_utils.py by @sfc-gh-sbekman in #291
- move
wandblog dir out of repo's root by @sfc-gh-sbekman in #294 make autoformatrun only on branch's modified files by @sfc-gh-sbekman in #295- do not log the first train iter to wandb by @sfc-gh-sbekman in #293
- [CI] modal gpus workflow by @sfc-gh-sbekman in #299
- new feature: CausalTrainer by @sfc-gh-sbekman in #210
- ALST/UlyssesSP: API wrt variable seqlen by @sfc-gh-sbekman in #298
- allow hf model config overrides by @sfc-gh-sbekman in #302
- modal ci: fix by @sfc-gh-sbekman in #303
- rename fusedadam => fused_adam by @sfc-gh-sbekman in #306
- allow hf model config overrides: take 2 by @sfc-gh-sbekman in #304
- Bump version from 0.6.1.dev0 to 0.7.0 by @sfc-gh-mwyatt in #307
Full Changelog: v0.6.0...v0.7.0