v0.1.4
Pre-release
Pre-release
Highlights
- We now have GPU and NPU in CI and enable some tests on them. We call for more contribution of tests if you are interested!
- Now VeOmni model/dataset/dataloader/checkpointer/chat_template/preprocess are registry-based, making adding new of them and customization more easily
- Update
pyproject.tomlforuv-based env management. We now only suggest you to install veomni throughuv.
What's Changed
- update README.md by @Fazziekey in #188
- [dist] fix: refactor fsdp2 grad norm clipping by @Luosuu in #185
- [misc] fix: reset hf init flag for random init by @Luosuu in #176
- Fix Qwen3-Moe MFU by @zhihaofang1017 in #125
- [model] fix:avoid cpu-device sync for qwenvl on npu by @wey-code in #190
- [task] fix: replace DataArguments with MyDataArguments and remove duplicated step2token saving by @MuyaoLi-jimo in #189
- [ci] test: CI env test by @FoolPlayer in #201
- [ckpt][fix]release cuda mem after dcp sync save by @EricOlivier in #207
- [misc] feat: update uv support for aarch platform for Ascend+Kunpeng … by @pjgao in #148
- [data] fix: fix exception raised when fetching current_device on NPU by @ji-huazhong in #211
- [ci] test: fix data_ci by @Coach257 in #222
- [ci] test: add npu ci env by @FoolPlayer in #219
- title: [data] feat: Implement extensible data preprocessor registry by @TimYangst in #203
- [ci]Add NPU support to data and model test by @Crystal-jiang in #224
- [ci]Add Ascend NPU native support to the unit test code by @Crystal-jiang in #208
- [ci] chore: add gemini config & test by @FoolPlayer in #229
- [test] ci: add device api check for
testsby @onehaitao in #213 - [core] feat: registry for dataset & dataloader & checkpointer & ckpt_to_state_dict & chat_template & preprocess by @Coach257 in #230
- [dist] fix: make OptimizerState EP-dim aware to fix its dcp saving by @Luosuu in #228
- Automatically add the "ascend" label by @Crystal-jiang in #234
- [helper]:fix npu profiling by @Feng0w0 in #214
- helper: degrade veomni_patch functions to warnings/no-op by @iqiancheng in #197
- [ci] fix: dataloader in e2e ckpt test by @Luosuu in #233
- [feat] nccl_timeout by @brook-cpp in #217
- Automatically apply "ascend" label to issues and PRs by @Crystal-jiang in #239
- chore: Upgrade PyTorch dependencies to 2.8.0 and flash-attention to 2.8.3 by @TimYangst in #242
- [version] update transformers version to 4.57.0 by @phdddd in #243
- feat: distributed checkpointer support customized backend by @Ziyi-Wang in #182
- [ckpt] refactor: remove unused output_dir parameter from ckpt_to_state_dict by @TimYangst in #248
- [config, omni, dis] fix: quick fix for sft of Wan2.1-I2V-14B-480P by @zbian99 in #240
- [model] fix: Update
@check_model_inputsdecorator for transformers 4.57+ compatibility by @TimYangst in #252 - [core] fix is_x_backend by @brook-cpp in #251
- [data] fix: quick fix for exception raised when building dit dataloader on NPU by @zbian99 in #246
- upgrade: Upgrade transformers from v4.57.0 to v4.57.3 by @yiwzhao in #249
- [core] feat: model registry by @Coach257 in #258
- [dist] feat: unified veomni grad norm clipping by @Luosuu in #205
- [task]fix: fix train.sh NPROC_PER_NODE calculation logic on the NPU by @Crystal-jiang in #227
- [chore]: cache ep group by @heidongxianhua in #231
New Contributors
- @zhihaofang1017 made their first contribution in #125
- @MuyaoLi-jimo made their first contribution in #189
- @FoolPlayer made their first contribution in #201
- @EricOlivier made their first contribution in #207
- @pjgao made their first contribution in #148
- @ji-huazhong made their first contribution in #211
- @TimYangst made their first contribution in #203
- @Crystal-jiang made their first contribution in #224
- @onehaitao made their first contribution in #213
- @Feng0w0 made their first contribution in #214
- @iqiancheng made their first contribution in #197
- @brook-cpp made their first contribution in #217
- @phdddd made their first contribution in #243
- @zbian99 made their first contribution in #240
- @yiwzhao made their first contribution in #249
Full Changelog: v0.1.3...v0.1.4