chore(deps): update dependency accelerate to v1.8.1 #551
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR contains the following updates:
==1.7.0->==1.8.1Release Notes
huggingface/accelerate (accelerate)
v1.8.1: : PatchfixCompare Source
Full Changelog: huggingface/accelerate@v1.8.0...v1.8.1
v1.8.0: : FSDPv2 + FP8, Regional Compilation for DeepSpeed, Faster Distributed Training on Intel CPUs, ipex.optimize deprecationCompare Source
FSDPv2 refactor + FP8 support
We've simplified how to prepare FSDPv2 models, as there were too many ways to compose FSDP2 with other features (e.g., FP8, torch.compile, activation checkpointing, etc.). Although the setup is now more restrictive, it leads to fewer errors and a more performant user experience. We’ve also added support for FP8. You can read about the results here. Thanks to @S1ro1 for this contribution!
Faster Distributed Training on Intel CPUs
We updated the
CCL_WORKER_COUNTvariable and addedKMPparameters for Intel CPU users. This significantly improves distributed training performance (e.g., Tensor Parallelism), with up to a 40% speed-up on Intel 4th Gen Xeon when training transformer TP models.Regional Compilation for DeepSpeed
We added support for regional compilation with the DeepSpeed engine. DeepSpeed’s .compile() modifies models in-place using torch.nn.Module.compile(...), rather than the out-of-place torch.compile(...), so we had to account for that. Thanks @IlyasMoutawwakil for this feature!
ipex.optimize deprecation
ipex.optimizeis being deprecated. Most optimizations have been upstreamed to PyTorch, and future improvements will land there directly. For users without PyTorch 2.8, we’ll continue to rely on IPEX for now.Better XPU Support
We've greatly expanded and stabilized support for Intel XPUs:
Trackers
We've added support for SwanLab as an experiment tracking backend. Huge thanks to @ShaohonChen for this contribution ! We also deferred all tracker initializations to prevent premature setup of distributed environments.
What's Changed
accelerator().load_state()by @luiz0992 in https://github.com/huggingface/accelerate/pull/3540dtype_byte_sizeby @SunMarc in https://github.com/huggingface/accelerate/pull/3625New Contributors
Full Changelog: huggingface/accelerate@v1.7.0...v1.8.0
Configuration
📅 Schedule: Branch creation - "on friday" (UTC), Automerge - At any time (no schedule defined).
🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.
♻ Rebasing: Whenever PR is behind base branch, or you tick the rebase/retry checkbox.
🔕 Ignore: Close this PR and you won't be reminded about this update again.
This PR was generated by Mend Renovate. View the repository job log.