Skip to content

Conversation

@pkooij
Copy link
Member

@pkooij pkooij commented Oct 9, 2025

This PR adds accelerate integration and docs to LeRobot. We keep it basic and not yet add all accelerate options (deep speed etc)

  • We use accelerate even when having a single GPU to make the code cleaner in lerobot_train.py and so we can utilize their methods for device discovery etc.
  • Added docs
  • Only main process does logging, uploading and dataset downloading

Tested

  • single gpu training on mac
  • single gpu training on H100
  • 4x multi gpu training on H100

AdilZouitine and others added 3 commits October 2, 2025 18:11
- Added support for multi-GPU training by introducing an `accelerator` parameter in training functions.
- Updated `update_policy` to handle gradient updates based on the presence of an accelerator.
- Modified logging to prevent duplicate messages in non-main processes.
- Enhanced `set_seed` and `get_safe_torch_device` functions to accommodate accelerator usage.
- Updated `MetricsTracker` to account for the number of processes when calculating metrics.
- Introduced a new feature in `pyproject.toml` for the `accelerate` library dependency.
…esses

- Added `init_logging` calls to ensure proper logging setup when using the accelerator and in standard training mode.
- This change enhances the clarity and consistency of logging during training sessions.
@pkooij pkooij added enhancement Suggestions for new features or improvements policies Items related to robot policies labels Oct 9, 2025
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@pkooij pkooij marked this pull request as ready for review October 14, 2025 15:43
Copy link
Collaborator

@michel-aractingi michel-aractingi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First look at the Pr its done very well, great job!
I just have two comments, I'll give it a deeper dive tomorrow and test it.

Copy link
Collaborator

@michel-aractingi michel-aractingi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for me its LGTM, waiting for @imstevenpmwork approval

@imstevenpmwork imstevenpmwork changed the title Add Accelerate -> melt gpus feat(train): add accelerate for multi gpu training Oct 16, 2025
@imstevenpmwork imstevenpmwork merged commit e82e7a0 into main Oct 16, 2025
17 checks passed
@imstevenpmwork imstevenpmwork deleted the feat/accelerate-melt-gpus branch October 16, 2025 15:41
@imstevenpmwork imstevenpmwork mentioned this pull request Oct 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement Suggestions for new features or improvements policies Items related to robot policies

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants