Difference between huggingface deepspeed integration and accelerate deepspeed-plugin #214

jerryli1981 · 2021-12-19T05:13:57Z

huggingface transformer support deepspeed (https://huggingface.co/docs/transformers/main_classes/deepspeed )

accelerate also support deepspeed via ds-plugin, so what is design thoughts besides no trainer flexibility?

jerryli1981 · 2021-12-19T13:51:04Z

Another quick question is "BatchSamplerShard" in data_loader.py is functionally equal to torch.utils.data.distributed.DistributedSampler ?

sgugger · 2021-12-20T13:39:13Z

Please use the forums to ask questions as we keep the issues for bugs and feature requests only. The Trainer supports deepspeed but Accelerate is designed for people who don't want to use a Trainer. It also supports deepseed for people who want to use that library and retain full control over their training loop.

github-actions · 2022-05-24T15:53:47Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions bot closed this as completed Jun 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Difference between huggingface deepspeed integration and accelerate deepspeed-plugin #214

Difference between huggingface deepspeed integration and accelerate deepspeed-plugin #214

jerryli1981 commented Dec 19, 2021

jerryli1981 commented Dec 19, 2021

sgugger commented Dec 20, 2021

github-actions bot commented May 24, 2022

Difference between huggingface deepspeed integration and accelerate deepspeed-plugin #214

Difference between huggingface deepspeed integration and accelerate deepspeed-plugin #214

Comments

jerryli1981 commented Dec 19, 2021

jerryli1981 commented Dec 19, 2021

sgugger commented Dec 20, 2021

github-actions bot commented May 24, 2022