[Question] The logic of data sampler in data parallel.

Hi, thanks for your attention.

When reading the source code of transformers, I cannot understand the implementation of `_get_train_sampler` in `trainer.py`. Why the default data sampler is `RandomSampler` rather than `DistributedSampler`? How does the trainer handle the sampler for data parallel?

reference code: https://github.com/huggingface/transformers/blob/main/src/transformers/trainer.py#L975

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Question] The logic of data sampler in data parallel. #38428

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Question] The logic of data sampler in data parallel. #38428

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions