Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Configs] Add sequence_parallel_size and SequenceParallelSampler to configs #538

Merged
merged 3 commits into from
Apr 2, 2024

Conversation

HIT-cwh
Copy link
Collaborator

@HIT-cwh HIT-cwh commented Apr 2, 2024

Modified configs: deepseek, llama2, internlm2, yi and zephyr.

  1. Add sequence_parallel_size to configs
  2. accumulative_counts = accumulative_counts * sequence_parallel_size. Suppose I aim to employ a training strategy using a batch size per device of 1 with a maximum length of max_length on N GPUs. Upon setting the sequence parallelism dimension to SP, the accumulative counts have to be adjusted to SP times the original value. This modification is essential to assure training equivalence, as the sequence of max_length length will be segmented into SP parts, with each part being allocated to its respective GPU among the SP GPUs for parallelized training.
  3. If sequence_parallel_size is greater than 1, use SequenceParallelSampler, otherwise use DefaultSampler.

@HIT-cwh HIT-cwh merged commit ea33f46 into InternLM:main Apr 2, 2024
1 of 3 checks passed
@HIT-cwh HIT-cwh deleted the fix_sp_configs branch April 2, 2024 05:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants