[Configs] Add sequence_parallel_size and SequenceParallelSampler to configs #538

HIT-cwh · 2024-04-02T03:07:36Z

Modified configs: deepseek, llama2, internlm2, yi and zephyr.

Add sequence_parallel_size to configs
accumulative_counts = accumulative_counts * sequence_parallel_size. Suppose I aim to employ a training strategy using a batch size per device of 1 with a maximum length of max_length on N GPUs. Upon setting the sequence parallelism dimension to SP, the accumulative counts have to be adjusted to SP times the original value. This modification is essential to assure training equivalence, as the sequence of max_length length will be segmented into SP parts, with each part being allocated to its respective GPU among the SP GPUs for parallelized training.
If sequence_parallel_size is greater than 1, use SequenceParallelSampler, otherwise use DefaultSampler.

HIT-cwh added 3 commits April 2, 2024 11:06

add sp to configs

c35ae77

add sp to configs

205f894

fix comments

5de7bea

LZHgrla approved these changes Apr 2, 2024

View reviewed changes

HIT-cwh merged commit ea33f46 into InternLM:main Apr 2, 2024
1 of 3 checks passed

HIT-cwh deleted the fix_sp_configs branch April 2, 2024 05:20

Provide feedback