Fix `block_size` picking in `megatron_lm_gpt_pretraining` example. #2342

nilq · 2024-01-16T10:57:57Z

What does this PR do?

If no block_size is explicitly passed, and the tokenizer happens to have a model_max_length of less than 1024, the current example will produce a bad block_size. With this change, we only cap block_size to 1024 if tokenizer.model_max_length is actually greater than 1024.

Who can review?

Tagging @pacman100 / @sgugger as the authors of the surrounding code.

Only cap `block_size` to 1024 if `tokenizer.model_max_length` is actually greater than 1024.

pacman100

Thank you @nilq for the fix, LGTM!

HuggingFaceDocBuilderDev · 2024-01-18T05:46:51Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

muellerzr

Thanks!

…ce#2342) Only cap `block_size` to 1024 if `tokenizer.model_max_length` is actually greater than 1024.

Fix block_size picking in megatron_lm_gpt_pretraining.py

7ae10f3

Only cap `block_size` to 1024 if `tokenizer.model_max_length` is actually greater than 1024.

muellerzr requested a review from pacman100 January 17, 2024 16:37

pacman100 approved these changes Jan 18, 2024

View reviewed changes

muellerzr approved these changes Jan 18, 2024

View reviewed changes

muellerzr merged commit 14d7c3f into huggingface:main Jan 18, 2024
23 checks passed

statelesshz pushed a commit to statelesshz/accelerate that referenced this pull request Jan 22, 2024

Fix block_size picking in megatron_lm_gpt_pretraining.py (huggingfa…

ed55fc5

…ce#2342) Only cap `block_size` to 1024 if `tokenizer.model_max_length` is actually greater than 1024.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix `block_size` picking in `megatron_lm_gpt_pretraining` example. #2342

Fix `block_size` picking in `megatron_lm_gpt_pretraining` example. #2342

nilq commented Jan 16, 2024

pacman100 left a comment

HuggingFaceDocBuilderDev commented Jan 18, 2024

muellerzr left a comment

Fix block_size picking in megatron_lm_gpt_pretraining example. #2342

Fix block_size picking in megatron_lm_gpt_pretraining example. #2342

Conversation

nilq commented Jan 16, 2024

What does this PR do?

Who can review?

pacman100 left a comment

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Jan 18, 2024

muellerzr left a comment

Choose a reason for hiding this comment

Fix `block_size` picking in `megatron_lm_gpt_pretraining` example. #2342

Fix `block_size` picking in `megatron_lm_gpt_pretraining` example. #2342