Skip to content

support truncation_strategy split & cached_dataset (qwen3.5)#9211

Merged
Jintao-Huang merged 5 commits into
modelscope:mainfrom
Jintao-Huang:support_split_cached_dataset
Apr 25, 2026
Merged

support truncation_strategy split & cached_dataset (qwen3.5)#9211
Jintao-Huang merged 5 commits into
modelscope:mainfrom
Jintao-Huang:support_split_cached_dataset

Conversation

@Jintao-Huang
Copy link
Copy Markdown
Collaborator

No description provided.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enables the use of the split truncation strategy in conjunction with cached datasets during pre-training. The changes include documentation updates in both Chinese and English, an updated example script demonstrating the correct workflow, and a modification in sft.py to relax restrictions on multimodal models for this strategy. Feedback was provided to ensure the Chinese documentation is consistent with the English version by adding a warning that the split strategy must be explicitly set during both the dataset export and training phases when using cached datasets.

Comment thread docs/source/Instruction/Command-line-parameters.md Outdated
@Jintao-Huang
Copy link
Copy Markdown
Collaborator Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enables compatibility between the split truncation strategy and cached_dataset for pre-training. It updates the documentation to specify that when using this combination, input_ids are stored and consistent parameters must be used across export and training. The changes also include updates to the validation logic in sft.py and the example pre-training script. I have no feedback to provide.

@Jintao-Huang Jintao-Huang merged commit 68eae8c into modelscope:main Apr 25, 2026
1 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants