Support dpo/grpo/gkd/sft padding_free by tastelikefeet · Pull Request #181 · modelscope/twinkle

tastelikefeet · 2026-04-23T08:54:24Z

PR type

Bug Fix
New Feature
Document Updates
More Models or Datasets Support

PR information

Support padding-free
Support no-argument constructor on Dataset
Support logger print information on slave processes
Change the default value of variable_seq_lengths to True to support padding-free

Experiment results

Tested padding-free on megatron/transformers grpo/dpo/sft/gkd
Tested dataset constructor on grpo

gemini-code-assist

Code Review

This pull request centralizes the logic for unpacking packed sequences (padding-free mode) into the InputProcessor class, moving it out of specific loss implementations like GRPO. It introduces a canonical method to detect packing and unpack tensors such as log-probabilities and labels into a per-sequence batch format. These changes are integrated into both Megatron and Transformers sequence parallel strategies. The review feedback identifies several improvement opportunities: ensuring the boundary detection logic explicitly includes the first sequence, relaxing the packing detection heuristic to support sequences of length one, and optimizing the unpacking process for better performance and consistency.

tastelikefeet · 2026-04-27T03:42:52Z

/gemini review

gemini-code-assist

Code Review

This pull request refactors the handling of packed sequences by moving unpacking logic into the InputProcessor and introduces a require_logits attribute to loss classes to optimize memory usage. It also updates Megatron and Transformers models to support variable sequence lengths by default, implements rank-aware logging, and includes FSDP2 compatibility fixes for LoRA dtypes. Feedback points out an inconsistent error message in the sampling configuration and a potential crash in the sequence unpacking utility when processing empty lists.

…ding_free_fix

…eet/twinkle into feat/padding_free_fix

fix

f875119

gemini-code-assist Bot reviewed Apr 23, 2026

View reviewed changes

Comment thread src/twinkle/processor/base.py Outdated

Comment thread src/twinkle/processor/base.py

Comment thread src/twinkle/processor/base.py

tastelikefeet added 11 commits April 24, 2026 19:23

fix

04411ee

fix

3bafa63

wip

d6ccd46

fix

e8e37b5

fix

03a8c0b

fix

13cef5c

fix

46136d6

wip

7ea0729

fix

30cc8e1

fix

1f93f73

fix

cf0aa03

tastelikefeet changed the title ~~[WIP]Support dpo/grpo simple padding_free~~ Support dpo/grpo simple padding_free Apr 26, 2026

tastelikefeet added 2 commits April 27, 2026 11:27

fix

2db614e

fix

33a66a8

fix

c168462

gemini-code-assist Bot reviewed Apr 27, 2026

View reviewed changes

Comment thread src/twinkle/data_format/sampling.py

Comment thread src/twinkle/processor/base.py

Merge commit 'ed00083b59be143a8550a166ea30019eaad37e0d' into feat/pad…

7203e77

…ding_free_fix

tastelikefeet changed the title ~~Support dpo/grpo simple padding_free~~ Support dpo/grpo/gkd padding_free Apr 27, 2026

tastelikefeet changed the title ~~Support dpo/grpo/gkd padding_free~~ Support dpo/grpo/gkd/sft padding_free Apr 27, 2026

Jintao-Huang approved these changes Apr 27, 2026

View reviewed changes

hjh0119 approved these changes Apr 27, 2026

View reviewed changes

Comment thread cookbook/rl/dpo_lora.py Outdated

tastelikefeet added 4 commits April 27, 2026 14:38

fix

2e7147d

Merge branch 'feat/padding_free_fix' of https://github.com/tastelikef…

0cd526c

…eet/twinkle into feat/padding_free_fix

fix

d3e9ab4

fix

af66a64

tastelikefeet merged commit 55d377a into modelscope:main Apr 27, 2026
1 of 3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support dpo/grpo/gkd/sft padding_free#181

Support dpo/grpo/gkd/sft padding_free#181
tastelikefeet merged 20 commits intomodelscope:mainfrom
tastelikefeet:feat/padding_free_fix

tastelikefeet commented Apr 23, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tastelikefeet commented Apr 27, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

tastelikefeet commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR type

PR information

Experiment results

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tastelikefeet commented Apr 27, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tastelikefeet commented Apr 23, 2026 •

edited

Loading