Bug fix: vlm model with seq parallel#517
Conversation
2. Fix sequence parallelism bug with Qwen3.5
|
/gemini review |
There was a problem hiding this comment.
Pull request overview
Fixes sequence-parallel shape mismatches when using veRL optimized (torch/triton) backends, including a targeted patch for Qwen3.5.
Changes:
- Adds backend-specific monkey patches to slice
input_ids/labelsto match sequence-parallel hidden-state shapes. - Introduces a custom autograd slicing op for Qwen3.5 sequence-parallel output handling.
- Updates
train_batch_sizedivisibility validation to account for Ulysses sequence parallel size.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| trinity/trainer/verl/monkey_patch.py | Adds runtime monkey patches for torch/triton optimized backends to correct seq-parallel shape alignment. |
| trinity/common/patch/qwen3_5.py | Adds a custom autograd-based slice and applies it in the seq-parallel decorator for Qwen3.5. |
| trinity/common/config_validator.py | Adjusts batch-size divisibility validation to use the effective data-parallel GPU count under sequence parallelism. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
2. upgrade vllm supporting to 0.18.0
|
/unittest-all |
There was a problem hiding this comment.
Pull request overview
This PR addresses multiple sequence-parallelism correctness issues in the veRL-based trainer integration (including Qwen3.5 and fused-kernel backends), updates the supported vLLM version range to include 0.18.0, and introduces a local replacement for veRL’s rearrange_micro_batches.
Changes:
- Add stricter sequence-parallel/FSDP validation (including batch-size divisibility under SP).
- Patch fused-kernel paths to fix VLM SP shape mismatches, and add a Qwen3.5 SP slicing autograd helper.
- Expand vLLM patch support up to v0.18.0 and monkey-patch veRL seqlen balancing behavior in FSDP workers.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
trinity/trainer/verl/verl_config.py |
Adds SP/FSDP validation including new batch-size divisibility check. |
trinity/trainer/verl/utils.py |
Introduces a local rearrange_micro_batches implementation for seqlen balancing. |
trinity/trainer/verl/monkey_patch.py |
Adds patch_fused_kernels and calls it when applying fused-kernel backends. |
trinity/trainer/verl/fsdp_workers.py |
Monkey-patches veRL’s seqlen_balancing.rearrange_micro_batches in worker init. |
trinity/common/patch/qwen3_5.py |
Adds a custom autograd Slice for Qwen3.5 sequence-parallel slicing behavior. |
trinity/common/models/vllm_patch/worker_patch.py |
Updates version gating to allow vLLM 0.18.0. |
trinity/common/config_validator.py |
Normalizes invalid ulysses_sequence_parallel_size to 1 for veRL trainer. |
pyproject.toml |
Expands optional vLLM dependency upper bound to <=0.18.0. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Summary
Skipped
Tests
Github Test Reporter by CTRF 💚 |
Description
rearrange_micro_batchesChecklist
Please check the following items before code is ready to be reviewed.