Skip to content

Adding multi_specializations_frames#909

Merged
quic-rishinr merged 10 commits into
quic:release/v1.21.6from
mohiso22:qwen_multi
Apr 10, 2026
Merged

Adding multi_specializations_frames#909
quic-rishinr merged 10 commits into
quic:release/v1.21.6from
mohiso22:qwen_multi

Conversation

@mohiso22
Copy link
Copy Markdown
Contributor

@mohiso22 mohiso22 commented Apr 7, 2026

Adds multi-specialization support for Qwen2.5-VL and Qwen3-VL, enabling the vision encoder to be compiled with multiple resolution configurations (height/width/num_frames as lists) in one shot.
Introduces a dedicated _generate_multi_frame_specialization inference path that selects the right specialization at runtime, along with example scripts for both model families.

@mohiso22 mohiso22 requested review from quic-rishinr and vbaddi April 7, 2026 04:20
@vbaddi
Copy link
Copy Markdown
Contributor

vbaddi commented Apr 7, 2026

@mohiso22 can you fix the quickcheck pls?

Comment thread QEfficient/generation/vlm_generation.py Outdated
Comment thread QEfficient/generation/vlm_generation.py Outdated
Comment thread QEfficient/generation/vlm_generation.py Outdated
Comment thread QEfficient/generation/vlm_generation.py Outdated
Comment thread examples/image_text_to_text/models/qwen_vl/multi_specialization_inference.py Outdated
Comment thread QEfficient/generation/vlm_generation.py Outdated

def _generate_multi_frame_specialization(
self,
inputs: List[str] = None,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: update the type annotation for inputs as inputs: Dict[str, torch.Tensor]

Comment thread QEfficient/generation/vlm_generation.py Outdated
Comment thread QEfficient/generation/vlm_generation.py Outdated
Comment thread QEfficient/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py Outdated
Comment thread QEfficient/transformers/models/qwen3_vl/modeling_qwen3_vl.py Outdated
@mohiso22 mohiso22 marked this pull request as ready for review April 9, 2026 14:27
}
]
else:
assert vision_size * f < user_vision_size, (
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: better to use exception instead of assert.

grid_height = grid_height * time * batch_size
if not user_vision_size:
max_vision_size = max(max_vision_size, vision_size * f)
assert max_vision_size < ctx_len, (
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: better to use exception instead of assert.

return self._generate_regular_batching(vision_prompts, generation_len, stream, **kwargs)

def run_prefill_multi_frame_specialization(
self, inputs: Optional[torch.Tensor], num_frames: Optional[int] = 1, generation_len: int = None
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a doc string for the method

self._session.deactivate()
self._vision_session.activate()

if not num_frames:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: better to specific it as if num_frames ==0: num_frames = 1 and add a warning if that was the check. Since default value is set to 1 ideally we need not need this condition unless somewhere we are passing the value as none.

@quic-rishinr quic-rishinr changed the base branch from main to release/v1.21.6 April 10, 2026 11:13
Mohit Soni added 10 commits April 10, 2026 16:44
Signed-off-by: Mohit Soni <mohisoni@qti.qualcomm.com>
Signed-off-by: Mohit Soni <mohisoni@qti.qualcomm.com>
Signed-off-by: Mohit Soni <mohisoni@qti.qualcomm.com>
Signed-off-by: Mohit Soni <mohisoni@qti.qualcomm.com>
Signed-off-by: Mohit Soni <mohisoni@qti.qualcomm.com>
Signed-off-by: Mohit Soni <mohisoni@qti.qualcomm.com>
Signed-off-by: Mohit Soni <mohisoni@qti.qualcomm.com>
Signed-off-by: Mohit Soni <mohisoni@qti.qualcomm.com>
Signed-off-by: Mohit Soni <mohisoni@qti.qualcomm.com>
Signed-off-by: Mohit Soni <mohisoni@qti.qualcomm.com>
@quic-rishinr
Copy link
Copy Markdown
Contributor

Merging the PR on release/v1.21.6. @mohiso22 please raise a new PR on mainline with the couple of changes requested.

@quic-rishinr quic-rishinr merged commit adc4c18 into quic:release/v1.21.6 Apr 10, 2026
4 checks passed
quic-dhirajku added a commit to quic-dhirajku/efficient-transformers that referenced this pull request Jun 2, 2026
…models as per reference from quic#909.

Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants