updating Subfn, Prefill only logic in Disagg mode#820
Merged
ochougul merged 5 commits intoquic:qwen3_vl_mainlinefrom Mar 5, 2026
Merged
updating Subfn, Prefill only logic in Disagg mode#820ochougul merged 5 commits intoquic:qwen3_vl_mainlinefrom
ochougul merged 5 commits intoquic:qwen3_vl_mainlinefrom
Conversation
Signed-off-by: vtirumal <vtirumal@qti.qualcomm.com>
c3343ed to
f9a3d17
Compare
Signed-off-by: vtirumal <vtirumal@qti.qualcomm.com>
Signed-off-by: vtirumal <vtirumal@qti.qualcomm.com>
Signed-off-by: vtirumal <vtirumal@qti.qualcomm.com>
ochougul
requested changes
Mar 4, 2026
| Path to the generated ONNX graph file for the language decoder. | ||
| """ | ||
| if prefill_only: | ||
| if prefill_only is not None: |
Contributor
There was a problem hiding this comment.
can we rewrite as
if prefill_only:
assert prefill_seq_len>1
if not enable_chunking and self.continuous_batching:
raise NotImplementedError(
"Looks like you are trying to run prefix-caching without chunking, this feature is not available yet!"
)
self.hash_params["prefill_only"] = True
self.prefill(enable=True, enable_chunking=enable_chunking)
else:
self.hash_params["prefill_only"] = False
self.prefill(False, retain_full_kv=kwargs.get("retain_full_kv", False))
Comment on lines
+1498
to
+1499
| or prefill_only | ||
| or prefill_seq_len == 1 # to export for prefill and decode |
Contributor
There was a problem hiding this comment.
remove both lines and try to use
if (
vision_onnx_path is None
or lang_onnx_path is None
):
| **compiler_options, | ||
| ) | ||
| if skip_vision and prefill_only: # for disagg serving | ||
| if skip_vision and (prefill_only or prefill_seq_len == 1): # for disagg serving |
| self, | ||
| export_dir: Optional[str] = None, | ||
| prefill_only: Optional[bool] = False, | ||
| prefill_only: Optional[bool] = None, |
06bc8dd to
23a9844
Compare
Signed-off-by: vtirumal <vtirumal@qti.qualcomm.com>
23a9844 to
0a940aa
Compare
ochougul
approved these changes
Mar 5, 2026
qcdipankar
pushed a commit
that referenced
this pull request
Mar 10, 2026
Added Support for Subfn for Qwen 3 VL dense, MOE. Updated prefill only logic for disagg mode --------- Signed-off-by: vtirumal <vtirumal@qti.qualcomm.com> Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>
qcdipankar
pushed a commit
that referenced
this pull request
Mar 11, 2026
Added Support for Subfn for Qwen 3 VL dense, MOE. Updated prefill only logic for disagg mode --------- Signed-off-by: vtirumal <vtirumal@qti.qualcomm.com> Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Added Support for Subfn for Qwen 3 VL dense, MOE.
Updated prefill only logic for disagg mode