Enable arch tests for Qwen3VL and Cohere2 in OpenVINO backend#180
Merged
zhaixuejun1993 merged 2 commits intoMay 25, 2026
Merged
Conversation
Collaborator
Author
da48690
into
ravi9:dev_backend_openvino
4 of 14 checks passed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

This pull request introduces improvements to the handling of dynamic sequence lengths in the ROPE (Rotary Positional Embedding) implementation within the OpenVINO integration. The changes ensure that sequence slicing is handled correctly per ROPE node, prevent unsafe sharing of positional encodings, and allow passing dynamic sequence lengths through the computation graph.
Dynamic sequence handling and ROPE improvements:
make_sin_cosfunction now accepts an optionaltoken_len_per_seqparameter, allowing dynamic slicing of input positions based on the active sequence length. The function applies slicing whentoken_len_per_seqis provided and the mode is notimrope. [1] [2] [3]translate_ropefunction is updated to detect and pass thetoken_len_per_seqinput tomake_sin_cosif available, enabling dynamic sequence length support in ROPE computations.token_len_per_seqinGgmlOvDecoder::compute_llm_paramsis corrected to derive the value from the input sequence shape, ensuring accurate sequence length propagation.Safety and correctness enhancements:
add_rope_sin_cosfunction now avoids reusing shared ROPE sine/cosine values across the graph when dynamic active-sequence slicing is in use, preventing potential mismatches in positional encoding.## OverviewAdditional information
Requirements