Add is_model_splitted interface#71
Merged
zhaixuejun1993 merged 5 commits intoravi9:dev_backend_openvinofrom Mar 19, 2026
Merged
Conversation
wine99
approved these changes
Mar 17, 2026
cavusmustafa
requested changes
Mar 17, 2026
cavusmustafa
approved these changes
Mar 18, 2026
be67f32
into
ravi9:dev_backend_openvino
94 of 117 checks passed
Collaborator
|
@zhaixuejun1993 The split check seems to be expensive when running with quantized models
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

This pull request adds support for detecting and handling split-model computation graphs ("splitted" models) in the OpenVINO GGML decoder integration. The changes introduce a heuristic function to determine if a computation graph is a split subgraph and propagate this information through the decoder's API and logic. This enables more robust and accurate model handling, especially for cases where models are partitioned or composed of multiple subgraphs.
Key changes include:
Split-model detection and handling:
Introduced the
is_model_splittedfunction inutils.cppand its declaration inutils.h, which heuristically checks if aggml_cgraphis a split-model fragment. This function is now used in the dynamic compute path to determine whether to use naive computation or not. [1] [2] [3]Updated the
GgmlOvDecoderclass and its constructor to accept and store amodel_is_splittedflag, and added theis_splited_model()virtual method to the decoder interface. This allows downstream components to query if the current model is split. [1] [2] [3] [4] [5]Integration with compute logic:
utils.cppto utilize the new split-model detection logic and to correctly constructGgmlOvDecoderinstances with the appropriatemodel_is_splittedflag. [1] [2] [3]