imporve supported runtime sort logic#183
Conversation
There was a problem hiding this comment.
Summary of Changes
Hello @pallasathena92, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request significantly improves the logic for selecting the most suitable serving runtime for a given model. It introduces a sophisticated scoring system that considers the importance (weight) of model formats and frameworks, along with their defined priorities. This new approach, combined with refined secondary sorting based on model size proximity, ensures more accurate and efficient runtime assignment.
Highlights
- Introduction of 'weight' field: A new
weightfield (int64, default 1) has been added toModelFormatandModelFrameworkSpecwithin the Custom Resource Definitions (CRDs) and their corresponding Go structs. This enhancement allows for explicit prioritization of specific model formats and frameworks during runtime selection. - Enhanced Runtime Sorting Logic: The core runtime selection and sorting logic in
pkg/controller/v1beta1/inferenceservice/utils/runtime.gohas been significantly refactored. The primary sorting mechanism now leverages a new weighted scoring algorithm, replacing the previous priority-based sorting. - New Scoring Algorithm: A dedicated
scorefunction has been introduced to calculate a compatibility score for each supported runtime. This score is derived from(formatWeight × priority) + (frameworkWeight × priority), enabling a more nuanced and flexible prioritization of runtimes based on their model format and framework compatibility. - Improved Secondary Sorting: When runtimes have identical scores, the sorting now employs a distance-based comparison for the model's size range. This ensures that runtimes whose supported size ranges are numerically closer to the model's actual parameter size are preferred.
- Refined Compatibility Checks: New helper functions,
compareModelFormatandcompareModelFramework, centralize and streamline the version comparison logic. These functions correctly handle various operators (Equal, GreaterThan, GreaterThanOrEqual) and the presence of unofficial version parts. Runtimes that do not achieve a positive score (indicating no suitable format/framework match) are now efficiently filtered out early in the selection process. - Dependency Updates: The
go.sumfile has undergone extensive updates, reflecting various Go module dependency changes and upgrades across the project.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
0053b84 to
c6f01a2
Compare
There was a problem hiding this comment.
Code Review
This pull request introduces a sophisticated weighted scoring mechanism for selecting serving runtimes, which is a significant enhancement. The logic is mostly sound and the changes are well-contained. I've identified a few issues, including a critical bug in the sorting logic that could lead to non-deterministic runtime selection, and a logic error in the new scoring function. Additionally, there are opportunities to improve code maintainability by reducing duplication and enhancing logging and error messages. Once these points are addressed, this will be a solid contribution.
| if (modelFormatMatches || (supportedFormat.ModelFormat == nil && &baseModel.ModelFormat == nil)) && | ||
| (modelFrameworkMatches || (supportedFormat.ModelFramework == nil && baseModel.ModelFramework == nil)) { |
There was a problem hiding this comment.
There seems to be a logic issue in this condition. &baseModel.ModelFormat will never be nil because baseModel.ModelFormat is a struct field, not a pointer. This will cause the second part of the OR condition (supportedFormat.ModelFormat == nil && &baseModel.ModelFormat == nil) to always be false.
The intent is likely to check if the baseModel has a model format specified. You could check if baseModel.ModelFormat.Name is empty instead.
| if (modelFormatMatches || (supportedFormat.ModelFormat == nil && &baseModel.ModelFormat == nil)) && | |
| (modelFrameworkMatches || (supportedFormat.ModelFramework == nil && baseModel.ModelFramework == nil)) { | |
| if (modelFormatMatches || (supportedFormat.ModelFormat == nil && baseModel.ModelFormat.Name == "")) && | |
| (modelFrameworkMatches || (supportedFormat.ModelFramework == nil && baseModel.ModelFramework == nil)) { |
| if !r1HasSizeRange && r2HasSizeRange { | ||
| return false | ||
| } | ||
| return true |
There was a problem hiding this comment.
The final return true in this sort function can lead to unstable and unpredictable sorting behavior when scores and size ranges are equal. A stable sort should be ensured by adding a deterministic tie-breaker. For example, you could sort by the runtime name. Returning false would also be better as it would preserve the existing relative order of elements that are considered equal.
| return true | |
| return runtimes[i].Name < runtimes[j].Name |
| ModelName: "", // Will be filled by caller if available | ||
| ModelFormat: baseModel.ModelFormat.Name, | ||
| Reason: fmt.Sprintf("model format '%s' not in supported formats %v", getModelFormatLabel(baseModel), srSpec.SupportedModelFormats), | ||
| Reason: fmt.Sprintf("model format '%s' not in supported formats", getModelFormatLabel(baseModel)), |
There was a problem hiding this comment.
This error message is less informative than it could be. The previous version of the code included the list of supported formats, which is very helpful for debugging. Please consider adding that information back to the error message. You could create a helper function to extract and join the format names for a cleaner output.
| // compareModelFormat compares two model formats based on their versions and operators. | ||
| func compareModelFormat(supportedModelFormat *v1beta1.ModelFormat, basemodelModeFormat *v1beta1.ModelFormat, modelFormatMatches bool) bool { | ||
| hasUnofficialFormatVersion := false | ||
| // Parse versions | ||
| baseModelFormatVersion, err := modelVer.Parse(*basemodelModeFormat.Version) | ||
| if err != nil { | ||
| fmt.Println("Error parsing basModel modelFormat version:", err) | ||
| return false | ||
| } | ||
|
|
||
| supportedFormatVersion, err := modelVer.Parse(*supportedModelFormat.Version) | ||
| if err != nil { | ||
| fmt.Println("Error parsing supportedFormat modelFormat version:", err) | ||
| return false | ||
| } | ||
|
|
||
| // Check if versions have unofficial parts (requirement #1) | ||
| hasUnofficialFormatVersion = modelVer.ContainsUnofficialVersion(baseModelFormatVersion) || | ||
| modelVer.ContainsUnofficialVersion(supportedFormatVersion) | ||
|
|
||
| // Get operator from modelFormat in supportedFormat | ||
| operator := getRuntimeSelectorOperator(supportedModelFormat.Operator) | ||
|
|
||
| // Compare versions based on operator and whether unofficial versions exist (requirements #1, #2, #3) | ||
| if hasUnofficialFormatVersion || operator == "Equal" { | ||
| modelFormatMatches = modelVer.Equal(supportedFormatVersion, baseModelFormatVersion) | ||
| } else if operator == "GreaterThan" { | ||
| modelFormatMatches = modelVer.GreaterThan(supportedFormatVersion, baseModelFormatVersion) | ||
| } else if operator == "GreaterThanOrEqual" { | ||
| modelFormatMatches = modelVer.GreaterThanOrEqual(supportedFormatVersion, baseModelFormatVersion) | ||
| } else { | ||
| // Default to Equal for unknown operators | ||
| modelFormatMatches = modelVer.Equal(supportedFormatVersion, baseModelFormatVersion) | ||
| } | ||
| return modelFormatMatches | ||
| } | ||
|
|
||
| // compareModelFramework compares two modelFrameworks based on their versions and operators. | ||
| func compareModelFramework(supportedModelFramework *v1beta1.ModelFrameworkSpec, baseModelFramework *v1beta1.ModelFrameworkSpec, modelFrameworkMatches bool) bool { | ||
| hasUnofficialFrameworkVersion := false | ||
| // Parse framework versions | ||
| baseFrameworkVersion, err := modelVer.Parse(*baseModelFramework.Version) | ||
| if err != nil { | ||
| fmt.Println("Error parsing baseModel modelFramework version:", err) | ||
| return false | ||
| } | ||
|
|
||
| supportedFrameworkVersion, err := modelVer.Parse(*supportedModelFramework.Version) | ||
| if err != nil { | ||
| fmt.Println("Error parsing supportedFormat modelFramework version:", err) | ||
| return false | ||
| } | ||
|
|
||
| // Check if versions have unofficial parts (requirement #1) | ||
| hasUnofficialFrameworkVersion = modelVer.ContainsUnofficialVersion(baseFrameworkVersion) || | ||
| modelVer.ContainsUnofficialVersion(supportedFrameworkVersion) | ||
|
|
||
| // Get operator from modelFramework in supportedFormat | ||
| operator := getRuntimeSelectorOperator(supportedModelFramework.Operator) | ||
|
|
||
| // If there are unofficial versions or operator is Equal, use Equal comparison (requirements #1, #2) | ||
| if hasUnofficialFrameworkVersion || operator == "Equal" { | ||
| modelFrameworkMatches = modelVer.Equal(supportedFrameworkVersion, baseFrameworkVersion) | ||
| } else if operator == "GreaterThan" { | ||
| modelFrameworkMatches = modelVer.GreaterThan(supportedFrameworkVersion, baseFrameworkVersion) | ||
| } else if operator == "GreaterThanOrEqual" { | ||
| modelFrameworkMatches = modelVer.GreaterThanOrEqual(supportedFrameworkVersion, baseFrameworkVersion) | ||
| } else { | ||
| // Default to Equal for unknown operators | ||
| modelFrameworkMatches = modelVer.Equal(supportedFrameworkVersion, baseFrameworkVersion) | ||
| } | ||
| return modelFrameworkMatches | ||
| } |
There was a problem hiding this comment.
The functions compareModelFormat and compareModelFramework are nearly identical. This code duplication can be avoided by refactoring them into a single function to improve maintainability.
You could define an interface with GetVersion() and GetOperator() methods that both v1beta1.ModelFormat and v1beta1.ModelFrameworkSpec can implement.
Example interface:
type versionedSpec interface {
GetVersion() *string
GetOperator() *v1beta1.RuntimeSelectorOperator
}Then you could have a single compareVersionedSpec function.
| // Parse versions | ||
| baseModelFormatVersion, err := modelVer.Parse(*basemodelModeFormat.Version) | ||
| if err != nil { | ||
| fmt.Println("Error parsing basModel modelFormat version:", err) |
There was a problem hiding this comment.
Using fmt.Println for logging errors is not ideal in library code. It writes to standard output and lacks context. It would be better to use a structured logger (like the one used in the reconciler) and pass it down. Since this function is called from a sort callback, returning an error isn't feasible, but logging should be improved for better debuggability.
What type of PR is this?
/kind bug
/kind feature
What this PR does / why we need it:
Key Implementation Details
Scoring Algorithm: score = (formatWeight × priority) + (frameworkWeight × priority)
Framework Validation: Framework mismatch results in 0 score (early exit logic)
Edge Case Handling: Properly handles nil frameworks, missing priorities, and multiple format scenarios
Secondary Sorting: Uses distance-based comparison for model size range when scores are tied
Special notes for your reviewer:
Does this PR introduce a user-facing change?