Skip to content

imporve supported runtime sort logic#183

Merged
slin1237 merged 1 commit into
mainfrom
yifeliu/runtime-2
Jul 15, 2025
Merged

imporve supported runtime sort logic#183
slin1237 merged 1 commit into
mainfrom
yifeliu/runtime-2

Conversation

@pallasathena92
Copy link
Copy Markdown
Collaborator

What type of PR is this?

/kind bug
/kind feature

What this PR does / why we need it:

Key Implementation Details
Scoring Algorithm: score = (formatWeight × priority) + (frameworkWeight × priority)
Framework Validation: Framework mismatch results in 0 score (early exit logic)
Edge Case Handling: Properly handles nil frameworks, missing priorities, and multiple format scenarios
Secondary Sorting: Uses distance-based comparison for model size range when scores are tied

Special notes for your reviewer:

Does this PR introduce a user-facing change?


Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @pallasathena92, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly improves the logic for selecting the most suitable serving runtime for a given model. It introduces a sophisticated scoring system that considers the importance (weight) of model formats and frameworks, along with their defined priorities. This new approach, combined with refined secondary sorting based on model size proximity, ensures more accurate and efficient runtime assignment.

Highlights

  • Introduction of 'weight' field: A new weight field (int64, default 1) has been added to ModelFormat and ModelFrameworkSpec within the Custom Resource Definitions (CRDs) and their corresponding Go structs. This enhancement allows for explicit prioritization of specific model formats and frameworks during runtime selection.
  • Enhanced Runtime Sorting Logic: The core runtime selection and sorting logic in pkg/controller/v1beta1/inferenceservice/utils/runtime.go has been significantly refactored. The primary sorting mechanism now leverages a new weighted scoring algorithm, replacing the previous priority-based sorting.
  • New Scoring Algorithm: A dedicated score function has been introduced to calculate a compatibility score for each supported runtime. This score is derived from (formatWeight × priority) + (frameworkWeight × priority), enabling a more nuanced and flexible prioritization of runtimes based on their model format and framework compatibility.
  • Improved Secondary Sorting: When runtimes have identical scores, the sorting now employs a distance-based comparison for the model's size range. This ensures that runtimes whose supported size ranges are numerically closer to the model's actual parameter size are preferred.
  • Refined Compatibility Checks: New helper functions, compareModelFormat and compareModelFramework, centralize and streamline the version comparison logic. These functions correctly handle various operators (Equal, GreaterThan, GreaterThanOrEqual) and the presence of unofficial version parts. Runtimes that do not achieve a positive score (indicating no suitable format/framework match) are now efficiently filtered out early in the selection process.
  • Dependency Updates: The go.sum file has undergone extensive updates, reflecting various Go module dependency changes and upgrades across the project.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a sophisticated weighted scoring mechanism for selecting serving runtimes, which is a significant enhancement. The logic is mostly sound and the changes are well-contained. I've identified a few issues, including a critical bug in the sorting logic that could lead to non-deterministic runtime selection, and a logic error in the new scoring function. Additionally, there are opportunities to improve code maintainability by reducing duplication and enhancing logging and error messages. Once these points are addressed, this will be a solid contribution.

Comment on lines +223 to +224
if (modelFormatMatches || (supportedFormat.ModelFormat == nil && &baseModel.ModelFormat == nil)) &&
(modelFrameworkMatches || (supportedFormat.ModelFramework == nil && baseModel.ModelFramework == nil)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

There seems to be a logic issue in this condition. &baseModel.ModelFormat will never be nil because baseModel.ModelFormat is a struct field, not a pointer. This will cause the second part of the OR condition (supportedFormat.ModelFormat == nil && &baseModel.ModelFormat == nil) to always be false.

The intent is likely to check if the baseModel has a model format specified. You could check if baseModel.ModelFormat.Name is empty instead.

Suggested change
if (modelFormatMatches || (supportedFormat.ModelFormat == nil && &baseModel.ModelFormat == nil)) &&
(modelFrameworkMatches || (supportedFormat.ModelFramework == nil && baseModel.ModelFramework == nil)) {
if (modelFormatMatches || (supportedFormat.ModelFormat == nil && baseModel.ModelFormat.Name == "")) &&
(modelFrameworkMatches || (supportedFormat.ModelFramework == nil && baseModel.ModelFramework == nil)) {

if !r1HasSizeRange && r2HasSizeRange {
return false
}
return true
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The final return true in this sort function can lead to unstable and unpredictable sorting behavior when scores and size ranges are equal. A stable sort should be ensured by adding a deterministic tie-breaker. For example, you could sort by the runtime name. Returning false would also be better as it would preserve the existing relative order of elements that are considered equal.

Suggested change
return true
return runtimes[i].Name < runtimes[j].Name

ModelName: "", // Will be filled by caller if available
ModelFormat: baseModel.ModelFormat.Name,
Reason: fmt.Sprintf("model format '%s' not in supported formats %v", getModelFormatLabel(baseModel), srSpec.SupportedModelFormats),
Reason: fmt.Sprintf("model format '%s' not in supported formats", getModelFormatLabel(baseModel)),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This error message is less informative than it could be. The previous version of the code included the list of supported formats, which is very helpful for debugging. Please consider adding that information back to the error message. You could create a helper function to extract and join the format names for a cleaner output.

Comment on lines +403 to +475
// compareModelFormat compares two model formats based on their versions and operators.
func compareModelFormat(supportedModelFormat *v1beta1.ModelFormat, basemodelModeFormat *v1beta1.ModelFormat, modelFormatMatches bool) bool {
hasUnofficialFormatVersion := false
// Parse versions
baseModelFormatVersion, err := modelVer.Parse(*basemodelModeFormat.Version)
if err != nil {
fmt.Println("Error parsing basModel modelFormat version:", err)
return false
}

supportedFormatVersion, err := modelVer.Parse(*supportedModelFormat.Version)
if err != nil {
fmt.Println("Error parsing supportedFormat modelFormat version:", err)
return false
}

// Check if versions have unofficial parts (requirement #1)
hasUnofficialFormatVersion = modelVer.ContainsUnofficialVersion(baseModelFormatVersion) ||
modelVer.ContainsUnofficialVersion(supportedFormatVersion)

// Get operator from modelFormat in supportedFormat
operator := getRuntimeSelectorOperator(supportedModelFormat.Operator)

// Compare versions based on operator and whether unofficial versions exist (requirements #1, #2, #3)
if hasUnofficialFormatVersion || operator == "Equal" {
modelFormatMatches = modelVer.Equal(supportedFormatVersion, baseModelFormatVersion)
} else if operator == "GreaterThan" {
modelFormatMatches = modelVer.GreaterThan(supportedFormatVersion, baseModelFormatVersion)
} else if operator == "GreaterThanOrEqual" {
modelFormatMatches = modelVer.GreaterThanOrEqual(supportedFormatVersion, baseModelFormatVersion)
} else {
// Default to Equal for unknown operators
modelFormatMatches = modelVer.Equal(supportedFormatVersion, baseModelFormatVersion)
}
return modelFormatMatches
}

// compareModelFramework compares two modelFrameworks based on their versions and operators.
func compareModelFramework(supportedModelFramework *v1beta1.ModelFrameworkSpec, baseModelFramework *v1beta1.ModelFrameworkSpec, modelFrameworkMatches bool) bool {
hasUnofficialFrameworkVersion := false
// Parse framework versions
baseFrameworkVersion, err := modelVer.Parse(*baseModelFramework.Version)
if err != nil {
fmt.Println("Error parsing baseModel modelFramework version:", err)
return false
}

supportedFrameworkVersion, err := modelVer.Parse(*supportedModelFramework.Version)
if err != nil {
fmt.Println("Error parsing supportedFormat modelFramework version:", err)
return false
}

// Check if versions have unofficial parts (requirement #1)
hasUnofficialFrameworkVersion = modelVer.ContainsUnofficialVersion(baseFrameworkVersion) ||
modelVer.ContainsUnofficialVersion(supportedFrameworkVersion)

// Get operator from modelFramework in supportedFormat
operator := getRuntimeSelectorOperator(supportedModelFramework.Operator)

// If there are unofficial versions or operator is Equal, use Equal comparison (requirements #1, #2)
if hasUnofficialFrameworkVersion || operator == "Equal" {
modelFrameworkMatches = modelVer.Equal(supportedFrameworkVersion, baseFrameworkVersion)
} else if operator == "GreaterThan" {
modelFrameworkMatches = modelVer.GreaterThan(supportedFrameworkVersion, baseFrameworkVersion)
} else if operator == "GreaterThanOrEqual" {
modelFrameworkMatches = modelVer.GreaterThanOrEqual(supportedFrameworkVersion, baseFrameworkVersion)
} else {
// Default to Equal for unknown operators
modelFrameworkMatches = modelVer.Equal(supportedFrameworkVersion, baseFrameworkVersion)
}
return modelFrameworkMatches
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The functions compareModelFormat and compareModelFramework are nearly identical. This code duplication can be avoided by refactoring them into a single function to improve maintainability.

You could define an interface with GetVersion() and GetOperator() methods that both v1beta1.ModelFormat and v1beta1.ModelFrameworkSpec can implement.

Example interface:

type versionedSpec interface {
    GetVersion() *string
    GetOperator() *v1beta1.RuntimeSelectorOperator
}

Then you could have a single compareVersionedSpec function.

// Parse versions
baseModelFormatVersion, err := modelVer.Parse(*basemodelModeFormat.Version)
if err != nil {
fmt.Println("Error parsing basModel modelFormat version:", err)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using fmt.Println for logging errors is not ideal in library code. It writes to standard output and lacks context. It would be better to use a structured logger (like the one used in the reconciler) and pass it down. Since this function is called from a sort callback, returning an error isn't feasible, but logging should be improved for better debuggability.

@slin1237 slin1237 merged commit 3722047 into main Jul 15, 2025
22 checks passed
@slin1237 slin1237 deleted the yifeliu/runtime-2 branch July 15, 2025 03:30
slin1237 pushed a commit that referenced this pull request Dec 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants