Skip to content

[Core]add select by policy for acceleratorclass#518

Merged
pallasathena92 merged 1 commit into
mainfrom
yifeliu/ac-policy-1
Mar 26, 2026
Merged

[Core]add select by policy for acceleratorclass#518
pallasathena92 merged 1 commit into
mainfrom
yifeliu/ac-policy-1

Conversation

@pallasathena92
Copy link
Copy Markdown
Collaborator

@pallasathena92 pallasathena92 commented Feb 2, 2026

What this PR does

Implement policy-based accelerator class selection with four policies: BestFit, Cheapest, MostCapable, and FirstAvailable
Add constraint-based filtering for accelerator candidates (memory, features, architecture, availability)
Add weighted scoring algorithms for intelligent accelerator selection

Why we need it

There are two ways to specify the ac resources: name and selection policy

Implement Detail

Policy Implementations

BestFit Policy (70% memory + 30% compute)

Penalizes over-provisioning (2x memory = 0.5 score)
Iterative precision fallback with degradation penalty (first precision = 1.0, second = 0.5, third = 0.25)
MinComputePerformanceTFLOPS treated as soft constraint (proportional scoring)
Cheapest Policy

Selects lowest cost accelerator meeting constraints
Priority: spot pricing > on-demand > per-million-tokens
MostCapable Policy (50% memory + 30% bandwidth + 20% TFLOPS)

Normalized scoring across different metric ranges
Prioritizes memory and bandwidth over raw compute for inference workloads
FirstAvailable Policy

Returns first accelerator from runtime's candidate list

Constraint Filtering

MinMemory / MaxMemory bounds
Required features (e.g., NVLink)
Architecture family matching
Excluded accelerator classes
Optional availability check (configurable)
For #517

How to test

Checklist

  • Tests added/updated (if applicable)
  • Docs updated (if applicable)
  • make test passes locally

@github-actions github-actions Bot added web-console Web console changes backend Backend API changes accelerator Accelerator class changes tests Test changes labels Feb 2, 2026
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @pallasathena92, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the AcceleratorClass selection mechanism by introducing a flexible, policy-driven approach. Instead of relying solely on explicit naming, users can now define selection policies (e.g., BestFit, Cheapest) along with detailed constraints (memory, features, architecture). This allows for more intelligent and automated resource allocation, optimizing for cost, performance, or specific workload requirements, ultimately improving the efficiency and adaptability of accelerator resource management.

Highlights

  • Policy-Based Accelerator Selection: Introduced a new mechanism for selecting AcceleratorClass resources based on defined policies rather than explicit naming.
  • Four New Selection Policies: Implemented four distinct policies: BestFit (optimizes for resource utilization), Cheapest (minimizes cost), MostCapable (prioritizes performance), and FirstAvailable (selects the first suitable option).
  • Constraint-Based Filtering: Added comprehensive filtering capabilities for accelerator candidates based on criteria such as minimum/maximum memory, required features (e.g., NVLink), architecture family, minimum architecture version, and optional availability checks.
  • Weighted Scoring Algorithms: Developed sophisticated weighted scoring algorithms for the BestFit and MostCapable policies to intelligently rank accelerators based on multiple performance and resource metrics.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a sophisticated policy-based accelerator selection mechanism, which is a great enhancement. The implementation of different policies like BestFit, Cheapest, and MostCapable with detailed scoring logic is well-thought-out. The code is well-structured with helper functions and comprehensive tests. However, I've found a critical issue in the FirstAvailablePolicy implementation that bypasses constraint filtering. I've also identified a few areas for improvement regarding hardcoded values, code duplication, and performance, which I've detailed in the comments. Addressing these points will make the implementation more robust and maintainable.

Comment thread pkg/acceleratorclassselector/selector.go
Comment thread pkg/acceleratorclassselector/policy_helpers.go Outdated
Comment thread pkg/acceleratorclassselector/policy_helpers.go Outdated
Comment thread pkg/acceleratorclassselector/selector.go Outdated
Comment thread pkg/acceleratorclassselector/policy_helpers.go Outdated
Comment thread pkg/acceleratorclassselector/selector.go Outdated
Comment thread pkg/acceleratorclassselector/types.go Outdated
cloud.google.com/go/compute/metadata v0.3.0/go.mod h1:zFmK7XCadkQkj6TtorcaGlCW1hT1fIilQDwofLpJ20k=
github.com/NYTimes/gziphandler v0.0.0-20170623195520-56545f4a5d46/go.mod h1:3wb06e3pkSAbeQ52E9H9iFoQsEEwGN64994WTCIhntQ=
github.com/armon/go-socks5 v0.0.0-20160902184237-e75332964ef5/go.mod h1:wHh0iHkYZB8zMSxRWpUBQtwG5a7fFgvEO+odwuTv2gs=
github.com/asaskevich/govalidator v0.0.0-20190424111038-f61b66f89f4a/go.mod h1:lB+ZfQJz7igIIfQNfa7Ml4HSf2uFQQRzpGGRXenZAgY=
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these changes required

@github-actions github-actions Bot added inferenceservice InferenceService controller changes controller Controller changes dependencies Dependency updates labels Mar 13, 2026
@pallasathena92 pallasathena92 merged commit 9b65e19 into main Mar 26, 2026
33 checks passed
@pallasathena92 pallasathena92 deleted the yifeliu/ac-policy-1 branch March 26, 2026 20:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

accelerator Accelerator class changes backend Backend API changes controller Controller changes dependencies Dependency updates inferenceservice InferenceService controller changes tests Test changes web-console Web console changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants