Skip to content

[core] skip model agent task for pvc storage#171

Merged
slin1237 merged 1 commit into
mainfrom
pvc-storage-support-02
Jul 12, 2025
Merged

[core] skip model agent task for pvc storage#171
slin1237 merged 1 commit into
mainfrom
pvc-storage-support-02

Conversation

@slin1237
Copy link
Copy Markdown
Collaborator

PR Type

/kind feature

What this PR does / why we need it

This PR implements the model agent changes needed for PVC storage support as outlined in OEP-0004. Specifically, it updates the model agent
(gopher.go and scout.go) to skip processing of PVC storage types entirely.

Changes made:

  • Added detection logic for PVC storage type using storage.GetStorageType()
  • Model agent now returns immediately when encountering PVC storage URIs (format: pvc://{pvc-name}/{sub-path})
  • Added informational logging when PVC models are skipped
  • Ensured no ConfigMap updates or node labels are created for PVC storage
  • Added unit tests to verify PVC skip behavior

Why this is needed:
PVC storage requires a fundamentally different approach than downloadable storage types. Since DaemonSet pods cannot effectively mount PVCs
(especially RWO volumes), and PVC models don't require downloading or node-specific verification, the model agent must skip these entirely. This
allows the BaseModel controller to handle all PVC operations directly, avoiding complex DaemonSet-PVC interactions.

Which issue(s) this PR fixes

Related to OEP-0004: PVC Storage Support implementation
#162 #160

Special notes for your reviewer

  • This PR only implements the model agent changes (task 02 of OEP-0004)
  • The actual PVC handling logic will be implemented in the BaseModel controller in subsequent PRs
  • All existing storage types (OCI, HuggingFace, Vendor) continue to work unchanged
  • Tests have been added to ensure PVC detection and skip behavior work correctly

User-facing change

NONE - This is an internal implementation change that prepares for future PVC storage support. Users will not see any behavioral changes with this
PR alone.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @slin1237, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces foundational changes to the model agent components to enable future Persistent Volume Claim (PVC) storage support. It modifies the agent's behavior to explicitly ignore and skip any models that specify PVC as their storage type, delegating their management entirely to the BaseModel controller. This is a critical step in the broader implementation of OEP-0004, ensuring a clear separation of concerns for different storage mechanisms.

Highlights

  • Model Agent Task Skipping: The Gopher component of the model agent (pkg/modelagent/gopher.go) has been updated to explicitly detect PVC storage types and immediately skip any further processing for models using them. This ensures that the model agent does not attempt to handle PVC-backed models, which are intended to be managed by the BaseModel controller.
  • Scout Component Download Logic Update: The Scout component (pkg/modelagent/scout.go) now includes logic within its shouldDownloadModel function to identify PVC storage URIs. If a model uses PVC storage, Scout will return false, indicating that the model should not be downloaded by the agent, further reinforcing the delegation of PVC handling to the BaseModel controller.
  • Unit Test Coverage: New unit tests have been added in pkg/modelagent/gopher_test.go to verify the correct skipping behavior for PVC storage types in both the Gopher's task processing and the Scout's download decision logic. These tests also confirm that existing storage types (OCI, HuggingFace, Vendor) are still processed as expected.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@slin1237 slin1237 merged commit 631f7d7 into main Jul 12, 2025
16 checks passed
@slin1237 slin1237 deleted the pvc-storage-support-02 branch July 12, 2025 02:04
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This PR correctly implements the skipping of PVC storage types in the model agent's gopher and scout components, aligning with the design to handle PVCs in the BaseModel controller. The changes are logical and well-contained. My review focuses on improving error handling robustness in the scout, strengthening the new unit tests for better coverage and correctness, and fixing a compilation issue in the new test file.

"go.uber.org/zap"
corev1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The test code uses a helper function stringPtr which is not defined in this file or imported from another package. This will cause a compilation error.

Please add the helper function to this file, for example, after the import block:

func stringPtr(s string) *string {
	return &s
}

Comment thread pkg/modelagent/scout.go
Comment on lines +458 to +462
storageType, err := storage.GetStorageType(*storageSpec.StorageUri)
if err == nil && storageType == storage.StorageTypePVC {
w.logger.Debugf("Skipping PVC storage type in shouldDownloadModel (handled by BaseModel controller)")
return false
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The error returned by storage.GetStorageType is currently being ignored. If an invalid storage URI is provided (e.g., invalid://...), GetStorageType will return an error, but the function will proceed to evaluate node selectors and affinity. This could lead to unexpected behavior where the scout incorrectly decides a model should be downloaded, only for the gopher to fail later.

It would be more robust to handle the error here, log it, and prevent the download from being scheduled.

Suggested change
storageType, err := storage.GetStorageType(*storageSpec.StorageUri)
if err == nil && storageType == storage.StorageTypePVC {
w.logger.Debugf("Skipping PVC storage type in shouldDownloadModel (handled by BaseModel controller)")
return false
}
storageType, err := storage.GetStorageType(*storageSpec.StorageUri)
if err != nil {
w.logger.Warnf("Invalid storage URI '%s', cannot determine storage type: %v. Model will not be downloaded.", *storageSpec.StorageUri, err)
return false
}
if storageType == storage.StorageTypePVC {
w.logger.Debugf("Skipping PVC storage type in shouldDownloadModel (handled by BaseModel controller)")
return false
}

)

// TestHandleTaskPVCSkip tests that PVC storage types are properly skipped
func TestHandleTaskPVCSkip(t *testing.T) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This test, TestHandleTaskPVCSkip, doesn't seem to test the gopher's processTask function as its name implies. Instead, it primarily tests the behavior of storage.GetStorageType and validates a boolean flag expectSkip from the test case table. The core logic inside gopher.go's switch statement for StorageTypePVC is not actually executed or verified.

Additionally, the assertion logic is incomplete. The check if storageType == storage.StorageTypePVC { assert.True(t, tc.expectSkip, ...) } only validates the true case for expectSkip. It doesn't verify that expectSkip is correctly false for other storage types.

To improve this, consider renaming the test to something like TestGetStorageTypeForGopherTasks to more accurately reflect what's being tested, or refactoring this into a true unit test for gopher.processTask.

Comment on lines +157 to +159
if storageType == storage.StorageTypePVC {
assert.True(t, tc.expectSkip, "PVC storage type should be skipped")
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The assertion logic here is incomplete. It only validates the true case for expectSkip when the storage type is PVC. It doesn't verify that expectSkip is correctly false for other storage types.

A more robust assertion would check the expectation for all cases.

// Verify that PVC storage type would be skipped
shouldSkip := (storageType == storage.StorageTypePVC)
assert.Equal(t, tc.expectSkip, shouldSkip)

Comment on lines +190 to +195
testCases := []struct {
name string
storageSpec *v1beta1.StorageSpec
expectedResult bool
description string
}{
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The test suite for shouldDownloadModel is good, but it would be more complete if it included a test case for an invalid storage URI. This would verify that the function correctly handles malformed URIs and returns false, preventing download tasks from being scheduled. This is especially relevant given the potential for unhandled errors from storage.GetStorageType.

{ 
  name: "Invalid storage URI should be skipped",
  storageSpec: &v1beta1.StorageSpec{
  StorageUri: stringPtr("invalid://some/uri"),
  },
  expectedResult: false,
  description: "Invalid storage URI should return false (skip)",
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant