Skip to content

[API] introduce logic of handling model CR update with DownloadPolicy#511

Merged
pallasathena92 merged 1 commit into
mainfrom
private-artifact-update
Jan 22, 2026
Merged

[API] introduce logic of handling model CR update with DownloadPolicy#511
pallasathena92 merged 1 commit into
mainfrom
private-artifact-update

Conversation

@truddy0
Copy link
Copy Markdown
Collaborator

@truddy0 truddy0 commented Jan 21, 2026

What this PR does

This is the 3rd PR of model artifact reuse feature #409 .
Introduce logic of handling model CR update with DownloadPolicy

If the model CR needs to be updated due to the downloadPolicy value being changed, there will be 2 cases:

  1. downloadPolicy is changed from AlwaysDownload to ReuseIfExists: It will trigger the model artifact deletion and create a symbolic link from child path to parent path.
  2. downloadPolicy changed from ReuseIfExists to AlwaysDownload: It will trigger symbolic link deletion and download the model artifact to the child path.

Why we need it

If customer needs to migrate to the new download policy and revert to the default download policy, such request will be honored

How to test

Test ID Context and Test Action Expected Result Test Evidence
T-01 Model is Child. Download policy changes from ReuseIfExists → AlwaysDownload. Parent model (-hello) ConfigMap is removed with the child. Child model (-switch) is downloaded with artifact. Child model ConfigMap parent field is updated to reference itself. Test 1 (see test evidence file)
T-02 Model is Child. Download policy changes from AlwaysDownload → ReuseIfExists. Parent model (-parent) ConfigMap is updated to include the child. Child model (-switch) updates its parent reference. Existing artifact is deleted and replaced with a symbolic link. Test 2 (see test evidence file)
T-03 Model is Parent. Download policy changes from ReuseIfExists → AlwaysDownload. No operation. Both parent and child models remain unchanged. Test 3 (see test evidence file)
T-04 Model is Parent. Download policy changes from AlwaysDownload → ReuseIfExists. If no other model exists with the same artifact SHA, no operation is performed. Otherwise, the parent model transitions to a child model. Test 4 (see test evidence file)

test_evidence_update.txt

Checklist

  • Tests added/updated (if applicable)
  • Docs updated (if applicable)
  • make test passes locally
➜  ome git:(private-artifact-update) ✗ make test
-[ 2026-01-20 17:00:55 Start ]-
🔧 Fixing tools go.mod file...
✅ go.mod fixed
📦 Installing goimports...
cd /Users/huiding/src/github.com/ome/hack/internal/tools && GOBIN=/Users/huiding/src/github.com/ome/bin GO111MODULE=on go install golang.org/x/tools/cmd/goimports@latest
✅ Installation complete
🧹 Formatting Go code...
🧹 Organizing imports in Go files...
✅ Formatting complete
🔍 Checking code with go vet...
✅ Vet checks passed
🎮 Installing controller-gen...
cd /Users/huiding/src/github.com/ome/hack/internal/tools && GOBIN=/Users/huiding/src/github.com/ome/bin GO111MODULE=on go install sigs.k8s.io/controller-tools/cmd/controller-gen@v0.18.0
✅ Installation complete
🔧 Installing yq...
cd /Users/huiding/src/github.com/ome/hack/internal/tools && GOBIN=/Users/huiding/src/github.com/ome/bin GO111MODULE=on go install github.com/mikefarah/yq/v4@v4.44.3
✅ Installation complete

📦 Kubernetes Manifest Generation Starting...

🔧 Step 1: Generating CRD manifests...
✅ CRD manifests generated

🔑 Step 2: Generating RBAC manifests...
✅ RBAC manifests generated

📝 Step 3: Generating object boilerplate...
✅ Object boilerplate generated

🔄 Step 4: Applying CRD fixes and modifications...
  • Fixing stored versions...
  • Fixing conditions...
  • Updating type definitions...
  • Updating framework properties...
  • Optimizing CRD size...
  • Updating probe configurations...
  • Setting protocol defaults...
✅ CRD modifications complete

📋 Step 5: Generating minimal CRDs...
Creating minimal CRD file: config/crd/minimal/ome.io_basemodels.yaml
Creating minimal CRD file: config/crd/minimal/ome.io_finetunedweights.yaml
Creating minimal CRD file: config/crd/minimal/ome.io_inferenceservices.yaml
Creating minimal CRD file: config/crd/minimal/ome.io_clusterbasemodels.yaml
Creating minimal CRD file: config/crd/minimal/ome.io_servingruntimes.yaml
Creating minimal CRD file: config/crd/minimal/ome.io_benchmarkjobs.yaml
Creating minimal CRD file: config/crd/minimal/ome.io_acceleratorclasses.yaml
Creating minimal CRD file: config/crd/minimal/ome.io_clusterservingruntimes.yaml
✅ Minimal CRDs generated

📁 Step 6: Copying manifests to Helm charts...
✅ Manifests copied to Helm charts

🎉 Manifest generation completed successfully!

🧪 Installing envtest...
cd /Users/huiding/src/github.com/ome/hack/internal/tools && GOBIN=/Users/huiding/src/github.com/ome/bin GO111MODULE=on go install sigs.k8s.io/controller-runtime/tools/setup-envtest@v0.0.0-20241105200929-48ec3b71211f
✅ Installation complete
🔧 Building XET library...
Building Rust library...
cargo build --release 
    Finished `release` profile [optimized] target(s) in 0.43s
Copying static library...
Building Go bindings...
go build -ldflags="-extldflags '-L./'" .
✅ XET library built

🧪 Running comprehensive test suite (Toolchain: go1.25.0+auto )...
🧪 Running cmd tests ...
ok      github.com/sgl-project/ome/cmd/crd-gen  0.536s  coverage: 82.6% of statements
ok      github.com/sgl-project/ome/cmd/manager  2.324s  coverage: 16.2% of statements
ok      github.com/sgl-project/ome/cmd/model-agent      1.905s  coverage: 36.4% of statements
ok      github.com/sgl-project/ome/cmd/multinode-prober 2.650s  coverage: 57.8% of statements
ok      github.com/sgl-project/ome/cmd/ome-agent        1.444s  coverage: 45.6% of statements
ok      github.com/sgl-project/ome/cmd/qpext    2.297s  coverage: 69.4% of statements
        github.com/sgl-project/ome/cmd/spec-gen         coverage: 0.0% of statements
✅ cmd tests passed 
🧪 Running pkg tests ...
ok      github.com/sgl-project/ome/pkg/acceleratorclassselector 0.840s  coverage: 22.0% of statements
ok      github.com/sgl-project/ome/pkg/afero    1.186s  coverage: 21.4% of statements
        github.com/sgl-project/ome/pkg/apis             coverage: 0.0% of statements
ok      github.com/sgl-project/ome/pkg/apis/ome/v1beta1 1.462s  coverage: 0.7% of statements
ok      github.com/sgl-project/ome/pkg/auth     1.904s  coverage: 81.2% of statements
ok      github.com/sgl-project/ome/pkg/auth/aws 2.732s  coverage: 90.0% of statements
ok      github.com/sgl-project/ome/pkg/auth/azure       2.176s  coverage: 73.7% of statements
ok      github.com/sgl-project/ome/pkg/auth/gcp 2.577s  coverage: 87.0% of statements
ok      github.com/sgl-project/ome/pkg/auth/oci 94.668s coverage: 91.0% of statements
ok      github.com/sgl-project/ome/pkg/configutils      3.297s  coverage: 56.5% of statements
ok      github.com/sgl-project/ome/pkg/constants        3.732s  coverage: 1.4% of statements [no tests to run]
ok      github.com/sgl-project/ome/pkg/controller/v1beta1/acceleratorclass      4.111s  coverage: 51.4% of statements
ok      github.com/sgl-project/ome/pkg/controller/v1beta1/basemodel     2.547s  coverage: 79.5% of statements
ok      github.com/sgl-project/ome/pkg/controller/v1beta1/benchmark     1.425s  coverage: 67.9% of statements
ok      github.com/sgl-project/ome/pkg/controller/v1beta1/benchmark/reconcilers/job     0.884s  coverage: 78.3% of statements
ok      github.com/sgl-project/ome/pkg/controller/v1beta1/benchmark/utils       2.706s  coverage: 66.7% of statements
ok      github.com/sgl-project/ome/pkg/controller/v1beta1/controllerconfig      2.172s  coverage: 64.6% of statements
ok      github.com/sgl-project/ome/pkg/controller/v1beta1/inferenceservice      6.458s  coverage: 50.4% of statements
ok      github.com/sgl-project/ome/pkg/controller/v1beta1/inferenceservice/components   1.048s  coverage: 58.3% of statements
ok      github.com/sgl-project/ome/pkg/controller/v1beta1/inferenceservice/reconcilers/autoscaler       2.423s  coverage: 31.1% of statements
ok      github.com/sgl-project/ome/pkg/controller/v1beta1/inferenceservice/reconcilers/common   1.415s  coverage: 43.8% of statements
ok      github.com/sgl-project/ome/pkg/controller/v1beta1/inferenceservice/reconcilers/deployment       2.838s  coverage: 98.7% of statements
ok      github.com/sgl-project/ome/pkg/controller/v1beta1/inferenceservice/reconcilers/external_service 3.367s  coverage: 84.6% of statements
ok      github.com/sgl-project/ome/pkg/controller/v1beta1/inferenceservice/reconcilers/hpa      3.780s  coverage: 41.9% of statements
ok      github.com/sgl-project/ome/pkg/controller/v1beta1/inferenceservice/reconcilers/ingress  4.106s  coverage: 93.5% of statements
ok      github.com/sgl-project/ome/pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/builders 3.703s  coverage: 77.8% of statements
ok      github.com/sgl-project/ome/pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/factory  4.192s  coverage: 100.0% of statements
?       github.com/sgl-project/ome/pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/interfaces       [no test files]
ok      github.com/sgl-project/ome/pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/services 3.516s  coverage: 91.5% of statements
ok      github.com/sgl-project/ome/pkg/controller/v1beta1/inferenceservice/reconcilers/ingress/strategies       3.790s  coverage: 72.7% of statements
ok      github.com/sgl-project/ome/pkg/controller/v1beta1/inferenceservice/reconcilers/istiosidecar     3.284s  coverage: 95.8% of statements
ok      github.com/sgl-project/ome/pkg/controller/v1beta1/inferenceservice/reconcilers/keda     3.707s  coverage: 40.8% of statements
        github.com/sgl-project/ome/pkg/controller/v1beta1/inferenceservice/reconcilers/knative          coverage: 0.0% of statements
ok      github.com/sgl-project/ome/pkg/controller/v1beta1/inferenceservice/reconcilers/lws      3.298s  coverage: 98.6% of statements
ok      github.com/sgl-project/ome/pkg/controller/v1beta1/inferenceservice/reconcilers/modelconfig      3.238s  coverage: 72.9% of statements
        github.com/sgl-project/ome/pkg/controller/v1beta1/inferenceservice/reconcilers/multinode                coverage: 0.0% of statements
ok      github.com/sgl-project/ome/pkg/controller/v1beta1/inferenceservice/reconcilers/multinodevllm    3.153s  coverage: 54.6% of statements
ok      github.com/sgl-project/ome/pkg/controller/v1beta1/inferenceservice/reconcilers/pdb      3.509s  coverage: 92.0% of statements
        github.com/sgl-project/ome/pkg/controller/v1beta1/inferenceservice/reconcilers/raw              coverage: 0.0% of statements
ok      github.com/sgl-project/ome/pkg/controller/v1beta1/inferenceservice/reconcilers/rbac     3.039s  coverage: 82.5% of statements
        github.com/sgl-project/ome/pkg/controller/v1beta1/inferenceservice/reconcilers/service          coverage: 0.0% of statements
ok      github.com/sgl-project/ome/pkg/controller/v1beta1/inferenceservice/status       3.007s  coverage: 90.3% of statements
ok      github.com/sgl-project/ome/pkg/controller/v1beta1/inferenceservice/utils        3.302s  coverage: 54.9% of statements
ok      github.com/sgl-project/ome/pkg/hfutil/hub       3.830s  coverage: 68.9% of statements
        github.com/sgl-project/ome/pkg/hfutil/hub/samples/basic_download                coverage: 0.0% of statements
        github.com/sgl-project/ome/pkg/hfutil/hub/samples/enhanced_client               coverage: 0.0% of statements
        github.com/sgl-project/ome/pkg/hfutil/hub/samples/llama_download                coverage: 0.0% of statements
        github.com/sgl-project/ome/pkg/hfutil/hub/samples/progress_logging              coverage: 0.0% of statements
ok      github.com/sgl-project/ome/pkg/hfutil/modelconfig       3.091s  coverage: 66.3% of statements
        github.com/sgl-project/ome/pkg/hfutil/modelconfig/examples              coverage: 0.0% of statements
ok      github.com/sgl-project/ome/pkg/imds     3.433s  coverage: 81.1% of statements
ok      github.com/sgl-project/ome/pkg/logging  3.473s  coverage: 20.1% of statements
ok      github.com/sgl-project/ome/pkg/logging/ginlog   2.689s  coverage: 31.1% of statements
ok      github.com/sgl-project/ome/pkg/modelagent       4.893s  coverage: 47.1% of statements
ok      github.com/sgl-project/ome/pkg/modelver 1.755s  coverage: 80.5% of statements
ok      github.com/sgl-project/ome/pkg/ociobjectstore   3.423s  coverage: 32.3% of statements
        github.com/sgl-project/ome/pkg/principals               coverage: 0.0% of statements
ok      github.com/sgl-project/ome/pkg/runtimeselector  3.303s  coverage: 79.0% of statements
ok      github.com/sgl-project/ome/pkg/storage  2.641s  coverage: 64.0% of statements
ok      github.com/sgl-project/ome/pkg/storage/providers/gcs    2.842s  coverage: 6.2% of statements
ok      github.com/sgl-project/ome/pkg/storage/providers/oci    2.909s  coverage: 12.2% of statements
        github.com/sgl-project/ome/pkg/storage/providers/s3             coverage: 0.0% of statements
ok      github.com/sgl-project/ome/pkg/utils    1.847s  coverage: 60.8% of statements
ok      github.com/sgl-project/ome/pkg/utils/storage    2.518s  coverage: 89.2% of statements
ok      github.com/sgl-project/ome/pkg/vault    2.360s  coverage: 90.6% of statements
ok      github.com/sgl-project/ome/pkg/vault/kmscrypto  3.008s  coverage: 30.9% of statements
ok      github.com/sgl-project/ome/pkg/vault/kmsmgm     2.412s  coverage: 23.3% of statements
ok      github.com/sgl-project/ome/pkg/vault/kmsvault   2.505s  coverage: 25.4% of statements
ok      github.com/sgl-project/ome/pkg/vault/secret     2.594s  coverage: 36.2% of statements
ok      github.com/sgl-project/ome/pkg/vault/secret_in_vault    2.953s  coverage: 78.6% of statements
ok      github.com/sgl-project/ome/pkg/vault/secret_retrieval   3.188s  coverage: 72.8% of statements
ok      github.com/sgl-project/ome/pkg/vault/vault      3.400s  coverage: 36.2% of statements
?       github.com/sgl-project/ome/pkg/version  [no test files]
ok      github.com/sgl-project/ome/pkg/webhook/admission/benchmark      3.263s  coverage: 79.3% of statements
ok      github.com/sgl-project/ome/pkg/webhook/admission/isvc   3.111s  coverage: 93.5% of statements
ok      github.com/sgl-project/ome/pkg/webhook/admission/pod    7.155s  coverage: 56.9% of statements
ok      github.com/sgl-project/ome/pkg/webhook/admission/servingruntime 3.789s  coverage: 64.2% of statements
ok      github.com/sgl-project/ome/pkg/zipper   2.693s  coverage: 83.8% of statements
✅ pkg tests passed 
🧪 Running internal tests ...
ok      github.com/sgl-project/ome/internal/ome-agent/enigma    2.315s  coverage: 37.2% of statements
        github.com/sgl-project/ome/internal/ome-agent/fine-tuned-adapter                coverage: 0.0% of statements
ok      github.com/sgl-project/ome/internal/ome-agent/model-metadata    2.742s  coverage: 39.8% of statements
ok      github.com/sgl-project/ome/internal/ome-agent/replica   3.190s  coverage: 69.5% of statements
ok      github.com/sgl-project/ome/internal/ome-agent/replica/common    3.700s  coverage: 100.0% of statements
ok      github.com/sgl-project/ome/internal/ome-agent/replica/replicator        4.601s  coverage: 46.2% of statements
ok      github.com/sgl-project/ome/internal/ome-agent/serving-agent     5.177s  coverage: 59.1% of statements
✅ internal tests passed 

🎉 All tests completed successfully!

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@github-actions github-actions Bot added model-agent Model agent changes tests Test changes labels Jan 21, 2026
@pallasathena92 pallasathena92 merged commit d9611c8 into main Jan 22, 2026
29 checks passed
@pallasathena92 pallasathena92 deleted the private-artifact-update branch January 22, 2026 01:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

model-agent Model agent changes tests Test changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants