feat(inference): add spec.modelCache.claimName for user-owned cache PVCs#960
Conversation
Add an optional per-InferenceService modelCache.claimName field that mounts a pre-existing, user-owned PVC as the writable model cache volume instead of the operator's shared/perService cache PVC. The existing model-cache-prep + model-downloader init containers run against the claim unchanged, weights land under the usual <cacheKey>/ subdirectory, and the serving container mounts it read-only. The operator never creates, adopts, or garbage-collects the user's claim: ensureModelCachePVC only verifies it exists, and a missing claim surfaces a Degraded condition plus a ModelCachePVCNotFound warning event instead of silently falling back to the shared cache. For pre-staged pvc:// model sources (read-only, no download) the claimName is ignored and a ModelCacheClaimIgnored warning is emitted. When claimName is unset, behavior is unchanged: the operator-global shared/perService cache mode applies as before. AI assistance: authored by Claude (Opus 4.8) operating under the maintainer's direction; reviewed before submitting. Fixes defilantech#928 Signed-off-by: Jory Irving <jory.irving@stackadapt.com>
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
ReviewOne fix, then good to go. The riskiest concern is handled correctly: a user-owned Silent drop when caching is disabled or the model has no cache key (should fix): Nice-to-haves: the missing-claim degraded condition uses the generic |
Defilan
left a comment
There was a problem hiding this comment.
Review
Approve. The riskiest part is handled correctly and tested: a user-owned claimName PVC gets only a Get and return nil in ensureModelCachePVC (never Create, no owner-ref, no operator labels), so it survives InferenceService deletion instead of being garbage-collected. Precedence (claimName wins before the shared/perService branch), RBAC (existing persistentvolumeclaims get suffices), and CRD/deepcopy/chart sync all check out.
Merging now. One non-blocking follow-up, filed as #965: the symmetric silent-drop when caching is off (claimName set but useCache false routes to buildEmptyDirStorageConfig, which ignores claimName) wants the same ModelCacheClaimIgnored warning you already added for the pvc:// case. Purely observability, no data-loss risk.
Thanks for this, the safe-by-default handling of user PVCs is exactly right.
What
Adds optional
InferenceService.spec.modelCache.claimName: when set, the named pre-existing, user-owned PVC is mounted as the writable model cache for that workload instead of the operator's shared/perService cache PVC.Why
The cache backend is currently operator-global, so there is no way to give one workload (e.g. a large model pinned to a node) node-local cache storage while everything else rides the shared cache.
Fixes #928
How
modelCachePVCNamereturns the user claim when set, so all cache volume references (single-file, multi-file, invalid-fileset branches) and the same prep + downloader init containers use it; weights still land under<cacheKey>/and the serving container mounts read-only.ensureModelCachePVCnever creates/adopts/GCs the user claim — it only verifies existence. A missing claim yields aDegradedcondition plus aModelCachePVCNotFoundwarning event (no silent fallback to the shared cache).claimName+ pre-stagedpvc://source: claimName is ignored (no download path) and aModelCacheClaimIgnoredwarning event is emitted.go test ./internal/controller/... ./api/... -count=1andgolangci-lint run(0 issues).AI assistance: authored by Claude (Opus 4.8) operating under the maintainer's direction; reviewed before submitting.
Checklist
make testpasses locallymake lintpasses locallygit commit -s) per DCO