Skip to content

[Core] Add owner reference to RBAC resources for automatic cleanup#492

Merged
slin1237 merged 1 commit into
ome-projects:mainfrom
bcfre:fix-rbac
Jan 8, 2026
Merged

[Core] Add owner reference to RBAC resources for automatic cleanup#492
slin1237 merged 1 commit into
ome-projects:mainfrom
bcfre:fix-rbac

Conversation

@bcfre
Copy link
Copy Markdown
Contributor

@bcfre bcfre commented Jan 8, 2026

What this PR does

This PR adds proper owner reference management to RBAC resources (ServiceAccount, Role, RoleBinding) to enable automatic garbage collection and prevent resource leaks.

Why we need it

Previously, RBAC resources created by the InferenceService controller did not have proper owner references set, which caused:

  • Resource leaks: RBAC resources were not automatically cleaned up when InferenceService was deleted
  • Cleanup blocking issues: Manual cleanup in the reconciler caused blocking problems during resource cleanup phase

Fixes #

How to test

go test -v ./pkg/controller/v1beta1/inferenceservice/reconcilers/rbac/

example

apiVersion: ome.io/v1beta1
kind: InferenceService
metadata:
  name: qwen3-0-6b
spec:
  model:
    name: qwen3-0-6b
  runtime:
    name: srt-qwen3-0-6b
  router:
    minReplicas: 1
    maxReplicas: 1
  engine:
    minReplicas: 1
    maxReplicas: 1

before

NAMESPACE  NAME                                             READY  REASON              AGE
default    InferenceService/qwen3-0-6b                      False  Initializing        40s
default    ├─ConfigMap/modelconfig-qwen3-0-6b             -                          40s
default    ├─Deployment/qwen3-0-6b-engine                 -                          40s
default    │ └─ReplicaSet/qwen3-0-6b-engine-5bf986dbc5   -                          40s
default    │   ├─Pod/qwen3-0-6b-engine-5bf986dbc5-hk5wr  False  ContainersNotReady  25s
default    │   └─Pod/qwen3-0-6b-engine-5bf986dbc5-stpl5  False  ContainersNotReady  40s
default    ├─Deployment/qwen3-0-6b-router                 -                          40s
default    │ └─ReplicaSet/qwen3-0-6b-router-65f8988b7c   -                          40s
default    │   └─Pod/qwen3-0-6b-router-65f8988b7c-7nhxn  False  ContainersNotReady  40s
default    ├─HorizontalPodAutoscaler/qwen3-0-6b-engine    -                          40s
default    ├─HorizontalPodAutoscaler/qwen3-0-6b-router    -                          40s
default    ├─PodDisruptionBudget/qwen3-0-6b-engine        -                          40s
default    ├─PodDisruptionBudget/qwen3-0-6b-router        -                          40s
default    ├─Service/qwen3-0-6b                           -                          40s
default    │ └─EndpointSlice/qwen3-0-6b-w2kkk            -                          40s
default    ├─Service/qwen3-0-6b-engine                    -                          40s
default    │ └─EndpointSlice/qwen3-0-6b-engine-zgvgz     -                          40s
default    └─Service/qwen3-0-6b-router                    -                          40s
default      └─EndpointSlice/qwen3-0-6b-router-m5n8m      -                          40s

after Role/qwen3-0-6b-router, RoleBinding/qwen3-0-6b-router, ServiceAccount/qwen3-0-6b-router

NAMESPACE  NAME                                             READY  REASON              AGE
default    InferenceService/qwen3-0-6b                      True                       6m43s
default    ├─ConfigMap/modelconfig-qwen3-0-6b             -                          6m43s
default    ├─Deployment/qwen3-0-6b-engine                 -                          6m43s
default    │ └─ReplicaSet/qwen3-0-6b-engine-5bf986dbc5   -                          6m43s
default    │   └─Pod/qwen3-0-6b-engine-5bf986dbc5-stpl5  True                       6m43s
default    ├─Deployment/qwen3-0-6b-router                 -                          6m43s
default    │ └─ReplicaSet/qwen3-0-6b-router-65f8988b7c   -                          6m43s
default    │   └─Pod/qwen3-0-6b-router-65f8988b7c-7nhxn  False  ContainersNotReady  6m43s
default    ├─HorizontalPodAutoscaler/qwen3-0-6b-engine    -                          6m43s
default    ├─HorizontalPodAutoscaler/qwen3-0-6b-router    -                          6m43s
default    ├─PodDisruptionBudget/qwen3-0-6b-engine        -                          6m43s
default    ├─PodDisruptionBudget/qwen3-0-6b-router        -                          6m43s
default    ├─Role/qwen3-0-6b-router                       -                          6m43s
default    ├─RoleBinding/qwen3-0-6b-router                -                          6m43s
default    ├─Service/qwen3-0-6b                           -                          6m43s
default    │ └─EndpointSlice/qwen3-0-6b-w2kkk            -                          6m43s
default    ├─Service/qwen3-0-6b-engine                    -                          6m43s
default    │ └─EndpointSlice/qwen3-0-6b-engine-zgvgz     -                          6m43s
default    ├─Service/qwen3-0-6b-router                    -                          6m43s
default    │ └─EndpointSlice/qwen3-0-6b-router-m5n8m     -                          6m43s
default    └─ServiceAccount/qwen3-0-6b-router             -                          6m43s

Checklist

  • Tests added/updated (if applicable)
  • Docs updated (if applicable)
  • make test passes locally

Signed-off-by: bcfre <guo0xiong1feng@gmail.com>
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@github-actions github-actions Bot added inferenceservice InferenceService controller changes controller Controller changes tests Test changes labels Jan 8, 2026
@slin1237 slin1237 merged commit ae7b0f9 into ome-projects:main Jan 8, 2026
24 checks passed
@bcfre bcfre deleted the fix-rbac branch January 9, 2026 01:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

controller Controller changes inferenceservice InferenceService controller changes tests Test changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants