Skip to content

[CP 1210] GPUOP-588 filter out workflow related pods from drain step#479

Merged
sajmera-pensando merged 1 commit intoROCm:mainfrom
ci-penbot-01:CP.O2O.pensando.gpu-operator.1210.rocm.gpu-operator.main
Mar 24, 2026
Merged

[CP 1210] GPUOP-588 filter out workflow related pods from drain step#479
sajmera-pensando merged 1 commit intoROCm:mainfrom
ci-penbot-01:CP.O2O.pensando.gpu-operator.1210.rocm.gpu-operator.main

Conversation

@ci-penbot-01
Copy link
Copy Markdown
Contributor

@ci-penbot-01 ci-penbot-01 commented Mar 23, 2026

cp of pensando/gpu-operator#1210

This PR improves remediation workflow behavior by (1) excluding remediation workflow pods from the drain step, (2) adding validation for user-supplied remediation workflow ConfigMaps, and (3) adjusting recovery-policy window handling when syncing internal state from the status CR.

Changes:
Add validateUserConfigMap and call it before creating/loading remediation default objects.
Pass workflow mappings into syncInternalMapFromStatusCR to apply per-condition recovery-policy window filtering.
Update drain.sh jq selection to ignore pods belonging to the remediation workflow controller instance.

* GPUOP-588 ignore workflow pods during drain

* more validations for configmap

(cherry picked from commit 554ddd5)
@sajmera-pensando sajmera-pensando merged commit 1950fbe into ROCm:main Mar 24, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants