Skip to content

fix: prevent CJI with runOnNotReady from using stale job image#1742

Merged
bsquizz merged 2 commits intoRedHatInsights:masterfrom
rodrigonull:fix-cji-stale-image-race
Mar 9, 2026
Merged

fix: prevent CJI with runOnNotReady from using stale job image#1742
bsquizz merged 2 commits intoRedHatInsights:masterfrom
rodrigonull:fix-cji-stale-image-race

Conversation

@rodrigonull
Copy link
Copy Markdown
Member

Add generation tracking to ClowdAppStatus so the CJI controller can verify the ClowdApp has been reconciled with its current spec before creating jobs. Without this, a CJI created concurrently with a ClowdApp update could read a stale job spec from the informer cache and launch a job with the old image.

This mirrors the existing pattern used by ClowdEnvironment, where Status.Generation is compared against metadata.Generation to detect unreconciled changes.

Add generation tracking to ClowdAppStatus so the CJI controller can
verify the ClowdApp has been reconciled with its current spec before
creating jobs. Without this, a CJI created concurrently with a ClowdApp
update could read a stale job spec from the informer cache and launch a
job with the old image.
This mirrors the existing pattern used by ClowdEnvironment, where
Status.Generation is compared against metadata.Generation to detect
unreconciled changes. The runOnNotReady flag continues to bypass the
deployment readiness check, preserving the init container / migration
workflow.
@bsquizz
Copy link
Copy Markdown
Contributor

bsquizz commented Mar 9, 2026

This looks like a good approach to me. Would it be possible to enhance one of the kuttl tests dealing w/ CJIs to test this?

Enhance test-runonnotready-cji to verify that a CJI with runOnNotReady
picks up an updated job image after a ClowdApp spec change. The test
updates the job image from busybox to busybox:1.37 and creates a new
CJI, asserting the resulting Job uses the new image.
@rodrigonull rodrigonull marked this pull request as ready for review March 9, 2026 18:54
@bsquizz
Copy link
Copy Markdown
Contributor

bsquizz commented Mar 9, 2026

/test-e2e

@bsquizz bsquizz merged commit c23a820 into RedHatInsights:master Mar 9, 2026
9 checks passed
rodrigonull added a commit to rodrigonull/clowder that referenced this pull request Mar 26, 2026
Add the missing generation field to the ClowdApp status schema in
deploy.yml and deploy-mutate.yml. Without this, the API server
silently prunes the field during status updates, making the generation
tracking from PR RedHatInsights#1742 ineffective.
bsquizz pushed a commit that referenced this pull request Mar 27, 2026
Add the missing generation field to the ClowdApp status schema in
deploy.yml and deploy-mutate.yml. Without this, the API server
silently prunes the field during status updates, making the generation
tracking from PR #1742 ineffective.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants