Skip to content

Restore status fields when gitjob creation fails#4949

Merged
0xavi0 merged 1 commit intorancher:mainfrom
0xavi0:missing-commit-on-gitjob-error
Apr 15, 2026
Merged

Restore status fields when gitjob creation fails#4949
0xavi0 merged 1 commit intorancher:mainfrom
0xavi0:missing-commit-on-gitjob-error

Conversation

@0xavi0
Copy link
Copy Markdown
Contributor

@0xavi0 0xavi0 commented Apr 9, 2026

When manageGitJob fails, updateErrorStatus was persisting mutated in-memory status fields to Kubernetes even though no gitjob had been created. On the next reconcile all shouldCreateJob conditions evaluated to false, permanently losing the trigger.

Three fields were affected:

  • status.Commit — promoted by getNextCommit before manageGitJob is called
  • status.UpdateGeneration — consumed by updateGenerationValuesIfNeeded before job creation
  • status.ObservedGeneration — consumed by the same call

The fix saves the original values of all three fields before manageGitJob is called and restores them in the error path, so the next reconcile still sees the pending trigger and retries job creation.

Tests are included that fail on the unfixed code and pass after the fix.

Refers to: #4948

Additional Information

Checklist

- [ ] I have updated the documentation via a pull request in the fleet-product-docs repository.

When manageGitJob fails, updateErrorStatus was persisting mutated in-memory status fields to Kubernetes even though no gitjob had been created. On the next reconcile all shouldCreateJob conditions evaluated to false, permanently losing the trigger.

Three fields were affected:
  - status.Commit — promoted by getNextCommit before manageGitJob is called
  - status.UpdateGeneration — consumed by updateGenerationValuesIfNeeded before job creation
  - status.ObservedGeneration — consumed by the same call

The fix saves the original values of all three fields before manageGitJob is called and restores them in the error path, so the next reconcile still sees the pending trigger and retries job creation.

Tests are included that fail on the unfixed code and pass after the fix.

Refers to: rancher#4948

Signed-off-by: Xavi Garcia <xavi.garcia@suse.com>
@0xavi0 0xavi0 added this to the v2.15.0 milestone Apr 9, 2026
@0xavi0 0xavi0 self-assigned this Apr 9, 2026
@0xavi0 0xavi0 added the kind/bug label Apr 9, 2026
@0xavi0 0xavi0 added this to Fleet Apr 9, 2026
@0xavi0 0xavi0 marked this pull request as ready for review April 9, 2026 10:31
@0xavi0 0xavi0 requested a review from a team as a code owner April 9, 2026 10:31
Copilot AI review requested due to automatic review settings April 9, 2026 10:31
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a GitRepo reconciliation edge case where in-memory status fields were being persisted to Kubernetes even when GitJob creation failed, causing subsequent reconciles to miss triggers and never retry job creation (issue #4948).

Changes:

  • Restore Status.Commit, Status.UpdateGeneration, and Status.ObservedGeneration before calling updateErrorStatus when manageGitJob returns an error/requeue.
  • Add regression tests that assert these status fields are not persisted when GitJob creation fails due to missing secrets.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
internal/cmd/controller/gitops/reconciler/gitjob_controller.go Restores mutated status fields on the manageGitJob error/requeue path to preserve retry triggers.
internal/cmd/controller/gitops/reconciler/gitjob_test.go Adds tests covering commit promotion, force-sync generation, and observed generation behavior on GitJob creation failure.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread internal/cmd/controller/gitops/reconciler/gitjob_controller.go
Comment thread internal/cmd/controller/gitops/reconciler/gitjob_test.go
Comment thread internal/cmd/controller/gitops/reconciler/gitjob_test.go
Comment thread internal/cmd/controller/gitops/reconciler/gitjob_test.go
@thardeck thardeck moved this to 👀 In review in Fleet Apr 14, 2026
@0xavi0 0xavi0 merged commit 034ecda into rancher:main Apr 15, 2026
39 of 40 checks passed
@github-project-automation github-project-automation bot moved this from 👀 In review to ✅ Done in Fleet Apr 15, 2026
0xavi0 added a commit to 0xavi0/fleet that referenced this pull request Apr 15, 2026
When manageGitJob fails, updateErrorStatus was persisting mutated in-memory status fields to Kubernetes even though no gitjob had been created. On the next reconcile all shouldCreateJob conditions evaluated to false, permanently losing the trigger.

Three fields were affected:
  - status.Commit — promoted by getNextCommit before manageGitJob is called
  - status.UpdateGeneration — consumed by updateGenerationValuesIfNeeded before job creation
  - status.ObservedGeneration — consumed by the same call

The fix saves the original values of all three fields before manageGitJob is called and restores them in the error path, so the next reconcile still sees the pending trigger and retries job creation.

Tests are included that fail on the unfixed code and pass after the fix.

Refers to: rancher#4948

Signed-off-by: Xavi Garcia <xavi.garcia@suse.com>
thardeck pushed a commit that referenced this pull request Apr 16, 2026
When manageGitJob fails, updateErrorStatus was persisting mutated in-memory status fields to Kubernetes even though no gitjob had been created. On the next reconcile all shouldCreateJob conditions evaluated to false, permanently losing the trigger.

Three fields were affected:
  - status.Commit — promoted by getNextCommit before manageGitJob is called
  - status.UpdateGeneration — consumed by updateGenerationValuesIfNeeded before job creation
  - status.ObservedGeneration — consumed by the same call

The fix saves the original values of all three fields before manageGitJob is called and restores them in the error path, so the next reconcile still sees the pending trigger and retries job creation.

Tests are included that fail on the unfixed code and pass after the fix.

Refers to: #4948

Signed-off-by: Xavi Garcia <xavi.garcia@suse.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Status: ✅ Done

Development

Successfully merging this pull request may close these issues.

4 participants