Avoids race condition on first gitjob creation whith polling enabled#4986
Merged
0xavi0 merged 1 commit intorancher:mainfrom Apr 16, 2026
Merged
Avoids race condition on first gitjob creation whith polling enabled#49860xavi0 merged 1 commit intorancher:mainfrom
0xavi0 merged 1 commit intorancher:mainfrom
Conversation
When polling is enabled Fleet is creating the gijob for calling fleet apply twice. The first time we're creating the gitjob because of the creation of the GitRepo itself and the second time it is because we receive the pollingCommi asynchronously from the quartz scheduler. This was producing a concurrent execution of fleet apply, which was ending up in Bundle creation conflicts. Also, when deleting a previous job Fleet was not deleting the child pods. The PR is also doing this and requeuing after deleting a previous job so k8s has time to effectively delete fleet apply pods and avoid race conditions. Refers to: rancher#4984 Signed-off-by: Xavi Garcia <xavi.garcia@suse.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR addresses a race in Fleet’s gitjob controller when polling is enabled, where multiple fleet-apply Jobs can be created concurrently during initial GitRepo creation / first polling commit, leading to Bundle creation conflicts. It also adjusts Job deletion behavior to better avoid concurrent fleet-apply pod execution.
Changes:
- Prevent gitjob creation when polling is enabled but
Status.Commitis still empty. - Delete the previous gitjob when the commit changes and requeue to allow time for termination before creating the next Job.
- Add unit tests covering the “empty old commit” deletion case and
shouldCreateJobbehavior with/without polling.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
internal/cmd/controller/gitops/reconciler/gitjob_controller.go |
Adds requeue-after-delete flow, updates previous-job deletion semantics, and blocks job creation on empty commit when polling is enabled. |
internal/cmd/controller/gitops/reconciler/gitjob_test.go |
Adds tests for deleting an empty-commit previous job and for shouldCreateJob behavior with polling enabled/disabled. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
0xavi0
added a commit
to 0xavi0/fleet
that referenced
this pull request
Apr 16, 2026
…ancher#4986) When polling is enabled Fleet is creating the gijob for calling fleet apply twice. The first time we're creating the gitjob because of the creation of the GitRepo itself and the second time it is because we receive the pollingCommi asynchronously from the quartz scheduler. This was producing a concurrent execution of fleet apply, which was ending up in Bundle creation conflicts. Also, when deleting a previous job Fleet was not deleting the child pods. The PR is also doing this and requeuing after deleting a previous job so k8s has time to effectively delete fleet apply pods and avoid race conditions. Refers to: rancher#4984 Signed-off-by: Xavi Garcia <xavi.garcia@suse.com>
0xavi0
added a commit
to 0xavi0/fleet
that referenced
this pull request
Apr 16, 2026
…ancher#4986) When polling is enabled Fleet is creating the gijob for calling fleet apply twice. The first time we're creating the gitjob because of the creation of the GitRepo itself and the second time it is because we receive the pollingCommi asynchronously from the quartz scheduler. This was producing a concurrent execution of fleet apply, which was ending up in Bundle creation conflicts. Also, when deleting a previous job Fleet was not deleting the child pods. The PR is also doing this and requeuing after deleting a previous job so k8s has time to effectively delete fleet apply pods and avoid race conditions. Refers to: rancher#4984 Signed-off-by: Xavi Garcia <xavi.garcia@suse.com>
0xavi0
added a commit
that referenced
this pull request
Apr 16, 2026
…4986) (#4996) When polling is enabled Fleet is creating the gijob for calling fleet apply twice. The first time we're creating the gitjob because of the creation of the GitRepo itself and the second time it is because we receive the pollingCommi asynchronously from the quartz scheduler. This was producing a concurrent execution of fleet apply, which was ending up in Bundle creation conflicts. Also, when deleting a previous job Fleet was not deleting the child pods. The PR is also doing this and requeuing after deleting a previous job so k8s has time to effectively delete fleet apply pods and avoid race conditions. Refers to: #4984 Signed-off-by: Xavi Garcia <xavi.garcia@suse.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When polling is enabled Fleet is creating the gijob for calling fleet apply twice. The first time we're creating the gitjob because of the creation of the GitRepo itself and the second time it is because we receive the pollingCommi asynchronously from the quartz scheduler.
This was producing a concurrent execution of fleet apply, which was ending up in Bundle creation conflicts.
Also, when deleting a previous job Fleet was not deleting the child pods. The PR is also doing this and requeuing after deleting a previous job so k8s has time to effectively delete fleet apply pods and avoid race conditions.
Refers to: #4984
Additional Information
Checklist
- [ ] I have updated the documentation via a pull request in the fleet-product-docs repository.