-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure deplyoment await logic uses the latest deployment object #2943
Conversation
Does the PR have any schema changes?Looking good! No breaking changes found. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #2943 +/- ##
==========================================
+ Coverage 27.63% 29.16% +1.52%
==========================================
Files 54 61 +7
Lines 7862 8194 +332
==========================================
+ Hits 2173 2390 +217
- Misses 5504 5600 +96
- Partials 185 204 +19 ☔ View full report in Codecov by Sentry. |
provider/pkg/await/deployment.go
Outdated
// We run the event handlers in goroutines to avoid blocking the main event loop. | ||
// We also need to ensure that only one of each event processes can run at a time. | ||
var depMutex, rsMutex, podMutex, pvcMutex sync.Mutex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little worried this introduces room for more race conditions on dia
, because we can still be handling multiple events at once. Re: not blocking the main loop, do you have an idea of where we're getting bottlenecked?
I wonder if there's an alternative where we simply ignore replicaset events until we see a deployment event with a generation change. Something like:
select {
case event := <-deploymentEvents:
deploying := dia.processDeploymentEvent(event)
case event := <-replicaSetEvents:
if deploying {
dia.processReplicaSetEvent(event)
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the initial approach in this PR may be risky. We also shouldn't be discarding replicaset events since we need those events to inform us of status updates so I'm hesitant to go down the approach you suggested.
I've reworked this PR so that we continuously poll for deployment events first, and ensure that the deployment controller has informed us if a rollout was triggered or not before we proceed onto processing other events. This ensures that when we check the replicaset, the deployment controller would have informed us if a roll-out has been triggered.
173f462
to
89affa7
Compare
@@ -1537,7 +1540,7 @@ func deploymentUpdated(namespace, name, revision string) *unstructured.Unstructu | |||
} | |||
}, | |||
"status": { | |||
"observedGeneration": 1, | |||
"observedGeneration": 2, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: the unit test had the incorrect observedGeneration for this object.
observedGeneration=generation before processing
89affa7
to
2ef5470
Compare
@@ -664,12 +665,12 @@ func Test_Apps_Deployment_Without_PersistentVolumeClaims(t *testing.T) { | |||
// The Deployment specifically does not reference the PVC in | |||
// its spec. Therefore, the Deployment should succeed no | |||
// matter what phase the PVC is in as it does not have to wait on it. | |||
deployments <- watchAddedEvent( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The event is reordered since we're using unbuffered channels which will block until there is a value receiver.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
continue | ||
} | ||
|
||
if deployment.GetGeneration() == observedGeneration { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would generation <= observedGeneration
be technically safer here? I don't know if that's actually observable in practice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need the observedGeneration
to match exactly the generation
of the deployment object so that the status matches the right manifest spec. generation != observedGeneration
is currently why this bug exists, though it would be a bug in k8s if generation < observeredGeneration
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I too stared at the comparison for a few minutes. Rollout code will sometimes have a 'planned' generation that they got from the call to Apply
, and await an observed generation that is greater-or-equal to the planned generation. This code is not doing that; it is simply comparing generations within a given object and has no planned generation.
// Before we start processing any ReplicaSet, PVC or Pod events, we need to wait until the Deployment controller | ||
// has seen and updated the status of the Deployment object. | ||
if err := dia.waitUntilDeploymentControllerReconciles(deploymentEvents, timeout); err != nil { | ||
return err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
This PR contains the following updates: | Package | Type | Update | Change | |---|---|---|---| | [@pulumi/kubernetes](https://pulumi.com) ([source](https://togithub.com/pulumi/pulumi-kubernetes)) | dependencies | minor | [`4.9.1` -> `4.10.0`](https://renovatebot.com/diffs/npm/@pulumi%2fkubernetes/4.9.1/4.10.0) | --- > [!WARNING] > Some dependencies could not be looked up. Check the Dependency Dashboard for more information. --- ### Release Notes <details> <summary>pulumi/pulumi-kubernetes (@​pulumi/kubernetes)</summary> ### [`v4.10.0`](https://togithub.com/pulumi/pulumi-kubernetes/blob/HEAD/CHANGELOG.md#4100-April-11-2024) [Compare Source](https://togithub.com/pulumi/pulumi-kubernetes/compare/v4.9.1...v4.10.0) - ConfigGroup V2 ([pulumi/pulumi-kubernetes#2844) - ConfigFile V2 ([pulumi/pulumi-kubernetes#2862) - Bugfix for ambiguous kinds ([pulumi/pulumi-kubernetes#2889) - \[yaml/v2] Support for resource ordering [pulumi/pulumi-kubernetes#2894) - Bugfix for deployment await logic not referencing the correct deployment status ([pulumi/pulumi-kubernetes#2943) ##### New Features A new MLC-based implementation of `ConfigGroup` and of `ConfigFile` is now available in the "yaml/v2" package. These resources are usable in all Pulumi languages, including Pulumi YAML and in the Java Pulumi SDK. Note that transformations aren't supported in this release (see [pulumi/pulumi#12996). </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://togithub.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy4yODcuMSIsInVwZGF0ZWRJblZlciI6IjM3LjI4Ny4xIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJyZW5vdmF0ZS9ucG0iLCJ0eXBlL21pbm9yIl19--> Co-authored-by: lumiere-bot[bot] <98047013+lumiere-bot[bot]@users.noreply.github.com>
Proposed changes
This PR ensures that we compare deployment revisions based on the old live object, and the most current deployment object.
This PR does the following:
.status.observedGeneration
Related issues (optional)
Fixes: #2941