engine/engine_impl.go: Add a mutex around pod image updates for step progress. #26
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
When a pipeline fans out a DAG, concurrent calls are made to (*Kubernetes).Run
to start each step in parallel. But starting a step requires a
Read/Modify/Write loop to the Kubernetes API, and so updates are highly likely
to fail with a concurrent modification error. A user encountered this behavior
here[1], and it's easy to reproduce by creating a pipeline with high fan out.
This commit adds a mutex around the Read/Modify/Write loop in
(*Kubernetes).start. We know that concurrent modifications from the single
process will cause failures, so there is no reason to dispatch them
concurrently.
This commit also changes the current backoff parameters (5 max tries, .1
jitter factor) to be a little more robust. In local testing, even with the
mutex, concurrent modification errors were still possible but were much rarer.
1: https://discourse.drone.io/t/kube-runner-limitations/6853