Please sign in to comment.
capacity controller: listen to creation events on deployments (#111)
This is related to #77. When running shipper with resyncs disabled, we noticed that some releases would hang, waiting for contender to achieve capacity. After investigating, we found that this was probably due to replication lag on the application clusters. When operating over a freshly created Deployment, we'd get a stream of the following errors: error syncing CapacityTarget "foo/bar" (will retry: true): expected exactly 1 deployment on cluster baz, namespace foo, with label "..." but 0 deployments exist Usually, this resolves itself with a couple of retries, but when the replication lag is severe enough, or different errors occur in sequence, we'd drop the capacity target from the queue and never retry again. By watching creation of Deployments, the CapacityTarget will re-join the queue, and releases should no longer get stuck.
- Loading branch information...