Skip to content
This repository has been archived by the owner on Nov 1, 2022. It is now read-only.

Reorder boot and verify informer cache sync #2103

Merged
merged 1 commit into from
May 29, 2019
Merged

Conversation

hiddeco
Copy link
Member

@hiddeco hiddeco commented May 28, 2019

Before this change the operator would hang at random on "waiting for
informer caches to sync", without anything else happening or it timing
out.

After an extensive search in recently merged PRs, this seems to be
related to an change of order in #1906, where the informer would be
initialized before the operator had registered its event handlers.

In addition to reverting this change and ensuring the informer cache is
synced before it is used, reorder everything so that the operator is
started before anything else that may depend on it, so it can start
processing changes from the queue as soon as possible.

Addresses #2147.

@hiddeco hiddeco added the helm label May 28, 2019
@hiddeco hiddeco force-pushed the helm/informer-sync branch 2 times, most recently from 6735611 to dc07e9b Compare May 29, 2019 15:44
@hiddeco hiddeco changed the title Verify informer caches have synced before booting Reorder boot and verify informer cache sync May 29, 2019
@hiddeco hiddeco requested review from squaremo and 2opremio May 29, 2019 15:45
@hiddeco hiddeco force-pushed the helm/informer-sync branch 2 times, most recently from 84b5c99 to e4b68e0 Compare May 29, 2019 15:49
@hiddeco
Copy link
Member Author

hiddeco commented May 29, 2019

@squaremo @2opremio added 30 builds to be 100% sure this is a permanent fix, but my previous tests on CircleCI and on my own machine (about 100 cycles all together) did not run into any issues. https://circleci.com/gh/weaveworks/workflows/flux/tree/helm%2Finformer-sync

@2opremio
Copy link
Contributor

LGTM, but again, I am not super familiar with the operator.

Before this change the operator would hang at random on "waiting for
informer caches to sync", without anything else happening or it timing
out.

After an extensive search in recently merged PRs, this seems to be
related to an change of order in #1906, where the informer would be
initialized before the operator had registered its event handlers.

In addition to reverting this change and ensuring the informer cache is
synced before it is used, reorder everything so that the operator is
started before anything else that may depend on it, so it can start
processing changes from the queue as soon as possible.
Copy link
Member

@squaremo squaremo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks entirely plausible to me :-)

@hiddeco hiddeco changed the title Reorder boot and verify informer cache sync Reorder boot and verify informer cache sync May 29, 2019
@hiddeco hiddeco merged commit b45b6dd into master May 29, 2019
@hiddeco hiddeco deleted the helm/informer-sync branch May 29, 2019 16:35
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants