Skip to content
This repository has been archived by the owner on Nov 9, 2022. It is now read-only.

Time limit for cache sync on startup and controller reconciliations #102

Merged
merged 2 commits into from
Jan 8, 2021

Conversation

rfranzke
Copy link
Contributor

@rfranzke rfranzke commented Jan 8, 2021

How to categorize this PR?

/area scalability
/kind bug
/priority normal

What this PR does / why we need it:
In order to prevent stuck worker routines as described in #99, @timebertt suggested to introduce time limits for the reconciliations of the controllers, similarly to how it's already done in Gardener's shoot care controller (see).

Also, the cache sync on startup is now limited to 5m.

Which issue(s) this PR fixes:
Fixes #99

Release note:

The controller reconciliations are now limited to `1m`.

@rfranzke rfranzke requested a review from a team as a code owner January 8, 2021 07:43
@gardener-robot gardener-robot added area/scalability Scalability related kind/bug Bug priority/normal needs/review Needs review size/s Size of pull request is small (see gardener-robot robot/bots/size.py) labels Jan 8, 2021
@rfranzke
Copy link
Contributor Author

rfranzke commented Jan 8, 2021

/invite @timebertt
/assign @timebertt

@gardener-robot-ci-3 gardener-robot-ci-3 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jan 8, 2021
@gardener-robot-ci-1 gardener-robot-ci-1 added needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Jan 8, 2021
@gardener-robot gardener-robot added size/m Size of pull request is medium (see gardener-robot robot/bots/size.py) and removed size/s Size of pull request is small (see gardener-robot robot/bots/size.py) labels Jan 8, 2021
@gardener-robot-ci-3 gardener-robot-ci-3 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jan 8, 2021
@gardener-robot-ci-1 gardener-robot-ci-1 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jan 8, 2021
@timebertt
Copy link
Contributor

>> make generate
make generate needs to be run:
 M api/resources/v1alpha1/zz_generated.deepcopy.go

@rfranzke
Copy link
Contributor Author

rfranzke commented Jan 8, 2021

/squash

@gardener-robot gardener-robot added the merge/squash Should be merged via 'Squash and merge' label Jan 8, 2021
@gardener-robot-ci-1 gardener-robot-ci-1 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Jan 8, 2021
timebertt
timebertt previously approved these changes Jan 8, 2021
Copy link
Contributor

@timebertt timebertt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, thank you!
This was exactly how I imagined it 🚀
/lgtm

@@ -113,7 +114,10 @@ func NewResourceManagerCommand() *cobra.Command {
}
}()

if !targetClientOpts.Completed().WaitForCacheSync(ctx.Done()) {
ctxWaitForCache, cancelWaitForCache := context.WithTimeout(ctx, 5*time.Minute)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side-note, doesn't have to be tackled in this PR: we should further improve health checks, by separating liveness and readiness endpoint and report an unready state as long as the caches haven't synced initially.
Though, this is also a general problem in our components.

@gardener-robot gardener-robot added reviewed/lgtm Has approval for merging and removed needs/review Needs review labels Jan 8, 2021
@gardener-robot gardener-robot removed the needs/review Needs review label Jan 8, 2021
@gardener-robot-ci-2 gardener-robot-ci-2 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jan 8, 2021
@rfranzke
Copy link
Contributor Author

rfranzke commented Jan 8, 2021

Due to the go mod magic I needed to run make revendor after the make generate. :( :)

@gardener-robot-ci-1 gardener-robot-ci-1 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jan 8, 2021
@timebertt timebertt merged commit cc3fdc8 into gardener-attic:master Jan 8, 2021
@rfranzke rfranzke deleted the enh/reconcile-timeout branch January 8, 2021 08:43
@gardener-robot gardener-robot added priority/3 Priority (lower number equals higher priority) and removed priority/3 Priority (lower number equals higher priority) labels Mar 8, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area/scalability Scalability related kind/bug Bug merge/squash Should be merged via 'Squash and merge' needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) reviewed/lgtm Has approval for merging size/m Size of pull request is medium (see gardener-robot robot/bots/size.py)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Worker routines get stuck
6 participants