Ignore unavailable deployments and handle kubeconfig secret rotation #41

ashwani2k · 2022-03-07T15:36:10Z

What this PR does / why we need it:
This PR handles skipping the scaling operation for deployment when they are not available in the cluster.
The logs are also refined to avoid noise and clutter.
It also handled the bug introduced with reloading of shoot kubeconfig as reported with #36.

Which issue(s) this PR fixes:
Fixes #40 #36

Special notes for your reviewer:

The PR upgrades GO dependency to 1.17
There is also a delay introduced before checking dependents for desire availability. This is introduced as checking for availability immediately after scaling operation always failed as it takes some time for the deployment to become ready.
Currently it is a hacky way with 2 sec delay as per what was observed from the tests.
Without this delay also the scaling operation happens but it is honored in the next reconciliation cycle. This creates an overall delay of 30s scaling delay per deployment requiring scaling.
In order to handle the kubeconfig rotation race condition where for some edge case the probe fails to access the latest kubeconfig and continues to fail unless restated. A retry mechanism is introduced which shall update the secret. The retry only acts if the error api errors of
Refactors the error handling for probe.
Updated alpine image to 3.15

Release note:

A bug is fixed which allowed dependency-watchdog to not ignore scaling operations on deployment which are not enabled/deployed in a given cluster
A bug with uploading of a rotated dependency-watchdog-probe secrets is now fixed by refreshing the clients with updated secrets.

ashwani2k · 2022-03-19T15:15:51Z

@unmarshall -- As suggested by you, I've introduced a retry mechanism to handle the kubeconfig rotation scenario. Also refined the logs overall and made them leaner for happy path. Moved some of the logs to Level 5.
Kindly have a look at the commit 38437b4

unmarshall

Added comments

pkg/scaler/prober.go

pkg/scaler/scaler.go

ashwani2k · 2022-03-23T11:08:53Z

Thanks Madhav, I agree we can make a total separation of concerns.
The changes are made with the commit 15f476a and it also tested on a local seed/shoot setup.

ashwani2k · 2022-03-23T12:51:56Z

Fixed test errors with missing arguments https://github.com/gardener/dependency-watchdog/compare/15f476af3a05b03b7922b3866cf2a46df1687c51..1d5b24b617cefdf54bef6493eddbec22aecf9108

unmarshall · 2022-03-23T15:22:30Z

/lgtm

…in-rel-0.7.0 [rel-0.7.0] Automated cherry pick of #41: Ignore unavailable deployments and handle kubeconfig secret rotation

ashwani2k added 2 commits March 5, 2022 00:24

Vendored Go v1.17

ad113a2

Handled skipping scaling of dependents when not available in a cluster

8110398

ashwani2k requested a review from a team as a code owner March 7, 2022 15:36

gardener-robot added needs/review size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Mar 7, 2022

gardener-robot-ci-2 added the reviewed/ok-to-test label Mar 7, 2022

ashwani2k mentioned this pull request Mar 7, 2022

Integrate enhanced meltdown handling for dependency-watchdog probe gardener/gardener#5497

Merged

gardener-robot-ci-3 added needs/ok-to-test and removed reviewed/ok-to-test labels Mar 7, 2022

gardener-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Mar 19, 2022

gardener-robot-ci-1 added reviewed/ok-to-test and removed reviewed/ok-to-test labels Mar 19, 2022

Enabled retry for failed probes due to secret rotation

38437b4

ashwani2k force-pushed the scale-autoscaler branch from f768dcc to 38437b4 Compare March 19, 2022 14:48

gardener-robot-ci-1 added the reviewed/ok-to-test label Mar 19, 2022

gardener-robot-ci-2 removed the reviewed/ok-to-test label Mar 19, 2022

ashwani2k changed the title ~~Handled behaviour for deployments when not available in the cluster~~ Ignore unavailable deployments and handle kubeconfig secret rotation Mar 19, 2022

unmarshall reviewed Mar 21, 2022

View reviewed changes

pkg/scaler/prober.go Outdated Show resolved Hide resolved

pkg/scaler/prober.go Show resolved Hide resolved

pkg/scaler/scaler.go Show resolved Hide resolved

Retry of probes for refreshed clients moved to next reconcilation

a1f1997

gardener-robot-ci-3 added reviewed/ok-to-test and removed reviewed/ok-to-test labels Mar 21, 2022

gardener-robot-ci-1 added the reviewed/ok-to-test label Mar 23, 2022

gardener-robot-ci-3 removed the reviewed/ok-to-test label Mar 23, 2022

Refactored error handling for probe

1d5b24b

ashwani2k force-pushed the scale-autoscaler branch from 15f476a to 1d5b24b Compare March 23, 2022 12:49

gardener-robot-ci-1 added reviewed/ok-to-test and removed reviewed/ok-to-test labels Mar 23, 2022

unmarshall approved these changes Mar 23, 2022

View reviewed changes

gardener-robot added reviewed/lgtm and removed needs/review labels Mar 23, 2022

ashwani2k merged commit 8965732 into gardener:master Mar 23, 2022

ashwani2k deleted the scale-autoscaler branch March 23, 2022 16:34

acumino mentioned this pull request Mar 24, 2022

[rel-0.7.0] Automated cherry pick of #41: Ignore unavailable deployments and handle kubeconfig secret rotation #46

Merged

rfranzke mentioned this pull request Mar 28, 2022

[BUG] Shoot kubeconfig reloading seems to not work in all cases #36

Closed

unmarshall added a commit that referenced this pull request Mar 28, 2022

Merge pull request #46 from acumino/automated-cherry-pick-of-#41-orig…

a6fc828

…in-rel-0.7.0 [rel-0.7.0] Automated cherry pick of #41: Ignore unavailable deployments and handle kubeconfig secret rotation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ignore unavailable deployments and handle kubeconfig secret rotation #41

Ignore unavailable deployments and handle kubeconfig secret rotation #41

ashwani2k commented Mar 7, 2022 •

edited

ashwani2k commented Mar 19, 2022

unmarshall left a comment

ashwani2k commented Mar 23, 2022

ashwani2k commented Mar 23, 2022

unmarshall commented Mar 23, 2022

Ignore unavailable deployments and handle kubeconfig secret rotation #41

Ignore unavailable deployments and handle kubeconfig secret rotation #41

Conversation

ashwani2k commented Mar 7, 2022 • edited

ashwani2k commented Mar 19, 2022

unmarshall left a comment

Choose a reason for hiding this comment

ashwani2k commented Mar 23, 2022

ashwani2k commented Mar 23, 2022

unmarshall commented Mar 23, 2022

ashwani2k commented Mar 7, 2022 •

edited