Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update healthcheck Description on BackendConfig removal #2181

Merged

Conversation

DamianSawicki
Copy link
Contributor

In #2008 and #2068, we guarantee that the health check Description is updated when a BackendConfig is added to the Service (behind the flag --enable-update-hc-description). When a BackendConfig is removed, typically a health check update is triggered, which includes resetting the Description. However, under some circumstances the health check parameters update won't happen on BackendConfig removal, but it is still desirable to reset the Description to reflect the lack of BackendConfig. This reset is added in the present PR.

The changes are behind the abovementioned flag --enable-update-hc-description. This is a temporary fix and corresponding code will be removed with the subsequent planned changes.

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jun 29, 2023
@k8s-ci-robot
Copy link
Contributor

Hi @DamianSawicki. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jun 29, 2023
@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jun 29, 2023
@DamianSawicki DamianSawicki marked this pull request as ready for review June 29, 2023 11:33
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 29, 2023
@k8s-ci-robot k8s-ci-robot requested a review from aojea June 29, 2023 11:34
@DamianSawicki
Copy link
Contributor Author

/assign aojea
/cc swetharepakula

@DamianSawicki DamianSawicki force-pushed the fix-backendconfig-removal branch 2 times, most recently from a567521 to 08777ff Compare June 29, 2023 12:33
@DamianSawicki DamianSawicki requested a review from aojea June 29, 2023 12:36
@swetharepakula
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jun 29, 2023
@DamianSawicki
Copy link
Contributor Author

/cc mag-kol
/assign swetharepakula

@k8s-ci-robot
Copy link
Contributor

@DamianSawicki: GitHub didn't allow me to request PR reviews from the following users: mag-kol.

Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc mag-kol
/assign swetharepakula

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@@ -1743,6 +1792,25 @@ func TestSyncServicePort(t *testing.T) {
t.Errorf("hcs.SyncServicePort(tc.sp, tc.probe) = %q, _; want = %q", gotSelfLink, tc.wantSelfLink)
}
verify()

if tc.expectedEvent > 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's better to create a new test func then extend this with new cases and special assertion just for your cases

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

if flags.F.EnableUpdateCustomHealthCheckDescription {
desc := &healthcheck.HealthcheckDesc{}
err := json.Unmarshal([]byte(existingHC.Description), desc)
if err == nil && desc.Config == healthcheck.BackendConfigHC && bchcc == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bchcc?
pelase name this properly
it has been declared 50+ lines earlier

maybe healthCheckConfig is a good name

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BC stands for BackendConfig
HC must stand for Health Check
C is probably Config

Renaming this is a topic for a refactor, not a quick bug fix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

if err != nil {
klog.Errorf("Health check %q update error: %v", existingHC.Name, err)
}
h.notifyAboutTHC(hc, thcConf.THCEvents)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

notifyAboutTHC -> emitTHCEvents

event is well defined concept

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how many times notifyAboutTHC func can be called in one execution? I may happen a few times from what I see

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just once, it's always shortly before a return statement.

// The contents of the following 'if' are temporary, see https://github.com/kubernetes/ingress-gce/pull/2181 for details.
if flags.F.EnableUpdateCustomHealthCheckDescription {
desc := &healthcheck.HealthcheckDesc{}
err := json.Unmarshal([]byte(existingHC.Description), desc)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you have to do something if this error happens
maybe logging error is enough (with warning severity?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error means that the Description is not a JSON. We are in the process of switching from plain strings to JSON, depending on health check types and flags enabled, so it's not really an error to see a non-jsonified Description.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for debugging purposes it would be good to have a log that says that this was not a JSON description

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -262,6 +263,28 @@ func (h *HealthChecks) sync(hc *translator.HealthCheck, bchcc *backendconfigv1.H
return existingHC.SelfLink, err
}

// The contents of the following 'if' are temporary, see https://github.com/kubernetes/ingress-gce/pull/2181 for details.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add more info here

Copy link
Contributor Author

@DamianSawicki DamianSawicki Jun 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed this with Antonio already, there are Google-internal resources with details.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I second Cezary's comment. Since we cannot link the internal issue, it would be good to have some description in the code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be a good idea to describe what this trying to do as in we are only updating the HC description and nothing else

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Member

@swetharepakula swetharepakula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good to me. I do agree with the existing feedback that we should have a more detailed comment on what this is fixing. Though this will be changed in the future, we will still be supporting this version of the code for a long time and may need the comment to understand past context.

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jul 3, 2023
desc := &healthcheck.HealthcheckDesc{}
err := json.Unmarshal([]byte(existingHC.Description), desc)
if err != nil {
klog.V(3).Info("Description for healthcheck %s is not a JSON (probably a plain-text description): %s.", existingHC.Name, existingHC.Description)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you do early return here and remove else. Also you can mention in this log that the update is omitted because the description is not a JSON.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also you can mention in this log that the update is omitted because the description is not a JSON.

Done.

Can you do early return here and remove else.

I would prefer not to to avoid copy-pasting the following lines.

	klog.V(2).Infof("Health check %q already exists and needs no update", hc.Name)
	return existingHC.SelfLink, nil

// even if changes.hasDiff() above is false. No other health check field is modified. The purpose is for the Description to accurately
// reflect the existence of a backendconfigv1.HealthCheckConfig for the service. This is temporary, see
// https://github.com/kubernetes/ingress-gce/pull/2181 for details.
if flags.F.EnableUpdateCustomHealthCheckDescription {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the idea to postpone the update if it will not happened earlier. I think it would be better to put all logic related to checking against backend config description to a function fg isDescriptionUpdateNeeded and add condition to line 247

if changes.hasDiff() || isDescriptionUpdateNeeded() {}

WDYT?

Copy link
Contributor Author

@DamianSawicki DamianSawicki Jul 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed outside GitHub, done. @kl52752 PTAL.

shallowCopy := func(x *compute.HealthCheck) *compute.HealthCheck { y := *x; return &y }

// Don't recalculate health check on BackendConfig removal (legacy behaviour, to be changed), but update Description.
chc := fixture.hc()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of using ficture.hc() and overriding almost everything can you create new function healthCheckFromBackendConfig and reuse it below? (I don't see any difference here)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, done.

t.Fatalf("Got %d healthchecks, want 1\n%s", len(computeHCs), pretty.Sprint(computeHCs))
}

// Filter out SelfLink because it is hard to deal with in the mock and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And description if flag is disabled

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

This is a temporary solution under the flag
--enable-update-hc-description.
@kl52752
Copy link
Contributor

kl52752 commented Jul 5, 2023

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 5, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: DamianSawicki, kl52752

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 5, 2023
@k8s-ci-robot k8s-ci-robot merged commit 01fcfee into kubernetes:master Jul 5, 2023
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants