Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace klog with context logging for NEG Controller #2297

Merged
merged 2 commits into from
Nov 8, 2023

Conversation

sawsa307
Copy link
Contributor

@sawsa307 sawsa307 commented Oct 24, 2023

This PR is the continuation of #1746 logging for components
in NEG controller).

  • Functions will accept a logger object from its caller, so the prefix will be determined based on the caller objects.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Oct 24, 2023
@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Oct 24, 2023
@k8s-ci-robot
Copy link
Contributor

Hi @sawsa307. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Oct 24, 2023
for endpoint, namespacedName := range endpointMap {
pod, exists, err := getPodFromStore(podLister, namespacedName.Namespace, namespacedName.Name)
if err != nil {
klog.Warningf("Failed to retrieve pod %q from store: %v", namespacedName.String(), err)
Copy link
Contributor Author

@sawsa307 sawsa307 Oct 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on discussion in #1746, I think we can use Info or Error. In this case, Error makes more sense to me.

@sawsa307 sawsa307 force-pushed the add-neg-context-logging branch 2 times, most recently from f5be8ec to aa061ae Compare October 24, 2023 21:22
@sawsa307
Copy link
Contributor Author

/assign @swetharepakula

@sawsa307 sawsa307 marked this pull request as draft October 24, 2023 23:07
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 24, 2023
@sawsa307 sawsa307 marked this pull request as ready for review October 24, 2023 23:57
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 24, 2023
return negRef, err
}
klog.V(4).Infof("Neg %q in zone %q was not found: %s", negName, zone, err)
logger.V(4).Info("Neg in zone was not found", "negName", negName, "zone", zone, "err", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I don't think we need to do V(4)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated!

recorder.Eventf(svc, apiv1.EventTypeNormal, "Delete", "Deleted NEG %q for %s in %q.", negName, negServicePortName, zone)
}
}
}
}

if needToCreate {
klog.V(2).Infof("Creating NEG %q for %s in %q.", negName, negServicePortName, zone)
logger.V(2).Info("Creating NEG", "negName", negName, "negServicePortName", negServicePortName, "zone", zone)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove V(2)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated!

var negRef negv1beta1.NegObjectReference
neg, err := cloud.GetNetworkEndpointGroup(negName, zone, version)
if err != nil {
if !utils.IsNotFoundError(err) {
klog.Errorf("Failed to get Neg %q in zone %q: %s", negName, zone, err)
logger.Error(err, "Failed to get Neg in zone", "negName", negName, "zone", zone)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "Failed to get NEG"

Copy link
Contributor Author

@sawsa307 sawsa307 Oct 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One optimization we can do is:
since this function is processing a given NEG, and we need to log the name and zone of this NEG in all log message in this function, we first create a logger with addition key/value pair like:

negLogger := logger.WithValues("negName", negName, "zone", zone)

And we use this as the logger of this function, which should increase readability and shorten the log message.

Same applied to toZoneNetworkEndpointMap and toZoneNetworkEndpointMapDegradedMode. Please take a look!

@bowei
Copy link
Member

bowei commented Oct 30, 2023

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Oct 30, 2023
Copy link
Member

@swetharepakula swetharepakula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few small comments, otherwise generally looks good.

Are we able to pass the logger into the API calls? specifically the GCE ones?

return negRef, err
}
klog.V(4).Infof("Neg %q in zone %q was not found: %s", negName, zone, err)
logger.Info("Neg in zone was not found", "negName", negName, "zone", zone, "err", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "Neg was not found"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

@@ -161,21 +161,21 @@ func ensureNetworkEndpointGroup(svcNamespace, svcName, negName, zone, negService
!utils.EqualResourceIDs(neg.Subnetwork, networkInfo.SubnetworkURL)) {

needToCreate = true
klog.V(2).Infof("NEG %q in %q does not match network and subnetwork of the cluster. Deleting NEG.", negName, zone)
logger.V(2).Info("NEG does not match network and subnetwork of the cluster. Deleting NEG", "negName", negName, "zone", zone)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove V(2)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

@@ -215,7 +215,7 @@ func ensureNetworkEndpointGroup(svcNamespace, svcName, negName, zone, negService
var err error
neg, err = cloud.GetNetworkEndpointGroup(negName, zone, version)
if err != nil {
klog.Errorf("Error while retrieving %q in zone %q: %v after initialization", negName, zone, err)
logger.Error(err, "Error while retrieving in zone after initialization", "negName", negName, "zone", zone)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Error while retrieving NEG after initialization"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

@@ -303,6 +309,13 @@ func toZoneNetworkEndpointMap(eds []negtypes.EndpointsData, zoneGetter negtypes.
// accidental diffs resulting from different formats.
networkEndpoint.IPv6 = parseIPAddress(podIPs.IPv6)
}
neLogger := logger.WithValues(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we do epLogger.WithValues. Then you don't need to add the EPS specific information.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. Thanks!

@@ -313,7 +326,7 @@ func toZoneNetworkEndpointMap(eds []negtypes.EndpointsData, zoneGetter negtypes.
if existingPod, contains := networkEndpointPodMap[networkEndpoint]; contains {
localEPCount[negtypes.Duplicate] += 1
if existingPod.Name < endpointAddress.TargetRef.Name {
klog.Infof("Found duplicate endpoints [%v, %v] when processing endpoint slice %s/%s, save the pod information from the alphabetically higher pod", networkEndpoint.IP, networkEndpoint.IPv6, ed.Meta.Namespace, ed.Meta.Name)
neLogger.Info("Found duplicate endpoints when processing endpoint slice, save the pod information from the alphabetically higher pod")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add the pod information to the log line too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

@@ -458,17 +476,24 @@ func toZoneNetworkEndpointMapDegradedMode(eds []negtypes.EndpointsData, zoneGett
// accidental diffs resulting from different formats.
networkEndpoint.IPv6 = parseIPAddress(podIPs.IPv6)
}
neLogger := logger.WithValues(
Copy link
Member

@swetharepakula swetharepakula Nov 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as in the other function.

I think the diff is the EPS addresses field

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

@sawsa307
Copy link
Contributor Author

sawsa307 commented Nov 2, 2023

A few small comments, otherwise generally looks good.

Are we able to pass the logger into the API calls? specifically the GCE ones?

Yes that should be possible. I'll update those as well.

@sawsa307
Copy link
Contributor Author

sawsa307 commented Nov 2, 2023

Added contextual logging in all API call function signature in #2320, and created another commit to replace all placeholder in NEG related API calls to use the logger with context.

@sawsa307
Copy link
Contributor Author

sawsa307 commented Nov 2, 2023

/retest

@sawsa307 sawsa307 force-pushed the add-neg-context-logging branch 2 times, most recently from 9dd254a to 18280d7 Compare November 2, 2023 22:11
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 2, 2023
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 3, 2023
@@ -243,7 +244,7 @@ func (f *FakeNetworkEndpointGroupCloud) DetachNetworkEndpoints(name, zone string
return nil
}

func (f *FakeNetworkEndpointGroupCloud) ListNetworkEndpoints(name, zone string, showHealthStatus bool, version meta.Version) ([]*composite.NetworkEndpointWithHealthStatus, error) {
func (f *FakeNetworkEndpointGroupCloud) ListNetworkEndpoints(name, zone string, showHealthStatus bool, version meta.Version, logger klog.Logger) ([]*composite.NetworkEndpointWithHealthStatus, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, should we ignore this parameter like the other functions in this fake?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this! Updated all instances of logger.

Copy link
Member

@swetharepakula swetharepakula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nits. This LG and we can merge after the preceding PR goes in

This PR is the continuation of kubernetes#1746(contextual logging for components
in NEG controller).
* Functions will accept a logger object from its caller, so the prefix
  will be determined based on the caller objects.
* Pass the logger to API call adapter and update the function interface.
* Updates in the pkg/composite will be created in a separate PR when all
  components are done with contextual logging migration.
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Nov 7, 2023
Copy link
Member

@swetharepakula swetharepakula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 7, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sawsa307, swetharepakula

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 7, 2023
@k8s-ci-robot k8s-ci-robot merged commit 96aa269 into kubernetes:master Nov 8, 2023
5 checks passed
@sawsa307 sawsa307 deleted the add-neg-context-logging branch January 30, 2024 20:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants