Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

skip iptables sync if no endpoint changes #41223

Merged
merged 1 commit into from
Feb 10, 2017

Conversation

freehan
Copy link
Contributor

@freehan freehan commented Feb 10, 2017

Alternative to #41173
fixes: #26637
No need to checksum. Just compare endpoint maps.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Feb 10, 2017
@k8s-reviewable
Copy link

This change is Reviewable

@k8s-github-robot k8s-github-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. release-note-label-needed labels Feb 10, 2017
@freehan
Copy link
Contributor Author

freehan commented Feb 10, 2017

I think this one is cleaner than the checksum one.
I did my best to sort out the original logic and build on top of it. It probably need a major refactor at some point. But since this PR is aiming for cherrypick, so only changing the minimum

Copy link
Contributor

@bprashanth bprashanth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

activeEndpoints[svcPort] = true
}
}
// Remove endpoints missing from the update.
// Gather stale connections to endpoints missing from the update.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: suggest rewording to "Check stale connections against endpoints missing from the update"

removedEndpoints := getRemovedEndpoints(curEndpointIPs, newEndpoints)
for _, ep := range removedEndpoints {
staleConnections[endpointServicePair{endpoint: ep, servicePortName: svcPort}] = true
}
glog.V(3).Infof("Setting endpoints for %q to %+v", svcPort, newEndpoints)
// Once the set operations using the list of ips are complete, build the list of endpoint infos
proxier.endpointsMap[svcPort] = proxier.buildEndpointInfoList(portsToEndpoints[portname], newEndpoints)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment: The only significant diff is that we were only doing the work to buildEndpointInfoList lazily, but that function only seems as expensive as flattenEndpointsInfo anyway (O(endpoints)), so I don't think we're burning any more cpu.

// record endpoints of unactive service to stale connections
for _, ep := range proxier.endpointsMap[svcPort] {
staleConnections[endpointServicePair{endpoint: ep.ip, servicePortName: svcPort}] = true
}

glog.V(2).Infof("Removing endpoints for %q", svcPort)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment: this version seems cleaner since we're not deleting from the map we're iterating, and there's no residual state as we start with a fresh map everytime.

@bprashanth
Copy link
Contributor

/approve

@k8s-github-robot k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 10, 2017
@freehan
Copy link
Contributor Author

freehan commented Feb 10, 2017

Should we get this in and let it soak for some time before cherry-pick into 1.5?

@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

The following people have approved this PR: bprashanth, freehan

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@freehan
Copy link
Contributor Author

freehan commented Feb 10, 2017

adding label based on #41223 (review)

@freehan freehan added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 10, 2017
@k8s-github-robot k8s-github-robot added the do-not-merge DEPRECATED. Indicates that a PR should not merge. Label can only be manually applied/removed. label Feb 10, 2017
@freehan freehan added release-note-none Denotes a PR that doesn't merit a release note. and removed do-not-merge DEPRECATED. Indicates that a PR should not merge. Label can only be manually applied/removed. release-note-label-needed labels Feb 10, 2017
@freehan freehan added this to the v1.5 milestone Feb 10, 2017
@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 41223, 40892, 41220, 41207, 41242)

@dcbw
Copy link
Member

dcbw commented Feb 11, 2017

@freehan I've had #39484 sitting around since forever that @thockin hasn't gotten to. That adds a boatload of testcases.

And one of those testcases catches an error in this PR, which is that healthcheck updates are not correctly added anymore because proxier.updateHealthCheckEntries() is no longer called for every svcPort in the new endpoints map too.

I'm happy to fix that up in #39484 if you like, provided I can actually get somebody (@thockin ? @bprashanth ?) to review it.

@freehan
Copy link
Contributor Author

freehan commented Feb 12, 2017

@dcbw Thanks for noticing!

Please go ahead with the fix in #39484.

@dcbw
Copy link
Member

dcbw commented Feb 13, 2017

@freehan I've rebased and repushed #39484, would you mind taking a look at it too? Thanks!

@thockin
Copy link
Member

thockin commented Feb 13, 2017

Thanks guys,

I have been out for about 2 weeks (combo of vaca, work, sick) and am now trying to get all these PRs accounted for. Minhan can own the review.

I have another set of PRs between me, @timothysc, and @danwinship which make sync be called at a max rate - all together these should be a nice improvement.

Should we do this same transform over Service updates?

@dcbw
Copy link
Member

dcbw commented Feb 13, 2017

@thockin yeah, we probably should do the same for Service updates, but it's not as big of an issue since Service updates are much less frequent than Endpoints.

@freehan
Copy link
Contributor Author

freehan commented Feb 13, 2017

@dcbw
I understand the bug you are referring in #41223 (comment)

I think that should not cause any problem at this point. Here is the theory:

  1. create new service svc1
  2. OnEndpointsUpdate gets called with new service endpoint.
  3. Since updateHealthCheckEntries is only called for previous known service ports in proxier.endpointsMap, healthcheck for the newly added service endpoint will not be included.
  4. OnEndpointsUpdate gets called again with the same of endpoints.
  5. This time, proxier.endpointsMap includes svc1 and healthcheck is updated.

Since OnEndpointsUpdate is called at least twice a second (Due to master election logic in kube-controller-manager and kube-scheduler), having a half second delay to update health check should not cause any problem, right?

I intended to cherry-pick this into 1.5 to help some customers. I think #39484 is too big for cherry-pick. I can add a surgical fix to combine svcPorts from newEndpointMap and proxier.endpointsMap, and trigger updateHealthCheckEntries. For #39484, just need to rebase and delete the surgical fix.

What do you think?

freehan added a commit to freehan/kubernetes that referenced this pull request Feb 13, 2017
k8s-github-robot pushed a commit that referenced this pull request Feb 14, 2017
Automatic merge from submit-queue

fix healthcheck update problem introduced by #41223

ref: #41223

surgical fix for #41223 (comment)
freehan added a commit to freehan/kubernetes that referenced this pull request Feb 16, 2017
ahakanbaba pushed a commit to ahakanbaba/kubernetes that referenced this pull request Feb 17, 2017
k8s-github-robot pushed a commit that referenced this pull request Feb 23, 2017
…#41357-upstream-release-1.5

Automatic merge from submit-queue

Automated cherry pick of #41223 #41357

Cherry pick of #41223 #41357 on release-1.5.

#41223: skip iptables sync if no endpoint changes
#41357: fix healthcheck update problem introduced by #41223
databus23 pushed a commit to sapcc/kubernetes that referenced this pull request Mar 28, 2017
databus23 pushed a commit to sapcc/kubernetes that referenced this pull request May 15, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

kube-proxy should ignore "spam" updates to endpoints
7 participants