Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ipvs support graceful termination #66012

Merged
merged 2 commits into from Sep 28, 2018
Merged

Conversation

@Lion-Wei
Copy link
Contributor

@Lion-Wei Lion-Wei commented Jul 10, 2018

What this PR does / why we need it:
Add a timed queue to handle ipvs graceful delete.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #57841

Special notes for your reviewer:

Release note:

IPVS proxier mode now support connection based graceful termination.
@Lion-Wei
Copy link
Contributor Author

@Lion-Wei Lion-Wei commented Jul 10, 2018

/cc @jsravn @jhorwit2 @rramkumar1
/assign @m1093782566
Please take a look, thanks.

@k8s-ci-robot
Copy link
Contributor

@k8s-ci-robot k8s-ci-robot commented Jul 10, 2018

@Lion-Wei: GitHub didn't allow me to request PR reviews from the following users: jsravn.

Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @jsravn @jhorwit2 @rramkumar1
/assign @m1093782566
Please take a look, thanks.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot requested a review from rramkumar1 Jul 10, 2018
@Lion-Wei Lion-Wei changed the title ipvs support graceful termination [wip]ipvs support graceful termination Jul 10, 2018
@jsravn
Copy link
Contributor

@jsravn jsravn commented Jul 10, 2018

@Lion-Wei Wouldn't it be better to check for active connections, rather than using an arbitrary timeout? This should be available via the netfilter interface as active connections. When it reaches 0, it should be safe to remove the real server.

@jsravn
Copy link
Contributor

@jsravn jsravn commented Jul 10, 2018

Previously discussed at #64947 (comment).

continue
}
portNum, err := strconv.Atoi(port)
glog.V(5).Infof("new ep %q is in graceful delete list", uniqueRS)
err := proxier.gracefuldeleteManager.DeleteRsImmediately(uniqueRS)

This comment has been minimized.

@jsravn

jsravn Jul 10, 2018
Contributor

Won't this drop existing connections if an endpoint is flapping? For example, if a pod is under load and goes NotReady temporarily (indicating it's rejecting new connections, but still processing existing ones). I don't think we want to drop connections in this case. It should just remove it from the queue rather than deleting the real server first. And update the weight to 1.

This comment has been minimized.

@Lion-Wei

Lion-Wei Jul 10, 2018
Author Contributor

That's a good point, thanks. Update to weight=1 would be better.
But is there another situation, that both service and pod have been deleted, and user created another service is pod, match the original vs/rs in the delete list.
Just little worry about this case, seems like doesn't matter.

// Delete old endpoints
for _, ep := range curEndpoints.Difference(newEndpoints).UnsortedList() {
// if curEndpoint is in gracefulDelete, skip
uniqueRS := vs.String() + "/" + ep

This comment has been minimized.

@jsravn

jsravn Jul 10, 2018
Contributor

I think this identifier should either be created in graceful_delete.go (pass in vs and rs instead of the string), or defined higher up so it's not duplicated with line 1503.

@m1093782566
Copy link
Member

@m1093782566 m1093782566 commented Jul 10, 2018

@Lion-Wei Wouldn't it be better to check for active connections, rather than using an arbitrary timeout?

It would be ideal if there is a way to detect if the connection is active.

@Lion-Wei
Copy link
Contributor Author

@Lion-Wei Lion-Wei commented Jul 10, 2018

@jsravn @m1093782566 I think check active connections can be better. But I was worry about the reliable issue. Is no active connections enough to prove that rs can be safely removed? And is other connect status matters, e.g. FIN_WAIT/TIME_WAIT...

And we need a new IPVS interface in pkg/util/ipvs to get realserver active connection count.

@jsravn
Copy link
Contributor

@jsravn jsravn commented Jul 10, 2018

@Lion-Wei For TCP connections, active count is the correct thing to use. If it is in FIN_WAIT/TIME_WAIT then it's been closed by one side so I think it's fine to close the connection at this point (after which, for FIN_WAIT at least, the one side will get a reset if they try to send packets).

UDP is trickier. I think UDP only increments "InActCount", I would need to test this to be sure. So for UDP you would have to wait for InActCount to drop to 0.

You can see here how ipvsadm retrieves those values, I guess it might require modifying the ipvs util package if it doesn't have it yet: https://git.kernel.org/pub/scm/utils/kernel/ipvsadm/ipvsadm.git/tree/libipvs/libipvs.c#n893.

Copy link
Member

@rramkumar1 rramkumar1 left a comment

First round of comments.

One observation: I see that the iptables implementation (#60074) takes in a command line arg which dictates the termination delay. How come we are not doing that here?

Also, that implementation spawns a goroutine for each endpoint, whereas you have the priority queue implementation. What was the reasoning for the queue?

@@ -0,0 +1,250 @@
/*

This comment has been minimized.

@rramkumar1

rramkumar1 Jul 10, 2018
Member

I feel like a more appropriate name for the file would be graceful_termination.go. Any objections to that?

}

// GracefulDeleteRSQueueQueue is a priority heap where the lowest ProcessAt is at the front of the queue
type GracefulDeleteRSQueueQueue []*GracefulDeleteRS

This comment has been minimized.

@rramkumar1

rramkumar1 Jul 10, 2018
Member

Can you call this gracefulTerminationQueue and make it package private? The queue is an implementation detail of the manager.


// GracefulDeleteRS stores real server information and the process time.
// If nothing special happened, real server will be delete after process time.
type GracefulDeleteRS struct {

This comment has been minimized.

@rramkumar1

rramkumar1 Jul 10, 2018
Member

Can you call this queueItem or something like that? GracefulDeleteRS is a really confusing name .

}

// GracefulDeleteRSQueueQueue is a priority heap where the lowest ProcessAt is at the front of the queue
type GracefulDeleteRSQueueQueue []*GracefulDeleteRS

This comment has been minimized.

@rramkumar1

rramkumar1 Jul 10, 2018
Member

Just an idea: why not make the internal type of the heap a map with the key being the queue item and the value being the index? That way you can use the native properties of the map to preserve uniqueness and you should still be able to implement the sort and heap interfaces. Then, you won't need the UniqueQueue .Maybe it does not make sense but just wanted to throw that idea out there.

return nil
}

func (m *GracefulDeleteManager) TryDeleteRs() {

This comment has been minimized.

@rramkumar1

rramkumar1 Jul 10, 2018
Member

make this package private

}
}

func (m *GracefulDeleteManager) RSInGracefulDelete(uniqueRS string) bool {

This comment has been minimized.

@rramkumar1

rramkumar1 Jul 10, 2018
Member

Can we call this PendingGracefulTermination?

This comment has been minimized.

@Lion-Wei

Lion-Wei Jul 19, 2018
Author Contributor

This function to check whether rs is in graceful termination list. I think InTerminationList would be better.

return exist
}

func (m *GracefulDeleteManager) GracefulDeleteRS(vs *utilipvs.VirtualServer, rs *utilipvs.RealServer) error {

This comment has been minimized.

@rramkumar1

rramkumar1 Jul 10, 2018
Member

can we call this GracefullyTerminate?

@Lion-Wei
Copy link
Contributor Author

@Lion-Wei Lion-Wei commented Jul 19, 2018

@jsravn , I think use active connections to determine whether rs can be delete is a terrific idea, But I think this work gonna take some time, cause the vendor repo docker/libnetwork maybe need talk about it, and discuss whether this is necessary.
Anyway, I raised an issue in docker/libnetwork. moby/libnetwork#2237

And I think with a graceful time is another solution, thought not the best. Maybe we can let this in to solve the problem first, and after docker/libnetwork side work finished, then use the better way.

@jhorwit2 @m1093782566 @rramkumar1 Thoughts?

@Lion-Wei
Copy link
Contributor Author

@Lion-Wei Lion-Wei commented Jul 19, 2018

@rramkumar1 Recently I was busy in other work, sorry for the delay response.
Firstly, I don't think we need a parameter for termination delay time, cause users should not worry about it.
And I don't think each rs have an goroutine is a good idea, cause that could potentially cause a lot of goroutines in a large cluster with a lot of pod churn.

Also, I'm little hesitate about the struct, if we use a fifo queue, then we don't have to traversing all "grace terminating endpoint" to decide which should be delete, in large cluster we might have a lot of rs in "grace termination".

@jsravn
Copy link
Contributor

@jsravn jsravn commented Jul 19, 2018

@Lion-Wei I think that's okay. I agree it should be a configurable timeout, some people might want to disable it or change the timeout (thinking of the bug that led to UDP connections being aggressively flushed in the iptables proxier).

@m1093782566
Copy link
Member

@m1093782566 m1093782566 commented Jul 20, 2018

@Lion-Wei I think that's okay. I agree it should be a configurable timeout, some people might want to disable it or change the timeout (thinking of the bug that led to UDP connections being aggressively flushed in the iptables proxier).

We should be a bit careful about creating new API/flags, especially the new timeout flag would be deprecated very soon when/if we add connection detection support. I am not sure how urgent it is but configure timeout and detect connections they are different behaviors, we need to know the effects to users.

@Lion-Wei Lion-Wei changed the title [wip]ipvs support graceful termination ipvs support graceful termination Jul 20, 2018
@Lion-Wei
Copy link
Contributor Author

@Lion-Wei Lion-Wei commented Jul 23, 2018

Quick update, turnout docker/libnetwork have the same problem, and they are using the same way to solve it.

Once a service is disabled we are down weighting it to 0 so that new connection are not going to be LB on that, and we are removing the ipvs rule once the container exit. The time between the weight to 0 and the removal of the ipvs rule is managed by the graceful shutdown period.
moby/libnetwork#2237 (comment)

@akumria
Copy link

@akumria akumria commented Aug 30, 2018

Is anything left here, except for a rebase so the conflict is resolved?

@Lion-Wei
Copy link
Contributor Author

@Lion-Wei Lion-Wei commented Sep 5, 2018

@akumria @jsravn @lbernail @m1093782566 , Connection based graceful termination is finished, please take another look.

@lbernail
Copy link
Contributor

@lbernail lbernail commented Sep 6, 2018

@Lion-Wei : this looks good. I think we'll test it as soon as it makes it to an alpha release
Do you think this can make it to the 1.11 branch?

@Lion-Wei
Copy link
Contributor Author

@Lion-Wei Lion-Wei commented Sep 20, 2018

@lbernail I think this should be cherry-pick to 1.11, that's necessary.
But first we need to get this merged. : (

@m1093782566
Copy link
Member

@m1093782566 m1093782566 commented Sep 27, 2018

/approve

@Lion-Wei
Copy link
Contributor Author

@Lion-Wei Lion-Wei commented Sep 28, 2018

Thanks, @m1093782566 .
@jsravn @lbernail @rramkumar1 Can anyone help me add a lgtm, then we can get this merged. 😃

@m1093782566
Copy link
Member

@m1093782566 m1093782566 commented Sep 28, 2018

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm label Sep 28, 2018
@k8s-ci-robot
Copy link
Contributor

@k8s-ci-robot k8s-ci-robot commented Sep 28, 2018

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Lion-Wei, m1093782566

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@Lion-Wei
Copy link
Contributor Author

@Lion-Wei Lion-Wei commented Sep 28, 2018

/test pull-kubernetes-integration

@k8s-ci-robot k8s-ci-robot merged commit 8ea6b2c into kubernetes:master Sep 28, 2018
18 checks passed
18 checks passed
cla/linuxfoundation Lion-Wei authorized
Details
pull-kubernetes-bazel-build Job succeeded.
Details
pull-kubernetes-bazel-test Job succeeded.
Details
pull-kubernetes-cross Skipped
pull-kubernetes-e2e-gce Job succeeded.
Details
pull-kubernetes-e2e-gce-100-performance Job succeeded.
Details
pull-kubernetes-e2e-gce-device-plugin-gpu Job succeeded.
Details
pull-kubernetes-e2e-gke Skipped
pull-kubernetes-e2e-kops-aws Job succeeded.
Details
pull-kubernetes-e2e-kubeadm-gce Skipped
pull-kubernetes-integration Job succeeded.
Details
pull-kubernetes-kubemark-e2e-gce-big Job succeeded.
Details
pull-kubernetes-local-e2e Skipped
pull-kubernetes-local-e2e-containerized Skipped
pull-kubernetes-node-e2e Job succeeded.
Details
pull-kubernetes-typecheck Job succeeded.
Details
pull-kubernetes-verify Job succeeded.
Details
tide In merge pool.
Details
@zachaller
Copy link
Contributor

@zachaller zachaller commented Oct 8, 2018

Will this make it into a 1.12.x release? Along with #69267

@Lion-Wei
Copy link
Contributor Author

@Lion-Wei Lion-Wei commented Oct 8, 2018

@zachaller Yeah, I think so.

k8s-ci-robot added a commit that referenced this pull request Oct 24, 2018
#66012-#69268-upstream-release-1.12

Automated cherry pick of #68874: update vendor #66012: ipvs support graceful termination #69268: fix bugs introduced by ipvs graceful termination
k8s-ci-robot added a commit that referenced this pull request Nov 8, 2018
#66012-#69268-upstream-release-1.11

Automated cherry pick of #68874: update vendor #66012: ipvs support graceful termination #69268: fix bugs introduced by ipvs graceful termination
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

9 participants
You can’t perform that action at this time.