New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ipvs proxier doesn't respect graceful termination #57841
Comments
/cc @m1093782566 |
@jsravn: GitHub didn't allow me to assign the following users: sig-node. Note that only kubernetes members can be assigned. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/sig network |
@jsravn You can assign any IPVS related issue to me directly :) /assign /area ipvs |
I have a question:
Why? Currently there is no way to pass the weight to IPVS proxier, so we set weight to 1 by default. What's the advantage of setting weight to 0? Will it resolve this issue? |
@m1093782566 Setting weight to 0 puts the real server into "maintenance mode" so to speak. The scheduler will stop sending new connections to it, but existing connections remain active. It's the correct way to gracefully remove a server from IPVS as far as I know. So syncEndpoint on endpoint delete should:
This simulates the same behavior in iptables where TCP connections time out after some period of time rather than being forcefully removed by the proxier. |
great comment! will try and fix it if works Thanks |
Hopefully you can fix this for UDP connections too, since iptables proxier suffers from a bug where it drops the udp connection immediately (causing errors when kube-dns is restarted for instance...). It'd be nice if the ipvs proxier could handle that better. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/remove-lifecycle stale |
@m1093782566 Are you actively working on this? This is something that definitely needs to be fixed before IPVS GA. |
I think the iptables fix is at #60074 For IPVS we might need to implement a similar change. |
I slightly disagree... I don't think this issue is a big deal(it only affect UDP connections) that will block IPVS GA and iptables proxy has the same issue, I am looking at the issue though. |
I believe this affects TCP connections as well, since it immediately removes the IPVS entry as I described. iptables doesn't have this problem (except for UDP). |
@m1093782566 @rramkumar1 is this a blocker for GA? This seems like a serious gap in functionality from iptables mode especially with IPVS becoming the new default. Specifically referring to dropping TCP connections. |
I don't think this issue should be a blocker for IPVS GA. Per discussion, the graduation criteria are: a) CIs are green b) necessary documents are available there Please note that iptables is still the default mode though IPVS have become GA. Anyway, my team will take a look at this issue. |
@jhorwit2 at this point, I don’t think this should block GA. However, we should prioritize releasing a fix for this in a patch release.
… On Jun 5, 2018, at 10:02 PM, DuJun ***@***.***> wrote:
@jhorwit2
I don't think this issue should be a blocker for IPVS GA. Per discussion, the graduation criteria are:
a) CIs are green
b) necessary documents are available there
Please note that iptables is still the default mode though IPVS have become GA.
Anyway, my team will take a look at this issue.
cc @Lion-Wei @islinwb @stewart-yu
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Problem is we don't know whether this is what we needed. I mean, some people like this graceful termination, some people don't. |
I think it cat be an interesting topic for next sig-network call? |
@rramkumar1 i don't see how not supporting graceful termination of applications with ipvs provides users with what they expect from the kubernetes platform around providing production ready solutions. This bug alone will block us from using the feature since it would start causing noticeable impact immediately. Imo If it isn't fixed prior to GA, then this lack of functionality should be made more visible in docs and release notes. |
@jhorwit2 I am not opposed to adding this caveat to release notes/docs. However, given the current state I'm not sure its worth blocking GA for one more whole cycle just for this when we can just push a fix in a patch release. @m1093782566 if possible, can your team prioritize coming up with a fix that maybe we can squeeze into 1.11.0 rather than wait for the first patch? |
Sure, my team is looking at this issue. Will let you all kown if we make any progress. |
I think it will break a lot of people's expectations if IPVS doesn't support graceful termination with at least TCP - this is the current behavior with iptables and userspace mode. |
Thank you for the release note. It saved me a bunch of time testing something that was not ready yet. |
@jsravn @m1093782566 @rramkumar1 @jhorwit2 Hi, guys. Recently I have been testing this issue, and got some result. Here is my test step:
So, according to my test, I think ipvs proxier should have graceful termination for lone lived connection. If you have any question or suggestion about the test process, please let me know. And I'd like people introduce other test process. |
I discussed this with @Lion-Wei on slack, and I tested it myself to confirm. I'll repeat my notes here: Even though Start http server exposed as a service (replace nodeSelector as appropriate):
Start a client shell, on a separate node:
Then:
If you set a weight of 0 instead of removing the real server, in step (4) the request will succeed. It also works in iptables mode. I hope this helps. |
@jsravn what you described sounds like what should happen when https://www.kernel.org/doc/Documentation/networking/ipvs-sysctl.txt |
When you set |
@Lion-Wei do you need any help w/ the dev side of this? I can probably find some time for it. Although, it's still not clear to me what the proposed solution is. Either
|
@jsravn Hi, I already sended a pr to fix this issue. |
Is this a BUG REPORT or FEATURE REQUEST?:
/kind bug
What happened:
Upon removing an endpoint, the ipvs proxier immediately deletes the ipvs real server, causing all connections to get dropped.
What you expected to happen:
It should allow the terminating pod to gracefully close connections, just like the iptables proxier.
How to reproduce it (as minimally and precisely as possible):
while :; do echo -e "GET / HTTP/1.1\nhost: $host\n\n"; sleep 5; echo; done | telnet $serviceip 80
)Anything else we need to know?:
The ipvs proxier should instead be setting weight to 0, then reaping the stale real servers after some time period (that should be greater than any pod's graceful termination time). This may also fix the existing bug around UDP connections getting dropped prematurely (#45976).
Environment:
kubectl version
): tested on 1.8, but same issue in 1.9 afaictuname -a
): 4.4.0The text was updated successfully, but these errors were encountered: