Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kube-proxy iptables random routing is inefficient #64580

Closed
bfleming-ciena opened this issue May 31, 2018 · 8 comments
Closed

kube-proxy iptables random routing is inefficient #64580

bfleming-ciena opened this issue May 31, 2018 · 8 comments
Labels
sig/network Categorizes an issue or PR as relevant to SIG Network.

Comments

@bfleming-ciena
Copy link

bfleming-ciena commented May 31, 2018

I've learned recently that userspace routing was replaced with iptables routing in k8s, and iptables uses random pod selection for load balancing.

This means most of the time we have pods that are idle. If it were round-robin then we would guarantee to distribute evenly.

Is this random selection the only option we have available right now?

Thanks

/sig network

@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. sig/network Categorizes an issue or PR as relevant to SIG Network. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels May 31, 2018
@islinwb
Copy link
Contributor

islinwb commented Jun 1, 2018

@stonefury Maybe you should try ipvs.

@bfleming-ciena
Copy link
Author

Oh wow, didn't know that was in the pipeline. It's beta I see, but still, is this to replace iptables mode, or just an alternate mode? Either way, thanks for that @islinwb

@dims
Copy link
Member

dims commented Jun 2, 2018

@stonefury it will be GA in 1.11 - #58442

@islinwb
Copy link
Contributor

islinwb commented Jun 2, 2018

@stonefury Currently IPVS is an alternate mode but I guess someday most people will use it. It takes round-robin as the default load balancing algorithm and you can choose others. (See the ipvs docs 1, 2)

@thockin
Copy link
Member

thockin commented Jun 6, 2018

A) Random means "equal probability of hitting any backend". If you have idle pods it suggests you either don't have a statistically significant number of connections or you're doing client affinity or something to defeat the randomizer.

B) Round-robin becomes a distributed decision - each node chooses independently of each other node. So to your backend service it's basically random anyway.

Anyway ipvs mode is going GA in 1.11, so please feel free to try it out :)

@thockin thockin closed this as completed Jun 6, 2018
@bfleming-ciena
Copy link
Author

@thockin - Yes, I later learned the the dev team uses persistent connections to the microservice, and so they only do a small number of initial connections, so statistically it was not evenly distributed. I had them increase the front end so it would make more connections. So this ultimately was an implementation issue on the dev side, though having round-robin would've made this a non issue for them.

Thanks all

@Jeffwan
Copy link
Contributor

Jeffwan commented Jul 14, 2018

@thockin

I use socket to connect to a cluster IP which has many backend pods listening on same port. I notice random is totally not guaranteed. I disable sticky session on Service level (by default) but I am wondering if iptable proxy mode has something in-built to reuse address of backend pod.

I send ~1000 request and they all go to same pod.

@zghnr1993
Copy link

i'm wondering too . why many request go to same pod .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/network Categorizes an issue or PR as relevant to SIG Network.
Projects
None yet
Development

No branches or pull requests

7 participants