Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFE] Certify/document enabling ipvs in k8s 1.11 for rancher 1.6 #15718

Closed
aemneina opened this issue Sep 19, 2018 · 12 comments
Closed

[RFE] Certify/document enabling ipvs in k8s 1.11 for rancher 1.6 #15718

aemneina opened this issue Sep 19, 2018 · 12 comments
Labels
internal kind/enhancement Issues that improve or augment existing functionality version/1.6

Comments

@aemneina
Copy link

aemneina commented Sep 19, 2018

Rancher versions:
rancher/server or rancher/rancher:v1.6.22

Feature:
This should be straight forward to configure:
update the kube proxy with flags to enable ipvs and ipvs scheduling as well as cleanup for ipvs, I'm looking for this feature to be validated or certified for rancher. Points for making it configurable via the k8s catalog template. The main problem is in DNS query intensive environments DNS + iptables + conntrack, some fraction of DNS queries end up in dns purgatory.

Background
https://www.weave.works/blog/racy-conntrack-and-dns-lookup-timeouts

@aemneina aemneina added kind/enhancement Issues that improve or augment existing functionality version/1.6 labels Sep 19, 2018
@deniseschannon deniseschannon added this to the v1.6 - Next milestone Sep 21, 2018
@alena1108 alena1108 assigned loganhz and unassigned moelsayed Oct 26, 2018
@aemneina
Copy link
Author

aemneina commented Nov 8, 2018

I think we can drop this feature if this related ticket is addressed: #16018
@alena1108 what do you think?

@alena1108
Copy link

@aemneina based on the initial research don't think tuning dns would be trivial. We can try ipvs solution first as it seems to be more straightforward

@aemneina
Copy link
Author

@alena1108 cool, can we check this out for both 1.6 and 2.x. Let me know if you want another ticket to track 2.x

@gitlawr
Copy link
Contributor

gitlawr commented Nov 20, 2018

For 1.6:
To switch to ipvs mode, we need to

  1. Add ipset to our kubernetes image, which is addressed by add ipset utility kubernetes-package#169
  2. Make it configurable via k8s catalog template. We've added extra args support for all kubernetes components but volume mount is not configurable. This is addressed by Addedv1.11.5-rancher1-2: support ipvs proxy mode rancher-catalog#1191

With the above changes, to use ipvs proxy mode:

  1. users need to ensure relevant kernel modules are compiled and loaded on the hosts.(See reference)
  2. activate the ipvs option in k8s catalog template.

For 2.0:
ipset is included in hyperkube(See) and both extra-args & extra-mounts are configurable via rke. So users can simply edit cluster yaml and add following configurations before provisioning:

services:  
  kubeproxy:
    extra_args:
      proxy-mode: ipvs
    extra_binds:
      - "/lib/modules:/lib/modules"

@gitlawr
Copy link
Contributor

gitlawr commented Nov 20, 2018

@aemneina Do you have steps to reproduce the DNS package dropping issue so I can verify it? I've tried some tools to mock a loaded cluster but got no luck.

@aemneina
Copy link
Author

aemneina commented Nov 20, 2018 via email

@gitlawr
Copy link
Contributor

gitlawr commented Nov 27, 2018

To reproduce upstream dns issue:

  1. setup a cluster with 1 master + 2workers
  2. run workloads to send sufficient dns queries
#scale the replicas if needed
kubectl run hammer --image=aemneina/dns-hammer:v0.0.7 --replicas=10
  1. do the check:
kubectl run probe --image=lawr/dns-probe:v0.0.1
kubectl logs -f $(kubectl get --no-headers=true pods -l run=probe -o custom-columns=:metadata.name)

Example outputs:

...
trying #1385
trying #1386
#1387 slow: 5.073300 09:24:39
trying #1387
trying #1388
#1389 slow: 5.072093 09:24:45
trying #1389
trying #1390
trying #1391
#1392 slow: 5.073432 09:24:51

Enabling IPVS mode does not help solve the upstream DNS issue, we still get 5s timeout in clusters using IPVS mode.

@aemneina
Copy link
Author

aemneina commented Dec 4, 2018

@superseb @alena1108, @Oats87 pointed me to Node Local DNS, which should be a solid solution. Forces dns lookups over TCP. I believe it's alpha in 1.13.x
kubernetes/kubernetes#70555

@alena1108
Copy link

alena1108 commented Dec 4, 2018

@aemneina that would be the best solution given it's officially supported by k8s. And that k8s 1.13 was released today, 12/03/2018 :)

@loganhz loganhz removed their assignment Dec 11, 2018
@alena1108
Copy link

No need to validate for now as ipvs support was added to address #16018

@deniseschannon
Copy link

With the release of Rancher 2.0, development on v1.6 is only limited to critical bug fixes and security patches.

@deniseschannon
Copy link

As 1.6 is EOL as of June 30, 2020, we are closing 1.6 issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
internal kind/enhancement Issues that improve or augment existing functionality version/1.6
Projects
None yet
Development

No branches or pull requests

7 participants