Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increasing weave-net pod resource request and limit #8113

Closed
ysaakpr opened this issue Dec 15, 2019 · 12 comments
Closed

Increasing weave-net pod resource request and limit #8113

ysaakpr opened this issue Dec 15, 2019 · 12 comments

Comments

@ysaakpr
Copy link

ysaakpr commented Dec 15, 2019

We are using weave net as the cni plugin and at time we found that the weave pods are getting restarted. Mostly the reason could be that the current reserved and limit resources values are 200Mb.

I could see that the weave-net pod has limited the memory at 200Mb, While, the weave container has a max memory usage reported as 136Mb, and Weave-npc container max reported as 98Mb, Which some together to 234Mb, which is more than the current limit. Its preferably better to keep the request and limit to a higher value like 300Mb, and if that is not viable its good to at least keep the limit a higher value.

And why not make the network addon resource configuration editable through the kops cmd and persist it to the config store(s3)

@mmerrill3
Copy link

mmerrill3 commented Dec 16, 2019

What version of weave? Version 2.6.0 has tremendous performance improvements and critical bug fixes for ip management conflicts when nodes are deleted.

@ysaakpr
Copy link
Author

ysaakpr commented Dec 16, 2019

I am using the one that defaults to kops 1.14, Version: 2.5.2

@mmerrill3
Copy link

Are all of our connections in your cluster using fastdp and not sleeve? You can see if its all fastdp by running:

for i in $(kubectl get pods -n kube-system | grep weave | awk '{ print $1}'); do kubectl get pods $i -o wide -n kube-system; kubectl exec -n kube-system $i -c weave -- /home/weave/weave --local status connections; done

@ysaakpr
Copy link
Author

ysaakpr commented Dec 16, 2019

@mmerrill3 I am using all fastdp(though i am not sure what it is), I can see connection established for all the nodes, except the node self, which is expected, after running the above command.

@mmerrill3
Copy link

mmerrill3 commented Dec 16, 2019

fastdp is system space handling, whereas sleeve is userspace. sleeve is the fall back implementation if there is anything wrong with the IPAM ring. It's much more CPU and memory intensive. If you see fastdp, you don't have that issue.

I set my daemonset for weave to use the following resource limits. Note that I installed weave after setting up the cluster, not using kops. I installed the daemonset manually.

kubectl describe ds weave-net -n kube-system Name: weave-net Selector: name=weave-net Node-Selector: <none> Labels: name=weave-net Annotations: deprecated.daemonset.template.generation: 11 kubectl.kubernetes.io/last-applied-configuration: {"apiVersion":"apps/v1","kind":"DaemonSet","metadata":{"annotations":{},"labels":{"name":"weave-net"},"name":"weave-net","namespace":"kube... Desired Number of Nodes Scheduled: 27 Current Number of Nodes Scheduled: 27 Number of Nodes Scheduled with Up-to-date Pods: 27 Number of Nodes Scheduled with Available Pods: 27 Number of Nodes Misscheduled: 0 Pods Status: 27 Running / 0 Waiting / 0 Succeeded / 0 Failed Pod Template: Labels: name=weave-net Service Account: weave-net Containers: weave: Image: weaveworks/weave-kube:2.6.0 Port: <none> Host Port: <none> Command: /home/weave/launch.sh Limits: cpu: 200m memory: 256Mi Requests: cpu: 50m memory: 128Mi Liveness: http-get http://127.0.0.1:6784/status delay=30s timeout=1s period=10s #success=1 #failure=3 Environment: HOSTNAME: (v1:spec.nodeName) IPALLOC_RANGE: 100.96.0.0/11 Mounts: /host/etc from cni-conf (rw) /host/home from cni-bin2 (rw) /host/opt from cni-bin (rw) /host/var/lib/dbus from dbus (rw) /lib/modules from lib-modules (rw) /run/xtables.lock from xtables-lock (rw) /weavedb from weavedb (rw) weave-npc: Image: weaveworks/weave-npc:2.6.0 Port: <none> Host Port: <none> Limits: cpu: 100m memory: 128Mi Requests: cpu: 10m memory: 64Mi Environment: HOSTNAME: (v1:spec.nodeName) Mounts: /run/xtables.lock from xtables-lock (rw) Volumes: weavedb: Type: HostPath (bare host directory volume) Path: /var/lib/weave HostPathType: cni-bin: Type: HostPath (bare host directory volume) Path: /opt HostPathType: cni-bin2: Type: HostPath (bare host directory volume) Path: /home HostPathType: cni-conf: Type: HostPath (bare host directory volume) Path: /etc HostPathType: dbus: Type: HostPath (bare host directory volume) Path: /var/lib/dbus HostPathType: lib-modules: Type: HostPath (bare host directory volume) Path: /lib/modules HostPathType: xtables-lock: Type: HostPath (bare host directory volume) Path: /run/xtables.lock HostPathType: FileOrCreate Priority Class Name: system-node-critical Events: <none>
On the pods that I am running, they are all consuming about 100MB of memory. I'm running version 2.6.0 though. There is a PR to upgrade to 2.6.0 (#7898) that needs to be updated to use 2.6.0 out of the box, with the performance improvements. I think with that version, you won't have to worry about the resources since it performs so much better. We saw a 50% decrease in memory as a result, see image:
Screen Shot 2019-12-04 at 10 34 26 AM (1)

I can see how adding CNI specific overrides would be pretty verbose in the cluster configuration for kops.

@bboreham
Copy link
Contributor

Whilst we did work hard on reducing memory usage in Weave Net 2.6, it should still be expected to go up with the number of nodes so I sympathise with the suggestion that it should be configurable.

@mmerrill3
Copy link

I can look into adding these overrides in this definition file and generate the bindings in go for it.

https://github.com/kubernetes/kops/blob/c36470f227ed8c11b30f1072510f8adc96727579/k8s/crds/kops.k8s.io_clusters.yaml

mmerrill3 pushed a commit to mmerrill3/kops that referenced this issue Dec 28, 2019
Signed-off-by: mmerrill3 <michael.merrill@vonage.com>
mmerrill3 pushed a commit to mmerrill3/kops that referenced this issue Dec 28, 2019
Signed-off-by: mmerrill3 <michael.merrill@vonage.com>
k8s-ci-robot added a commit that referenced this issue Jan 3, 2020
…urces

Adding ability to configure resources for weave (#8113)
hakman pushed a commit to hakman/kops that referenced this issue Jan 4, 2020
Signed-off-by: mmerrill3 <michael.merrill@vonage.com>
@mmerrill3
Copy link

merged, this can be closed.

matt0x6F pushed a commit to matt0x6F/kops that referenced this issue Feb 3, 2020
Signed-off-by: mmerrill3 <michael.merrill@vonage.com>
@mmerrill3
Copy link

@ysaakpr the functionality is released in kops 1.18 alpha

@johngmyers
Copy link
Member

/close

@k8s-ci-robot
Copy link
Contributor

@johngmyers: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@jim-barber-he
Copy link
Contributor

The above improvement was merged into Kops 1.18 alpha
Will it be backported to the current stable 1.15.x series?
I saw no mention of it in the 1.15 release notes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants