Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to change cluster CIDR #93

Closed
sahlex opened this Issue Feb 28, 2019 · 10 comments

Comments

6 participants
@sahlex
Copy link

sahlex commented Feb 28, 2019

When using k3s server it uses 10.42.0.0/16 as cluster cidr by default. This clashes with our network setup. So I tried to change this by giving a --cluster-cidr 10.10.0.0/16 parameter (as indicated by k3s help). Still, the startup output tells me that k3s is ignoring the parameter:

INFO[2019-02-28T16:19:44.756164411+01:00] Running kube-controller-manager --kubeconfig /var/lib/rancher/k3s/server/cred/kubeconfig-system.yaml --service-account-private-key-file /var/lib/rancher/k3s/server/tls/service.key --allocate-node-cidrs --cluster-cidr 10.42.0.0/16 --root-ca-file /var/lib/rancher/k3s/server/tls/token-ca.crt --port 0 --secure-port 0 --leader-elect=false

To Reproduce

  1. run k3s server --cluster-cidr 10.10.0.0/16
  2. Still see 10.42.0.0/16 in the output

Expected behavior
When calling kube-controller-manager --cluster-cidr parameter should be passed the value given with the call to k3s

@ibuildthecloud ibuildthecloud added this to Backlog in K3S Development Feb 28, 2019

@saphoooo

This comment has been minimized.

Copy link

saphoooo commented Mar 3, 2019

+1
Also a better documentation on how to run a different network addon, because I still don't know if I have to use k3s server --no-deploy flannel --disable-agentand run each agent with --no-flannel flag or not.

@ibuildthecloud

This comment has been minimized.

Copy link
Member

ibuildthecloud commented Mar 4, 2019

I'll improve the docs. To disable flannel you just need to pass --no-flannel to the agent. When running k3s server it will automatically run the agent and due to #73 there is no way to disable flannel. So you have to run the agent separately (pass --disable-agent to server command) and each agent needs --no-flannel set.

@ibuildthecloud

This comment has been minimized.

Copy link
Member

ibuildthecloud commented Mar 4, 2019

@saphoooo @sahlex This should work now in v0.2.0-rc3. Could you please test it out? Thanks.

@ibuildthecloud ibuildthecloud moved this from Backlog to Needs review in K3S Development Mar 4, 2019

@ibuildthecloud ibuildthecloud moved this from Needs review to Reviewer approved in K3S Development Mar 4, 2019

@ibuildthecloud ibuildthecloud moved this from Reviewer approved to Testing in K3S Development Mar 4, 2019

@ibuildthecloud ibuildthecloud moved this from Testing to Reviewer approved in K3S Development Mar 4, 2019

@saphoooo

This comment has been minimized.

Copy link

saphoooo commented Mar 4, 2019

works fine with canal:

$ k3s kubectl describe po nginx-7cdbd8cdc9-ghtm8
Name:               nginx-7cdbd8cdc9-ghtm8
Namespace:          default
Priority:           0
PriorityClassName:  <none>
Node:               oidc2/192.168.99.132
Start Time:         Mon, 04 Mar 2019 16:33:37 -0500
Labels:             pod-template-hash=7cdbd8cdc9
                    run=nginx
Annotations:        <none>
Status:             Running
IP:                 10.244.1.4

I don't know if it's linked, but I have an error message on the agents:

I0304 16:39:16.540478     941 log.go:172] suppressing panic for copyResponse error in test; copy error: context canceled
@saphoooo

This comment has been minimized.

Copy link

saphoooo commented Mar 4, 2019

Apparently I've an error with traefik too:

E0304 17:22:17.259487    5375 pod_workers.go:190] Error syncing pod d8233647-3ecb-11e9-a887-7e9185699db4 ("svclb-traefik-785fb97589-dglld_kube-system(d8233647-3ecb-11e9-a887-7e9185699db4)"), skipping: [failed to "StartContainer" for "http" with CrashLoopBackOff: "Back-off 20s restarting failed container=http pod=svclb-traefik-785fb97589-dglld_kube-system(d8233647-3ecb-11e9-a887-7e9185699db4)"
, failed to "StartContainer" for "https" with CrashLoopBackOff: "Back-off 20s restarting failed container=https pod=svclb-traefik-785fb97589-dglld_kube-system(d8233647-3ecb-11e9-a887-7e9185699db4)"
@bjd

This comment has been minimized.

Copy link

bjd commented Mar 4, 2019

@ibuildthecloud Not the original reporter, but I happened to run into the same issue with conflict on 10.42.0.0 network.
I've installed the rc-3 release and added --cluster-cidr=10.99.0.0/16 to k3s server command line.
It now crashes on startup with the following:

-- Logs begin at ma 2019-03-04 23:14:34 CET, end at ma 2019-03-04 23:17:54 CET. --
mrt 04 23:14:46 master systemd[1]: Starting Lightweight Kubernetes...
mrt 04 23:14:49 master k3s[4972]: time="2019-03-04T23:14:49.248561882+01:00" level=info msg="Starting k3s v0.2.0-rc3 (6de915d)"
mrt 04 23:14:49 master k3s[4972]: time="2019-03-04T23:14:49.259088385+01:00" level=info msg="Running kube-apiserver --watch-cache=false --cert-dir /var/lib/rancher/k3s/server/tls/temporary-certs --allow-privileged=true --authorization-mode Node,RBAC --service-account-signing-key-file /var/lib/rancher/k3s/server/tls/service.key --service-cluster-ip-range 10.43.0.0/16 --advertise-port 6445 --advertise-address 127.0.0.1 --insecure-port 0 --secure-port 6444 --bind-address 127.0.0.1 --tls-cert-file /var/lib/rancher/k3s/server/tls/localhost.crt --tls-private-key-file /var/lib/rancher/k3s/server/tls/localhost.key --service-account-key-file /var/lib/rancher/k3s/server/tls/service.key --service-account-issuer k3s --api-audiences unknown --basic-auth-file /var/lib/rancher/k3s/server/cred/passwd --kubelet-client-certificate /var/lib/rancher/k3s/server/tls/token-node.crt --kubelet-client-key /var/lib/rancher/k3s/server/tls/token-node.key"
mrt 04 23:14:49 master systemd[1]: Started Lightweight Kubernetes.
mrt 04 23:14:49 master k3s[4972]: time="2019-03-04T23:14:49.501520055+01:00" level=info msg="Running kube-controller-manager --kubeconfig /var/lib/rancher/k3s/server/cred/kubeconfig-system.yaml --service-account-private-key-file /var/lib/rancher/k3s/server/tls/service.key --allocate-node-cidrs --cluster-cidr 10.99.0.0/16 --root-ca-file /var/lib/rancher/k3s/server/tls/token-ca.crt --port 10252 --address 127.0.0.1 --secure-port 0 --leader-elect=false"
mrt 04 23:14:49 master k3s[4972]: Flag --address has been deprecated, see --bind-address instead.
mrt 04 23:14:49 master k3s[4972]: time="2019-03-04T23:14:49.520724757+01:00" level=info msg="Running kube-scheduler --kubeconfig /var/lib/rancher/k3s/server/cred/kubeconfig-system.yaml --port 10251 --address 127.0.0.1 --secure-port 0 --leader-elect=false"
mrt 04 23:14:49 master k3s[4972]: time="2019-03-04T23:14:49.801230868+01:00" level=info msg="Listening on :6443"
mrt 04 23:14:49 master k3s[4972]: time="2019-03-04T23:14:49.803913104+01:00" level=info msg="Writing manifest: /var/lib/rancher/k3s/server/manifests/coredns.yaml"
mrt 04 23:14:49 master k3s[4972]: time="2019-03-04T23:14:49.806556460+01:00" level=info msg="Writing manifest: /var/lib/rancher/k3s/server/manifests/traefik.yaml"
mrt 04 23:14:50 master k3s[4972]: time="2019-03-04T23:14:50.010316963+01:00" level=info msg="Node token is available at /var/lib/rancher/k3s/server/node-token"
mrt 04 23:14:50 master k3s[4972]: time="2019-03-04T23:14:50.010354933+01:00" level=info msg="To join node to cluster: k3s agent -s https://10.42.0.10:6443 -t ${NODE_TOKEN}"
mrt 04 23:14:50 master k3s[4972]: time="2019-03-04T23:14:50.089738860+01:00" level=info msg="Wrote kubeconfig /etc/rancher/k3s/k3s.yaml"
mrt 04 23:14:50 master k3s[4972]: time="2019-03-04T23:14:50.089769922+01:00" level=info msg="Run: k3s kubectl"
mrt 04 23:14:50 master k3s[4972]: time="2019-03-04T23:14:50.089780387+01:00" level=info msg="k3s is up and running"
mrt 04 23:15:00 master k3s[4972]: F0304 23:15:00.809441    4972 controllermanager.go:203] error starting controllers: failed to mark cidr as occupied: cidr 10.42.0.0/24 is out the range of cluster cidr 10.99.0.0/16
mrt 04 23:15:00 master k3s[4972]: goroutine 229 [running]:
mrt 04 23:15:00 master k3s[4972]: github.com/rancher/k3s/vendor/k8s.io/klog.stacks(0xc0001a4900, 0xc0004c68c0, 0xb5, 0x13a)
mrt 04 23:15:00 master k3s[4972]: /go/src/github.com/rancher/k3s/vendor/k8s.io/klog/klog.go:828 +0xd4
mrt 04 23:15:00 master k3s[4972]: github.com/rancher/k3s/vendor/k8s.io/klog.(*loggingT).output(0x57e6f00, 0xc000000003, 0xc000604a50, 0x53471e5, 0x14, 0xcb, 0x0)
mrt 04 23:15:00 master k3s[4972]: /go/src/github.com/rancher/k3s/vendor/k8s.io/klog/klog.go:779 +0x306
mrt 04 23:15:00 master k3s[4972]: github.com/rancher/k3s/vendor/k8s.io/klog.(*loggingT).printf(0x57e6f00, 0x3, 0x308ee17, 0x1e, 0xc00453f1b8, 0x1, 0x1)
mrt 04 23:15:00 master k3s[4972]: /go/src/github.com/rancher/k3s/vendor/k8s.io/klog/klog.go:678 +0x14b
mrt 04 23:15:00 master k3s[4972]: github.com/rancher/k3s/vendor/k8s.io/klog.Fatalf(0x308ee17, 0x1e, 0xc00453f1b8, 0x1, 0x1)
mrt 04 23:15:00 master k3s[4972]: /go/src/github.com/rancher/k3s/vendor/k8s.io/klog/klog.go:1207 +0x67
mrt 04 23:15:00 master k3s[4972]: github.com/rancher/k3s/vendor/k8s.io/kubernetes/cmd/kube-controller-manager/app.Run.func1(0x35d2640, 0xc0000b6018)
mrt 04 23:15:00 master k3s[4972]: /go/src/github.com/rancher/k3s/vendor/k8s.io/kubernetes/cmd/kube-controller-manager/app/controllermanager.go:203 +0x590
mrt 04 23:15:00 master k3s[4972]: github.com/rancher/k3s/vendor/k8s.io/kubernetes/cmd/kube-controller-manager/app.Run(0xc00187b3d8, 0xc00009c360, 0xc00189299c, 0xc001892950)
mrt 04 23:15:00 master k3s[4972]: /go/src/github.com/rancher/k3s/vendor/k8s.io/kubernetes/cmd/kube-controller-manager/app/controllermanager.go:213 +0x90a
mrt 04 23:15:00 master k3s[4972]: github.com/rancher/k3s/vendor/k8s.io/kubernetes/cmd/kube-controller-manager/app.NewControllerManagerCommand.func1(0xc0019be280, 0xc0019eda00, 0x0, 0x10)
mrt 04 23:15:00 master k3s[4972]: /go/src/github.com/rancher/k3s/vendor/k8s.io/kubernetes/cmd/kube-controller-manager/app/controllermanager.go:99 +0x1e5
mrt 04 23:15:00 master k3s[4972]: github.com/rancher/k3s/vendor/github.com/spf13/cobra.(*Command).execute(0xc0019be280, 0xc0019d8000, 0x10, 0x1e, 0xc0019be280, 0xc0019d8000)
mrt 04 23:15:00 master k3s[4972]: /go/src/github.com/rancher/k3s/vendor/github.com/spf13/cobra/command.go:760 +0x2cc
mrt 04 23:15:00 master k3s[4972]: github.com/rancher/k3s/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0xc0019be280, 0xc000e8ef88, 0x1, 0x1)
mrt 04 23:15:00 master k3s[4972]: /go/src/github.com/rancher/k3s/vendor/github.com/spf13/cobra/command.go:846 +0x2fd
mrt 04 23:15:00 master k3s[4972]: github.com/rancher/k3s/vendor/github.com/spf13/cobra.(*Command).Execute(0xc0019be280, 0x22, 0xc000e8ef88)
mrt 04 23:15:00 master k3s[4972]: /go/src/github.com/rancher/k3s/vendor/github.com/spf13/cobra/command.go:794 +0x2b
mrt 04 23:15:00 master k3s[4972]: github.com/rancher/k3s/pkg/daemons/control.controllerManager.func1(0xc0019d8000, 0x10, 0x1e, 0xc0019be280)
mrt 04 23:15:00 master k3s[4972]: /go/src/github.com/rancher/k3s/pkg/daemons/control/server.go:117 +0xbe
mrt 04 23:15:00 master k3s[4972]: created by github.com/rancher/k3s/pkg/daemons/control.controllerManager
mrt 04 23:15:00 master k3s[4972]: /go/src/github.com/rancher/k3s/pkg/daemons/control/server.go:115 +0x23e
mrt 04 23:15:00 master systemd[1]: k3s.service: main process exited, code=exited, status=255/n/a
mrt 04 23:15:01 master systemd[1]: Unit k3s.service entered failed state.
mrt 04 23:15:01 master systemd[1]: k3s.service failed.

The k3s server host is 10.42.0.10 (and two agents on 10.42.0.11 and 10.42.0.12 but they're not active at the moment)

@epicfilemcnulty

This comment has been minimized.

Copy link
Contributor

epicfilemcnulty commented Mar 5, 2019

@ibuildthecloud I was able to use cilium as CNI with v0.2.0-rc5, running server with /usr/local/bin/k3s server --cluster-cidr=192.168.0.0/16 --no-flannel --no-deploy=traefik. Is there an option to alse customize --service-cluster-ip-range 10.43.0.0/16 passed to kube-apiserver?

@epicfilemcnulty

This comment has been minimized.

Copy link
Contributor

epicfilemcnulty commented Mar 6, 2019

well, turns out there was an option, but not initialized, same story as with cluster-cidr:
#171

@bjd

This comment has been minimized.

Copy link

bjd commented Mar 6, 2019

As far as my crash is concerned, it appears that it does not like for cidr-cluster to be changed if k3s server has been run at least once. It does work when set on a fresh, never run config.

@cjellick cjellick added this to the v0.2.0 milestone Mar 8, 2019

@cjellick cjellick moved this from Reviewer approved to Testing in K3S Development Mar 8, 2019

@cjellick cjellick added the kind/bug label Mar 8, 2019

K3S Development automation moved this from Testing to Done Mar 9, 2019

@sahlex

This comment has been minimized.

Copy link
Author

sahlex commented Mar 12, 2019

Works for me now! Although I was not able to add a agent node yet (but this my be due to another issue like firewalling)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.