-
Notifications
You must be signed in to change notification settings - Fork 716
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
weave-net CrashLoopBackOff for the second node #66
Comments
From @avkonst on October 5, 2016 14:14 Please, let me know if it needs to be reported on weave-net project. |
From @Bach1 on October 5, 2016 16:12 I reported a similar issue here: |
From @avkonst on October 5, 2016 19:44 I wonder if there is any working step by step instruction on how to get kubernetes cluster up and running on a set of virtual machines? I am happy to downgrade the version of kubernetes if it is an option. I am totally confused with all of these options: kube-deploy, kube-up, and other, I found from 3rd parties.. |
From @andreagardiman on October 6, 2016 13:24 I have the same issue. I'm using VirtualBox to run 2 VM based on minimal Centos 7 image. All VMs are attached to 2 interfaces, a NAT and an host-only network. I tried also with the instructions about Calico and Canal, and I cannot make them work either. |
From @avkonst on October 6, 2016 21:45 It seems Calico has got similar issue:
What could I try to progress this issue further? |
From @tedstirm on October 12, 2016 0:20 Okay I ran into this exact same issue and here is how I fixed it. This problem seems to be due to kube-proxy looking at the wrong network interface. If you look at the kube-proxy logs on a worker node you will most likely see something like:
This is the wrong network interface. The kube-proxy should be looking at the master node's IP address not the NAT IP address. As far as I know the kube-proxy gets this value from the Kube API Server when starting up. If you look at the Kube API Server's documentation it states that if You will need to update your So for example: My master node's IP address is
I am not to sure if this is a valid long term solution, because if the default kube-apiserver.json changes then those changes wouldn't get reflected by doing what I am doing. Ideally, I think the user would want some way to set these flags via kubeadm or maybe the user should be responsible for parsing the JSON themselves. Thoughts? However, it still may be a good idea to update Step 2 of Installing Kubernetes on Linux with kubeadm to atleast mention to the users that they can update the kube component flags by modifying the their json found at: |
From @errordeveloper on October 12, 2016 9:56 I am looking at this right now. We have a way of reproducing it with https://github.com/davidkbainbridge/k8s-playground, where we have already applied a fix that makes kubelet pickup desride IP for pods in host network namespaces, however somehow |
From @chenzhiwei on October 12, 2016 10:11 I encountered this issue too. Adding --advertise-address when starting kube-apiserver solved this issue. |
From @errordeveloper on October 12, 2016 10:32 @chenzhiwei I've already started diving into the code, and you saved me time to realise how this part worked exactly, thank you so much! This looks like an easy fix for kubeadm, PR is coming shortly. |
From @errordeveloper on October 12, 2016 11:14 Ok, so it turns out that this flag is not enough, we still have an issue reaching |
From @avkonst on October 12, 2016 12:37 I am starting kubeadm with --advertise-address option: |
From @avkonst on October 12, 2016 12:39 "Ok, so it turns out that this flag is not enough, we still have an issue reaching kubernetes service IP. The simplest solution to this is to run kube-proxy with --proxy-mode=userspace. To enable this, you can use kubectl -n kube-system edit ds kube-proxy-amd64 && kubectl -n kube-system delete pods -l name=kube-proxy-amd64." @errordeveloper Could you, please, explain what flag are you setting, to what command in the getting started tutorial? |
From @errordeveloper on October 12, 2016 12:56 @avkonst see kubernetes/kubernetes#34607. Also, you can do this for now:
|
From @errordeveloper on October 12, 2016 12:57
We are going to fix this shortly. If you can hop on Slack, I'd be happy to help. |
From @errordeveloper on October 12, 2016 14:49
As this thread is getting quite noisy, here is a recap. First, find out what IP address you want to use on the master, it's probably the one on the second network interface. For this example I'll use Next, run Now, you want to append
And finally, you need to update flags in
|
From @errordeveloper on October 12, 2016 14:51 I'm working on docs for this. Also, @davidkbainbridge has provided a Vagrant+Ansible implementation, and latest fixes are in my fork. |
From @miry on October 12, 2016 19:47 @tedstirm @errordeveloper can you help me with similar problem, I have multiple eth*. And I am using I could not access to the instances from the cluster(example to redis that hosts on separate instance). I found that weave uses |
From @tedstirm on October 12, 2016 19:54 @errordeveloper I didn't have to update the |
From @errordeveloper on October 12, 2016 20:14
No, it uses So for Weave we have
How many VMs are you running on? If it's a single-node setup, you won't notice anything. If it's not, I'm quite curious and would like to take a closer look, could you share your Vagrantfile? |
From @geotro on October 13, 2016 0:6 @errordeveloper, after applying your changes I was able to get kube-dns and weave-net working! However, kube-scheduler-kubernetes won't run:
|
From @errordeveloper on October 13, 2016 0:42 @geotro sorry, that was a copy/paste bug, I've updated the comments now, it's should had been |
From @geotro on October 13, 2016 0:50 Yeah I figured that out shortly after my comment. I have everything working as expected now. Thank you to everyone who participated in fixing this! :) |
From @errordeveloper on October 13, 2016 0:51 @geotro you are welcome! We should eliminate the need for these work-arounds soon, at least as far as week can get without total hacks. |
From @Bach1 on October 13, 2016 9:18 Thank you for the feedback. I'm getting closer to a working solution. With my basic Vagrant 2-node cluster (Centos) I now have weave up and running. The kube-dns service is however still not working.
kube-dns is still stuck in ContainerCreating state:
I experimented with both bridged network settings and private network settings but the problem still persists. |
From @errordeveloper on October 13, 2016 9:21 @Bach1 could you provide the output of |
From @miry on October 13, 2016 9:54
I thought it should be After update api server to use specific ip address the wave could not connect to 100.64.0.1:443. Digital Ocean has also Anchor Ip to eth0 with mask: 10.10.0.0/16:
Updated: So I have theory that |
From @Bach1 on October 13, 2016 10:3 @errordeveloper sure, see below. I assume the last cni-related errors propagate before weave is installed. The local NAT-interface address is still visible in
|
From @errordeveloper on October 13, 2016 11:2
You a right, it's that in the current release, but it changed in master.
Yes, there is an issue with that in some DO regions... you need to pass If you are online now, it'd be easy if you could find me on Slack. Thanks. |
From @errordeveloper on October 13, 2016 11:5 @Bach1 just as a sanity check, could you kill the DNS pod, i.e. |
From @Bach1 on October 13, 2016 13:22 @errordeveloper Hmm no luck. After deletion:
|
From @miry on October 14, 2016 8:58 resolved my issue via adding a routing rule to use eth1 for node machines to kubernetes service range ip. Example:
|
From @petergardfjall on October 18, 2016 6:24 Despite following the workaround steps ((1) specify That is, after installing the weave pod network,
@Bach1: did you ever resolve your problem? |
From @miry on October 18, 2016 13:50 @petergardfjall but weave-net was up 15m ago and dns last event was 25m ago. try to restart dns pod. |
From @errordeveloper on October 18, 2016 14:11 It would be easier if folks could hop on #kubeadm channel in Slack and we On Tue, 18 Oct 2016, 14:50 Michael Nikitochkin, notifications@github.com
|
From @petergardfjall on October 19, 2016 5:43 @miry I did restart the pod with no luck (by the way, when you say restart a pod, I assume you mean deleting the pod (and have the replica set replace it), right)? It is stuck in
But I guess I'll follow @errordeveloper's advice and take it on the slack channel. |
@errordeveloper @lukemarsden What's the status of this issue? |
Closing this one. Re-open if the problem persists. |
I have this error
Any idea? |
Is that run from a node? Then you need to use the |
I don't understand what you mean by 'use the /etc/kubernetes/kubelet.conf' This is my playbook from master node
I used Vagrant to create 1 master (kubeadm init) & 1 worker (kubeadm join). |
Was this fixed? I still get the CrashLookBackOff on non-master nodes until I manually add the route (clean / up-to-date install of kubeadm etc...). |
@johnharris85 Please open a new issue in that case with more details... |
From @avkonst on October 5, 2016 14:12
Is this a request for help?
I think it is an issue either with software or documentation, but I am not quite sure. I have started with a question on stackoverflow: http://stackoverflow.com/questions/39872332/how-to-fix-weave-net-crashloopbackoff-for-the-second-node
Is this a BUG REPORT or FEATURE REQUEST? (choose one):
I think it is a bug or request to improve documentation
Kubernetes version (use
kubectl version
):1.4.0
Environment:
uname -a
):What happened:
I have got 2 VMs nodes. Both see each other either by hostname (through /etc/hosts) or by ip address. One has been provisioned with kubeadm as a master. Another as a worker node. Following the instructions (http://kubernetes.io/docs/getting-started-guides/kubeadm/) I have added weave-net. The list of pods looks like the following:
CrashLoopBackOff appears for each worker node connected. I have spent several ours playing with network interfaces, but it seems the network is fine. I have found similar question on stackoverflow, where the answer advised to look into the logs and no follow up. So, here are the logs:
What you expected to happen:
I would expect the weave-net to be in Running state
How to reproduce it (as minimally and precisely as possible):
I have not done anything special, just followed the documentation on Getting Started. If it is essencial, I can share Vagrant project, which I used to provision everything. Please, let me know if you need one.
Copied from original issue: kubernetes/kubernetes#34101
The text was updated successfully, but these errors were encountered: