Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flannel always overwrites public-ip node annotation in Kubernetes on startup #712

Closed
maxx opened this issue May 5, 2017 · 22 comments · Fixed by #840
Closed

Flannel always overwrites public-ip node annotation in Kubernetes on startup #712

maxx opened this issue May 5, 2017 · 22 comments · Fixed by #840

Comments

@maxx
Copy link

maxx commented May 5, 2017

This is an issue when running flannel inside kubernetes as a daemonset on VM's with differing internal and external IP addresses (openstack). There is no way to use the --public-ip argument to different hosts within a daemonset. The only other way to vary public-ip on a node is to use the node annotation (flannel.alpha.coreos.com/public-ip).

The problem is flannel overwrites this annotation every time it starts up with the internal IP of the host. This, with the inability to use --public-ip variations in daemonsets, means there is no way to set a different public-ip to be used with flannel in kubernetes.

I have a fix in mind (see below).

Expected Behavior

flannel.alpha.coreos.com/public-ip can be set manually and is not overwritten.

Current Behavior

flannel.alpha.coreos.com/public-ip is overwritten every time flannel starts regardless of it's previous state.

Possible Solution

Simply commenting this line fixes the problem on a node where flannel has been run before.
https://github.com/coreos/flannel/blob/master/subnet/kube/kube.go#L218

This allows us to set a custom public-ip and flannel will read it on startup.

Public-ip still needs to be set if it does not exist, so I propose we check for a public-ip-override and use the assumed IP only if it's not there.

Steps to Reproduce (for bugs)

  • Start flannel in a kubernetes daemon set
  • kubectl annotate node <node name> flannel.alpha.coreos.com/public-ip=<some ip> --overwrite
  • Kill flannel container or restart it some other way
  • Verify your node annotation flannel.alpha.coreos.com/public-ip is now set to the assumed (incorrect) IP.

Context

See above description.

Your Environment

  • Flannel version: v0.7.0-96 (applicable on master branch)
  • Backend used (e.g. vxlan or udp): vxlan
  • Etcd version: 3.0.17
  • Kubernetes version (if used): v1.6.2
  • Operating System and version: Ubuntu 16.04
@maxx
Copy link
Author

maxx commented May 5, 2017

I'm working on a fix for this which will check to see if the public-ip node annotation already exists before overwriting it.

@maxx
Copy link
Author

maxx commented May 5, 2017

My fix could potentially cause a problem if the internal IP of the VM changes on reboot. Perhaps a second annotation to indicate that the IP is overridden would be good.

@maxx maxx changed the title Flannel clobbers public-ip node annotation in Kubernetes on startup Flannel always overwrites public-ip node annotation in Kubernetes on startup May 5, 2017
@maxx
Copy link
Author

maxx commented May 5, 2017

or perhaps a kube node label would be more appropriate for this sort of configuration data.

@tomdee
Copy link
Contributor

tomdee commented May 17, 2017

@maxx what about having better support for selecting the public IP for flannel to use? e.g. selecting the interface or external IP by passing in a regex to flanneld (i.e. it could be put in the daemonset)

@maxx
Copy link
Author

maxx commented May 17, 2017

@tomdee In my situation, the external IP is stored in openstack metadata. It's not something that could be expressed in a regex and it's not assigned to any interface within the node. Flannel has to read this data from somewhere on startup. A node annotation is a good place (it obviously stores it there already). We just need another annotation which is an override which flannel won't overwrite with any presumed IP.

@tomdee
Copy link
Contributor

tomdee commented May 17, 2017

@maxx that sounds reasonable - would you be able to submit a PR for this feature? Or if you can't write code, write the documentation for it?

@tz-lom
Copy link

tz-lom commented Jun 25, 2017

Any luck with that problem? I like "public-ip-override" sollution, it usable not only with OpenStack - my hosting provides eth0 with local ip, and public ip can't be easy acquired on the machine

@martynd
Copy link

martynd commented Aug 15, 2017

If you are using --hostname-override on the kubelet, have you tried making sure that its value has an entry in the host machines /etc/hosts file or that its able to be resolved correctly by your primary resolver?

When it isn't defined ive experienced the inconsistent behavior described, however once it is, ive not experienced any issues, even with the daemonset method.

If you have your kubelet sandboxed using the kubelet-wrapper and you're still having issues, try making sure the rkt args include --hosts-entry=host to have it use the nodes hosts file to test it.

Some providers use public to mean the world accessible IP rather than the private ip which is what flannel wants which seems to be where this comes about.

alvaroaleman added a commit to alvaroaleman/flannel that referenced this issue Oct 16, 2017
alvaroaleman added a commit to alvaroaleman/flannel that referenced this issue Oct 21, 2017
alvaroaleman added a commit to alvaroaleman/flannel that referenced this issue Oct 28, 2017
This may be useful if a nodes public IP can not determined, e.G.
because it is behind a nat. Fixes flannel-io#712
alvaroaleman added a commit to alvaroaleman/flannel that referenced this issue Oct 28, 2017
This may be useful if a nodes public IP can not determined, e.G.
because it is behind a nat. Fixes flannel-io#712
alvaroaleman added a commit to alvaroaleman/flannel that referenced this issue Oct 29, 2017
This may be useful if a nodes public IP can not determined, e.G.
because it is behind a nat. Fixes flannel-io#712
alvaroaleman added a commit to alvaroaleman/flannel that referenced this issue Oct 29, 2017
This may be useful if a nodes public IP can not determined, e.G.
because it is behind a nat. Fixes flannel-io#712
tomdee added a commit that referenced this issue Oct 31, 2017
Fix #712, allow overwriting the public IP of a Kubernetes node
@tommyknows
Copy link

Hi,
I tried to apply your patch - i took the standard daemon-set, replaced the docker image with one from quay.io/coreos/flannel-git and set the node annotations with
kubectl annotate node $name flannel.alpha.coreos.com/public-ip-overwrite=$IP

Your fix works - it sets flannel.alpha.coreos.com/public-ip to the IP I provided.
However, when checking the flannel logs:

{"log":"I1110 13:57:46.705442       1 main.go:469] Determining IP address of default interface\n","stream":"stderr","time":"2017-11-10T13:57:46.706125283Z"}
{"log":"I1110 13:57:46.706949       1 main.go:482] Using interface with name eth0 and address 10.94.41.81\n","stream":"stderr","time":"2017-11-10T13:57:46.707759148Z"}
{"log":"I1110 13:57:46.706969       1 main.go:499] Defaulting external address to interface address (10.94.41.81)\n","stream":"stderr","time":"2017-11-10T13:57:46.707780337Z"}

Flannel still defaults to the standard interface and IP. Do I have to change any of the parameters in the daemonset or configmap?

@alvaroaleman
Copy link
Contributor

Uhm. I guess only the logging is wrong there. Did you try if you can reach pods that are placed on the node you annotated?

@tommyknows
Copy link

You're right - it's just the log. It didn't work right after that because I had to add some iptables rules to the host system. Thank you!

@tomdee
Copy link
Contributor

tomdee commented Nov 10, 2017

Reopening until we have a separate issue to track the logging bug

@jaytaylor
Copy link

This is still causing me problems in environments where the public IP is different the the private IP.

Is there any way to prevent flannel from always overwriting the flannel.alpha.coreos.com/public-ip key when it already exists? I've had to implement a rather ugly hack to check every few minutes for mismatches between alpha.kubernetes.io/provided-node-ip and when one is found, set flannel.alpha.coreos.com/public-ip accordingly.

Really wishing there were a --use-ip=w.x.y.z flag..

@wakawaka54
Copy link

wakawaka54 commented Mar 2, 2018

I'm working on the same issue right now. It looks like there is an override flag you can set as an annotation on the node: https://coreos.com/flannel/docs/latest/kubernetes.html

The flag is this flannel.alpha.coreos.com/public-ip-overwrite . I tried adding this annotation to my master node but upon restarting flannel, it looks like it resets the public-ip annotation to the incorrect ip address again.

Update: I bet this is because I am using Flannel v0.9.1 which is in the kubeadm docs but the latest is v0.10.0 which seems to add the flag. I will check on this and then follow up.

@jaytaylor
Copy link

Fantastic information, thank you @wakawaka54!

@timorjim
Copy link

Looks like we are seeing the same behaviour in v0.10.0. Set the IP with:

kubectl annotate node flannel.alpha.coreos.com/public-ip= --overwrite

and it is reset to the default on reboot.

@vasu-dasari
Copy link

Yes. I am also seeing this issue.

# Set annotation
kubectl annotate node seeweed-vm-06 flannel.alpha.coreos.com/public-ip-overwrite=10.200.1.6 --overwrite

# Restart flannel
kubectl get pod kube-flannel-ds-49shf -o yaml -n kube-system  | kubectl replace --force -f -

Now I see that packet is supposed using the right interface eth1.31, but, the source IP address that is used in outer IP header is not right. It should have been 10.200.1.6

10:40:43.052861 00:0c:29:8c:82:93 > 00:0c:29:39:02:0c, ethertype 802.1Q (0x8100), length 152: vlan 31, p 0, ethertype IPv4, 172.17.4.197.33434 > 10.200.1.5.8472: OTV, flags [I] (0x08), overlay 0, instance 1
    ce:08:6c:7e:3a:ea > d2:39:c2:72:cb:3c, ethertype IPv4 (0x0800), length 98: 10.244.1.12 > 10.244.5.6: ICMP echo request, id 273, seq 587, length 64

Before makikng annotation changes, VxLAN packets used leave from eth0 whose IP address used to be 172.17.4.197.

And I see that the flannel that is being used is: quay.io/coreos/flannel:v0.10.0-amd64

@aisensiy
Copy link

@vasu-dasari I meet the same problem. Trying to change the annotation but it come back quickly.

@ramesaliyev
Copy link

Hello, anyone has any solution to this?

@sab24
Copy link

sab24 commented Aug 25, 2020

This still happens in 2020.......

@ramesaliyev
Copy link

This still happens in 2020.......

yeah im using Calico instead

@darkyzhou
Copy link

darkyzhou commented Nov 3, 2020

faced with similar issue using wireguard
you can try editing flannel daemonset and add --iface=wg0 to container launch args.
flannel will get default ip from the interface specified. logs will be like:

I1103 16:08:44.769545       1 main.go:531] Using interface with name wg0 and address 10.x.x.4
I1103 16:08:44.769593       1 main.go:548] Defaulting external address to interface address (10.x.x.4)

reference: https://stackoverflow.com/questions/47845739/configuring-flannel-to-use-a-non-default-interface-in-kubernetes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.