During GCI bootup config, the docker0 bridge is deleted before kubelet starts which works fine when not using a NETWORK_PROVIDER.
If a NETWORK_PROVIDER is used (e.g. kubenet) kubelet won't restart docker, this introduces a race condition between when the config restarts docker and the docker0 bridge is deleted.
cc @bprashanth @thockin @Amey-D @fabioy @roberthbailey
we don't need to restart docker with kubenet. The new containers are created on cbr0 by the network plugin, and if someone ssh's into the node and runs "docker run -it busybox /bin/sh" it still gets an ip from docker0. Now docker0 and cbr0 should not have overlapping cidrs, because docker0 will be created from the default 172.17.0.1, and cbr0 will get created from the podcidr range.
We should probably make sure the nodecidr doesn't overlap the 172 range or bad things can happen.
Also isn't the only thing running GCI the master, on which we don't use kubenet?
GCI is an option for GKE customers for the node as well. It'd be bad if it was broken for them.
@maisem can you clarify the race? we delete docker0 on the master to avoid overlapping cidrs. There might actually be better ways to solve the race (like actually applying the master kubelet's --pod-cidr arg to cbr0 , and making sure it doesn't overlap with docker0), but i don't think we need to target everything for 1.3.4. The docker0 deletion shouldn't be an issue on nodes, in fact we shouldn't even need to delete docker0 because we start docker without --bridge option, so it's going to create docker0 anyway.
We need to restart docker at least once for it to pick up the new command line arguments.
We didn't need to explicitly restart it pre-kubenet as kubelet would do that.
Pre-kubenet these are the following happens.
When we start using kubenet the following happens.
There is a race between 2 and 3. If docker starts before the bridge is deleted with kubenet enabled it crashes because it is unable to use 172.17.0.1. Which causes the startup scripts to fail.
Error starting daemon: Error initializing network controller: Error creating default \"bridge\" network: failed to allocate gateway (172.17.0.1): Address already in use
#29757 changes the order of the steps to
I hope that clarifies things.
Thanks, will check tomorrow
Reading your previous comment, we don't need to delete docker0 with kubenet. In fact, with the "correct" ordering (2 before 3 in your list) doesn't docker just recreate docker0 for itself if we simply service restart docker? how does deleting the bridge even help?
service restart docker
There are a few situations to handle though:
The easiest option is to get rid of docker0 deletion hack by just changing the --pod-cidr given to the master on GKE. Is ther a reason it is what it is today? Then everywhere, we use kubenet, never delete docker0 and respect --pod-cidr if specified, otherwise use the podCIDR in the node object.
This isn't an issue with the master.
This is an issue with the node.
docker0 deletion for GCI was introduced in #27016 to fix #26379. #29757 merely reorders it.
docker0 deletion isn't required for node.
#26379 is only an issue on the GKE master because we pass --pod-cidr=172.17.42.1/16 on GKE, which overlaps with the default range of docker0.
I take that back, #26379 was an issue because we were not using kubenet. Once we started using kubenet, GCE masters should work with or without docker0 deletion. GKE masters won't work because of the --pod-cidr argument overlapping with docker0.