You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Oct 16, 2020. It is now read-only.
We just hit a nasty bug running on GCE behind a Google L3 load balancer.
After our machines restart with CoreOS (991.2.0) Beta, they do not come back with their L3 Load balancing address programmed.
For example, compare this healthy instance:
michal@healthy ~ $ ip route ls table local type local scope host
104.155.21.113 dev ens4v1 proto 66
127.0.0.0/8 dev lo proto kernel src 127.0.0.1
127.0.0.1 dev lo proto kernel src 127.0.0.1
172.17.0.1 dev docker0 proto kernel src 172.17.0.1
172.29.0.19 dev ens4v1 proto kernel src 172.29.0.19
to a machine whose public IP address worked, but the L3 load balancing address didn't:
michal@bad ~ $ ip route ls table local type local scope host
127.0.0.0/8 dev lo proto kernel src 127.0.0.1
127.0.0.1 dev lo proto kernel src 127.0.0.1
172.17.0.1 dev docker0 proto kernel src 172.17.0.1
172.29.12.10 dev ens4v1 proto kernel src 172.29.12.10
The magical incantation that fixes the problem is as follows
sudo ip route add to local 104.155.83.31/32 dev ens4v1 proto 66
(where 104.155.83.31 is the IP of your load balancer). Basically, you add a local route to the machine so that it responds with SYN to addresses it doesn't own on an interface :) Beautiful hack.
CoreOS in its GCE image provides a service that does it google-address-manager.service, however it only runs on machines first initialization, and not on restarts.. at least on the Beta channel.
So basically to fix your CoreOS instances on GCE users need to add this to their cloud-init
We just hit a nasty bug running on GCE behind a Google L3 load balancer.
After our machines restart with CoreOS (991.2.0) Beta, they do not come back with their L3 Load balancing address programmed.
For example, compare this healthy instance:
to a machine whose public IP address worked, but the L3 load balancing address didn't:
The magical incantation that fixes the problem is as follows
(where
104.155.83.31
is the IP of your load balancer). Basically, you add a local route to the machine so that it responds with SYN to addresses it doesn't own on an interface :) Beautiful hack.The code that extracts the L3 LB ips from the VM's Metadata server and to program it like this is here:
https://github.com/GoogleCloudPlatform/compute-image-packages/blob/master/google-daemon/usr/share/google/google_daemon/address_manager.py
CoreOS in its GCE image provides a service that does it
google-address-manager.service
, however it only runs on machines first initialization, and not on restarts.. at least on the Beta channel.So basically to fix your CoreOS instances on GCE users need to add this to their
cloud-init
Can someone from GCE or CoreOS take a look what other GCE
.service
units should be re-run on machine restart?The text was updated successfully, but these errors were encountered: