Skip to content
This repository has been archived by the owner on Oct 16, 2020. It is now read-only.

CoreOS Beta on GCE should run google-address-manager.service after restart #1195

Closed
mwitkow opened this issue Mar 30, 2016 · 1 comment
Closed

Comments

@mwitkow
Copy link

mwitkow commented Mar 30, 2016

We just hit a nasty bug running on GCE behind a Google L3 load balancer.

After our machines restart with CoreOS (991.2.0) Beta, they do not come back with their L3 Load balancing address programmed.

For example, compare this healthy instance:

michal@healthy ~ $ ip route ls table local type local scope host
104.155.21.113 dev ens4v1  proto 66 
127.0.0.0/8 dev lo  proto kernel  src 127.0.0.1 
127.0.0.1 dev lo  proto kernel  src 127.0.0.1 
172.17.0.1 dev docker0  proto kernel  src 172.17.0.1 
172.29.0.19 dev ens4v1  proto kernel  src 172.29.0.19 

to a machine whose public IP address worked, but the L3 load balancing address didn't:

michal@bad ~ $ ip route ls table local type local scope host
127.0.0.0/8 dev lo  proto kernel  src 127.0.0.1 
127.0.0.1 dev lo  proto kernel  src 127.0.0.1 
172.17.0.1 dev docker0  proto kernel  src 172.17.0.1 
172.29.12.10 dev ens4v1  proto kernel  src 172.29.12.10 

The magical incantation that fixes the problem is as follows

sudo ip route add to local 104.155.83.31/32 dev ens4v1 proto 66

(where 104.155.83.31 is the IP of your load balancer). Basically, you add a local route to the machine so that it responds with SYN to addresses it doesn't own on an interface :) Beautiful hack.

The code that extracts the L3 LB ips from the VM's Metadata server and to program it like this is here:
https://github.com/GoogleCloudPlatform/compute-image-packages/blob/master/google-daemon/usr/share/google/google_daemon/address_manager.py

CoreOS in its GCE image provides a service that does it google-address-manager.service, however it only runs on machines first initialization, and not on restarts.. at least on the Beta channel.

So basically to fix your CoreOS instances on GCE users need to add this to their cloud-init

coreos:
    units:
      - name: google-address-manager.service
        enable: true

Can someone from GCE or CoreOS take a look what other GCE .service units should be re-run on machine restart?

@crawford
Copy link
Contributor

This will be fixed in the next Alpha with the introduction of Ignition v0.6.0.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants