Skip to content
This repository has been archived by the owner. It is now read-only.

Container linux 1662.0.0 alpha doesn't add routes if external to network #2327

Closed
jorgelon opened this issue Jan 24, 2018 · 11 comments
Closed

Container linux 1662.0.0 alpha doesn't add routes if external to network #2327

jorgelon opened this issue Jan 24, 2018 · 11 comments

Comments

@jorgelon
Copy link

@jorgelon jorgelon commented Jan 24, 2018

Issue Report

Bug

Container Linux Version

1662.0.0 alpha

Environment

Under vmware esxi 5.5 with oem vmware_raw with vmxnet3 interface
I have a gateway from a different network than the IP
for example:
vm ip : 10.5.100.23/32
gateway: 50.255.255.255
The ip and the routes come from the dhcp server

Expected Behavior

In coreos stable or beta or previous alpha releases, when the machine boots all works. The vm gets the correct route and ip adress

Actual Behavior

In Coreos 1662.0.0 alpha the vm gets the IP but it fails setting the gateway

Reproduction Steps

  1. Start an vm with coreos alpha 1662
  2. The vm has no connectivity because the gateway is not set

Other Information

The systemd-network error is "Could not set DHCPv4 route" "Network is unreachable

@ajeddeloh
Copy link

@ajeddeloh ajeddeloh commented Jan 25, 2018

This is almost certainly from the systemd 235 -> 236 upgrade; I'll review the changes between the two and see if anything sticks out. Can you please provide the output from journalctl --no-pager -u systemd-networkd? Also ip a and the contents of the relevent lease in /run/systemd/netif/leases/ would be helpful. Thanks for the report.

@jorgelon
Copy link
Author

@jorgelon jorgelon commented Jan 25, 2018

Jan 25 14:00:26 localhost systemd[1]: Starting Network Service...
Jan 25 14:00:26 localhost systemd-networkd[577]: Enumeration completed
Jan 25 14:00:26 localhost systemd[1]: Started Network Service.
Jan 25 14:00:27 localhost systemd-networkd[577]: eth0: Interface name change detected, eth0 has been renamed to ens192.
Jan 25 14:00:27 localhost systemd-networkd[577]: lo: Configured
Jan 25 14:00:27 localhost systemd-networkd[577]: ens192: IPv6 successfully enabled
Jan 25 14:00:27 localhost systemd-networkd[577]: ens192: Gained carrier
Jan 25 14:00:27 localhost systemd-networkd[577]: ens192: DHCPv4 address <PUBLIC IP>/32 via 15.255.255.1
Jan 25 14:00:27 localhost systemd-networkd[577]: ens192: Could not set DHCPv4 route: Network is unreachable
Jan 25 14:00:27 localhost systemd-networkd[577]: ens192: Failed
ADDRESS=<PUBLIC IP>
NETMASK=255.255.255.255
ROUTER=15.255.255.1
SERVER_ADDRESS=<PUBLIC IP>
T1=21600
T2=37800
LIFETIME=43200
DNS=<DNS1> <DNS2>
ROUTES=169.254.0.0/16,15.255.255.1   << the first route is because I have a metadata server there
CLIENTID=ff2d1aa13300020000ab11948395539b53d96a
@squeed
Copy link

@squeed squeed commented Jan 25, 2018

Oddly enough, this change, at first glance, should have fixed this: systemd/systemd#5982

@ajeddeloh
Copy link

@ajeddeloh ajeddeloh commented Jan 25, 2018

Thanks for the logs. It looks like when specifying static routes (i.e. not from dhcp) you need to set GatewayOnlink to true or else it will reject the route, but there's no option to do the same with dhcp routes.

@squeed That looks like it's been in systemd since 234 and this only popped up after switching to from 235 to 236.

@ajeddeloh
Copy link

@ajeddeloh ajeddeloh commented Jan 25, 2018

I wonder if systemd/systemd#6885 is responsible. It looks like before that the gateway route was applied then the static routes whereas now the static routes are applied first, then the gateway route if no static routes were applied.

@jorgelon can you provide a little more info about what your network looks like? Specifically the dhcp options the server is sending.

You also might try swapping the order of those routes so that gateway one comes first. I suspect what's happening is it doesn't have the any routes, then try to apply the metadata one, but doesn't have the gateway yet so it cant reach anything and thus fails.

@lucab
Copy link
Member

@lucab lucab commented Jan 26, 2018

A packet capture of the DCHP exchange may also be valuable, in order to check what is actually offered to networkd.

@jorgelon
Copy link
Author

@jorgelon jorgelon commented Jan 26, 2018

    option subnet-mask 255.255.255.255;
    option routers 15.255.255.1;
    option static-routes 169.254.169.254 15.255.255.1; << the first route is because I have a metadata server there
    option domain-name-servers 15.5.100.12, 15.5.101.12;

I have tried with GatewayOnlink and it works but in stable and beta it is not neccessary.

@ajeddeloh
Copy link

@ajeddeloh ajeddeloh commented Jan 26, 2018

Ugh, it looks like they're parsing both the classless static routes and the classful static routes into the same list, then assuming it's classless (and thus ignoring the gateway route as required when using the classless routes option, but detrimental when using the classful routes option).

@ajeddeloh
Copy link

@ajeddeloh ajeddeloh commented Jan 26, 2018

Looks like it's fixed upstream in systemd/systemd@8cdc46e. We'll backport that and it should be in the next alpha.

@ajeddeloh
Copy link

@ajeddeloh ajeddeloh commented Jan 26, 2018

Closed via coreos/coreos-overlay#3027. It will be fixed in the next alpha. As a workaround you can probably specify the routes manually in a networkd unit.

@ajeddeloh ajeddeloh closed this Jan 26, 2018
@bgilbert
Copy link
Member

@bgilbert bgilbert commented Feb 1, 2018

This will be fixed in the next alpha, and also beta as current alpha is promoted. Both are due shortly. Thanks for the report.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
6 participants