New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Static routes are not per-interface, which breaks some deployments #3143
Comments
Launchpad user Gábor Mészáros(gabor.meszaros) wrote on 2018-03-26T13:52:43.998253+00:00 Launchpad attachments: eni_original_new |
Launchpad user Gábor Mészáros(gabor.meszaros) wrote on 2018-03-26T13:53:02.425136+00:00 attached is the original and juju modified interfaces file |
Launchpad user Gábor Mészáros(gabor.meszaros) wrote on 2018-03-26T13:55:11.749329+00:00 the routes are in the wrong bond (bond2), however the gateways are on br-bond0. Also in MAAS they are set to those proper subnets. |
Launchpad user Gábor Mészáros(gabor.meszaros) wrote on 2018-03-26T13:59:09.323381+00:00 on nodes without containers, the configuration is put to /etc/network/interfaces.d/50-cloud-init.cfg, which is also available on all nodes, but getting overridden. |
Launchpad user Ante Karamatić(ivoks) wrote on 2018-03-26T13:59:10.231739+00:00 ifup brings interfaces in serial. In juju's ENI, this means that it would bring bond0 before br-bond0 and br-bond1. And since layer3, provided by br-bond1 and br-bond2 would not exist when post-up is run, post-up would fail. Because of '|| true' that would not cause ifup to fail, but it would leave the machine without routes. I believe MAAS add 'post-up' static routes always to last interface (which is a good approach until netplan solves this). This means that juju should do the same; pick up post-up routes from the bottom of ENI and place them at the end of the last bridge it creates. |
Launchpad user Gábor Mészáros(gabor.meszaros) wrote on 2018-03-26T13:59:40.201409+00:00 Launchpad attachments: 50-cloud-init.cfg |
Launchpad user John A Meinel(jameinel) wrote on 2018-03-26T14:24:25+00:00 Why is adding it to the "last interface" correct. Wouldn't it be more Eg, in your above scenario, bond0 is getting 100.99.4.3/24 thus things that On Mon, Mar 26, 2018 at 5:59 PM, Gábor Mészáros <
|
Launchpad user John A Meinel(jameinel) wrote on 2018-03-26T14:26:04+00:00 Note that: On Mon, Mar 26, 2018 at 6:24 PM, John Meinel john@arbash-meinel.com wrote:
|
Launchpad user John A Meinel(jameinel) wrote on 2018-03-26T14:29:09+00:00 Is this actually Field Critical? Isn't just moving the post-up to a On Mon, Mar 26, 2018 at 6:26 PM, John Meinel john@arbash-meinel.com wrote:
|
Launchpad user Ante Karamatić(ivoks) wrote on 2018-03-26T14:30:39.729724+00:00 You are right, and as soon as I wrote the comment I realized I was wrong (mixed it with using iptables in post-up). 16:07 < ivoks> so, ideally, cloud-config would be smarter here Routes should be placed on the interfaces that provide access to gateways for those routes. |
Launchpad user Witold Krecicki(wpk) wrote on 2018-03-26T14:39:51.586121+00:00 IMHO that's an obvious MAAS fault in writing the routes always to the last device and not to the device the routes are 'attached' to. In this scenario doing ifdown bond2 (an interface that has absolutely nothing to do with the static routes) would bring the routes down. Moreover, the assumption that order of the devices in e/n/i will be the order in which the devices are brought might be incorrect. IMHO This should be fixed in MAAS. |
Launchpad user Mike Pontillo(mpontillo) wrote on 2018-03-27T04:49:51.691342+00:00 IMHO, this should also be fixed in cloud-init. If the input netplan contains "global" routes, the renderer (or whatever can pre-process the Netplan before renderering) should intelligently determine which interfaces have an on-link gateway that matches the global route, and automatically render the route at interface scope instead of "global". Arguably, if the route's gateway address doesn't match an on-link prefix, it should not be installed anyway (the kernel will reject it anyway, unless the |
Agreed
The expectation is that cloud-init should parse configurations and automatically fix them when it thinks they are incorrect? That sounds error prone at best, and definitely out of scope. I suggest fixing this in MAAS, if it hasn't been already. Closing |
This bug was originally filed in Launchpad as LP: #1758919
Launchpad details
Launchpad user Gábor Mészáros(gabor.meszaros) wrote on 2018-03-26T13:49:51.837880+00:00
When juju tries to deploy a lxd container on a maas managed machine, it looses all static routes, due to ifdown/ifup being issued and e/n/i has no saved data on the original state.
Machine with no lxd container deployed:
root@4-compute-4:~# ip r
default via 100.68.4.254 dev bond2 onlink
100.68.4.0/24 dev bond2 proto kernel scope link src 100.68.4.1
100.68.5.0/24 via 100.68.4.254 dev bond2
100.68.6.0/24 via 100.68.4.254 dev bond2
100.84.4.0/24 dev bond1 proto kernel scope link src 100.84.4.2
100.84.5.0/24 via 100.84.4.254 dev bond1
100.84.6.0/24 via 100.84.4.254 dev bond1
100.99.4.0/24 dev bond0 proto kernel scope link src 100.99.4.101
100.99.5.0/24 via 100.99.4.254 dev bond0
100.99.6.0/24 via 100.99.4.254 dev bond0
100.107.0.0/24 via 100.99.4.254 dev bond0
After juju deploys a container, routes are disappearing:
root@4-management-1:~# ip r
default via 100.68.100.254 dev bond2 onlink
10.177.144.0/24 dev lxdbr0 proto kernel scope link src 10.177.144.1
100.68.100.0/24 dev bond2 proto kernel scope link src 100.68.100.26
100.84.4.0/24 dev br-bond1 proto kernel scope link src 100.84.4.1
100.99.4.0/24 dev br-bond0 proto kernel scope link src 100.99.4.3
After host reboot, the routes are NOT getting back in place, they are still gone:
root@4-management-1:~# ip r s
default via 100.68.100.254 dev bond2 onlink
100.68.100.0/24 dev bond2 proto kernel scope link src 100.68.100.26
100.84.4.0/24 dev br-bond1 proto kernel scope link src 100.84.4.1
100.84.5.0/24 via 100.84.4.254 dev br-bond1
100.84.6.0/24 via 100.84.4.254 dev br-bond1
100.99.4.0/24 dev br-bond0 proto kernel scope link src 100.99.4.3
The text was updated successfully, but these errors were encountered: