-
Notifications
You must be signed in to change notification settings - Fork 837
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for multiple [Route] sections so device routes work #1868
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @nkukard, thanks for the submission.
Ignoring for a moment the implementation details, I want to be sure I understand the expected config input -> output targeted in this PR.
I have a question regarding the syntax in your example:
addresses:
- 192.168.1.1/32
- fec0::1/128
gateway4: 169.254.1.1
gateway6: "fe80::1"
routes:
- to: 169.254.1.1/32
- to: "fe80::1/128"
Makes implicit a default route, where none has been specified:
ip route add 169.254.1.1/32 dev eno1
ip -6 route add fe80::1/128 dev eno1
ip route add default via 169.254.1.1 dev eno1
ip -6 route add default via fe80::1 dev eno1
Under this UI, how would one express that no default is desirable for this interface? Imagine that we want the default on a different interface, for example.
Is there any reason that an explicitly declared interface default route should not be required in the configuration for each device?
Let me know if I missed anything.
Note: I think for any changes like this that change behavior and still pass the test suite, this demonstrates that we have too little test coverage.
"gateway4" and "gateway6" are the default routes. If you want to add a normal route via a gateway you would use the following, which is the documented method in doing so. addresses:
- 192.168.1.1/24
- fec0::1/64
routes:
- to: 10.0.0.1/24
via: 192.168.1.77
- to: "fefe::/64"
via: "fec0::77" |
Can you please provide a case which my changes break, if I read the documentation and implement configurations as documented, I do not see any regressions. There is test coverage of routes being added and the only one I modified was adding a duplicate '[Route]' entry, which systemd has always supported. |
Thanks, I see now. |
The proposed commit message explains it well. Just adding a quote from the docs to corroborate the change:
Thanks for proposing this, current behavior is clearly incorrect. You're probably right on not needing more tests. I'll review the code in the morning. I need to bounce. |
@holmanb Thanks so much for your time looking into this MR, it is really appreciated. If this commit gets approved, I'll make time to submit another one to slightly improve the documentation. |
@nkukard Since network v2 doesn't have coverage, I'd like to add something. Using your example I put together the following test. If you agree with this, feel free to include as part of the PR. diff --git a/tests/unittests/net/test_networkd.py b/tests/unittests/net/test_networkd.py
index 44e3f97e1..bb781b983 100644
--- a/tests/unittests/net/test_networkd.py
+++ b/tests/unittests/net/test_networkd.py
@@ -201,6 +201,48 @@ Gateway=2a01:4f8:10a:19d2::2
"""
+V2_CONFIG_MULTI_SUBNETS = """
+network:
+ version: 2
+ ethernets:
+ eth0:
+ addresses:
+ - 192.168.1.1/24
+ - fec0::1/64
+ gateway4: 192.168.254.254
+ gateway6: "fec0::ffff"
+ routes:
+ - to: 169.254.1.1/32
+ - to: "fe80::1/128"
+"""
+
+V2_CONFIG_MULTI_SUBNETS_RENDERED = """\
+[Address]
+Address=192.168.1.1/24
+
+[Address]
+Address=fec0::1/64
+
+[Match]
+Name=eth0
+
+[Network]
+DHCP=no
+
+[Route]
+Gateway=192.168.254.254
+
+[Route]
+Gateway=fec0::ffff
+
+[Route]
+Destination=169.254.1.1/32
+
+[Route]
+Destination=fe80::1/128
+
+"""
+
class TestNetworkdRenderState:
def _parse_network_state_from_config(self, config):
@@ -309,5 +351,16 @@ class TestNetworkdRenderState:
assert rendered_content["eth0"] == V1_CONFIG_MULTI_SUBNETS_RENDERED
+ def test_networkd_render_v2_multi_subnets(self):
+ """
+ Ensure a device with multiple subnets gets correctly rendered.
+
+ Per systemd-networkd docs, [Route] can only contain a single instance
+ of Gateway.
+ """
+ with mock.patch("cloudinit.net.get_interfaces_by_mac"):
+ ns = self._parse_network_state_from_config(V2_CONFIG_MULTI_SUBNETS)
+ renderer = networkd.Renderer()
+ rendered_content = renderer._render_content(ns)
-# vi: ts=4 expandtab
+ assert rendered_content["eth0"] == V2_CONFIG_MULTI_SUBNETS_RENDERED On main, this produces the following incorrect config, passes with this branch.
Thoughts? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once again, thanks for this contribution. Identifying a problem and proposing a fix is much appreciated, +1.
Review:
Using unique route IDs in a dict
seems like a novel way to solve the task without changing much of the surrounding code.
Scoping route IDs via the a/r/c
concatenation, etc does feel like we're putting some not-so-obvious context into the data structure using magic numbers, which makes refactors and maintenance a bit harder going forward.
At a minimum I think that adding docstring comments to parse_routes()
and update_route_section()
so that the rid
s have some explanation would aid future readers.
What do you think?
Thanks to @holmanb for adding a test case for this.
LGTM, I've included it. :) |
Would you like me to add a few comments in addition to the docstring to better clarify the concatenation? or just a docstring? |
Here is the bare minimum, if you're happy I'll add it. I'm also happy to add a one liner on each call to describe why the concatenation is used.
|
Thanks! Looks good, +1.
That would be great, thanks! |
Ok it ended up being a 2 line comment, I added it so its a little more clear whats going on. |
I expect that we'll merge this, however I'm concerned that there might be other implementation pitfalls lurking in the systemd-networkd code, since this was originally merged as experimentally supported and could use more scrutiny. I'm not sure the best way to go about that, besides pinging someone who works a little bit closer with systemd-networkd. @slyon Is that something you might be able to help with? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @nkukard! I don't see a reason to hold this PR up any longer.
LGTM
Proposed Commit Message
Additional Context
So the problem is with a relatively simple configuration it is impossible to create a device route. Take this snippet for an example...
The result of this is ...
Not so great if you understand that the resulting configuration will route 169.254.1.1 and fe80::1 via 192.168.254 and fec0::ffff respectively.
A comparable manual operation wouldbe...
Maybe this doesn't sound like a problem, but it prevents a configuration like this from being created...
My solution is to track each [Route] section individually by using a dictionary, we prefix statics with 'a' and routes with 'r' to sort later so it looks a bit better. We keep each routes respective key-value pairs in the same section so we do not pollute one section with another sections tags.
I have modified one test case to cater for multiple route sections, all other tests pass. I do not believe additional documentation is necessary as in the absence of a "via" key, the route should be assumed to have no destination IP and be a device route.
This results in the following configuration...
Maybe it still doesn't make much sense and one is sitting with a 🤔.
Let us take a technical example where a cloud provider sets up 169.254.1.1/32 and fe80::1/128 on the lan side of containers and virtual machines. The reason for this is if every container/vm has a single default gateway they can be migrated between DC's and regions without the need for any client configuration. This also bypasses the requirement to use private ranges on the LAN and NAT on their network.
The only purpose of these /32 and /128 IP's is to resolve a MAC for the traffic to be routed to. The cloud provider routes IP traffic in the other direction in a similar manner, by adding the and locking the VM/container MAC to an IP and adding this route to their LAN side.
The configuration to get this working would be similar to this...
This is only possible if we can create a device route in cloud-init with a configuration similar to this with my patch...
This does not currently work and results in all the [Route] fields being mangled in a single [Route] section.
Checklist: