-
Notifications
You must be signed in to change notification settings - Fork 18.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
libnet/i/defaultipam: use ULA prefix by default #47853
Conversation
ffea1ae
to
75270f9
Compare
75270f9
to
c12cf7c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does libnetwork need to know about the host ID? Why not have the daemon configure the default address pools, augmenting the user config as needed?
Until this commit, the default local address pool was initialized by the defaultipam driver if none was provided by libnet / the daemon. Now, defaultipam errors out if none is passed and instead the daemon is made responsible for initializing it with the default values if the user don'te set the related config parameter. Signed-off-by: Albin Kerouanton <albinker@gmail.com>
c12cf7c
to
c395a33
Compare
netOptions, err := daemon.networkOptions(cfg, daemon.PluginStore, activeSandboxes) | ||
netOptions, err := daemon.networkOptions(cfg, daemon.PluginStore, daemon.id, activeSandboxes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure we should depend on daemon.id
here, or at least; what's the requirement for the ID here, and is it ok for this ID to change?
The daemon.id
used to be a cryptographic key for signing (images schema v1), but is no longer used for that. Some systems used it for identifying daemons (and ISTR "Docker EE" / "UCP" used it for that purpose), so it was replaced by default with a generated UUID that's stored in a file in /var/lib/docker
. HOWEVER the content of that file (and thus the "ID") is treated as an opaque value (it's returned in docker info
, but not used elsewhere, although we may abuse it in some integration tests), but the file is allowed to be set by the user (and as such could be "anything").
There's currently no contract whatsoever for it to be bound to any lifecycle (it's stored in /var/lib/docker
so in most situations it won't change unless /var/lib/docker
is wiped, but it's not strictly defined to not change).
Note that swarm nodes do have a cryptographic ID, but that's part of the swarm cluster, and that one is not controllable by the user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure we should depend on daemon.id here, or at least; what's the requirement for the ID here, and is it ok for this ID to change?
FWIW, my 2nd commit's message:
This change generates a ULA base network by deriving a ULA Global ID
from the Engine's Host ID and put that base network into
'default-address-pools'. This Host ID is stable over time (except if
users remove their '/var/lib/docker/engine-id') and thus the GID is
stable too.
This is loosely based on https://datatracker.ietf.org/doc/html/rfc4193#section-3.2.2.
The nature of what's passed to deriveULABaseNetwork
doesn't matter, but it has to meet two requirements: 1. it should be fairly stable over time (eg. it's no big deal if it changes exceptionally); 2. and it should be generated from a random source that provides enough entropy to make it reasonably unique across nodes.
HOWEVER the content of that file (and thus the "ID") is treated as an opaque value
I get that it's opaque for most of the code, but I think it shouldn't be for the Daemon
struct as it's what owns it ultimately. Said another way, I think it's totally reasonable for the daemon to expect it to have some level of randomness.
Since deriveULABaseNetwork
doesn't care about its nature / source beyond what I described above ⬆️, it doesn't break the ID's opaqueness.
so it was replaced by default with a generated UUID
It's currently a UUIDv4. Hot new RFC9562 defines UUIDv4 in its Section 5.4 and points to its Section 6.9 "for guidelines on random data generation". That section says:
Implementations SHOULD utilize a cryptographically secure pseudorandom number generator (CSPRNG)
Looking at github.com/google/uuid, the UUID generator uses crypto/rand.Reader
under the hood:
- https://github.com/google/uuid/blob/6e10cd1027e225e3ad7bfcc13c896abd165b02ef/version4.go#L41
- https://github.com/google/uuid/blob/6e10cd1027e225e3ad7bfcc13c896abd165b02ef/uuid.go#L40
And crypto/rand.Reader
is a CSPRNG.
but the file is allowed to be set by the user (and as such could be "anything").
Trusting kapa.ai, this file is described nowhere in the docs and AFAIK we always told users they shouldn't touch anything in /var/lib/docker
and /var/run/docker
. If users want to mess with this file, then fine -- garbage in, garbage out.
There's currently no contract whatsoever for it to be bound to any lifecycle (it's stored in /var/lib/docker so in most situations it won't change unless /var/lib/docker is wiped, but it's not strictly defined to not change).
Since it's an implementation detail, I fail to see what specification is needed here to clarify its lifecyle beyond the code itself. However, I could properly document expectations for this value by adding some GoDoc around it and its source.
I think everything is fine here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the default is a UUID now, but the engine-id can be manually set through the engine-id
file;
echo "my-engine-id" > /var/lib/docker/engine-id
dockerd
docker info --format '{{.ID}}'
my-engine-id
I guess my thinking here is that we'll be using the ID for a new purpose, and if we want something unique, we should either consider producing a separate file for this, which could be a similar file, but specifically for networking; libnetwork already has a directory for its own state;
/ # ls /var/lib/docker/network/
files
/ # ls /var/lib/docker/network/files/
local-kv.db
Or, if the "a random source that provides enough entropy to make it reasonably unique across nodes." means it's relevant for swarm nodes, and only relevant for those (?) this could even be the actual swarm node's id;
docker swarm init
Swarm initialized: current node (kf26e4om953f4cnrhufgph746) is now a manager.
But of course that won't work if the node is not a swarm node 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the default is a UUID now, but the engine-id can be manually set through the engine-id file;
Sure it can, as much as one can open local-kv.db
and do whatever they want there. But why would any user do that? Unless kapa.ai is lying to me, it's not something documented.
And while this provides enough entropy to comply with https://datatracker.ietf.org/doc/html/rfc4193#section-3.2.2, there's a really low probability that someone, somewhere will end up with a base ULA prefix conflicting with something on their network. Having the source host ID stored in a separate file provides an escape hatch for such situation. In such case, we could tell them to delete the engine-id
file and restart their daemon (but not to write it themselves).
Or, if the "a random source that provides enough entropy to make it reasonably unique across nodes." means it's relevant for swarm nodes, and only relevant for those (?) this could even be the actual swarm node's id;
Nope, this change adds the ULA base network to the 'local' address space which isn't used by Swarm. This change only targets standalone Docker Engine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But why would any user do that? Unless kapa.ai is lying to me, it's not something documented.
It's really hard to assume it's not used, at least I recall Docker EE depending on some of this, and may even have been setting custom identifiers; even to the extent that we added a config for this, e428c82, and later had to reverse a change that changed behavior; f695e98
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the commit message states, the revert was necessary in large part because the engine ID was changing across upgrades. That was solved once and for all in #43555 by migrating the ID. Aside from that, Classic Swarm depended on Engine IDs being unique across all nodes. Mirantis Container Runtime does not care about engine ID in the slightest; I checked all the patches. And I could find no evidence that Mirantis Kubernetes Engine (the product formerly known as Docker EE) cares about the engine ID, either. I have found mentions that Docker EE had migrated to using the Swarm Node ID values in place of the engine IDs way back in ~2017. There is limited evidence engine identifiers are being used these days, and no evidence that anyone is setting custom engine identifiers.
Keep in mind that no escape hatch is necessary if the generated ULA prefix is unsuitable for a given user for whatever reason. The generated prefix is merely the implicit default. The user always has the option to override the default by configuring the daemon to allocate subnets from any IPv6 address pool of their choosing. The default just has to be good enough for the common case. A good enough default ULA prefix is stable across daemon restarts, upgrades, backup and restore, and is reasonably likely to not collide with other daemons. The Engine ID has those properties, which makes it an ideal seed to derive the ULA prefix from.
What's the problem if a user does customize the engine ID? We are just using the engine ID as an opaque seed value. A valid and probabilistically unique ULA prefix will be derived from any unique seed. It is only a potential problem if the engine IDs are non-unique, when containers from multiple non-unique engines are attached to the same physical network using the macvlan or ipvlan drivers. If someone does require multiple non-unique custom engine IDs with IPv6 macvlan/ipvlan containers to coexist in the same physical LAN, they can configure an explicit IPv6 address pool on each node, explicitly configure the address range of the ipvlan/macvlan network, set the IP addresses of containers explicitly, or take advantage of any of the other configuration knobs to override the implicit default they made problematic through their own actions.
Deriving the generated ULA prefix from the engine-id has a distinct advantage over something specific to networking: there is a single source of truth for engine identity for users to re-roll when needed. See, for example, #13278. Users only need to delete /var/lib/docker/engine-id
before taking a VM snapshot or capturing a disk image to produce a system image which retains all the configuration and state (volumes, networks, containers, image cache, etc.) but results in each cloned instance having a distinct identity. If engine identity was scattered across multiple state files (or worse, local-kv.db) some Sysprep-like tool would be necessary to reliably generalize an engine's state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot to reel back to this one; we discussed this in the maintainers call, and discussed the potential risks; looks like this should not be problematic for scenarios we discussed, so no objections from me.
So far, Moby only had IPv4 prefixes in its 'default-address-pools'. To get dynamic IPv6 subnet allocations, users had to redefine this parameter to include IPv6 base network(s). This is needlessly complex and against Moby's 'batteries-included' principle. This change generates a ULA base network by deriving a ULA Global ID from the Engine's Host ID and put that base network into 'default-address-pools'. This Host ID is stable over time (except if users remove their '/var/lib/docker/engine-id') and thus the GID is stable too. This ULA base network won't be put into 'default-address-pools' if users have manually configured it. This is loosely based on https://datatracker.ietf.org/doc/html/rfc4193#section-3.2.2. Signed-off-by: Albin Kerouanton <albinker@gmail.com>
c395a33
to
d18b88f
Compare
- What I did
So far, Moby only had IPv4 prefixes in its 'default-address-pools'. To get dynamic IPv6 subnet allocations, users have to redefine this parameter to include IPv6 base network(s). This is needlessly complex and against Moby's 'batteries-included' principle.
This change generates a ULA base network by deriving a ULA Global ID from the Engine's Host ID and put that base network into the 'default-address-pools'. This Host ID is stable over time (except if users remove the file '/var/lib/docker/engine-id') and thus the GID is stable too.
This ULA base network won't be put into 'default-address-pools' if users
have manually configured it.
This is loosely based on https://datatracker.ietf.org/doc/html/rfc4193#section-3.2.2.
- How to verify it
CI -- a unit test has been added to make sure the GID is stable over time.
- Description for the changelog
- A picture of a cute animal (not mandatory but encouraged)