-
Notifications
You must be signed in to change notification settings - Fork 837
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
machine-id is not reset when instance-id changes #4066
Comments
Launchpad user Robie Basak(racb) wrote on 2023-01-17T21:18:27.344924+00:00 While experimenting with this, I found that systemd-networkd uses /etc/machine-id to determine the DHCP client identifier, and dnsmasq reissues the same lease if the client identifier is the same. So starting two cloud images using libvirt with its dnsmasq DHCP support from the same "golden image", without cloud-init resetting /etc/machine-id, results in an IP conflict between those two VMs. |
Launchpad user Brett Holman(holmanb) wrote on 2023-01-19T05:08:26.484986+00:00 Agreed, automating this boot-time step seems ideal from an user experience and identity correctness perspective. Resetting machine-id is currently expected to be done by the image builder at build time. Taking responsibility for this behavior at runtime carries risk that will need to be evaluated and mitigated prior to introduction. This would require all systemd services that use machine-id to be ordered after (or potentially restarted after, if already started) whichever cloud-init service would be responsible for this behavior. If this behavior is expected to be default in upstream cloud-init, risk is multiplied across distros, since each distro may have different services and ordering. Also note that resetting machine-id at runtime may cause a slower boot by forcing delayed ordering of services. |
Launchpad user Brett Holman(holmanb) wrote on 2023-01-19T21:14:36.810321+00:00 Resetting machine-id at runtime would be a pretty big break from current expectations, and correct implementation would require foreknowledge of services using machine-id that are provided in an image. The potential for bugs due to implementation complexity, potential for boot speed regression caused by services delaying until after machine-id is reset, and expected future burden of such a feature due to changes in services and variation in Ubuntu and other distros makes the perceived risk of this feature outweigh the benefit. These complexity, risk, and potential boot speed issues are not present when machine-id is correctly set at boot time, so I'm hesitant to move forward with this request. I'll mark this "Won't Fix" for now. In the meantime, I'd like to point users experiencing the same issue towards our build recommendation[1], specifically the --machine-id option. [1] https://cloudinit.readthedocs.io/en/latest/reference/cli.html#clean |
Launchpad user Chad Smith(chad.smith) wrote on 2023-01-20T00:27:36.730359+00:00 "it's expected that cloud-init will ensure that machine-id is not carried over when a VM is cloned and this is detectable by an instance-id change." I'm not sure that statement above is wholly correct. The instance-id delta is triggered in more cases than just a clone and first instance boot event. In recent history ~5 years, some clouds trigger instance-id changes for the following events to force cloud-init to reperform all configuration on next boot (or sometimes hotplug NIC configuration):
Here is systemd's documented stance on machine-id changes per man machine-id: The machine ID does not change based on local or network configuration Trying to fold /etc/machine-id regeneration into every instance-id change for cloud-init will be tough to support until we have:
The reason for #2 is because cloud-init is only able to detect instance metadata after the network is already active on the system, and restarting systemd-networkd later in boot is more likely to expose a number of other racey problems. We may take a look at this further, but the conditions under which we want cloud-init to magically regenerate /etc/machine-id and cope with systemd ordering/costs would need to be limited in scope to avoid triggering other concerns. |
Launchpad user Brett Holman(holmanb) wrote on 2023-03-13T16:55:28.868455+00:00 A couple of related details regarding machine-id induced IP collisions: The duplicate IP caused by duplicate machine-id will not happen with NetworkManager in Focal and later (NetworkManager versions >1.15) by default due to this change[1]. It is still possible to trigger it by setting SystemD is unlikely to follow NetworkManager[2], because doing so would only mask the bigger problem (duplicate machine-id on multiple machines). Therefore, the duplicate IP symptom is limited to distros using systemd-networkd, however the underlying machine-id issue affects all distros. [1] https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/commit/cfd696cc3cf43f5f510046b757949546bcee4cdc |
This bug was originally filed in Launchpad as LP: #2003121
Launchpad details
Launchpad user Robie Basak(racb) wrote on 2023-01-17T19:52:55.708270+00:00
As discussed in #ubuntu-server just now, it's expected that cloud-init will ensure that machine-id is not carried over when a VM is cloned and this is detectable by an instance-id change.
This would align behaviour with ssh host key regeneration behaviour.
Actual behaviour: currently if a VM is cloned and the instance-id changes, /etc/machine-id remains the same.
The text was updated successfully, but these errors were encountered: