Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't find proper metadata source IP - Interoperability problem with CentOS8/Stream, NetworkManager and Apache CloudStack #3839

Open
ubuntu-server-builder opened this issue May 12, 2023 · 4 comments
Labels
bug Something isn't working correctly launchpad Migrated from Launchpad

Comments

@ubuntu-server-builder
Copy link
Collaborator

This bug was originally filed in Launchpad as LP: #1915216

Launchpad details
affected_projects = []
assignee = None
assignee_name = None
date_closed = None
date_created = 2021-02-09T23:35:27.906402+00:00
date_fix_committed = None
date_fix_released = None
id = 1915216
importance = medium
is_complete = False
lp_url = https://bugs.launchpad.net/cloud-init/+bug/1915216
milestone = None
owner = jdoe666
owner_name = Peter M.
private = False
status = confirmed
submitter = jdoe666
submitter_name = Peter M.
tags = []
duplicates = []

Launchpad user Peter M.(jdoe666) wrote on 2021-02-09T23:35:27.906402+00:00

System environment: Apache CloudStack 4.11; KVM zone

In CentOS 8 either Upstream, there is NetworkManager. cloud-init currently packaged there is 20.3-9.el8.

We are talking about the code of the CloudStack datasource.

What we observe, is that on our CentOS test systems, cloud-init jumps into the default_gateway() method to return VR IP address 192.xxx.xxx.1. This is however wrong, this IP does not return metadata. To compare, an Ubuntu 20.04 deployed on same network resolves to 192.xxx.xxx.5.

This IP can be found under /run/NetworkManager:

./NetworkManager/resolv.conf:nameserver 192.xxx.xxx.5
./NetworkManager/no-stub-resolv.conf:nameserver 192.xxx.xxx.5
./NetworkManager/devices/2:next-server=192.xxx.xxx.5

While CloudStack datasource follows several approaches to find the IP, the code does not seem to implement the situation when there is NetworkManager.

What happens instead:

  • first approach is to try data-server DNS entry first; this is up to our system, we will try out as well
  • then, it looks for DHCP lease file location "/run/systemd/netif/leases". For some reason, this value is a hardcoded variable in net/dhcp.py: NETWORKD_LEASES_DIR = '/run/systemd/netif/leases'
  • then, it finds lease file /var/lib/NetworkManager/internal-ea2b5464-7c5e-3243-aa40-7d77805f41ee-ens3.lease, but there is (as opposite to what we see in Ubuntu) just one line, "ADDRESS=192.xxx.xxx.34" - why this file does not contain the expected entry "SERVER_ADDRESS=192.xxx.xxx.5" as well, I am not sure.
  • well and finally it is going to the default gateway method.

Would you say this is a bug, or maybe a missing feature to ensure interoperability with NetworkManager? (in terms that cloudinit does not look under /run/NetworkManager/)

@ubuntu-server-builder ubuntu-server-builder added bug Something isn't working correctly launchpad Migrated from Launchpad labels May 12, 2023
@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Peter M.(jdoe666) wrote on 2021-02-10T00:20:36.209793+00:00

P.S. asked also at https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/658

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user James Falcon(falcojr) wrote on 2021-02-10T20:18:10.540098+00:00

Based on the process you've laid out, as well as the documentation (http://docs.cloudstack.apache.org/projects/cloudstack-administration/en/4.8/virtual_machines/user-data.html), it looks like the metadata service should be at the same IP as a DHCP server, which explains the steps taken. All the steps taken are various ways to determine your DHCP server, while falling back to your current gateway.

I'm not sure what is unique about your setup that these steps aren't working, however, checking "resolv.conf" isn't a valid solution. While it's true that a DHCP and DNS server may often reside at the same IP, that isn't guaranteed to be the case, and in most cases checking DNS is "more wrong" than inspecting DHCP leases.

Is the data-server DNS entry not working for you?

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Dave(livegrenier) wrote on 2021-09-03T19:58:38.161908+00:00

Hello,

I am seeing the same problem under cloudstack 4.15 + Xen when using a shared network, since i am using a shared network the DHCP server is not the same as the gateway, therefor cloud-init ends up failing with the logs showing it is trying to use the gateway to fetch the metadata.

I see the same behaviour on CentOS 8 and Rocky Linux.

I have also attempted to play with the NETWORKD_LEASES_DIR setting but did not have any luck, i am open to provide more information or try any workarounds if someone can help.

Thanks.

Regards.

  • Dave

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Dave(livegrenier) wrote on 2021-10-13T07:56:45.746799+00:00

Hi,

Please let me know if i can provide any more info to help troubleshoot with this problem.

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working correctly launchpad Migrated from Launchpad
Projects
None yet
Development

No branches or pull requests

1 participant