Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sysconfig: NM_CONTROLLED=False should not be set on RHEL8 #3781

Closed
ubuntu-server-builder opened this issue May 12, 2023 · 13 comments · Fixed by #5089
Closed

sysconfig: NM_CONTROLLED=False should not be set on RHEL8 #3781

ubuntu-server-builder opened this issue May 12, 2023 · 13 comments · Fixed by #5089
Labels
bug Something isn't working correctly launchpad Migrated from Launchpad

Comments

@ubuntu-server-builder
Copy link
Collaborator

This bug was originally filed in Launchpad as LP: #1894837

Launchpad details
affected_projects = []
assignee = None
assignee_name = None
date_closed = None
date_created = 2020-09-08T12:21:36.599977+00:00
date_fix_committed = None
date_fix_released = None
id = 1894837
importance = medium
is_complete = False
lp_url = https://bugs.launchpad.net/cloud-init/+bug/1894837
milestone = None
owner = amansi26
owner_name = Aman Kumar Sinha
private = False
status = triaged
submitter = amansi26
submitter_name = Aman Kumar Sinha
tags = []
duplicates = []

Launchpad user Aman Kumar Sinha(amansi26) wrote on 2020-09-08T12:21:36.599977+00:00

Environment Details:
Management Control Plane : OpenStack (Ussuri Release)
cloud-init version : 19.1 (community)
Data Source : Config Drive
OS/platform of deployed VM : RHEL 8.2

I am using cloud-init v19.1 where the control plane (OpenStack nova service) passes information (data source) via configdrive during VM deployment.

On a RHEL8.2 VM deployed from the above environment, it is observed that IPv4 interfaces do not come up. This behavior is observed only when NM_CONTROLLED is set to no in the interface files. This value is set from cloud-init src code at the below lines :

https://github.com/canonical/cloud-init/blob/stable-19.4/cloudinit/net/sysconfig.py#L275

we are setting NM_CONTROLLED = no using the code
    iface_defaults = tuple([
        ('ONBOOT', True),
        ('USERCTL', False),
        ('NM_CONTROLLED', False),
        ('BOOTPROTO', 'none'),
        ('STARTMODE', 'auto'),
    ])

under the file [1] . Due to which the Network Manager is not able to handle the interfaces

[1] cloudinit/net/sysconfig.py

When the above pieces of code is updated to set True to NM_CONTROLLED, the IPV4 interfaces come up fine.

Making ('NM_CONTROLLED', True) fixes the issue.

@ubuntu-server-builder ubuntu-server-builder added bug Something isn't working correctly launchpad Migrated from Launchpad labels May 12, 2023
@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Chad Smith(chad.smith) wrote on 2020-09-08T16:42:45.304727+00:00

Please add cloud-init logs to this bug so we can triage a bit more.

Logs can be obtained running cloud-init collect-logs on the vm and attaching the tar.gz to this bug.

Cloud-init specifically avoids trying to render network configuration for network-manager based interfaces as recommended in RHEL6 docs, but if RHEL8 needs to differ some documentation links around that would be helpful.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Chad Smith(chad.smith) wrote on 2020-09-08T16:43:20.153531+00:00

Feel free to mark this bug back to 'New' state at the top when you have a chance to respond so that we know it need review.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Aman Kumar Sinha(amansi26) wrote on 2020-09-08T17:12:14.696156+00:00

RHEL8 documentation:https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/pdf/configuring_and_managing_networking/Red_Hat_Enterprise_Linux-8-Configuring_and_managing_networking-en-US.pdf
Launchpad attachments: clod-init logs

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Aman Kumar Sinha(amansi26) wrote on 2020-09-09T16:39:00.986744+00:00

IRC chat regarding this defect: https://irclogs.ubuntu.com/2020/09/08/%23cloud-init.html

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Aman Kumar Sinha(amansi26) wrote on 2020-09-10T07:37:16.417381+00:00

I validated this with cloud init provided by RHEL8.2 [cloud-init-19.4-1.el8.7.noarch] and I don't see NM_CONTROLLED in the code under cloudinit/net/sysconfig.py

iface_defaults = tuple([
    ('ONBOOT', True),
    ('USERCTL', False),
    ('BOOTPROTO', 'none'),
    ('STARTMODE', 'auto'),
])

Due to which the NM_CONTROLLED value does not get added in the interface file and by default NM_CONTROLLED value is yes.

Interface file looks like:

[root@rhle20 cloudinit]# cat /etc/sysconfig/network-scripts/ifcfg-env32

Created by cloud-init on instance boot automatically, do not edit.

BOOTPROTO=none
DEFROUTE=yes
DEVICE=env32
GATEWAY=xxx.xxx.xxx.xxx
HWADDR=fa:c9:e6:d2:01:20
IPADDR=xxx.xxx.xxx.xxx
MTU=1500
NETMASK=xxx.xxx.xxx.xxx
ONBOOT=yes
STARTMODE=auto
TYPE=Ethernet
USERCTL=no

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Aman Kumar Sinha(amansi26) wrote on 2020-09-10T08:15:57.557101+00:00

Launchpad attachments: RHEL8.2 provided cloud-init v19.4

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Divya K Konoor(dikonoor) wrote on 2020-09-10T08:16:49.359982+00:00

So with cloud-init 19.4 version shipped as part of RHEL 8.2.1, we see a difference in code compared to what is found at https://github.com/canonical/cloud-init/blob/stable-19.4/cloudinit/net/sysconfig.py#L275. Due to this we are not able to reproduce this problem with RHEL cloud-init but can reproduce it with community cloud-init.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Eduardo Otubo(otubo) wrote on 2020-09-10T12:45:39.182328+00:00

This makes sense to RHEL (not sure about SLES) because we do use NM to control interface configuration. We have a downstream-only patch to fix this - that's why you can't reproduce with RHEL provided cloud-init. But we simply remove the line instead of setting it to True.

diff --git a/cloudinit/net/sysconfig.py b/cloudinit/net/sysconfig.py
index 310cdf01..8bd7e887 100644
--- a/cloudinit/net/sysconfig.py
+++ b/cloudinit/net/sysconfig.py
@@ -272,7 +272,6 @@ class Renderer(renderer.Renderer):
iface_defaults = tuple([
('ONBOOT', True),
('USERCTL', False),

  •    ('NM_CONTROLLED', False),
       ('BOOTPROTO', 'none'),
       ('STARTMODE', 'auto'),
    
    ])

Not sure if cloud-init is designed to work with NM, perhaps that's why the default configuration uipstream is NM_CONTROLLED=False.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Paride Legovini(paride) wrote on 2020-09-18T13:00:09.182699+00:00

Hi, I will try to recap:

  • cloud-init explicitly sets NM_CONTROLLED=False when running on RHEL. A comment in the code shows that this was done when using RHEL 6 as a reference, which I suppose wasn't using NM.

  • RHEL 8 used NM, so setting NM_CONTROLLED=False leaves network not configured. This is fixed in the RHEL distributed cloud-init package by the patch in comment Jenkins pipeline #8.

  • RHEL 6 reaches EOL on 2020-11-30, IIUC. This may affect how we tackle this issue.

I have a question on RHEL 7. Was the patch needed already, or did cloud-init work with NM_CONTROLLED=False? Would it work with NM_CONTROLLED=True?

Thanks!

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Divya K Konoor(dikonoor) wrote on 2020-09-21T02:58:13.426842+00:00

My understanding is that things work fine on RHEL7 with NM_CONTROLLED=False. So NM_CONTROLLED=True is a downstream patch maintained by RH only for RHEL8.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Paride Legovini(paride) wrote on 2020-09-23T16:34:22.540381+00:00

I retitled the bug to (hopefully) make it more to the point.

We should make sysconfig.py distinguish between RHEL < 8 and RHEL >= 8.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Jessvin Thomas(jessvin) wrote on 2021-11-16T07:00:44.959848+00:00

I think I've run into this same issue when installing network manager in place of Netplan on Ubuntu 18.04.

I'm attempting to create a VM on Azure using Ubuntu with the Azure WALinuxAgent removed.
If network manager is installed cloud-init won't get any DHCP information and bring the network up during the local boot stage. I see the following at start on the console:

[ OK ] Started Wait for Network to be Configured.
Starting Initial cloud-init job (metadata service crawler)...
[[ 13.560599] cloud-init[909]: ci-info: | Route | Destination | Gateway | Interface | Flags |
[ 13.577782] cloud-init[909]: ci-info: +-------+-------------+---------+-----------+-------+
[ 13.590712] cloud-init[909]: ci-info: | 1 | fe80::/64 | :: | eth0 | U |
[ 13.599783] cloud-init[909]: ci-info: | 3 | local | :: | eth0 | U |
[ 13.610082] cloud-init[909]: ci-info: | 4 | multicast | :: | eth0 | U |
[ 13.621363] cloud-init[909]: ci-info: +-------+-------------+---------+-----------+-------+

This ultimately causes Azure provisioning to hang for 20 minutes as it can't get the network in order to report ready.

I've tried disabling network setup with below but that didn't seem to help:
99-nonet.cfg
network:
config: disabled

I've worked around it by changing the datasource to NoCloud, but that just keeps the device from hanging since there won't be an attempted azure check in. Fundamentally cloud-init still can't bring the network up.

Assuming setting NM_CONTROLLED=True fixes it, is possible for this to be a configuration option? Or is there some other way to let network manager bring up interfaces?

Thank you!!

Ubuntu 18.04.6 LTS (GNU/Linux 5.4.0-1063-azure x86_64)
Cloud: azure
network manager: 1.10.6
cloud init: /usr/bin/cloud-init 21.3-1-g6803368d-0ubuntu1~18.04.4

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Jessvin Thomas(jessvin) wrote on 2021-11-16T14:49:42.843638+00:00

I found one more item in the log. When I set the datasource to NoCloud I was able to see the following in the boot log:

[ 11.319812] cloud-init[868]: 2021-11-16 14:47:50,680 - util.py[WARNING]: Running interface command ['nmcli', 'connection', 'up', 'ifname', 'eth0'] failed

So it looks like its trying to use nmcli but can't.
Anywhere else I should look to troubleshoot?

holmanb added a commit that referenced this issue Mar 26, 2024
BREAKING_CHANGE: Use NetworkManager renderer by default for RHEL family
Fixes GH-3781
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working correctly launchpad Migrated from Launchpad
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant