-
Notifications
You must be signed in to change notification settings - Fork 840
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cloud-init selects sysconfig netconfig renderer if network-manager is installed on Ubuntu #3354
Comments
Launchpad user duanbenliang(duanbl1) wrote on 2019-03-14T02:30:40.386891+00:00 Launchpad attachments: Errors during deploy |
Launchpad user duanbenliang(duanbl1) wrote on 2019-03-14T02:33:13.606700+00:00 Launchpad attachments: Error during deploy |
Launchpad user duanbenliang(duanbl1) wrote on 2019-03-14T02:33:44.999366+00:00 Launchpad attachments: The log under MAAS |
Launchpad user duanbenliang(duanbl1) wrote on 2019-03-14T02:34:06.784111+00:00 Launchpad attachments: Output of dpkg_l |
Launchpad user Blake Rouse(blake-rouse) wrote on 2019-03-14T11:38:41.250137+00:00 Looks like it might be an issue either in curtin or MAAS based on the network configuration. Once the machine fails to deploy can you provide the output of: maas {profile} machine get-curtin-config {system_id} |
Launchpad user Jeff Lane (bladernr) wrote on 2019-03-14T16:22:02.476461+00:00 FYI, I've added a cert task for this. I don't know for sure this is curtin, it looks like something may have changed in one of the hundreds of dependency packages that checkbox pulls in causing curtin to fail. Rod is investigating it on our side. |
Launchpad user Jeff Lane (bladernr) wrote on 2019-03-14T19:29:51.173736+00:00 We have a bug for this as well, 1189973 but duping for that kills the MAAS (possibly curtin) task. So I un-duped it for now |
Launchpad user Rod Smith(rodsmith) wrote on 2019-03-14T19:36:05.407887+00:00 We've traced the problem to the network-manager package, which gets pulled in by a dependency in canonical-certification-server. Apparently, curtin or cloud-init (I'm not sure which) is now skipping netplan configuration when the network-manager package is installed. |
Launchpad user Rod Smith(rodsmith) wrote on 2019-03-14T20:53:11.663789+00:00 Launchpad attachments: Output of "maas {profile} machine get-curtin-config {system_id}" on MAAS server |
Launchpad user Rod Smith(rodsmith) wrote on 2019-03-14T20:54:08.321929+00:00 Launchpad attachments: Output of "maas {profile} node-results read system_id={system_id}" on MAAS server |
Launchpad user Ryan Harper(raharper) wrote on 2019-03-14T21:29:20.282424+00:00 Neither curtin, nor cloud-init will skip generating networking. However, if there exists some additional netplan config in the target system that cloud-init is not aware (maybe provided in the NetworkManager package (or something else)) then there may be a conflict in the configuration that prevents netplan apply from bringing up the network. If possible, getting the systemd journal and what's in /etc/netplan and /run/systemd/{netif,network} and /var/log/cloud-init.log could help see what's going on. |
Launchpad user Rod Smith(rodsmith) wrote on 2019-03-14T22:41:35.104569+00:00 Launchpad attachments: /var/log/cloud-init.log on a node that failed deployment |
Launchpad user Rod Smith(rodsmith) wrote on 2019-03-14T22:43:47.937184+00:00 I've attached the /var/log/cloud-init.log file from a node that failed deployment. (This is a different node from the one that generated the earlier logs.) The /etc/netplan directory is empty, and neither there is no /run/systemd directory on this node that failed to deploy. |
Launchpad user Ryan Harper(raharper) wrote on 2019-03-15T14:51:10.206832+00:00 2019-03-14 17:32:34,606 - init.py[DEBUG]: Selected renderer 'sysconfig' from priority list: None This is a cloud-init bug. The sysconfig renderer has NetworkManager support, this triggered cloud-init to render sysconfig instead of netplan. |
Launchpad user Ryan Harper(raharper) wrote on 2019-03-15T19:19:25.169491+00:00 You can workaround this issue by including the following curtin config when deploying. write_files: |
Launchpad user Rod Smith(rodsmith) wrote on 2019-03-15T19:44:41.765139+00:00 Thanks for the quick fix, Ryan! I've confirmed that your curtin config workaround in comment #15 works. Do you have an estimate for how long it'll be before a fix goes live? (I ask so we can plan whether we should push your workaround through one of the certification packages.) |
Launchpad user Ryan Harper(raharper) wrote on 2019-03-15T20:51:06+00:00 On Fri, Mar 15, 2019 at 2:50 PM Rod Smith rod.smith@canonical.com wrote:
Depends on where you need it. It can likely land upstream either today
|
Launchpad user Amy Gou(goujm1) wrote on 2019-03-18T09:54:52.332948+00:00 hi Jeff and all, After upgrade online, it is MAAS 0.4.0 show under version tale, but still 2.4.2 under the log. in the same time, the deploy fails again. please double check the log and let me know if there is any comments. Best Regards, |
Launchpad user Jeff Lane (bladernr) wrote on 2019-03-18T15:54:20.830384+00:00 Hi Amy, first, which machine failed? I see a bunch of machines in the /var/log/maas/rsyslog/ directory, and I'm not sure exactly which one to look at. Secondly, the version you posted in the screen shot looks correct, can you show me the output of: ls -l /etc/maas/preseed/curtin_userdata* |
Launchpad user Jeff Lane (bladernr) wrote on 2019-03-18T15:58:34.386862+00:00 Amy: Also, could you send me a tarball containing /etc/maas/preseeds ?? |
Launchpad user duanbenliang(duanbl1) wrote on 2019-03-19T07:59:31.300448+00:00 Launchpad attachments: Screen shot of "ls -l /etc/maas/preseed/curtin_userdata*" |
Launchpad user duanbenliang(duanbl1) wrote on 2019-03-19T08:00:41.799679+00:00 Launchpad attachments: Tarball of "/etc/maas" |
Launchpad user Amy Gou(goujm1) wrote on 2019-03-19T11:20:07.034916+00:00 hi Jeff, it is SR590 Cascadelake deploy failed with the new MAAS 0.4.0. the attahmen above is collected from The environment with SR590 Cascadelake. best Regards, |
Launchpad user Rod Smith(rodsmith) wrote on 2019-03-19T23:01:17.784603+00:00 Amy, I think you're confusing the MAAS version (which is 2.4.2 on one of our installations) and the maas-cert-server package version (the latest of which is 0.4.0). The maas-cert-server 0.3.9 package includes a workaround (but NOT A FIX) for this bug, and 0.4.0 provides some unrelated improvements, so the installation SHOULD succeed after you've upgraded maas-cert-server to version 0.3.9 or 0.4.0. If it's still failing, then it could be you'll need to apply the workaround described by Ryan Harper in comment #15, which is different from the workaround in maas-cert-server 0.3.9 and 0.4.0. (Post back if you need help applying Ryan's workaround.) It could also be that you're looking at a completely different problem. |
Launchpad user Amy Gou(goujm1) wrote on 2019-03-20T10:38:20.526162+00:00 hi Rod, Thanks for your update, we will use the workaround to execute the current certification test on Purley Cascadelake. Best Regards, |
Launchpad user Server Team CI bot(server-team-bot) wrote on 2019-04-22T22:46:43.736734+00:00 This bug is fixed with commit 5de83fc to cloud-init on branch master. |
Launchpad user Chad Smith(chad.smith) wrote on 2019-05-10T18:08:35.881688+00:00 This bug is believed to be fixed in cloud-init in version 19.1. If this is still a problem for you, please make a comment and set the state back to New Thank you. |
Launchpad user Amy Gou(goujm1) wrote on 2019-05-13T09:27:29.235445+00:00 Sorry for the later reply, the issue does not occur with current Cloud-init v. 18.5-45-g3554ffe8-0ubuntu1~18.04.1. please move on and close it. thanks a lot. Best Regards, |
Launchpad user Jeff Lane (bladernr) wrote on 2019-05-13T15:28:26.883924+00:00 Hi Amy, it's likely that you're still using our patched tooling that includes a workaround. cloud-init 18.5 should not work. |
Launchpad user Jeff Lane (bladernr) wrote on 2019-06-19T15:54:38.308550+00:00 Just a heads up, the fix is now in -updates, I've tested this locally on a couple deployments and it seems to resolve the issue we had before. Asking my team to verify on a couple more deployments for due diligence. |
Launchpad user Rod Smith(rodsmith) wrote on 2019-06-19T19:10:25.401326+00:00 I've tested this on three nodes on two MAAS servers (my own home MAAS server and maastiff, our MAAS server in the certification lab), using both 18.04 and 19.04. It looks good to me. |
Launchpad user Amy Gou(goujm1) wrote on 2019-06-20T10:16:23.729470+00:00 thanks for your kindly update, i will do the double check with the latest one. |
Launchpad user Dan Watkins(oddbloke) wrote on 2019-07-23T16:08:05.314117+00:00 Hi Amy et al, I'm going to mark this Fix Released, as 19.1 has made its way in to Ubuntu. Please let us know if you don't think this is fixed! Dan |
This bug was originally filed in Launchpad as LP: #1819994
Launchpad details
Launchpad user duanbenliang(duanbl1) wrote on 2019-03-14T02:30:40.386891+00:00
Configuration:
UEFI/BIOS: TEE136S
IMM/BMC: CDI333V
CPU: Intel(R) Xeon(R) Platinum 8253 CPU @ 2.20GHz
Memory: 16G DIMM * 12
Raid card: ThinkSystem RAID 530-8i
NIC Card: Intel X722 LOM
Reproduce Steps:
1.Config "network" as first boot
2.Power on machine
3.Visit TC through web browser and Commission machine
4.When commission complete, deploy ubuntu 18.04 LTS on SUT
5.The Error appeared during OS deploy.
Deploy errors like the following(you can view the attachment for details):
cloud-init[xxxx] Date_and_time - handlers.py[WARNING]: failed posting event: start: modules-final/config-xxxx: running config-xxxx
cloud-init[xxxx] Date_and_time - handlers.py[WARNING]: failed posting event: fainish: modules-final: SUCCESS: running modules for final
The text was updated successfully, but these errors were encountered: