Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The OpenStack network_config.json implementation fails on Hyper-V compute nodes #2761

Closed
ubuntu-server-builder opened this issue May 10, 2023 · 24 comments
Labels
launchpad Migrated from Launchpad

Comments

@ubuntu-server-builder
Copy link
Collaborator

This bug was originally filed in Launchpad as LP: #1642679

Launchpad details
affected_projects = ['nova', 'nova/ocata', 'cloud-init (Ubuntu)', 'cloud-init (Ubuntu Xenial)', 'cloud-init (Ubuntu Yakkety)']
assignee = None
assignee_name = None
date_closed = 2016-12-23T17:36:57.182593+00:00
date_created = 2016-11-17T17:46:24.611596+00:00
date_fix_committed = 2016-12-23T17:36:57.182593+00:00
date_fix_released = 2016-12-23T17:36:57.182593+00:00
id = 1642679
importance = medium
is_complete = True
lp_url = https://bugs.launchpad.net/cloud-init/+bug/1642679
milestone = None
owner = avladu
owner_name = Adrian Vladu
private = False
status = fix_released
submitter = avladu
submitter_name = Adrian Vladu
tags = ['hyper-v', 'verification-done']
duplicates = [1609279]

Launchpad user Adrian Vladu(avladu) wrote on 2016-11-17T17:46:24.611596+00:00

=== Begin SRU Template ===
[Impact]
When a config drive provides network_data.json on Azure OpenStack,
cloud-init will fail to configure networking.

Console log and /var/log/cloud-init.log will show:
 ValueError: Unknown network_data link type: hyperv

This woudl also occur when the type of the network device as declared
to cloud-init was 'hw_veb', 'hyperv', 'vhostuser' or 'vrouter'.

[Test Case]
Launch an instance with config drive on hyperv cloud.

[Regression Potential]
Low to none. cloud-init is relaxing requirements and will accept things
now that it previously complained were invalid.
=== End SRU Template ===

We have discovered an issue when booting Xenial instances on OpenStack environments (Liberty or newer) and Hyper-V compute nodes using config drive as metadata source.

When applying the network_config.json, cloud-init fails with this error:
http://paste.openstack.org/show/RvHZJqn48JBb0TO9QznL/

The fix would be to add 'hyperv' as a link type here:
/usr/lib/python3/dist-packages/cloudinit/sources/helpers/openstack.py, line 587

Related bugs:
 * bug 1674946: cloud-init fails with "Unknown network_data link type: dvs
 * bug 1642679: OpenStack network_config.json implementation fails on Hyper-V compute nodes

@ubuntu-server-builder ubuntu-server-builder added the launchpad Migrated from Launchpad label May 10, 2023
@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Scott Moser(smoser) wrote on 2016-11-22T16:51:49.342889+00:00

Hi,
I've subscribed Xiang to this as he recently pinged me on a different string that may appear as a network device. My response to him was:

| This non-sense really needs to stop.
| We need to fix openstack to stop sending arbitrary "types" of network
| devices that mean nothing to the guest.
|
| No new ones should be allowed.
|
| 'vhostuser' or 'ovs' means nothing to the guest. They just see a nic.
| They can't possibly use that information in any way, so telling them is
| not helpful. The type of the device should be 'tap' or 'ethernet'.
|
| Can you submit a merge proposal upstream that does that?
|
| We can take these things in, but they're silly and quite obviously busted,
| unless you have some information that shows why they're not.

I'm willing to take this, but lets please work to fix the source
of the problem.

Adrian,
Can you please file a merge proposal upstream to fix this?

You're welcome to use this bug. I've made it "Also affects nova".

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user OpenStack Infra(hudson-openstack) wrote on 2016-11-22T17:16:23.907585+00:00

Fix proposed to branch: master
Review: https://review.openstack.org/400883

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Scott Moser(smoser) wrote on 2016-11-22T17:16:44.166134+00:00

I've put up a request at https://review.openstack.org/400883

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Adrian Vladu(avladu) wrote on 2016-11-23T15:32:47.017396+00:00

Hello,

as the exposing behavior for nova is like this since a few releases, it is hard to believe they will change it, due to the backwards compatibility. Basically a few stable OpenStack releases(Liberty, Mitaka, Newton, Ocata) will be probably be stuck with it :(

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Xiang Hui(xianghui) wrote on 2016-11-30T02:11:30.658719+00:00

@scott, thanks for your fixing! BTW, would this cloud-init version target to xenial later?

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Robie Basak(racb) wrote on 2016-12-09T18:21:05.766978+00:00

Hello Adrian, or anyone else affected,

Accepted cloud-init into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/cloud-init/0.7.8-49-g9e904bb-0ubuntu1~16.04.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Scott Moser(smoser) wrote on 2016-12-19T15:00:46.226328+00:00

Adrian, Xiang,

Could you please verify this and mark 'verification-done' ?

At this point, this bug is blocking the release of cloud-init 0.7.8-49-g9e904bb-0ubuntu1~16.04.2 from xenial-proposed. That change contains fixes for other bugs that we need to get into -updates.

I've made requests off-bug to both Xiang Hui and to Adrian Vladu, but have not gotten a response.

Adrian has ACKed the upstream merge proposal at [1] with this fix.

While the code change does change behavior, the chance for regression is very low. See the code that was changed in context at [2]. Basically we extended the list of "physical types" to add 'hw_veb', 'hyperv', 'vhostuser'. Previously, if that condition did not match, then we would raise a ValueError exception that is not handled, leaving the system basically un-usable. Now, the strings are considered valid as "physical" and cloud-init will configure the devices as needed.

So:
Before: cloud-init raise exception and no user-data or metadata is used... user cannot log into system.
After: cloud-init configures networking and user-data and metadata is used.

Worst case for regression is really "still doesn't work".

--
[1] https://code.launchpad.net/~smoser/cloud-init/+git/cloud-init/+merge/311548
[2] https://git.launchpad.net/cloud-init/tree/cloudinit/sources/helpers/openstack.py#n599

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Gabriel Samfira(gabriel-samfira) wrote on 2016-12-19T15:50:04.602252+00:00

Tested version 0.7.8-49-g9e904bb-0ubuntu1~16.04.2 on an OpenStack Mitaka install running Hyper-V as compute host.

VM booted successfully and cloud-init finished its run. The following output is from inside the VM after accessing it via SSH:

https://paste.ubuntu.com/23653864/

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Launchpad Janitor(janitor) wrote on 2016-12-19T16:27:40.892634+00:00

This bug was fixed in the package cloud-init - 0.7.8-49-g9e904bb-0ubuntu1~16.04.2


cloud-init (0.7.8-49-g9e904bb-0ubuntu1~16.04.2) xenial-proposed; urgency=medium

  • cherry-pick 18203bf: disk_setup: Use sectors as unit when formatting
    MBR disks with sfdisk. (LP: #1460715)
  • cherry-pick 6e92c5f: net/cmdline: Consider ip= or ip6= on command
    line not only ip= (LP: #1639930)
  • cherry-pick 8c6878a: tests: fix assumptions that expected no eth0 in
    system. (LP: #1644043)
  • cherry-pick 2d2ec70: OpenStack: extend physical types to include
    hyperv, hw_veb, vhost_user. (LP: #1642679)

-- Scott Moser smoser@ubuntu.com Thu, 01 Dec 2016 16:57:39 -0500

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Robie Basak(racb) wrote on 2016-12-19T16:28:00.053690+00:00

The verification of the Stable Release Update for cloud-init has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Scott Moser(smoser) wrote on 2016-12-23T17:36:54.546977+00:00

This is fixed in cloud-init 0.7.9.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Brian Murray(brian-murray) wrote on 2017-01-12T19:47:49.811015+00:00

Hello Adrian, or anyone else affected,

Accepted cloud-init into yakkety-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/cloud-init/0.7.8-68-gca3ae67-0ubuntu1~16.10.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Scott Moser(smoser) wrote on 2017-01-27T14:34:49.013219+00:00

Hi,
I'm going to mark this as verification-done as the original opener has not been able to do that, unfortunately. The change that went in for this fix can be seen at [1]. It is exactly the change that is in trunk, zesty, and also verified in stable release 16.04.

If it turns out that some interaction with yakkety made it not work, then we can re-address that.

If an sru team member wishes to disagree with my argument above, please just set it back to verification-needed, and I will attempt to get someone to do that.

Scott

--
[1] https://git.launchpad.net/cloud-init/commit/?h=ubuntu/yakkety&id=2d2ec70f06015f0624f1d0d328cc97f1fb5c29de

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Steve Langasek(vorlon) wrote on 2017-01-27T17:35:23.555353+00:00

Because this SRU includes a large number of other bugfixes that have been verified in yakkety, we have confidence that the package is not fundamentally broken, and as you say this change has been verified on other releases, so I'm willing to accept this for the present SRU.

(But I am not releasing it on a Friday.)

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Adrian Vladu(avladu) wrote on 2017-01-27T17:46:32.866420+00:00

Hello,

sorry for the delay, we have successfully tested a latest yakkety image(we updated via chroot the cloud-init with the one from the -proposed repo).

Thanks,
Adrian Vladu

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Launchpad Janitor(janitor) wrote on 2017-01-30T18:18:12.820305+00:00

This bug was fixed in the package cloud-init - 0.7.8-68-gca3ae67-0ubuntu1~16.10.1


cloud-init (0.7.8-68-gca3ae67-0ubuntu1~16.10.1) yakkety; urgency=medium

  • debian/cherry-pick: add utility for cherry picking commits from upstream
    into patches in debian/patches.
  • New upstream snapshot.
    • mounts: use mount -a again to accomplish mounts (LP: #1647708)
    • CloudSigma: Fix bug where datasource was not loaded in local search.
      (LP: #1648380)
    • when adding a user, strip whitespace from group list
      [Lars Kellogg-Stedman] (LP: #1354694)
    • fix decoding of utf-8 chars in yaml test
    • Replace usage of sys_netdev_info with read_sys_net (LP: #1625766)
    • fix problems found in python2.6 test.
    • OpenStack: extend physical types to include hyperv, hw_veb, vhost_user.
      (LP: #1642679)
    • tests: fix assumptions that expected no eth0 in system. (LP: #1644043)
    • net/cmdline: Consider ip= or ip6= on command line not only ip=
      (LP: #1639930)
    • Just use file logging by default [Joshua Harlow] (LP: #1643990)
    • Improve formatting for ProcessExecutionError [Wesley Wiedenmeier]
    • flake8: fix trailing white space
    • Doc: various documentation fixes [Sean Bright]
    • cloudinit/config/cc_rh_subscription.py: Remove repos before adding
      [Brent Baude]
    • packages/redhat: fix rpm spec file.
    • main: set TZ in environment if not already set. [Ryan Harper]
    • disk_setup: Use sectors as unit when formatting MBR disks with sfdisk.
      [Daniel Watkins] (LP: #1460715)

-- Scott Moser smoser@ubuntu.com Mon, 19 Dec 2016 15:07:12 -0500

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Scott Moser(smoser) wrote on 2017-03-29T19:52:16.339146+00:00

this is definitely not fix-released in nova.
We see more bugs like: bug 1674946

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user OpenStack Infra(hudson-openstack) wrote on 2017-04-11T18:26:12.236785+00:00

Reviewed: https://review.openstack.org/400883
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=f559be35a03f5801f527355895a97c89cdc3c336
Submitter: Jenkins
Branch: master

commit f559be35a03f5801f527355895a97c89cdc3c336
Author: Scott Moser smoser@brickies.net
Date: Fri Mar 31 17:01:33 2017 -0400

Limit exposure of network device types to the guest.

Previously, the 'type' of the hypervisor network device, was exposed to
the guest directly. That does not make sense, as
a.) this leaks needless information into the guest
b.) the guest cannot be reasonably expected to make decisions
    based on a type of link that is present underneath the
    virtual device that is presented to the guest.
c.) guests then are forced to either continuously track these types
    or to assume that unknown type is "phy".

This limits the exposure of types to a specific list. Any other
type will be shown to the guest as 'phy'.

Change-Id: Iea458fba29596cd2773d8d3565451af60b02bcca
Closes-Bug: #1642679

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user OpenStack Infra(hudson-openstack) wrote on 2017-04-14T09:22:06.126431+00:00

This issue was fixed in the openstack/nova 16.0.0.0b1 development milestone.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Sam Stoelinga(sammiestoel) wrote on 2017-04-26T18:02:22.856725+00:00

I still hit this issue on latest xenial cloudimg of April 25th. This is the error I saw when trying to run an Ubuntu 16.04 guest OS on a contrail based cloud: http://paste.openstack.org/show/608110/

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user OpenStack Infra(hudson-openstack) wrote on 2017-06-21T15:56:26.803320+00:00

Fix proposed to branch: stable/ocata
Review: https://review.openstack.org/476195

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Andrey Kirilochkin(andreika-mail) wrote on 2017-07-12T10:06:18.460115+00:00

Guys we still hitting the same bug, this started to be a huge issue for us.
http://paste.openstack.org/show/615013/
http://paste.openstack.org/show/615127/
Each time we run vm with ubuntu 16.04 we randomly see this bug in vm boot-log.
OpenStack: Mitaka
Juniper Contrail: 3.2.8
Please provide fix for that.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user OpenStack Infra(hudson-openstack) wrote on 2017-08-12T14:03:14.514332+00:00

Reviewed: https://review.openstack.org/476195
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=cec7ecdc93c3b9ba401edf3cf84088b580247cb8
Submitter: Jenkins
Branch: stable/ocata

commit cec7ecdc93c3b9ba401edf3cf84088b580247cb8
Author: Scott Moser smoser@brickies.net
Date: Fri Mar 31 17:01:33 2017 -0400

Limit exposure of network device types to the guest.

Previously, the 'type' of the hypervisor network device, was exposed to
the guest directly. That does not make sense, as
a.) this leaks needless information into the guest
b.) the guest cannot be reasonably expected to make decisions
    based on a type of link that is present underneath the
    virtual device that is presented to the guest.
c.) guests then are forced to either continuously track these types
    or to assume that unknown type is "phy".

This limits the exposure of types to a specific list. Any other
type will be shown to the guest as 'phy'.

Change-Id: Iea458fba29596cd2773d8d3565451af60b02bcca
Closes-Bug: #1642679
(cherry picked from commit f559be35a03f5801f527355895a97c89cdc3c336)

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user OpenStack Infra(hudson-openstack) wrote on 2017-08-22T11:39:35.762809+00:00

This issue was fixed in the openstack/nova 15.0.7 release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
launchpad Migrated from Launchpad
Projects
None yet
Development

No branches or pull requests

1 participant