Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cloud-init fails to find CloudSigma datasource with cloud-init 0.7.8-1-g3705bb5-0ubuntu1 #2783

Closed
ubuntu-server-builder opened this issue May 10, 2023 · 14 comments
Labels
launchpad Migrated from Launchpad

Comments

@ubuntu-server-builder
Copy link
Collaborator

This bug was originally filed in Launchpad as LP: #1648380

Launchpad details
affected_projects = ['cloud-init (Ubuntu)', 'cloud-init (Ubuntu Xenial)', 'cloud-init (Ubuntu Yakkety)']
assignee = None
assignee_name = None
date_closed = 2016-12-23T17:36:34.003135+00:00
date_created = 2016-12-08T10:13:51.244377+00:00
date_fix_committed = 2016-12-23T17:36:34.003135+00:00
date_fix_released = 2016-12-23T17:36:34.003135+00:00
id = 1648380
importance = medium
is_complete = True
lp_url = https://bugs.launchpad.net/cloud-init/+bug/1648380
milestone = None
owner = smoser
owner_name = Scott Moser
private = False
status = fix_released
submitter = louis
submitter_name = Louis Bouchard
tags = ['verification-done-xenial', 'verification-done-yakkety']
duplicates = []

Launchpad user Louis Bouchard(louis) wrote on 2016-12-08T10:13:51.244377+00:00

[SRU justification]
Without this fix, images built with cloud-init 0.7.8-1-g3705bb5-0ubuntu1 and later will not correctly boot and become unreachable

[Impact]
Some cloud images built with this version of cloud-init may become unusable

[Fix]
Reinstate the second element of the datasources list as a tuple instead of a string.

[Test Case]
This test must be done on CloudSigma to complete correctly :

Build cloud image with only the CloudSigma datasource using cloud-init version 0.7.8-1-g3705bb5-0ubuntu1 or later
Launch an instance with this image
The instance will boot but will not be accessible through ssh

With this fix,the instance will complete its boot sequence and be accessible through ssh

[Regression]
None expected, the second element was a tuple in previous versions of the CloudSigma datasource

[Description of the problem]
The issue materialized itself on cloud instances launched with such images that became unreachable through SSH with the following message:

"Connection closed by {IP} port 22"

@ubuntu-server-builder ubuntu-server-builder added the launchpad Migrated from Launchpad label May 10, 2023
@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Louis Bouchard(louis) wrote on 2016-12-08T10:14:54.839659+00:00

Here is what I have found :
This seems suspect to me :

Nov 21 10:12:33 ubuntu [CLOUDINIT] init.py[DEBUG]: Looking for for data source in: ['CloudSigma'], via packages ['', 'cloudinit.sources'] that matches dependencies ['FILESYSTEM']
Nov 21 10:12:33 ubuntu [CLOUDINIT] init.py[DEBUG]: Searching for local data source in: []

There is no local data source returned and this is what will later trigger the python backtrace.

The message "Looking for for data source in" comes from list_sources().

list_sources(cfg_list, depends, pkg_list)
cfg_list = ['CloudSigma']
depends = ['FILESYSTEM']

Down in list_source we have :
for m_loc in m_locs:
mod = importer.import_module(m_loc)
lister = getattr(mod, "get_datasource_list")
matches = lister(depends)
if matches:
break
return src_list

lister is defined as get_datasource_list(depends) that does
return sources.list_from_depends(depends, datasources)

list_from_depends() does :
def list_from_depends(depends, ds_list):
ret_list = []
depset = set(depends)
for (cls, deps) in ds_list:
if depset == set(deps):
ret_list.append(cls)
return ret_list

ds_list being datasources which is :
datasources = [
(DataSourceCloudSigma, (sources.DEP_FILESYSTEM)),
]
and sources.DEP_FILESYSTEM == 'FILESYSTEM'

depset will be :
depset = {'FILESYSTEM'}

deps being sources.DEP_FILESYSTEM, so = 'FILESYSTEM'
set(deps) = {'I', 'Y', 'M', 'T', 'L', 'S', 'F', 'E'} !!!

so if depset == set(deps): is false, hence returns nothing.

Now I don't yet know what changed that makes that test fail

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Louis Bouchard(louis) wrote on 2016-12-08T10:16:12.812612+00:00

ok, while my previous analysis was correct, the first stage runs correctly w/o finding a DataSource (at least it looks like it).

But when it runs the init-network phase, things seem to go badly :

Nov 21 10:12:34 ubuntu [CLOUDINIT] handlers.py[DEBUG]: finish: init-network/check-cache: SUCCESS: no cache found
Nov 21 10:12:34 ubuntu [CLOUDINIT] util.py[DEBUG]: Attempting to remove /var/lib/cloud/instance
Nov 21 10:12:34 ubuntu [CLOUDINIT] stages.py[DEBUG]: Using distro class <class 'cloudinit.distros.ubuntu.Distro'>
Nov 21 10:12:34 ubuntu [CLOUDINIT] init.py[DEBUG]: Looking for for data source in: ['CloudSigma'], via packages ['', 'cloudinit.sources'] that matches dependencies ['FILESYSTEM', 'NETWORK']
Nov 21 10:12:34 ubuntu [CLOUDINIT] init.py[DEBUG]: Searching for network data source in: []
Nov 21 10:12:34 ubuntu [CLOUDINIT] util.py[WARNING]: No instance datasource found! Likely bad things to come!
Nov 21 10:12:34 ubuntu [CLOUDINIT] util.py[DEBUG]: No instance datasource found! Likely bad things to come!

cloud-init.log:Nov 21 10:12:34 ubuntu [CLOUDINIT] util.py[DEBUG]: No instance datasource found! Likely bad things to come!
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/cloudinit/cmd/main.py", line 247, in main_init
init.fetch(existing=existing)
File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 358, in fetch
return self._get_data_source(existing=existing)
File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 268, in _get_data_source
pkg_list, self.reporter)
File "/usr/lib/python3/distpackages/cloudinit/sources/init.py", line 318, in find_source
raise DataSourceNotFoundException(msg)
cloudinit.sources.DataSourceNotFoundException: Did not find any data source, searched classes: ()

The cleaner backtrace comes from /var/log/syslog.3.gz :

Nov 21 10:12:35 ubuntu cloud-init[974]: Can not apply stage config, no datasource found! Likely bad things to come!
Nov 21 10:12:35 ubuntu cloud-init[974]: ------------------------------------------------------------
Nov 21 10:12:35 ubuntu cloud-init[974]: Traceback (most recent call last):
Nov 21 10:12:35 ubuntu cloud-init[974]: File "/usr/lib/python3/dist-packages/cloudinit/cmd/main.py", line 351, in main_modules
Nov 21 10:12:35 ubuntu cloud-init[974]: init.fetch(existing="trust")
Nov 21 10:12:35 ubuntu cloud-init[974]: File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 358, in fetch
Nov 21 10:12:35 ubuntu cloud-init[974]: return self._get_data_source(existing=existing)
Nov 21 10:12:35 ubuntu cloud-init[974]: File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 268, in _get_data_source
Nov 21 10:12:35 ubuntu cloud-init[974]: pkg_list, self.reporter)
Nov 21 10:12:35 ubuntu cloud-init[974]: File "/usr/lib/python3/dist-packages/cloudinit/sources/init.py", line 318, in find_source
Nov 21 10:12:35 ubuntu cloud-init[974]: raise DataSourceNotFoundException(msg)
Nov 21 10:12:35 ubuntu cloud-init[974]: cloudinit.sources.DataSourceNotFoundException: Did not find any data source, searched classes: ()
Nov 21 10:12:35 ubuntu cloud-init[974]: ------------------------------------------------------------
Nov 21 10:12:35 ubuntu systemd[1]: cloud-config.service: Main process exited, code=exited, status=1/FAILURE
Nov 21 10:12:35 ubuntu systemd[1]: Failed to start Apply the settings specified in cloud-config.

As mentionned in my previous comment that failure happens when this is called:

cfg_list = ['CloudSigma']
pkg_list = ['', 'cloudinit.sources']
ds_deps = ['FILESYSTEM', 'NETWORK']

ds_list = list_sources(cfg_list, ds_deps, pkg_list)
ds_names = [type_utils.obj_name(f) for f in ds_list]

ds_list & ds_names will be empty.

I suspect that the following change between cloud-init_0.7.7~bzr1212 (version in 20160627 image) and cloud-init_0.7.8-1-g3705bb5 (version in 20161020) is the reason for that failure :

@@ -119,17 +109,13 @@
return self.metadata['uuid']

-class DataSourceCloudSigmaNet(DataSourceCloudSigma):

  • def init(self, sys_cfg, distro, paths):
  • DataSourceCloudSigma.init(self, sys_cfg, distro, paths)
  • self.dsmode = 'net'

+# Legacy: Must be present in case we load an old pkl object
+DataSourceCloudSigmaNet = DataSourceCloudSigma

Used to match classes to dependencies. Since this datasource uses the serial

port network is not really required, so it's okay to load without it, too.

datasources = [
(DataSourceCloudSigma, (sources.DEP_FILESYSTEM)),

  • (DataSourceCloudSigmaNet, (sources.DEP_FILESYSTEM, sources.DEP_NETWORK)),
    ]

And most particularly the removal of :
(DataSourceCloudSigmaNet, (sources.DEP_FILESYSTEM, sources.DEP_NETWORK))

In order to verify this assertion, I added the removed element in the list & ran list_sources() with the same parameters (from within the downloaded image) :

import sources
from sources import DataSourceCloudSigma as DataSourceCloudSigma
import type_utils
cfg_list = ['CloudSigma']
pkg_list = ['', 'cloudinit.sources']
ds_deps = ['FILESYSTEM', 'NETWORK']
ds_list = sources.list_sources(cfg_list, ds_deps, pkg_list)
ds_names = [type_utils.obj_name(f) for f in ds_list]
ds_list
[<class 'cloudinit.sources.DataSourceCloudSigma.DataSourceCloudSigma'>]
ds_names
['DataSourceCloudSigma']

This time, the proper datasource is correctly found.

Now I'm off to find out why it got removed

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Louis Bouchard(louis) wrote on 2016-12-08T10:16:35.268938+00:00

This is the commit that removed that element in the list :

https://git.launchpad.net/cloud-init/commit/?id=7f2e99f5345c227d07849da68acdf8562b44c3e1

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Launchpad Janitor(janitor) wrote on 2016-12-12T03:34:44.900645+00:00

This bug was fixed in the package cloud-init - 0.7.8-67-gc9c9197-0ubuntu1


cloud-init (0.7.8-67-gc9c9197-0ubuntu1) zesty; urgency=medium

  • debian/cherry-pick: add utility for cherry picking commits from upstream
    into patches in debian/patches.
  • New upstream snapshot.
    • mounts: use mount -a again to accomplish mounts (LP: #1647708)
    • CloudSigma: Fix bug where datasource was not loaded in local search. (LP: #1648380)
    • when adding a user, strip whitespace from group list (LP: #1354694)
    • fix decoding of utf-8 chars in yaml test
    • Replace usage of sys_netdev_info with read_sys_net (LP: #1625766)
    • fix problems found in python2.6 test.

-- Scott Moser smoser@ubuntu.com Sun, 11 Dec 2016 21:22:57 -0500

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Robie Basak(racb) wrote on 2016-12-14T15:04:59.582259+00:00

The SRU to Xenial here is blocked by 0.7.8-49-g9e904bb-0ubuntu1~16.04.2 that is currently in xenial-proposed.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Mathieu Trudel-Lapierre(cyphermox) wrote on 2016-12-14T16:13:46.960595+00:00

Can't we merge the two uploads so as not to delay things until one or the other is out of proposed?

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Robie Basak(racb) wrote on 2016-12-20T16:36:25.217336+00:00

Hello Louis, or anyone else affected,

Accepted cloud-init into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/cloud-init/0.7.8-49-g9e904bb-0ubuntu1~16.04.3 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Launchpad Janitor(janitor) wrote on 2016-12-21T02:02:23.226740+00:00

This bug was fixed in the package cloud-init - 0.7.8-49-g9e904bb-0ubuntu1~16.04.3


cloud-init (0.7.8-49-g9e904bb-0ubuntu1~16.04.3) xenial-proposed; urgency=medium

  • debian/cherry-pick: use git format-patch rather than git show
  • cherry-pick a9d41de: CloudSigma: Fix bug where datasource was not
    loaded in local (LP: #1648380)
  • cherry-pick c9c9197: mounts: use mount -a again to accomplish mounts
    (LP: #1647708)

-- Scott Moser smoser@ubuntu.com Tue, 13 Dec 2016 16:02:50 -0500

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Steve Langasek(vorlon) wrote on 2016-12-21T02:03:13.280862+00:00

The verification of the Stable Release Update for cloud-init has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Philip Roche(philroche) wrote on 2016-12-21T16:11:06.632686+00:00

Images with cloud-init version 0.7.8-49-g9e904bb-0ubuntu1~16.04.3 are now available at https://partner-images.canonical.com/cpc-cloudsigma/releases/xenial/20161221/

Please can you validate that this fixes the datasource issue.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Scott Moser(smoser) wrote on 2016-12-23T17:36:33.308651+00:00

This is fixed in cloud-init 0.7.9.

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Brian Murray(brian-murray) wrote on 2017-01-12T19:49:02.692110+00:00

Hello Louis, or anyone else affected,

Accepted cloud-init into yakkety-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/cloud-init/0.7.8-68-gca3ae67-0ubuntu1~16.10.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user Launchpad Janitor(janitor) wrote on 2017-01-30T18:18:12.820305+00:00

This bug was fixed in the package cloud-init - 0.7.8-68-gca3ae67-0ubuntu1~16.10.1


cloud-init (0.7.8-68-gca3ae67-0ubuntu1~16.10.1) yakkety; urgency=medium

  • debian/cherry-pick: add utility for cherry picking commits from upstream
    into patches in debian/patches.
  • New upstream snapshot.
    • mounts: use mount -a again to accomplish mounts (LP: #1647708)
    • CloudSigma: Fix bug where datasource was not loaded in local search.
      (LP: #1648380)
    • when adding a user, strip whitespace from group list
      [Lars Kellogg-Stedman] (LP: #1354694)
    • fix decoding of utf-8 chars in yaml test
    • Replace usage of sys_netdev_info with read_sys_net (LP: #1625766)
    • fix problems found in python2.6 test.
    • OpenStack: extend physical types to include hyperv, hw_veb, vhost_user.
      (LP: #1642679)
    • tests: fix assumptions that expected no eth0 in system. (LP: #1644043)
    • net/cmdline: Consider ip= or ip6= on command line not only ip=
      (LP: #1639930)
    • Just use file logging by default [Joshua Harlow] (LP: #1643990)
    • Improve formatting for ProcessExecutionError [Wesley Wiedenmeier]
    • flake8: fix trailing white space
    • Doc: various documentation fixes [Sean Bright]
    • cloudinit/config/cc_rh_subscription.py: Remove repos before adding
      [Brent Baude]
    • packages/redhat: fix rpm spec file.
    • main: set TZ in environment if not already set. [Ryan Harper]
    • disk_setup: Use sectors as unit when formatting MBR disks with sfdisk.
      [Daniel Watkins] (LP: #1460715)

-- Scott Moser smoser@ubuntu.com Mon, 19 Dec 2016 15:07:12 -0500

@ubuntu-server-builder
Copy link
Collaborator Author

Launchpad user DesktopMan(christian-auby) wrote on 2017-02-24T16:18:25.230893+00:00

Might be the wrong place to ask, but:

I'm using Ubuntu MaaS, will this fix be available in the 16.04 LTS images? Right now my nodes fail to commission which I believe comes from this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
launchpad Migrated from Launchpad
Projects
None yet
Development

No branches or pull requests

1 participant