-
Notifications
You must be signed in to change notification settings - Fork 820
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ubuntu/devel #5258
Merged
Merged
Ubuntu/devel #5258
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Bump the version in cloudinit/version.py to 24.1.3 and update ChangeLog.
Address assignment and link management is manual for isc-dhcp-client whereas dhcpcd brings up its own interface and assigns the IP address. Interface rename code assumes that the link will be down for rename. Make sure to set dhcpcd's interface to the same state.
Rebooting an instance which has finished VMware guest customization with DataSourceVMware will load DataSourceNone due to metadata is NOT available. This is mostly a re-post of PR#229, few differences are: 1. Let ds decide if fallback is allowed, not always fall back to previous cached LOCAL ds. 2. No comparing instance-id of cached ds with previous instance-id due to I think they are always identical. Fixes canonicalGH-3402
When cloud-init finds any ipv6 information in the instance metadata, it automatically enables dhcp6 for the network interface. However, this brings up the instance with a broken IPv6 configuration because SLAAC should be used for almost all situations on EC2. Red Hat BZ: https://bugzilla.redhat.com/show_bug.cgi?id=2092459 Fedora Pagure: https://pagure.io/cloud-sig/issue/382 Upstream: https://bugs.launchpad.net/cloud-init/+bug/1976526 Fixes canonicalGH-3980 Signed-off-by: Major Hayden <major@redhat.com>
On most distros, including Ubuntu, the default timeout for dhclient is 300s. There is no cloud-init controlled duration for the dhclient process as it doesn't fork until after it receives an IP address and there is no timeout value passed to subp(). I have seen some distros configure dhclient with a timeout of 60s, but is far less common. Given that a cloud VM is not very useful with DHCP, err on the generous side and allow up to 300 seconds for dhcpcd to get an address. Note that there is still an issue with dhcpcd retries which will be addressed later in a separate PR. Signed-off-by: Chris Patterson <cpatterson@microsoft.com>
Update various hard-coded filepaths. Also make sure we bootstrap our Paths() config correctly so that we read from the configured rundir. Co-authored-by: Mina Galić <freebsd@igalic.co> Sponsored by: The FreeBSD Foundation Fixes canonicalGH-4766
…se (canonical#5128) Seeing a fairly large number of lease parsing failures on Azure similar to: ``` Traceback (most recent call last): File "/usr/lib/python3/dist-packages/cloudinit/sources/DataSourceAzure.py", line 851, in _get_data crawled_data = util.log_time( ^^^^^^^^^^^^^^ File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 2828, in log_time ret = func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3/dist-packages/cloudinit/sources/helpers/azure.py", line 45, in impl return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3/dist-packages/cloudinit/sources/DataSourceAzure.py", line 660, in crawl_metadata self._wait_for_pps_savable_reuse() File "/usr/lib/python3/dist-packages/cloudinit/sources/helpers/azure.py", line 45, in impl return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3/dist-packages/cloudinit/sources/DataSourceAzure.py", line 1236, in _wait_for_pps_savable_reuse self._wait_for_hot_attached_primary_nic(nl_sock) File "/usr/lib/python3/dist-packages/cloudinit/sources/helpers/azure.py", line 45, in impl return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3/dist-packages/cloudinit/sources/DataSourceAzure.py", line 1142, in _wait_for_hot_attached_primary_nic primary_nic_found = self._setup_ephemeral_networking( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3/dist-packages/cloudinit/sources/helpers/azure.py", line 45, in impl return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3/dist-packages/cloudinit/sources/DataSourceAzure.py", line 440, in _setup_ephemeral_networking lease = self._ephemeral_dhcp_ctx.obtain_lease() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3/dist-packages/cloudinit/net/ephemeral.py", line 293, in obtain_lease self.lease = maybe_perform_dhcp_discovery( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3/dist-packages/cloudinit/net/dhcp.py", line 103, in maybe_perform_dhcp_discovery return distro.dhcp_client.dhcp_discovery(interface, dhcp_log_func, distro) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3/dist-packages/cloudinit/net/dhcp.py", line 656, in dhcp_discovery lease = self.get_newest_lease(interface) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3/dist-packages/cloudinit/net/dhcp.py", line 829, in get_newest_lease return self.parse_dhcpcd_lease( ^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3/dist-packages/cloudinit/net/dhcp.py", line 787, in parse_dhcpcd_lease lease = dict( ^^^^^ ValueError: dictionary update sequence element #0 has length 1; 2 is required ``` Catch this error in parse_dhcpcd_lease() and raise InvalidDHCPLeaseFileError after logging an error. Signed-off-by: Chris Patterson <cpatterson@microsoft.com>
After [0, 1], dhcp6 is going to be always false after upgrading cloud-init. Correct this in the integration test. Refs: [0] canonical#3980 [1] https://bugs.launchpad.net/cloud-init/+bug/1976526
Don't log sensitive data. Since /var/log/cloud-init.log is a priviledged file, this does not expose a secure system (no CVE). However, we don't want to log this information so that users can file reports without having to manually redact logs. Standardize log messages so that redacted and non-redacted logs match.
…5145) This reverts commit f0fb841. It appears that this bug was fixed already via another patch sometime between the time I found the issue and submitted the PR canonical#5104. This patch isn't needed any longer and I want to avoid causing additional problems. Signed-off-by: Major Hayden <major@redhat.com>
…cal#5144) In scenarios where a lot of retries are expected, Ubuntu 24.04 fails regularly with "Too many open files". The `ulimit -n` shows the same number of allowed open files in Ubuntu 20.04 (1024), but the connections don't close on 24.04. As retries gets close to 1024 in readurl(), the open file limit is hit and exceptions sprout up in a number of places. It appears that the reuse of Sesssion's context manager triggers a connection leak on python3-requests used in 24.04 when saving references to the requests (in excps[]). - drop `with session as sess` context manager. Session should be able to handle all retry attempts without a context manager - raise exceptions immediately when required rather than saving them to excps[] to raise outside of the exception handler Signed-off-by: Chris Patterson <cpatterson@microsoft.com>
…tion (canonical#5146) Cloud-init does not configure rfc3442-classless-static-routes if dhclient isn't patched to support them or it is not configured with: ``` option rfc3442-classless-static-routes code 121 = array of unsigned integer 8; ``` Example lease with option configured (typical): lease { interface "eth0"; <...cut...> option rfc3442-classless-static-routes 0,10,0,0,1,32,168,63,129,16,10,0,0,1,32,169,254,169,254,10,0,0,1; <...cut...> } Example lease without option, where it is presented as "unknown-121": lease { interface "eth0"; <...cut...> option unknown-121 0:a:0:0:1:20:a8:3f:81:10:a:0:0:1:20:a9:fe:a9:fe:a:0:0:1; <...cut...> } The primary difference is that dhclient outputs the bytes in a hex-encoded format and with `:` delimiter. Extend existing parsing to support this format. With a couple added INFO logs, here is a sample DHCP on Azure with static routes being parsed from unknown-121 option with this patch: ``` 2024-04-04 16:12:01,677 - ephemeral.py[DEBUG]: Received dhcp lease on eth0 for 10.0.0.11/255.255.255.0 2024-04-04 16:12:01,677 - dhcp.py[INFO]: Parsing: '0:a:0:0:1:20:a8:3f:81:10:a:0:0:1:20:a9:fe:a9:fe:a:0:0:1' 2024-04-04 16:12:01,677 - dhcp.py[INFO]: Tokens: ['0', '10', '0', '0', '1', '32', '168', '63', '129', '16', '10', '0', '0', '1', '32', '169', '254', '169', '254', '10', '0', '0', '1'] 2024-04-04 16:12:01,677 - ephemeral.py[DEBUG]: Attempting setup of ephemeral network on eth0 with 10.0.0.11/24 brd 10.0.0.255 2024-04-04 16:12:01,677 - subp.py[DEBUG]: Running command ['ip', '-family', 'inet', 'addr', 'add', '10.0.0.11/24', 'broadcast', '10.0.0.255', 'dev', 'eth0'] with allowed return codes [0] (shell=False, capture=True) 2024-04-04 16:12:01,679 - subp.py[DEBUG]: Running command ['ip', '-family', 'inet', 'link', 'set', 'dev', 'eth0', 'up'] with allowed return codes [0] (shell=False, capture=True) 2024-04-04 16:12:01,681 - subp.py[DEBUG]: Running command ['ip', '-4', 'route', 'append', '0.0.0.0/0', 'via', '10.0.0.1', 'dev', 'eth0'] with allowed return codes [0] (shell=False, capture=True) 2024-04-04 16:12:01,683 - subp.py[DEBUG]: Running command ['ip', '-4', 'route', 'append', '168.63.129.16/32', 'via', '10.0.0.1', 'dev', 'eth0'] with allowed return codes [0] (shell=False, capture=True) 2024-04-04 16:12:01,684 - subp.py[DEBUG]: Running command ['ip', '-4', 'route', 'append', '169.254.169.254/32', 'via', '10.0.0.1', 'dev', 'eth0'] with allowed return codes [0] (shell=False, capture=True) 2024-04-04 16:12:01,686 - handlers.py[DEBUG]: start: azure-ds/_check_if_primary: _check_if_primary 2024-04-04 16:12:01,686 - handlers.py[DEBUG]: finish: azure-ds/_check_if_primary: SUCCESS: _check_if_primary 2024-04-04 16:12:01,687 - azure.py[DEBUG]: Obtained DHCP lease on interface 'eth0' (primary=True driver='hv_netvsc' router='10.0.0.1' routes=[('0.0.0.0/0', '10.0.0.1'), ('168.63.129.16/32', '10.0.0.1'), ('169.254.169.254/32', '10.0.0.1')] lease={'inter face': 'eth0', 'fixed-address': '10.0.0.11', 'server-name': 'BL24A1071918060SOC', 'subnet-mask': '255.255.255.0', 'dhcp-lease-time': '4294967295', 'routers': '10.0.0.1', 'dhcp-message-type': '5', 'domain-name-servers': '168.63.129.16', 'dhcp-server-ide ntifier': '168.63.129.16', 'dhcp-renewal-time': '4294967295', 'unknown-121': '0:a:0:0:1:20:a8:3f:81:10:a:0:0:1:20:a9:fe:a9:fe:a:0:0:1', 'dhcp-rebinding-time': '4294967295', 'unknown-245': 'a8:3f:81:10', 'domain-name': 'fyoqc4gghleevjxtq4h4pjbded.bx.int ernal.cloudapp.net', 'renew': '0 2160/05/11 22:40:16', 'rebind': '0 2160/05/11 22:40:16', 'expire': '0 2160/05/11 22:40:16'} imds_routed=True wireserver_routed=True) ``` Signed-off-by: Chris Patterson <cpatterson@microsoft.com>
In the Alibaba Cloud scenario, we do not wish to define routing priority based on MAC addresses. In a cloud environment where the kernel parameter net.ifnames=0 has been configured, network interface card (NIC) names are determined by default according to their underlying Bus, Device, and Function (BDF) numbers, incrementing from eth0 to ethN, with eth0 acting as the default primary NIC name. In the previous logic, network-card has the highest priority, followed by device-number as the second priority. When _fallback_nic_order is set to NicOrder.MAC, the mac address takes the third priority. On the other hand, when _fallback_nic_order is set to NicOrder.NIC_NAME, the NIC name becomes the third priority. In AWS environments, the default setting remains as _fallback_nic_order = NicOrder.MAC, maintaining the original behavior. However, in Alibaba Cloud scenarios, we set _fallback_nic_order = NicOrder.NIC_NAME.
…l#5122) freebsd obtained device partition name by the function find_freebsd_part, the params of function is the device name mounted, usually is first field of /etc/fstab , like /dev/gpt/rootfs, but freebsd fstab also support gptid or ufsid for a unique id to identify partitions, like /dev/gptid/xxx, /dev/ufsid/xxx, update function to support
Various different activators, datasources, and networking code implementations make use of manual iproute2 calls, which has led to much code duplication in the codebase. This is a small step towards replacing distro assumptions at call sites with common interfaces, which will simplify future refactors for more distro-agnostic code. These same abstractions will also enable simpler testing.
) This reverts commit 9758673. Not needed after canonicalGH-5145.
PyYAML has built-in unicode support in Python3+. The original code[1] was added as a helper to add support for unicode to `yaml.safe_load()`. We don't need this anymore, and can jettison it and prefer `yaml.safe_load()`. [1] a7a9de1
Alpine uses mdev for device mapper. BSDs don't have device mapper.
Many failures are being treated as warnings instead of errors due to usage of logexc() to emit the failure. Add log_level parameter to allow increasing the log level without requiring an additional log. Add tests but I'm unsure why its not logging the backtrace when the failure occurs within the test method. Signed-off-by: Chris Patterson <cpatterson@microsoft.com>
Currently, WSL only supports manual cloud-init configurations provided in the host Windows filesystem. This adds support for Landscape/Ubuntu Pro for WSL to provide cloud-init configurations and have them merged with or override manual user configurations. This adds support for organizations to better provision WSL instances using cloud-init. Co-authored-by: Carlos Nihelton <carlosnsoliveira@gmail.com> Co-authored-by: Chad Smith <chad.smith@canonical.com>
None of the unit tests should be reaching out to the network. Since Alpine tests run under LXD, we can still easily run tests there without network. Also, include the tzdata package that was missed in 725f5fb and remove unneeded network debug lines.
Attempting a `cloud-init clean --reboot` fails on alpine because this command is hard-coded. This code already has an Init object available, which has a Distro attribute. Stamp out the duplicate hard-coded implementation and re-use distro reboot code to acquire cross-distro compatibility. Error: Could not reboot this system using "['shutdown', '-r', 'now']": Unexpected error while running comma nd. Command: ['shutdown', '-r', 'now'] Exit code: - Reason: [Errno 2] No such file or directory: b'shutdown' Stdout: - Stderr: -
…5239) During validation process, the network schema is extracted without the network key. As such, the schema validation should work either with or without the top level network key. This change updates the schema and adds a unit test to validate.
EC2 documents that the system-uuid may be reported in different endianness[1]. A user has reported a case where cloud-init is broken due to inability to detect the system platform. Fix it. Behavior change: Cloud-init was previously making the assumption that uuid and serial would match on ec2. This assumption was: 1) not documented as a valid way to identify ec2[1] 2) proven invalid on ec2 by the DMI_PRODUCT_SERIAL and DMI_PRODUCT_UUID reported in canonical#5105 3) used in the logic which warns about not running on the "real" ec2 Preserving this warning logic exactly as it was presents several challenges: a) Risk of regression outside of our control: Since this logic relied upon undocumented behavior, AWS could change this at any point, which would break all cloud-init instances. b) Risk of incorrect implementation: What format is the uuid and product serial actually in? We don't know. It's easy and safe to just swap the byteorder of the first segment of the uuid because this is documented, but matching the whole uuid is problematic because UUID formats may be presented as mixed encoding (partially little endian and partially big endian). To implement this behavior while fixing this bug we would have to make even more assumptions than before. I propose we stop assuming and if a cloud happens to implement the same as EC2 (minus the serial/product match), then we just don't emit that warning. It's simpler, it's safer, and I really don't think that it is a huge change. This is a "change in behavior", but the change is that the code more correctly identifies EC2 and would no longer emit a warning on valid ec2 instances, so I don't think that this would require omitting this change from SRU. c) Implementing whatever assumptions we make in b) would require implementing a byteswapping algorithm in POSIX shell, which is possible but best to avoid this if possible. [1] https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/identify_ec2_instances.html Fixes canonicalGH-5105
Commit acc68de introduced a change which no longer builds a wheel, however integration tests now fail when dependencies are not available. Include the base requirements in test-requirements.txt. Fixes canonicalGH-5210
When it passes locally, it would be good to know why ci fails.
…nonical#5251) The PPA provided to CLOUD_INIT_CLOUD_INIT_SOURCE can contain a lower version of cloud-init than what is currently released for a given series. In those cases install an apt preferences file to pin the cloud-init installed to the given PPA regardless of the published version.
- Add missing templates for chrony and ntp configuration files. - AlmaLinux OS is binary compatible with RHEL and CloudLinux OS based on AlmaLinux OS. So, let's use distro-specific configurations from rhel. Signed-off-by: Elkhan Mammadli <elkhan.mammadli@protonmail.com>
…nical#5226) cc_mounts configures a `Requires=cloud-init.service` in the configured mount unit via `x-systemd.requires=cloud-init.service`. This creates a requirement dependency on cloud-init.service, even if cloud-init is disabled. Fix this by changing the mount unit dependency to `x-systemd.after=cloud-init.service`. Fixes canonicalGH-2815
Do not validation network config against cloud-init's network v2 schema on netplan systems because netplan schema supports keys not present in cloud-init's v2 schema which can result in schema warnings from cloud-init which are perfectly acceptable netplan config keys. Given that cloud-init performs a clean passthrough of network version 2 directly to netplan without trying to process the network configuration, there is little value in providing such schema warnings unless cloud-init's network v2 schema is aligned with the specific netplan schema supported for each release. On mantic and later, cloud-init will call netplan's python API to validate schema with netplan itself, but prior to Ubuntu Mantic no python API exists, so we cannot validate network v2 against netplan. Update skip messaging and integration tests.
Since lxc/lxcfs#292 has been fixed, we can use util.uptime(). On a freshly booted container the following script: ``` import os, time from cloudinit.util import uptime print(f"{uptime(),os.stat('/proc/1/cmdline').st_atime, time.monotonic()}") ``` Shows the following output: ('20.09', 1714515317.703158, 28017.975603713) Since 20 seconds is much closer to the expected uptime than the other methods, use `util.uptime()`.
Harden cloud-init against system clock changes by using `time.monotonic()` and `cloudinit.util.uptime()` instead of `time.time()`. Use `util.uptime()` when time should be increment across cloud-init stages. Use `time.monotonic()` when only the time delta in a single process matters. Observe the affects of changing system time: ``` >>> time.time() 1714528647.3474798 >>> time.monotonic() 47.738647985 >>> from cloudinit.util import uptime >>> uptime() '70.09' >>> # set time back over 1 year in another terminal >>> time.time() 1672531205.9688644 >>> time.monotonic() 106.06945439 >>> uptime() '109.39' ``` Fixes canonicalGH-2423 Fixes canonicalGH-3149
The current descriptions do not follow systemd guidance and are misleading since some services have multiple roles. systemd.unit(5) on Description: A short human readable title of the unit. This may be used by systemd (and other UIs) as a user-visible label for the unit, so this string should identify the unit rather than describe it, despite the name. Yet these service files attempt to describe the unit rather than identify it. Before: ``` May 01 09:49:49.238629 tiny systemd[1]: Starting cloud-config.service - Apply the settings specified in cloud-config... ``` After: ``` May 01 09:49:49.238629 tiny systemd[1]: Starting cloud-config.service - Cloud-init: Config Stage... ```
Cloud-init closes stdin on startup, which breaks interactively running cloud-init under pdb. Leave stdin open if it is connected to a tty, which fixes pdb use. Otherwise: File "/usr/lib/python3.12/bdb.py", line 90, in trace_dispatch return self.dispatch_line(frame) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.12/bdb.py", line 115, in dispatch_line if self.quitting: raise BdbQuit ^^^^^^^^^^^^^ bdb.BdbQuit ------------------------------------------------------------ The program exited via sys.exit(). Exit status: 1
TheRealFalcon
approved these changes
May 3, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Diff and changelog looks good to me!
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Perform new_upsteam snapshot of main into oracular for release.
consolidate debian/changelog entries
process to create branch
test procedure followed: