-
Notifications
You must be signed in to change notification settings - Fork 931
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
User-defined meta-data can create malformed YAML #13853
Comments
Hi @holmanb I'm afraid im not really following what it is that LXD needs to change here? Also, not sure if relevant, but using the |
Thanks for the response @tomponline!
This is the offending line. See the commit on this branch for the change that I am proposing. I'm happy to submit a PR for this, but we need to release a change in cloud-init first to accommodate this expectation. This is why I filed a bug report rather than just a PR - I want to make sure that the proposed solution is acceptable before moving forward it.
The relevant key is $ lxc launch ubuntu:noble me -c cloud-init.meta-data=instance-'id: test_1'
Creating me
Error: Failed instance creation: Failed creating instance record: Unknown configuration key: cloud-init.meta-data
$ lxc launch ubuntu:noble me -c user.meta-data=instance-'id: test_1'
Creating me
Starting me similarly: $ lxc config set me user.meta-data=instance-'id: test_1'
$ lxc config set me cloud-init.meta-data=instance-'id: test_1'
Error: Invalid config: Unknown configuration key: cloud-init.meta-data If you want to deprecate the user.meta-data key as well for uniformity I could potentially make cloud-init support a new |
Thanks! Will this break users of LXD guests with older versions of cloud-init? |
Hrm, that is curious, I wasn't expecting that, but I'd need to dig into the commit history and original pull requests to try and understand why this wasn't originally changed to have a |
Please could you explain this statement. I'm confused why a key being used by cloud-init isn't part of cloud-config? |
I suspect we'll need option 1. at least, and then potentially land the proposed changed in 2. for only the 6.x series of LXD. |
This would break any user that provides a custom instance-id (duplicate key) on an older version of cloud-init, since this would cause cloud-init to see the old key where it didn't before. From a cloud-init perspective, fixes for bugs come in new releases so the typical stability / support recommendation is "upgrade to the latest version". If we want to avoid breaking old instances, I could probably update the proposal I made above to increment the api rev number.
Agreed. Let me know if you'd like to go that route.
Cloud-config isn't required for any of the keys: vendor-data, user-data, or meta-data. Cloud-config is just one of cloud-init's configuration formats. There are several configuration format options available for user-data and vendor-data, including cloud-config, and even just running a shell script: config:
...
user.user-data: |
#!/usr/bin/bash
echo hello | tee -a /tmp/example.txt With the above example a user would see: $ lxc exec me -- cat /tmp/example.txt
hello User-data is provided by the user for the purpose of configuring an instance. Vendor-data is likewise intended to by provided by the cloud/vendor for the purpose of configuring an instance with cloud-specific information. Both vendor-data and user-data can be any of the multiple configuration formats mentioned above. Meta-data doesn't follow any of the above formats, and is not intended to be a configuration format for the instance. Instead, it supposed to tell cloud-init just a few pieces of information about the instance: its instance_id, region, etc. The lines are blurred a bit because a couple of the keys that it supports overlap with cloud-config. One of the overlapping keys is
That sounds fine by me. Let me know if my responses here or further digging revealed anything new that suggest that we shouldn't go forward with this proposal. This PR is my proposal to option 1, if you'd like to take a look. |
@holmanb Hi, would you mind booking a meeting to discuss this issue? Thanks |
I just saw this when checking back on the status of this. I'd be happy to. |
Thanks for the call @holmanb As discussed, you can change the instance-id exposed to cloud-init via LXD's devlxd metadata API (https://documentation.ubuntu.com/lxd/en/latest/dev-lxd/#meta-data) by changing To change I also think we should entirely remove the
https://discuss.linuxcontainers.org/t/lxd-first-class-cloud-init-support/12559/18 See also Removed from docs here:
There is also an issue confirming its removal here (although there's some confusion between user.user-data and user.meta-data in that thread): |
Thanks @tomponline for discussing. The volatile key and instance rename should meet our needs. Cloud-init has one test which I recently added which depends on setting the instance ID via the I just submitted a PR against cloud-init to update cloud-init's lxd documentation per our conversation. |
@holmanb @tomponline we have a second use case for the user of If the ability to set |
@blackboxsw thanks for catching that, I didn't catch that platform ssh keys can be provided in meta-data. For completeness I double checked the other potential users of meta-data. Here are the references to arbitrary meta-data keys that I see in
None of these appear to be used by our LXD datasource code, nor by any of our tests in cloud-init or pycloudlib, so it looks to me like @tomponline My apologies, I missed this requirement. It seems that As @blackboxsw suggested, if lxd were to provide the ssh key some other way (such as with a new key |
what form would |
Regarding applying these settings, we would need to be able to be set this value in a profile (preferred) or on an instance before launch (less preferred, but workable). If I understand correctly, both of these expectations are true for the other volatile keys. Regarding the datatype, it would be best if this key could contain both string and list of strings. That will ensure that users can continue inserting either a single key or multiple public keys. If only one datatype is preferred, that would probably be fine too (as just a list of strings) but would require some changes in pycloudlib to accomidate. Regarding the upgrade path, we could make cloud-init and pycloudlib fall back to setting |
No need to support two format types for this content if it adds unnecessary complexity to lxd for a For example the following multi-line string value would allow cloud-init import my two public keys
Simplistically, I'm imagining something like this --- a/lxd/devlxd.go
+++ b/lxd/devlxd.go
@@ -145,7 +145,7 @@ func devlxdMetadataGetHandler(d *Daemon, inst instance.Instance, w http.Response
value := inst.ExpandedConfig()["user.meta-data"]
- return response.DevLxdResponse(http.StatusOK, fmt.Sprintf("#cloud-config\ninstance-id: %s\nlocal-hostname: %s\n%s", inst.CloudInitID(), inst.Name(), value), "raw", inst.Type() == instancetype.VM)
+ return response.DevLxdResponse(http.StatusOK, fmt.Sprintf("#cloud-config\ninstance-id: %s\nlocal-hostname: %s\n%s%s", inst.CloudInitID(), inst.Name(), inst.CloudInitPublicKeys(), value), "raw", inst.Type() == instancetype.VM)
}
var devlxdEventsGet = devLxdHandler{
diff --git a/lxd/instance/drivers/driver_common.go b/lxd/instance/drivers/driver_common.go
index 9d547032a8..adc124451e 100644
--- a/lxd/instance/drivers/driver_common.go
+++ b/lxd/instance/drivers/driver_common.go
@@ -171,6 +171,14 @@ func (d *common) CloudInitID() string {
return d.name
}
+// CloudInitPublicKeys returns a string containing a new-line separated list of SSH authorized keys to configure for an instance
+func (d *common) CloudInitPublicKeys() string {
+ id := d.LocalConfig()["volatile.cloud-init.public-keys"]
+ if id != "":
+ id = fmt.Sprintf("public-keys: %s\n", id)
+ return id
+}
+
// Location returns instance's location.
func (d *common) Location() string {
return d.node |
I prefer if we can avoid receiving structured inputs which then require additional parsing. Unnecessary parsing inevitably leads to bugs and introduces corner cases. If a single format type is preferred, I would lean slightly towards a list of strings. This would benefit not just cloud-init with a simpler implementation, but also users because of more correct validation of inputs. |
List of strings sounds good and can easily be validated in lxd/instance/drivers/driver_common.go and be presented as YAML list expected by cloud-init's DataSourceLXD meta-data processing of public-keys. I also note that the leading #cloud-config in existing lxd devlxdMetadataGetHandler doesn't need the leading #cloud-config (as it's not cloud-config and the comment header line is ignored for meta-data anyway). |
Related: I just submitted a PR because |
So LXD config options have no concept of "list of strings", all config options are a single string. However they can contain commas, newlines etc, so depending on the expected content of the string, selecting an appropriate delimiter is important (i.e can commas appear in SSH keys?). If the format is well understood by LXD, then we can validate it, split it and deliver it to cloud-init in the desired format (i.e list of strings). I would like avoid having an undefined blob of data like the current meta-data setting is as it leads to the issues we've found where the format is not well understood in all situations.
Sounds good.
volatile keys can only be set on the instance, not the profile, so I would suggest adding a new proper config key, such as Interestingly we've recently received a request for something similar from elsewhere in Canonical, albeit it without requiring cloud-init (LXD would set up the SSH keys in the guest). So if we did add a config key like this, and LXD set up the keys directly, would this data even need to be exported to cloud-init's metadata? Ofcourse we could do both, but would they then potentially conflict? cc @mionaalex |
Good to know.
That could work, and I think as @blackboxsw suggested new line delimited would be fine. I think that we would just want to avoid passing empty strings, which might get introduced for example when a user uses two newlines between keys rather than one newline.
Agreed
Sounds good. As described above, it wouldn't be used internally anyways so
I think that it would probably be preferred if we can exercise this code path in cloud-init using LXD, but I do think that doing both would conflict. Maybe we could get away with just testing this functionality on other clouds. @blackboxsw thoughts? |
Why are there multiple ways of setting up SSH keys in cloud-init via both meta-data and user-data? |
@blackboxsw does cloud-init apply the existing |
Required information
Issue description
Problem
The user-defined meta-data key gets appended as a string to the lxd-provided meta-data. This means that duplicate keys can be added, which creates a configuration that isn't well defined. Both 1.1 and 1.2 of the YAML spec state that keys are unique, which this violates.
The configuration received by cloud-init:
Cloud-init's implementation uses PyYAML which happens to use the last defined key - which happens to produce the desired outcome (allow user to override the default meta-data), but it depends on undefined behavior of a specific library. If cloud-init were ever to move to a different YAML library this behavior could break or need to be manually worked around.
In order to preserve the current behavior while creating a path to using standard-compliant yaml while preserving backwards compatibility, we could do the following:
cloud-init could be updated to make values in
metadata['config']['user.meta-data']
override values inmetadata['meta-data']
. This wouldn't change cloud-init's current behavior, which ignores the values inmetadata['config']
. We could optionally check for a bump to the value in_metadata_api_version
before doing this, but this wouldn't be strictly required since this is functionally identical currently.Once stable distributions have this update, we could update the api to no longer append user meta-data to the default metadata (and bump the meta-data api, if desired). While we're making this change, we might want to drop the
#cloud-config
comment too. This isn't necessary because meta-data isn't part of cloud-config.canonical/cloud-init#5575
Information to attach
lxc info NAME --show-log
)lxc config show NAME --expanded
)The text was updated successfully, but these errors were encountered: