New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
schema validation not in line with specification #4004
Comments
Launchpad user Alberto Contreras(aciba) wrote on 2022-08-02T12:15:48.995391+00:00 Reproduced with: cat > /tmp/lp1983306 <<EOF
#include
http://www.canonical.com
EOF
lxc launch ubuntu-daily:kinetic lp1983306
lxc file push /tmp/lp1983306 lp1983306/root/
lxc exec lp1983306 -- cloud-init status --wait
$ lxc exec lp1983306 -- cloud-init --version
/usr/bin/cloud-init 22.2-64-g1fcd55d6-0ubuntu1~22.10.1
$ lxc exec lp1983306 -- cloud-init schema -c /root/lp1983306
Error:
Cloud config schema errors: format-l1.c1: File /root/lp1983306 needs to begin with "#cloud-config"
$ lxc exec lp1983306 -- cloud-init schema -c /root/lp1983306 --annotate
#include # E1
http://www.google.com
# Errors: -------------
# E1: File /root/lp1983306 needs to begin with "#cloud-config" Note that the query command does also not resolve include directives: $ lxc exec lp1983306 -- cloud-init query -u /root/lp1983306 userdata.asdf
Traceback (most recent call last):
File "/usr/bin/cloud-init", line 33, in <module>
sys.exit(load_entry_point('cloud-init==22.2', 'console_scripts', 'cloud-init')())
File "/usr/lib/python3/dist-packages/cloudinit/cmd/main.py", line 1063, in main
retval = util.log_time(
File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 2621, in log_time
ret = func(*args, **kwargs)
File "/usr/lib/python3/dist-packages/cloudinit/cmd/query.py", line 291, in handle_args
response = _find_instance_data_leaf_by_varname_path(
File "/usr/lib/python3/dist-packages/cloudinit/cmd/query.py", line 230, in _find_instance_data_leaf_by_varname_path
jinja_vars_with_aliases = jinja_vars_with_aliases[key_path_part]
TypeError: string indices must be integers |
Launchpad user Brett Holman(holmanb) wrote on 2022-08-04T17:03:08.404881+00:00 Hi Peter, Thanks for reporting. I wouldn't expect validation to succeed on include format user-data for the following reason: Per the man page[1] and user docs[2], the schema command is for validating cloud-config files. In the future we could potentially add validation for include or other user-data formats, but for now it doesn't support anything besides cloud-config files. If you see anywhere that the docs claim that the schema validator supports all user-data formats, please let us know so we can correct it. I don't see anywhere at this time, but it's possible I missed something in the docs.
Can you expand on this statement please? How does this block you? I'm marking this as Incomplete until we get more information on how mangled logs cannot be worked around. [1] https://github.com/canonical/cloud-init/blob/main/doc/man/cloud-init.1 |
Launchpad user Peter Surda(surda) wrote on 2022-08-05T00:26:31.795155+00:00
If you use #include, it results in truncated logs. This happens automatically and there is no obvious way to disable it. |
Launchpad user Chad Smith(chad.smith) wrote on 2022-08-08T21:13:04.246981+00:00 Thanks a lot Peter for the bugs. I agree with Brett on the sentiment that the following cmd should fail because the cloud-init schema -c <some_cloud_config_file> But two things look like they are a bug here:
Both touch points should be fixed. As mentioned by Peter, this will break anyone using non Because cloud-initlxc exec lp1983306 -- cloud-init schema -c /root/lp1983306 --annotate |
Launchpad user Chad Smith(chad.smith) wrote on 2022-08-08T21:13:41.187231+00:00 I'm going to bump this to high so we can resolve it before cloud-init 22.3 |
Launchpad user Peter Surda(surda) wrote on 2022-08-09T02:54:47.166834+00:00 I found out that while running in a terminal, the inconsistency occurs as I described above. However that appears to be an insufficient condition to cause mangled logs as well, that needs additional errors (in my case jinja2 template errors) while rendering the full cloud config. The message therefore may be confusing. In my case, it suggests it's the "#include" causing the problems, but it appears that it's an invalid jinja2 template (well, technically a missing variable resulting in a rendering error) causing the problem (mangled logs). To summarise the point I'm trying to make, for me the main problem is the practical confusion in messages between components of cloud-init, not the theoretical inconsistency between specification and validation, as I thought originally. In case I am confusing, here is a practical example:
This output is the same on a system that has mangled logs, and one that doesn't have mangled logs. Someone who has mangled logs (e.g. me) may erroneously conclude that it's the validator causing his mangled logs. The system that has mangled logs has actually a different cause for the problems, but the validator doesn't show this cause as it doesn't get past the "schema errors" message above. |
Launchpad user Chad Smith(chad.smith) wrote on 2022-08-09T05:33:06.243607+00:00 This makes sense. I think we can minimally fix the messaging seen on the commandline when running Minimally I think we need to fix https://github.com/canoncloud-config ical/cloud-init/blob/main/cloudinit/config/schema.py#L621-L623 to validated the processed/rendered userdata instead of "userdata_raw" so that I can validate that we get a less than desirable message from reproducer: terminal 1 on host system 192.168.1.8$ mkdir -p instance-data $ python3 -m http.server terminal 2 on host system 192.168.1.8$ cat > include.yaml <<EOF $ lxc exec include-k bash See raw userdata (this is ok/desired)root@include-k:~# cloud-init query userdata verify cloud-init schema --system alias for rendered user-data (not correct)root@include-k:~# cloud-init schema --system --annotate Errors: -------------E1: File None needs to begin with "#cloud-config"What I think we want: This comes with something like the following diff, but needs tweaking and unittest coverage from cloudinit import importer, safeyaml try:
|
Launchpad user Alberto Contreras(aciba) wrote on 2022-08-09T13:44:38.267617+00:00 Thanks, chad.smith, for the proposed patch. I have added unit tests to it and created a PR here: #1644 |
Launchpad user Brett Holman(holmanb) wrote on 2022-08-19T16:37:35.614544+00:00 This bug is believed to be fixed in cloud-init in version 22.3. If this is still a problem for you, please make a comment and set the state back to New Thank you. |
This bug was originally filed in Launchpad as LP: #1983306
Launchpad details
Launchpad user Peter Surda(surda) wrote on 2022-08-02T04:03:16.370717+00:00
Related to #1983303. My user-data begins with #include, as it's not a "Cloud Config Data" but an "Include File" as described in the official documentation. However, this causes the validator
cloud-init schema --system
to complain thatOk I thought, I just manually add "#cloud-config" at the top and re-test:
Well, it's not a YAML dict because it's not a cloud config data but an include file, which isn't in the YAML format.
See the specification: https://cloudinit.readthedocs.io/en/latest/topics/format.html
Also look at the implementation in
user_data.py
, function_do_include
. As you can see, this file isn't processed as YAML but parsed line by line. So the specification and implementation agree, but the schema validator doesn't and thinks it should process it as YAML.This wouldn't be a practical problem for me, but due to #1983303 I get mangled logs and can't work around it.
The text was updated successfully, but these errors were encountered: