Cloud init schema failures #1954

Chris-Peterson444 · 2024-03-25T23:44:16Z

Users attempting to do autoinstall may incorrectly send autoinstall directives as cloud-config, which will result in cloud-init schema validation errors. When loading autoinstall from cloud-config, we now check to see if there are any cloud-init schema validation errors and warn the user. Additionally, if the source of the error is from a known autoinstall error, we inform the user and halt the installation with a nonreportable AutoinstallError.

Requires #1945 and #1947.

dbungert

A comment for now, more review tomorrow.

subiquity/cloudinit.py

Chris-Peterson444 · 2024-03-26T03:54:46Z

@blackboxsw could you please take a look at this as well?

dbungert

Chris, I'm really glad to see this work done. I think it's going to help reduce user confusion. I have some items to follow-up on.

subiquity/cloudinit.py

subiquity/server/server.py

Chris-Peterson444 · 2024-03-26T22:12:21Z

Thanks for the review @dbungert. I've gone ahead and implemented your suggestions. I added an extra test case to capture the scenario in which autoinstall keys are present in the cloud-config and the autoinstall key itself is not.

Also, by removing the dependence on the recoverable errors I was able to test this successfully on a 20.04.6 image which has cloud-init version 22.4.2-0ubuntu0~20.04.2

ogayot

Good work! A few minor comments but nothing blocking

subiquity/cloudinit.py

subiquity/server/tests/test_server.py

subiquity/cloudinit.py

Chris-Peterson444 · 2024-03-28T17:55:43Z

Thanks @ogayot I've implemented all of your suggestions

dbungert

One tweak please then LGTM.

subiquity/cloudinit.py

Chris-Peterson444 · 2024-04-02T20:23:33Z

Thanks! Since I changed CloudInitSchemaValidationError to a NonReportableException, I also added some lines to the client code so the error overlay works for those error types. Could you give that a look over too @dbungert ?

blackboxsw

Thanks for the ping on this for our awareness.

While I think consuming the CLI and reading stderr is potentially fragile, I get that subiquity may not want to tightly couple to internal cloudinit python libraries and functions to process and iterate over SchemaValidationErrors
raised by cloudinit.config.schema.validate_cloudconfig_schema or cloudnit.config.schema.validate_cloudconfig_file.

To use these functions, subiquity would have to likely have to do something like the following:

from cloudinit.config.schema import validate_cloudconfig_file, get_schema, SchemaValidationError
cloudconfig_schema = get_schema()
try:
    validate_cloudconfig_file('/var/lib/cloud/instance/user-data.txt', schema=cloudconfig_schema)
except SchemaValidationError as e:
    return [schema_problem.path for schema_problem in e.schema_errors if "unexpected" in schema_problem.message]

Note that even the structured SchemaValidationError will not set a 'path' attribute if multiple unexpected keys exist on the object. So, you'd still need your pattern matching to schema_problem.message to extract the separate keys so it's nearly the same logic you have for parsing the stderr of cloud-init schema --system.

The fragility in CLI approach in this PR will be due to cloud-init relying on jsonschema for that error string as cloud-init just presents that error message directly without modification. If jsonschema module changes their error messaging format this could break subiquity parsing.

That said, I don't see this error output format from jsonschema being any different between jammy and noble and I think cloud-init would like to work on a machine-readable representation of cloud-init schema --system --format=yaml per your feature request canonical/cloud-init#5100 that will make this easier to process in the future.

Minor changes requested that you can take or leave as you see fit

better regex pattern match, splitting of the parsed jsonschema error messages
dropping unused functions/tests

We'll make sure we keep you informed when we start tackling canonical/cloud-init#5100. So this code can consume more friendly structured content when available.

subiquity/cloudinit.py

blackboxsw · 2024-04-03T04:49:52Z

subiquity/cloudinit.py

+    # Matches:
+    # ('some-key' was unexpected)
+    # ('some-key', 'another-key' were unexpected)
+    pattern = r"\((?P<args>'[\w\-]+'(,\s'[\w\-]+')*)+ (?:was|were) unexpected\)"


A couple of thoughts on this regex:

Pattern should probably be more flexible as these keys can really have any characters in them. and could include underscores, periods etc. So, let's match each offensive key on [^']+

Also we can push the leading and trailing single-quotes outside the P? match so we don't have to strip them later

We can drop the trailing + outside the (P?<args>...)+ as your greedy matching and * should take care of all listed unexpected key matches.

Suggested change

pattern = r"\((?P<args>'[\w\-]+'(,\s'[\w\-]+')*)+ (?:was|were) unexpected\)"

pattern = r"\('(?P<args>[^']+(,\s'[^']+)*)' (?:was|were) unexpected\)"

Thanks for the review on this especially (Regex is hard!). I think we can definitely (1) replace [\w\-] with [^'] to be more flexible and (3) remove the trailing +, but (2) isolating the key names without the quotes is difficult. The suggested regex doesn't quite work and my attempts thus far have been insufficient.

I think I'm okay with another round of processing to strip the quotes and this is still an improvement on the parsing.

blackboxsw · 2024-04-03T05:05:59Z

subiquity/cloudinit.py

+    args_list: list[str] = search_result.group("args").split(", ")
+    no_quotes: list[str] = [arg.strip("'") for arg in args_list]
+
+    return no_quotes


If the above pattern suggestion is acceptable. Then you can avoid having to strip single quotes and adapt your split instead.

Suggested change

args_list: list[str] = search_result.group("args").split(", ")

no_quotes: list[str] = [arg.strip("'") for arg in args_list]

return no_quotes

args_list: list[str] = search_result.group("args").split("', '")

return args_list

subiquity/cloudinit.py

subiquity/tests/test_cloudinit.py

blackboxsw · 2024-04-03T05:48:35Z

subiquity/cloudinit.py

+async def get_schema_failure_sources() -> list[str]:
+    """Retrieve the keys causing schema failure."""
+
+    cmd: list[str] = ["cloud-init", "schema", "--system"]


Note that providing --system does report schema errors with any network-config or vendor-data as well as user-data provided to the instance. This is probably ok, and you probably want subiquity to raise any errors with any network, user-data or vendor-data provided to the instance. What you will not have with this approach is visibility to the file that generated the schema errors because you are processing stderr which only emits the specific error message from jsonschema for a given key, the file name (user-data.txt or network-config.txt) is represented only in the stdout at the moment. I don't think subiquity currently provides network-config to cloud-init, but users could provide this config via kernel commandline params ds=nocloud-net;http://some_url/

What you may want to do is specifically perform schema validation of only the user-data if it exists.

Suggested change

cmd: list[str] = ["cloud-init", "schema", "--system"]

if not os.path.exists("/var/lib/cloud/instance/user-data.txt"):

log.debug("No processed cloud-init user-data present")

return []

cmd: list[str] = ["cloud-init", "schema", "-c", "/var/lib/cloud/instance/user-data.txt"]

or network-config:

if not os.path.exists("/var/lib/cloud/instance/network-config.json"): log.debug("No processed cloud-init network-config present") return [] cmd: list[str] = ["cloud-init", "schema", "-c", "/var/lib/cloud/instance/network-config.json", "--schema-type", "network-config"]

I think in general we want to capture errors from all sources since we read the combined config when extracting the autoinstall config (e.g. is it possible someone sends autoinstall config in their network-config?), but capturing the file source would improve our error messaging for sure. What do you think about leaving this to future improvements?

Users attempting to do autoinstall may incorrectly send autoinstall directives as cloud-config, which will result in cloud-init schema validation errors. When loading autoinstall from cloud-config, we now check to see if there are any cloud-init schema validation errors and warn the user. Additionally, if the source of the error is from a known autoinstall error, we inform the user and halt the installation with a nonreportable AutoinstallError.

Chris-Peterson444 · 2024-04-03T18:41:52Z

@blackboxsw Thanks a lot for the extensive review! I went ahead and implemented most of your suggested changes (all but two, I left my reasoning in the comments and kept them unresolved). Based on your analysis I think we're okay with the potential fragility in the error parsing until we move to the structured output when it's available. I look forward to seeing more on canonical/cloud-init#5100!

dbungert · 2024-04-03T23:35:09Z

OK to merge, despite Noble test failures (unrelated archive problems)

Chris-Peterson444 requested a review from dbungert March 25, 2024 23:44

Chris-Peterson444 force-pushed the cloud-init-schema-failures branch from a7f4f26 to 62d34b5 Compare March 26, 2024 00:11

dbungert reviewed Mar 26, 2024

View reviewed changes

subiquity/cloudinit.py Outdated Show resolved Hide resolved

subiquity/cloudinit.py Outdated Show resolved Hide resolved

Chris-Peterson444 force-pushed the cloud-init-schema-failures branch from 62d34b5 to 077d49b Compare March 26, 2024 03:14

context: also send events to log

f307b87

Chris-Peterson444 force-pushed the cloud-init-schema-failures branch from 077d49b to 8af0d5d Compare March 26, 2024 03:19

Chris-Peterson444 mentioned this pull request Mar 26, 2024

[enhancement]: Query for schema failures canonical/cloud-init#5100

Open

dbungert reviewed Mar 26, 2024

View reviewed changes

subiquity/cloudinit.py Outdated Show resolved Hide resolved

subiquity/server/server.py Outdated Show resolved Hide resolved

Chris-Peterson444 force-pushed the cloud-init-schema-failures branch from 8af0d5d to 3c4bb85 Compare March 26, 2024 22:05

ogayot approved these changes Mar 28, 2024

View reviewed changes

Chris-Peterson444 force-pushed the cloud-init-schema-failures branch from 3c4bb85 to 4c47763 Compare March 28, 2024 17:54

dbungert approved these changes Apr 2, 2024

View reviewed changes

subiquity/cloudinit.py Outdated Show resolved Hide resolved

Chris-Peterson444 force-pushed the cloud-init-schema-failures branch from 4c47763 to 96e7720 Compare April 2, 2024 20:05

blackboxsw reviewed Apr 3, 2024

View reviewed changes

Chris-Peterson444 force-pushed the cloud-init-schema-failures branch from 96e7720 to 67bcfa3 Compare April 3, 2024 18:20

Chris-Peterson444 added 2 commits April 3, 2024 11:22

client: CloudInitSchemaValidationError in error overlay

1b2c6be

Chris-Peterson444 force-pushed the cloud-init-schema-failures branch from 67bcfa3 to 1b2c6be Compare April 3, 2024 18:22

dbungert approved these changes Apr 3, 2024

View reviewed changes

Chris-Peterson444 merged commit d77bfbe into canonical:main Apr 3, 2024
9 of 10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cloud init schema failures #1954

Cloud init schema failures #1954

Chris-Peterson444 commented Mar 25, 2024

dbungert left a comment

Chris-Peterson444 commented Mar 26, 2024

dbungert left a comment

Chris-Peterson444 commented Mar 26, 2024

ogayot left a comment

Chris-Peterson444 commented Mar 28, 2024

dbungert left a comment

Chris-Peterson444 commented Apr 2, 2024

blackboxsw left a comment •

edited

blackboxsw Apr 3, 2024

Chris-Peterson444 Apr 3, 2024 •

edited

blackboxsw Apr 3, 2024

blackboxsw Apr 3, 2024

Chris-Peterson444 Apr 3, 2024

Chris-Peterson444 commented Apr 3, 2024

dbungert commented Apr 3, 2024

	pattern = r"\((?P<args>'[\w\-]+'(,\s'[\w\-]+')*)+ (?:was\|were) unexpected\)"
	pattern = r"\('(?P<args>[^']+(,\s'[^']+)*)' (?:was\|were) unexpected\)"

-    cmd: list[str] = ["cloud-init", "schema", "--system"]
+    if not os.path.exists("/var/lib/cloud/instance/user-data.txt"):
+        log.debug("No processed cloud-init user-data present")
+        return []
+    cmd: list[str] = ["cloud-init", "schema", "-c", "/var/lib/cloud/instance/user-data.txt"]

Cloud init schema failures #1954

Cloud init schema failures #1954

Conversation

Chris-Peterson444 commented Mar 25, 2024

dbungert left a comment

Choose a reason for hiding this comment

Chris-Peterson444 commented Mar 26, 2024

dbungert left a comment

Choose a reason for hiding this comment

Chris-Peterson444 commented Mar 26, 2024

ogayot left a comment

Choose a reason for hiding this comment

Chris-Peterson444 commented Mar 28, 2024

dbungert left a comment

Choose a reason for hiding this comment

Chris-Peterson444 commented Apr 2, 2024

blackboxsw left a comment • edited

Choose a reason for hiding this comment

blackboxsw Apr 3, 2024

Choose a reason for hiding this comment

Chris-Peterson444 Apr 3, 2024 • edited

Choose a reason for hiding this comment

blackboxsw Apr 3, 2024

Choose a reason for hiding this comment

blackboxsw Apr 3, 2024

Choose a reason for hiding this comment

Chris-Peterson444 Apr 3, 2024

Choose a reason for hiding this comment

Chris-Peterson444 commented Apr 3, 2024

dbungert commented Apr 3, 2024

blackboxsw left a comment •

edited

Chris-Peterson444 Apr 3, 2024 •

edited