New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
identity: fail on missing section sooner #1965
identity: fail on missing section sooner #1965
Conversation
This doesn't quite work for desktop because |
I've only tested in dry-run mode but I'm not seeing an obvious difference between the previous and the new behavior. What is it that we are trying to achieve? Exit the subiquity server process before partitioning starts? |
Yes. On server, if you omit both the identity section and the user-data section then an error will be emitted when the identity controller loads the autoinstall data, but the installation with proceed until the postinstall stage where it hangs. We want to make sure this error halts the installer immediately. |
To elaborate on what Chris said, the failure mode here is really best seen to be believed. So you supply a trivial cloud-config like so to live-server:
An exception is seen in the logs complaining about the lack of THEN, at this point, the install hangs, as the post-install models are not configured, and there isn't even a visual indication on why this is other than that Exception shown all the way before partitioning even started. My goal here was to move that check earlier, since the time between the problem report and when the install stops is measured in how long it takes for partitioning + rsync to happen. |
Yeah, sorry for not being more explicit. So the way I see it this could have regressed the following conditions:
I was able to confirm all of these cases pass in VM tests. I did find a separate bug in the reset-partition-only handling but I'll file a separate bug about that. Edit: reset-partition bug LP: #2061042 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well that was a bit more involved than I would have guessed. Nice work!
Conceptually this is fine, my major feedback is to just relocate some parts to a place where we might remove the duplication, or at least make it more greppable. I would be quite surprised to find some of this in identity.py
.
if ( | ||
source_config is not None | ||
and (id := source_config.get("id")) is not None | ||
and "server" not in id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wary of checking the string set on the id, it's semi free form.
What's the motivation for this section? It looks like we can drop it and get the same outcome? If I'm missing something let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in subiquity/server/controllers/source.py
in load_autoinstall_data
we get the source id with a .get("id")
on the autoinstall data and do an exact match on the IDs in the catalog via get_matching_source
in subiquity/models/source.py
so I felt that it wasn't that bad of a comparison to just check that server
wasn't in the name.
In the autoinstall case if the id the user provides doesn't match any sources in the catalog then we get a KeyError
anyways so I wasn't super concerned with weird outcomes if the user provides something unrealistic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*but this section matches the source controller/model logic of letting autoinstall inform us of the source id
Right? And thanks! Another thing I had in mind about keeping the logic in the identity controller: we probably want to remove this logic at some point in the future and put it back into the load_autoinstall function when we can just throw from there and expect an immediate halt to the overall install. Edit: re-reading your comment now I see what you're saying. I'll move the bits around tomorrow! |
d29991b
to
1f11807
Compare
Moved to separate utility functions owned by the respective controllers, as requested. There is still the open question about the source controller logic. |
|
||
# Check the sources available the same way the source controller does | ||
# but return a bool for if the variant is desktop | ||
path = "/cdrom/casper/install-sources.yaml" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's move the existing magic path and reference that common definition, please, or maybe this is obsolete given the previous comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah obsolete but I still think it's worth moving the magic var definition somewhere global
source_config = self.app.autoinstall_config.get(self.autoinstall_key) | ||
if ( | ||
source_config is not None | ||
and (id := source_config.get("id")) is not None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm vetoing this approach, as it relies on behavior in a string that is theoretically an arbitrary value.
What I want instead is for the SourceController.start
logic to move earlier in the sequence - out of start
entirely actually - so that we can answer this question before SourceController.start()
has been called. Then we should be able to just get the info needed. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I originally looked at this bug I think I had reservations about moving the logic in SourceController.start
to somewhere sooner, but looking at it again now I don't see why we can't. We can probably move that logic to SourceController.load_autoinstall_data
. I'll give it a shot.
Moves the logic for checking if user-data or identity section is provided (on server) in the autoinstall config to the load_autoinstall_data function. Without this change, the exception thrown in apply_autoinstall_config won't halt the installer until the postinstall steps (LP: 2060547).
1f11807
to
31b6d3e
Compare
31b6d3e
to
90d4a0e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have suggested a few tweaks but this seems fine. What further work do you imagine for this draft PR? What sort of VM testing has been done at this point?
The identity controller shouldn't own the logic for determining if the filesystem controller will only install the reset partition. Creates a utility function that can be called by the identity controller to determine if only installing the reset partition.
The current check for Desktop in the identity controller doesn't work because the variant isn't set until SourceController.start is called. Move this logic to earlier in the sequence so that the variant information can be referenced by later-loaded controllers.
90d4a0e
to
68ae5a4
Compare
Thanks for the suggestions! Currently I am running the following VM tests before I mark this ready:
|
The above tests all pass |
1 similar comment
The above tests all pass |
Moves the logic for checking if user-data or identity section is provided (on server) in the autoinstall config to the load_autoinstall_data function. Without this change, the exception thrown in apply_autoinstall_config won't halt the installer until
the postinstall steps (LP: 2060547).
The change is ready but I am keeping this as a draft PR until I can test this on Desktop.