New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Only invoke hotplug socket when functionality is enabled (SC-251) #952
Conversation
Hello! Thank you for this proposed change to cloud-init. This pull request is now marked as stale as it has not seen any activity in 14 days. If no activity occurs within the next 7 days, this pull request will automatically close. If you are waiting for code review and you are seeing this message, apologies! Please reply, tagging mitechie, and he will ensure that someone takes a look soon. (If the pull request is closed and you would like to continue working on it, please do tag mitechie to reopen it.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good. minor changes for CLI usability.
cloudinit/cmd/devel/hotplug_hook.py
Outdated
udevaction=args.udevaction, | ||
) | ||
if args.hotplug_action == 'query': | ||
hotplug_init.fetch(existing="trust") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we run this before the datasource is detected, we trace here:
cloudinit.sources.DataSourceNotFoundException: Did not find any data source, searched classes: (DataSourceNone)
We might want to catch that case and at least report "unknown" or "no-datasource" etc.
hotplug_init.fetch(existing="trust") | |
try: | |
hotplug_init.fetch(existing="trust") | |
except cloudinit.sources.DataSourceNotFoundException: | |
print("Unable to determine hotplug state. No datasource detected") | |
sys.exit(1) |
Also is it worth just baking this hotplug_init.fetch(existing="trust") into is_enabled() function because I see this same potential problem in handle_hotplug too?
I realize generally this cmdline shouldn't normally get run until after the datasource detection has occurred, just a race there until that discovery happens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our udev hook should disable any hotplug commands until cloud-init has finished its initial run. How did you get this exception to happen? I don't necessarily disagree with your suggestion, just trying to understand.
Edit: Nevermind...hadn't fully swapped this code back into my head 😄 . I committed something very similar to your suggestion.
cloudinit/cmd/devel/hotplug_hook.py
Outdated
@@ -133,6 +149,14 @@ def device_detected(self) -> bool: | |||
} | |||
|
|||
|
|||
def is_enabled(hotplug_init, subsystem): | |||
scope = SUBSYSTEM_PROPERTES_MAP[subsystem][1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
run try/except hotplug_init.fetch(existing="trust") here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we want it here. I applied the try/except to the other place you suggested, but I don't think we always want to print and exit 1 when we get that exception. E.g., on line 174, if we have a problem there, I want it written the logs. I'm ok with the exception just falling through at that point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed on avoiding the this suggestion iside is_enabled. I think there is another gap that we are missing below though with cloud-init devel handle
calls.
Alter's hotplug hook to have a query mechanism checking if the functionality is enabled. This allows us to avoid using the hotplug socket and service when hotplug is disabled.
Co-authored-by: Chad Smith <chad.smith@canonical.com>
9afb86b
to
7623e9e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides the minor nit about the subparser title I think there are two other things:
UeventHandler.update_metadata raises a RuntimeError when a datasource is not configured to with a matching self.supported_update_event type for the subsystem and or hotplug. This configuration is static, so retrying on this RuntimeError doesn't make sense.
I'd prefer we just try handler.update_metadata and if fail we break the retry look in handle_hotplug.
Example, trying to provide the following in a lxc container:/etc/cloud/cloud.cfg
updates:
network:
when: ['boot', 'hotplug']
The result is blocking repeated RuntimeErrors logged with
RuntimeError: Datasource DataSourceNoCloud [seed=/var/lib/cloud/seed/nocloud-net][dsmode=net] not updated for event hotplug
I think instead we probably want to break the retry loop with a more specific failure mode if someone tries to enable hotplug on an unsupported datasource.
Here's a diff to ponder that fixes up a couple things I've suggested
diff --git a/cloudinit/cmd/devel/hotplug_hook.py b/cloudinit/cmd/devel/hotplug_hook.py
index 5c85c623a..3e1ffeecc 100644
--- a/cloudinit/cmd/devel/hotplug_hook.py
+++ b/cloudinit/cmd/devel/hotplug_hook.py
@@ -38,7 +38,9 @@ def get_parser(parser=None):
choices=['net']
)
- subparsers = parser.add_subparsers(dest='hotplug_action')
+ subparsers = parser.add_subparsers(
+ title="Hotplug Action", dest='hotplug_action'
+ )
subparsers.required = True
subparsers.add_parser(
@@ -100,14 +102,8 @@ class UeventHandler(abc.ABC):
return self.success_fn()
def update_metadata(self):
- result = self.datasource.update_metadata_if_supported([
return self.success_fn()
def update_metadata(self):
- result = self.datasource.update_metadata_if_supported([
+ return self.datasource.update_metadata_if_supported([
EventType.HOTPLUG])
- if not result:
- raise RuntimeError(
- 'Datasource %s not updated for '
- 'event %s' % (self.datasource, EventType.HOTPLUG)
- )
- return result
class NetHandler(UeventHandler):
@@ -194,7 +190,12 @@ def handle_hotplug(
)
try:
LOG.debug('Refreshing metadata')
- event_handler.update_metadata()
+ if not event_handler.update_metadata():
+ LOG.debug(
+ "Datasource%s does not support hotplug events for"
+ " subsystem %s", event_handler.datasource.dsname, subsystem
+ )
+ break # Don't retry as supported_update_events are hard-coded
LOG.debug('Detecting device in updated metadata')
event_handler.detect_hotplugged_device()
LOG.debug('Applying config change')
cloudinit/cmd/devel/hotplug_hook.py
Outdated
@@ -133,6 +149,14 @@ def device_detected(self) -> bool: | |||
} | |||
|
|||
|
|||
def is_enabled(hotplug_init, subsystem): | |||
scope = SUBSYSTEM_PROPERTES_MAP[subsystem][1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed on avoiding the this suggestion iside is_enabled. I think there is another gap that we are missing below though with cloud-init devel handle
calls.
cloudinit/cmd/devel/hotplug_hook.py
Outdated
choices=['net'] | ||
) | ||
|
||
subparsers = parser.add_subparsers(dest='hotplug_action') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Improve title so it matches error message if absent param on cmdline
subparsers = parser.add_subparsers(dest='hotplug_action') | |
subparsers = parser.add_subparsers(title="Hotplug Action", dest='hotplug_action') |
This changes --help from positional arguments:
to Hotplug Action:
We could just make title="Command", dest=command too as lxc & ua-tools do
@blackboxsw , thanks, I definitely missed a condition there.
I updated the code so that before we attempt to do any metadata update, we first fetch the datasource and ensure it is supported and enabled. I didn't do that for the query previously. I still haven't tested yet (so sorry if there's major dumb errors), but let me know if this idea works for you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is excellent @TheRealFalcon . tested on lxc and openstack, cli docs are good and behavior is expected
2021-08-13 20:11:27,422 - handlers.py[DEBUG]: finish: hotplug-hook: SUCCESS: Handle reconfiguration on hotplug events
:)
Proposed Commit Message
Additional Context
n/a
Test Steps
see integration test
Checklist: