New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix inconsistent device state transition #13002
base: main
Are you sure you want to change the base?
Fix inconsistent device state transition #13002
Conversation
This comment has been minimized.
This comment has been minimized.
Please hold on, I gave it more testing and there is still an issue, investigating... |
74eea20
to
f6b5399
Compare
@@ -176,6 +176,7 @@ static int device_coldplug(Unit *u) { | |||
} | |||
|
|||
static void device_catchup(Unit *u) { | |||
Manager *m = u->manager; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, can you elaborate why you need to know at all whether the udev db is trusted or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's supposed to be describe in the commit message. Which part of it should be improved ?
f6b5399
to
319ac3e
Compare
@poettering new version force pushed, hopefully I haven't missed anything this time. |
BTW this can be tested by reloading PID1 right before udevd is executed:
Without this patchset:
With this patchset:
|
Please force-push this to re-trigger CIs. |
319ac3e
to
d4cae70
Compare
No functional changes.
d->enumerated_found was used to store the new 'found' value recalculated during the enumeration phase while d->found was keeping the previous value. However there's no such needs as the previous value is already kept by d->deserialized_found, so let's use d->found instead to store the new one.
Rather than assuming that the device enumeration can be trusted based on the manager state (reloading vs reexecuting), we now simply look at the udev DB itself - ie if no single devices couldn't have been enumerated then we assume that the DB is not yet initialized [1] and thereby don't trust the enumeration phase. In this case we restore the serialized state and rely on events retriggering to update the view on device PID1 had before it reexecuted. [1] Maybe there's a better way to detect this state. Fixes: systemd#12953
No functional changes.
319ac3e
to
5b4341b
Compare
@yuwata rebased and force pushed. |
* because reload of PID1 can happen at anytime especially early during the boot | ||
* process. So in order to figure this out, we simply assume that if the DB is | ||
* fully populated, we should at least enumerate one device otherwise we consider | ||
* that the device enumeration hasn't happened yet hence we shouldn't trust it */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the assumption true even if there exist a rule which has db_persistent
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more question. What happens when udev is processing devices? That is, some devices are initialized, but some are not.
What happens if |
@yuwata yeah, you're right it's still racy :( And the whole sequence "cleanup DB/retrigger events" is completely asynchronous... I can't see any way to know when the DB is full repopulated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry for the late review.
FOREACH_DEVICE(e, dev) { | ||
const char *sysfs; | ||
|
||
m->honor_device_enumeration = true; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, i find this a tiny bit sloppy. I wonder if we can find a better approach here.
maybe simply make systemd-udev-trigger.service a magic service: we trust udev events exactly as long as it is up and running.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly, I don't remember all the details so I'm not sure to get your idea...
@poettering, I think the whole mess is due to the fact that udevd needs to be stopped while switching root. This implies its DB can be outdated and I guess that's the reason why the DB needs to be cleared (to not miss any "REMOVE" uevents) and we have to retrigger all events (to not miss any "ADD" uevents) before starting udevd again. Another thing, also part of the problem, is that we don't know when the DB is fully updated after systemd-udev-trigger.service has been fired up and I don't think that the fact that either It would be much simpler if udevd could survive from the switch root somehow. This way no event would be lost and all events that PID1 is monitoring would be queued. Hence reexecuting (and regardless if it was after switching root) would work like reloading, PID1 would just need to read the socket and process events (if any) as usual. In fact this might make the device enumeration step no more necessary... (BTW why do we need it when PID1 is simply reloading or reexecuting without Hmm or maybe rather than trying to make the whole udevd process survive from the switch root, we could try to leave the udev's netlink socket opened somehow and reuse it after udevd is restarted later. This way, both "ADD" and "REMOVE" device uevents could still be queued while udevd is temporary unavailable ? We have already an similar mechanism to temporary store fds in PID1 but it only works for service restarting currently. Maybe we could extend it a bit and make it possible to store fds "permanently" ? |
No description provided.