Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

logind-user: switch tracking logic to user-runtime-dir@.service #30910

Merged
merged 5 commits into from Feb 15, 2024

Conversation

YHNdnzj
Copy link
Member

@YHNdnzj YHNdnzj commented Jan 12, 2024

Also fixes #31287

@github-actions github-actions bot added logind please-review PR is ready for (re-)review by a maintainer labels Jan 12, 2024
@YHNdnzj YHNdnzj marked this pull request as draft January 12, 2024 20:09
@YHNdnzj YHNdnzj added dont-merge 💣 and removed please-review PR is ready for (re-)review by a maintainer labels Jan 12, 2024
@YHNdnzj YHNdnzj added this to the v256 milestone Jan 13, 2024
@YHNdnzj YHNdnzj changed the title logind-session: use Requires= for user{,-runtime-dir}@.service logind: decouple user tracking logic from service_job Jan 13, 2024
YHNdnzj added a commit to YHNdnzj/systemd that referenced this pull request Jan 13, 2024
Consider the following situation:

1. A 'background-light' session is created (i.e. no user service manager
   is needed)
2. The new session scope unit pulls in user-runtime-dir@.service through
   Want= (weak dep)
3. The session exits. A stop job is enqueued for user-runtime-dir@.service
   due to StopWhenUnneeded=yes
4. At the same time, another session which requires user manager is started.
   However, the session scope unit has JobMode=fail, and the start job
   for user-runtime-dir@.service is deleted as it conflicts with the
   stop job.

As we do require these services, let's use Requires=.

Normally, logind should enqueue a start job for user@.service explicitly
through user_start(). Yet this cannot happen if the user is already started,
and the existing session requires no user manager. This will be fixed later
in systemd#30910.
@YHNdnzj YHNdnzj changed the title logind: decouple user tracking logic from service_job logind: don't fully rely on service_job for user tracking Jan 13, 2024
@github-actions github-actions bot added the units label Jan 13, 2024
@YHNdnzj YHNdnzj changed the title logind: don't fully rely on service_job for user tracking logind-user: switch tracking logic to user-runtime-dir@.service Jan 13, 2024
@YHNdnzj YHNdnzj marked this pull request as ready for review January 13, 2024 21:20
@github-actions github-actions bot added the please-review PR is ready for (re-)review by a maintainer label Jan 13, 2024
@YHNdnzj YHNdnzj force-pushed the logind-followup branch 2 times, most recently from 6ffb264 to 03a029a Compare January 14, 2024 08:24
@YHNdnzj YHNdnzj force-pushed the logind-followup branch 2 times, most recently from 6979992 to 6291942 Compare January 16, 2024 07:04
@YHNdnzj
Copy link
Member Author

YHNdnzj commented Jan 17, 2024

@poettering Any chance you can bless this? TEST-35-LOGIN is currently racy (e.g. https://jenkins-systemd.apps.ocp.cloud.ci.centos.org/job/upstream-centos8/9519/)

@YHNdnzj
Copy link
Member Author

YHNdnzj commented Jan 17, 2024

@mrc0mmand Could you test this PR and see if the race is indeed gone?

@poettering
Copy link
Member

so, hmmm unless i am misunderstand this this would have the effect that sessions would not wait for user@.service to fully become up before they are allowed to proceed? but that's a problem.

we should make sure that regular session login is delayed until user@.service is fully up, only those of type "manager" should be able to come up before that.

This pretty much requires the user@.service service job to be tracked in all cases.

I have the suspicion we should have two service fields: one for tracking user@.service jobs and one for tracking user-runtime-dir@.service. And then regular sessions are allowed to complete statr when both jobs are finished, and "manager" sessions are allowed once the latter session type finishes.

or am i misreading this PR?

Copy link
Member

@keszybz keszybz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks all reasonable. All that login stuff is always hard to get right, but at the conceptual level, this sounds like the right approach.

(!SESSION_CLASS_WANTS_SERVICE_MANAGER(s->class) || !s->user->service_job);
return s->scope_job ||
s->user->runtime_dir_job ||
(SESSION_CLASS_WANTS_SERVICE_MANAGER(s->class) && s->user->service_manager_job);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually need to check the class here? If the job is there, we might as well wait for it to finish. (It doesn't matter greatly, but the condition "are there any jobs" is easier than "are the any jobs that meeting the following criteria".)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then manager sessions would deadlock, since they'll start waiting for themselves to finish startup.

src/login/logind-dbus.c Outdated Show resolved Hide resolved
src/core/manager.c Outdated Show resolved Hide resolved
src/login/logind-user.h Outdated Show resolved Hide resolved
@poettering
Copy link
Member

lgtm, thanks for working on this.

and sorry for taking so long on the review.

@YHNdnzj
Copy link
Member Author

YHNdnzj commented Feb 14, 2024

Rebased onto #30226 and added one more commit. @poettering and @mrc0mmand, please take a final look.

Copy link
Member

@mrc0mmand mrc0mmand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been hammering this PR with TEST-35 for the past 2+ hours and it hasn't failed once, so LGTM, thank you!

@YHNdnzj YHNdnzj added good-to-merge/waiting-for-ci 👍 PR is good to merge, but CI hasn't passed at time of review. Please merge if you see CI has passed and removed good-to-merge/with-minor-suggestions labels Feb 15, 2024
…temd#30226)

We don't usually say ", refusing" in bus error messages.
Also, make use of unref_and_replace_full.
…ir@.service

Before systemd#30884, the user state is tied to user@.service (user service
manager). However, systemd#30884 introduced sessions that need no manager,
and we can no longer rely on that.

Consider the following situation:

1. A 'background-light' session '1' is created (i.e. no user service manager
   is needed)
2. Session '1' scope unit pulls in user-runtime-dir@.service
3. Session '1' exits. A stop job is enqueued for user-runtime-dir@.service
   due to StopWhenUnneeded=yes
4. At the same time, another session '2' which requires user manager is started.
   However, session scope units have JobMode=fail, therefore the start job
   for user-runtime-dir@.service that was pulled in by session '2' scope job
   is deleted as it conflicts with the stop job.

We want session scope units to continue using JobMode=fail, but we still need
the dependencies to be started correctly, i.e. explicitly requested by logind
beforehand. Therefore, let's stop using StopWhenUnneeded=yes for
user-runtime-dir@.service, and track users' `started` and `stopping` state
based on that when user@.service is not needed. Then, for every invocation
of user_start(), we'll recheck if we need the service manager and start it
if so.

Also, the dependency type on user-runtime-dir@.service from user@.service
is upgraded to `BindsTo=`, in order to ensure that when logind stops the
former, the latter is stopped as well.
Since we do require these basic user services, let's make
the dependency stronger. Note that logind should enqueue
start jobs for these already in user_start(), so mostly
just paranoia.
@poettering
Copy link
Member

lgtm, thanks!

@mrc0mmand
Copy link
Member

The Testing Farm fail is unrelated (it looks like the intermittent QEMU segfaults were replaced by intermittent hangs, lovely).

@bluca bluca merged commit 55d37de into systemd:main Feb 15, 2024
46 of 49 checks passed
@github-actions github-actions bot removed the good-to-merge/waiting-for-ci 👍 PR is good to merge, but CI hasn't passed at time of review. Please merge if you see CI has passed label Feb 15, 2024
@YHNdnzj YHNdnzj deleted the logind-followup branch February 15, 2024 14:38
ayhamthemayhem pushed a commit to neighbourhoodie/nh-systemd that referenced this pull request Mar 25, 2024
Currently, user_start() doesn't return any error,
but let's not eat up the return value.
Preparation for systemd#30910, after which user_start()
does return error.
@mrc0mmand mrc0mmand added login and removed logind labels Apr 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

Recent logind changes break GNOME
5 participants