Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

238 - as root, when/why is systemd-tty-ask-password-agent spawned? #9507

Closed
SjonHortensius opened this issue Jul 5, 2018 · 10 comments

Comments

3 participants
@SjonHortensius
Copy link
Contributor

commented Jul 5, 2018

systemd version the issue has been seen with

238.133

Used distribution

Archlinux

Expected behaviour you didn't see

systemctl restart service should work

Unexpected behaviour you saw

systemctl restart service hangs, ps shows it is waiting for /usr/bin/systemd-tty-ask-password-agent --watch; but I'm root on that machine and there shouldn't be any need to request extra permissions
This has happened to me intermittently over the years, but recently it seems to have started happening more often. How should I work around this? Can I debug this somehow ? Sometimes it completes after a few minutes (while I'm waiting, doing nothing), sometimes it times out.

@keszybz

This comment has been minimized.

Copy link
Member

commented Jul 6, 2018

systemd-tty-ask-password agent is probably waiting for a password for a disk or similar. When you run sytemctl, the open password requests are handled. You can check if this is the case with systemd-tty-ask-password-agent --list.

It will not do that if not running from a tty, or if --no-ask-password is given.

I understand that this is surprising, but designing the right behaviour is complicated:

  • it is good if systemd immediately ask for a password which is needed to complete an operation (e.g. you say 'sudo systemctl start /encrypted', and immediately get a prompt)
  • it is good if the system was waiting for a password, and it catches the attention of an admin, even accidentally, and is then able to proceed
  • it is bad if an admin action unexpectedly blocks on an unrelated password query
    But because of the asynchronous nature of systemd jobs, distinguishing cases 1 and 3 is hard. It is not possible to say that a password is unrelated to current job, even if the query was already open when current job was started. Because of the asynchronous nature of systemd, something else might have requested some other transaction that also involves this step, and not answering the query might block the latest job too.
@poettering

This comment has been minimized.

Copy link
Member

commented Jul 7, 2018

systemctl restart service hangs, ps shows it is waiting for /usr/bin/systemd-tty-ask-password-agent --watch; but I'm root on that machine and there shouldn't be any need to request extra permissions

How is "ps" supposed to show that?

We do start the password agent in case any of the services invoked need a password, such as LUKS or some SSL certificate or so. It's terminated when systemctl goes away again. The agent is forked off, but we never really wait for it, we continue what we do, and maybe the agent shows a password prompt, or maybe it doesn't, but systemctl doesn't really care or wait for it...

"ps" is not a suitable tool to figure out what something waits for. Use "systemctl list-jobs" and system logs to see what systemd is doing and waiting for

@poettering

This comment has been minimized.

Copy link
Member

commented Jul 7, 2018

anyway, long story short if "systemctl restart foobar" fails to work, then this has almost certainly not something the password agent is involved in.

@keszybz

This comment has been minimized.

Copy link
Member

commented Jul 7, 2018

long story short if "systemctl restart foobar" fails to work, then this has almost certainly not something the password agent is involved in.

Well, ....., hmm, ....., nope. It is.

[fedora@fuefi ~]$ echo WAIT && sudo systemctl restart systemd-logind && echo DONE
WAIT
Please enter passphrase for disk luks-44707e18-b35e-40e1-b8e2-908db3071e99 on /encrypted! 
DONE

The "DONE" appears only after I press enter at the prompt. So yeah, systemctl is blocked.

@poettering

This comment has been minimized.

Copy link
Member

commented Jul 8, 2018

@keszybz no. systemctl doesn't care about the password agent and what it does much beyond starting it. systemctl does care about systemd jobs to finish though, and possibly there might be some job that needs a password to complete. So there might be some transitive blocking, but certainly no immediate blocking...

@keszybz

This comment has been minimized.

Copy link
Member

commented Jul 8, 2018

Ah, OK. systemd-logind/start depends on basic.target, which in turns depends on any mounts and such. So I guess it's a bit surprising, but OK. Maybe we should add some text to systemctl(1) ?

@SjonHortensius

This comment has been minimized.

Copy link
Contributor Author

commented Jul 9, 2018

Thanks for your extensive replies, I'm trying to understand what's happening.

systemd-tty-ask-password agent is probably waiting for a password for a disk or similar. When you run sytemctl, the open password requests are handled. You can check if this is the case with systemd-tty-ask-password-agent --list.

I actually have a machine where I can reliably reproduce this issue. However, executing --list returns nothing. This is my service definition (a default elasticsearch service):

[Service]
Type=forking
RuntimeDirectory=elasticsearch
PIDFile=/run/elasticsearch/%I.pid

Environment=JAVA_HOME=/usr/lib/jvm/default-runtime
Environment=CONF_DIR=/etc/elasticsearch/%I
EnvironmentFile=-/etc/default/elasticsearch

WorkingDirectory=/usr/share/elasticsearch

User=elasticsearch
Group=elasticsearch

ExecStart=/usr/bin/elasticsearch -d \
            -p /run/elasticsearch/%I.pid \
            -E path.conf=${CONF_DIR}

LimitNOFILE=65536
LimitMEMLOCK=infinity

Restart=on-failure
SendSIGKILL=no
TimeoutStopSec=0
SuccessExitStatus=143

Running systemctl restart on this process will hang for a few minutes (while I see systemd-tty-ask-password-agent getting started, but --list in another terminal returns nothing) until it reaches a timeout:

Job for elasticsearch@xxx.service failed because a timeout was exceeded.
See "systemctl status elasticsearch@xxx.service" and "journalctl -xe" for details.

"ps" is not a suitable tool to figure out what something waits for. Use "systemctl list-jobs" and system logs to see what systemd is doing and waiting for

This only lists my elasticsearch@ job. Even after it times out btw

@keszybz

This comment has been minimized.

Copy link
Member

commented Jul 9, 2018

Ah, OK, so my guess was completely wrong and @poettering was right. It's just waiting for elasticsearch to double-fork, and it's not happening correctly. Sorry, we can't help you with elasticsearch.

@SjonHortensius

This comment has been minimized.

Copy link
Contributor Author

commented Jul 9, 2018

Okay, so systemctl retries starting the job when it fails, resulting in a 'hanging' systemctl for about 75 seconds. But why does systemd-tty-ask-password-agent get spawned every time this happens?

Here's a smaller testcase btw:

[Unit]
Description=systemd-issue-9507

[Service]
Type=forking
PIDFile=/run/test.pid

ExecStart=/bin/sh -c '/bin/systemd-issue-9507 &'

LimitNOFILE=65536
LimitMEMLOCK=infinity

Restart=on-failure
SendSIGKILL=no
TimeoutStopSec=0
SuccessExitStatus=143

[Install]
WantedBy=multi-user.target

where /bin/systemd-issue-9507 contains

#!/bin/sh
set -ue

echo $$ >/run/test.pid
sleep 0.5
exit 143

Is systemd doing something smart by interpreting the 143 as a reason to spawn systemd-tty-ask-password-agent?

@SjonHortensius

This comment has been minimized.

Copy link
Contributor Author

commented Jul 11, 2018

Okay, so to summarize, it seemssystemd-tty-ask-password-agent --watch always runs alongside systemctl whenever invoked, this is not an indication of it waiting for anything.
The reason systemctl seemed to hang was simply that it was waiting for the unit to start, and since that failed, it was retrying it a few more times with increasing timeouts.

Maybe systemctl could be a bit more verbose when invoked, eg. it could output that starting failed and it will try again, like it currently logs in the journal.

anyway, not a bug

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.