Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFE: Disable warning for left-over process in cases where we want them to remain (i.e. non-default Killmode=) #7864

Open
bjoe2k4 opened this issue Jan 12, 2018 · 23 comments
Labels
needs-discussion 🤔 pid1 RFE 🎁 Request for Enhancement, i.e. a feature request

Comments

@bjoe2k4
Copy link

bjoe2k4 commented Jan 12, 2018

log_unit_warning(userdata,

Is there any way to avoid being overwhelmed by this warning if someone deliberately wants processes to remain? Any configuration setting?

Background:
I am using monitoring software (munin) which queries other servers (actually a lot of them) via ssh every 5 minutes (oneshot type service triggered by timer). For performance reasons, i am using ssh's ControlMaster feature, which keeps the ssh connections open. Therefore i actually want these leftover ssh processes to remain. Worked very well until v235 and still works with v236, however these warings spam my logs every 5 minutes.

@boucman
Copy link
Contributor

boucman commented Jan 12, 2018

I'm not a ssh expert, nor completely clear of what you are trying to do, but i would create a separate service to keep the master connection opened, and just have the slave connections being triggered by a timer...

if need be, you could play with StopWhenUnneeded in the master connection, with proper dependencies to close the master when no timer need it anymore...

@poettering
Copy link
Member

Uh, not following? ssh sessions should get their own scope units anyway. It appears to be a local misconfiguration if they don't?

not following?

@bjoe2k4
Copy link
Author

bjoe2k4 commented Jan 12, 2018

Okay, let me give an example:

test.service

[Unit]
Description=Test

[Service]
Type=oneshot
KillMode=process
ExecStart=/usr/bin/ssh -oControlMaster=auto -oControlPath=%h/.ssh/ssh.%%h_%%p_%%r -oControlPersist=yes somehost dostuff

Using the ssh ControlMaster options, a background process is spawned by /usr/bin/ssh as soon as this service is executed. Due to the KillMode=process setting, the ssh background process remains.

Log when starting the service two times:

Jan 12 21:47:59 localhost systemd[1]: Starting Test...
Jan 12 21:47:59 localhost systemd[1]: Started Test.
Jan 12 21:48:06 localhost systemd[1]: test.service: Found left-over process 11028 (ssh) in control group while starting unit. Ignoring.
Jan 12 21:48:06 localhost systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 12 21:48:06 localhost systemd[1]: Starting Test...
Jan 12 21:48:06 localhost systemd[1]: Started Test.

Can we at least disable this warning when KillMode= is set explicitly and we actually want processes to remain?

@boucman
Copy link
Contributor

boucman commented Jan 12, 2018

again... if you have processes remaining, why do you want the service to stop ?

don't stop the service, make it a real resident service which is just the master and have another, timer triggered service do the periodic check.

That way, you can easily stop the master connection when you want via systemd. I think the idea of keeping a process when the service terminates as a "normal behaviour" is a mistake...

@bjoe2k4
Copy link
Author

bjoe2k4 commented Jan 12, 2018

I never said i wanted to stop the service... It's a oneshot service that goes inactive once it is done. And again, this is not my software and i don't have the time nor ability to make the necessary changes (the above example service is just a simplification down to the root of the problem).

So you are saying i should create 25 additional services (i am connecting to 25 hosts to collect their data) where each one opens a persistent ssh master process. Sounds like a real waste just to make this warning go away and make my journal usable again ...sigh

@arvidjaar
Copy link
Contributor

arvidjaar commented Jan 14, 2018

@bjoe2k4

I never said i wanted to stop the service.

You said it in unit definition. oneshot means exactly that - do something once and stop the service. According to your own explanation oneshot is wrong here. Quoting "background process is spawned ... [and] ... remains". This is text-book case for Type=forking. I just tested it and it works fine (in my limited testing):

● ssh-master.service - test ssh master mode
   Loaded: loaded (/run/user/1000/systemd/user/ssh-master.service; static; vendor preset: enabled)
   Active: active (running) since Sun 2018-01-14 11:19:38 MSK; 56s ago
  Process: 17828 ExecStart=/usr/bin/ssh -oControlMaster=yes -oControlPath=/tmp/ssh -oControlPersist=yes 10.0.2.2 (code=exited, status=0/SUCCESS)
 Main PID: 17831 (ssh)
   CGroup: /user.slice/user-1000.slice/user@1000.service/ssh-master.service
           └─17831 ssh: /tmp/ssh [mux]                                                                  

Jan 14 11:19:37 10 systemd[2613]: Starting test ssh master mode...
Jan 14 11:19:37 10 ssh[17828]: Pseudo-terminal will not be allocated because stdin is not a terminal.
Jan 14 11:19:37 10 ssh[17828]: Welcome to Ubuntu 16.04.3 LTS (GNU/Linux 4.13.0-26-generic x86_64)
Jan 14 11:19:37 10 ssh[17828]:  * Documentation:  https://help.ubuntu.com
Jan 14 11:19:37 10 ssh[17828]:  * Management:     https://landscape.canonical.com
Jan 14 11:19:37 10 ssh[17828]:  * Support:        https://ubuntu.com/advantage
Jan 14 11:19:37 10 ssh[17828]: 5 packages can be updated.
Jan 14 11:19:37 10 ssh[17828]: 0 updates are security updates.
Jan 14 11:19:38 10 systemd[2613]: Started test ssh master mode.
bor@10:~> systemctl --no-pager --user cat ssh-master.service 
# /run/user/1000/systemd/user/ssh-master.service
[Unit]
Description=test ssh master mode

[Service]
Type=forking
ExecStart=/usr/bin/ssh -oControlMaster=yes -oControlPath=/tmp/ssh -oControlPersist=yes 10.0.2.2
bor@10:~> 

@bjoe2k4
Copy link
Author

bjoe2k4 commented Jan 14, 2018

I have to use Type=oneshot to start this task every 5 minutes and fetch some remote data. afaict, with Type=forking i can run it only once, which defeats the use of the ControlMaster completely.

@arvidjaar
Copy link
Contributor

@bjoe2k4

So you are saying i should create 25 additional services

No, you create single template for master ssh connection and let your monitoring services depend on service instantiated from this template. This way starting monitoring service will automatically start master session; if master session ever terminates, it will be started automatically next time monitoring service runs. And it costs you exactly one additional unit definition.

Can we at least disable this warning when KillMode= is set explicitly and we actually want processes to remain?

You are right in that this warning does not play nicely with ability to leave processes behind after service has stopped.

@bjoe2k4
Copy link
Author

bjoe2k4 commented Jan 14, 2018

No, you create single template for master ssh connection and let your monitoring services depend on service instantiated from this template. This way starting monitoring service will automatically start master session; if master session ever terminates, it will be started automatically next time monitoring service runs. And it costs you exactly one additional unit definition.

It doesn't matter if it is 25 seperate services or instances from a single template, i'd have to keep track of remote host both in the monitoring config as well as the service definition as dependencies. Not very user friendly.

You are right in that this warning does not play nicely with ability to leave processes behind after service has stopped.

Thanks for the support, appreciated.

@bjoe2k4 bjoe2k4 changed the title Ability to turn off left-over process warning RFE: Disable warning for left-over process in cases where we want them to remain (i.e. non-default Killmode=) Jan 15, 2018
@bjoe2k4
Copy link
Author

bjoe2k4 commented Jan 17, 2018

@poettering any opinion on this?

@poettering
Copy link
Member

Hmpf. So I am sorry, but forking off bg threads from arbitrary shell scripts the user invoked is and remains highly problematic since that means that process inherits half of the original execution context and then reusing that in a later instance is very very questionnable.

I sympathesize with what you are trying to do, but from whatever angle you look at it, what you are doing is problematic and hence I think there should be a warning about this like we currently have.

To fix this properly if the ssh client wants to leave a background process around it should fork that off in an independent service (i.e. for example as a systemd --user socket activated service or so), so that it lives in a clean, independent, isolated context. Yes, I am aware openssh is unlikely the project which will embrace such a solution, but doing this mix&match of separate service instantiations (i.e. ssh instance from three invocations ago, combined with a script invocation from right now) is not a good solution, and I think logging about this (but permitting it like we do) is the least we should do.

Sorry, but I am really not convinced the usecase is convincing enough, given that the warning has a good reason and any change to it would be cosmetic at best...

Sorry!

@poettering poettering added RFE 🎁 Request for Enhancement, i.e. a feature request needs-discussion 🤔 pid1 labels Jan 22, 2018
@bjoe2k4
Copy link
Author

bjoe2k4 commented Jan 22, 2018

Thanks for your detailed point of view. However, this is not only to be narrowed down to "arbitrary shell scripts" as in my case, but the same is true for the traditional sshd.service as shipped with the majority of distributions using systemd. Whenever you restart the ssh daemon with systemd v236, the journal gets hammered with 2 n messages (n=number of incoming open connections).

Forking new processes for every incoming connection is the traditional way the ssh daemon has been working for decades, and it has been working well. I cannot really wrap my head around why this way suddenly should be highly problematic or questionable enough in order to force people to change this by littering their journals (which is far from being just cosmetic). Also I do not see any security implications permitting this.

As a compromise, as suggested above, please just disable this warning whenever KillMode= is set explicitly.

@arvidjaar
Copy link
Contributor

@bjoe2k4

Whenever you restart the ssh daemon with systemd v236, the journal gets hammered with 2 n messages (n=number of incoming open connections).

Could you please provide example of these processes before and after restart?

@poettering
Copy link
Member

Thanks for your detailed point of view. However, this is not only to be narrowed down to "arbitrary shell scripts" as in my case, but the same is true for the traditional sshd.service as shipped with the majority of distributions using systemd.

No, that's not the case. Login sessions are generally moved to a scope unit of their own when the user logs in. This is done through pam_systemd. It makes sure that each login session is nicely separated out, and ceases to exist when the user logs out.

Which distribution are you using?

@poettering
Copy link
Member

Whenever you restart the ssh daemon with systemd v236, the journal gets hammered with 2 n messages (n=number of incoming open connections).

If that's the case, then it appears your distribution is not set up correctly. User session processes should not be part of sshd.service.

@bjoe2k4
Copy link
Author

bjoe2k4 commented Jan 25, 2018

Sorry guys, it has appeared that i had UsePAM=no set in my sshd_config.

I agree that the warning is helpful when services are leaving behind processes they are not supposed to start. But i still have the opinion that unconditionally emitting warnings like this is a bad idea regarding usability, especially when it is just a suggestion imposed on the user. When you permit for this setup with configuration options like KillMode= explicitly set, it should work silently and in the end it should be the user who decides. Otherwise there is no point having this option at all. Again, it used to work just fine as it should. Please add such a condition (which has no relevant overhead) or emit this warning only once per boot. Enough is enough.

@Alexander-Shukaev
Copy link

I have another use case. The daemon (wrapped by systemd service) has a purpose of listening to key events and executing corresponding actions. For instance, you hit some combination of keys and modifiers on your keyboard and that is configured to start some application process on the desktop. Now if that daemon is restarted, I don't want the application process that was opened to just terminate as this would be annoying and counter-intuitive. Hence, KillMode=process is used within the service but the warning is completely redundant and noizy in this case because this is the behavior that I asked for. How to avoid the warning in this case?

@Alexander-Shukaev
Copy link

Any progress on this? The simplest example would be xfce4-panel, which when used to start new applications, would have them in its control group. If the service wrapping xfce4-panel is bounced, one gets a wall of nonsense warnings. Restarting a panel should obviously not result in all user applications being shut down. I bet the solution is as already suggested to let KillMode=process be the trigger to forfeit that warning.

@Alexander-Shukaev
Copy link

#6432 looks also very relevant in the light of the current issue.

@MyLogins
Copy link

MyLogins commented Apr 1, 2020

I have another example. I open an openvpn connection with NetworkManager. NetworkManager has an event dispatcher, so i can create some ssh tunnels if connection up. Furthermore i terminate the ssh tunnels before connection going down (via event dispatcher and vpn-pre-down event). In this case i need now warning. NetworkManager-dispatcher.service has KillMode=process too.

@Alexander-Shukaev
Copy link

So, you want a warning that there are some "dangling" SSH connections out there if dispatcher dies earlier? Why? Either you need those connections regardless or you absolutely want them to die together with the dispatcher. For both of these cases KillMode can be set appropriately (apart from the remaining warning issue that we discuss here). Or did I miss your point?

@MyLogins
Copy link

MyLogins commented Apr 1, 2020

No, i want no warning. Because the dispatcher and NetworkManger won't die. The NetworkManager change only the connection state. Perhaps its clearer with following log:

Apr 1 13:03:15 UserPC NetworkManager[30608]: <info> [1585738995.7771] audit: op="connection-deactivate" uuid="ab20b6db-1871-45c5-8173-8c7058351fba" name="Firma" pid=17995 uid=1000 result="success"
Apr 1 13:03:15 UserPC dbus-daemon[710]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service' requested by ':1.2870' (uid=0 pid=30608 comm="/usr/sbin/NetworkManager --no-daemon ")
Apr 1 13:03:15 UserPC systemd[1]: NetworkManager-dispatcher.service: Found left-over process 17932 (sudo) in control group while starting unit. Ignoring.
Apr 1 13:03:15 UserPC systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Apr 1 13:03:15 UserPC systemd[1]: NetworkManager-dispatcher.service: Found left-over process 17933 (sudo) in control group while starting unit. Ignoring.
Apr 1 13:03:15 UserPC systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Apr 1 13:03:15 UserPC systemd[1]: NetworkManager-dispatcher.service: Found left-over process 17936 (ssh) in control group while starting unit. Ignoring.
Apr 1 13:03:15 UserPC systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Apr 1 13:03:15 UserPC systemd[1]: NetworkManager-dispatcher.service: Found left-over process 17938 (ssh) in control group while starting unit. Ignoring.
Apr 1 13:03:15 UserPC systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Apr 1 13:03:15 UserPC systemd[1]: Starting Network Manager Script Dispatcher Service...
Apr 1 13:03:15 UserPC dbus-daemon[710]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Apr 1 13:03:15 UserPC systemd[1]: Started Network Manager Script Dispatcher Service.
Apr 1 13:03:15 UserPC nm-dispatcher: req:1 'vpn-pre-down' [tun0]: new request (1 scripts)
Apr 1 13:03:15 UserPC nm-dispatcher: req:1 'vpn-pre-down' [tun0]: start running ordered scripts...
Apr 1 13:03:15 UserPC nm-dispatcher[17999]: NetworkManager: VPN connection to Firma terminated, terminate ssh tunnel
Apr 1 13:03:15 UserPC NetworkManager[30608]: <info> [1585738995.8396] device (tun0): state change: activated -> unmanaged (reason 'connection-assumed', sys-iface-state: 'external')
Apr 1 13:03:15 UserPC nm-dispatcher: req:2 'vpn-down' [tun0]: new request (2 scripts)
Apr 1 13:03:15 UserPC nm-dispatcher: req:2 'vpn-down' [tun0]: start running ordered scripts...
Apr 1 13:03:15 UserPC nm-openvpn[17889]: SIGTERM[hard,] received, process exiting
Apr 1 13:03:15 UserPC NetworkManager[30608]: <info> [1585738995.8892] vpn-connection[0x55f8c1edc6d0,ab20b6db-1871-45c5-8173-8c7058351fba,"Firma",0]: VPN plugin: state changed: stopping (5)
Apr 1 13:03:15 UserPC NetworkManager[30608]: <info> [1585738995.8893] vpn-connection[0x55f8c1edc6d0,ab20b6db-1871-45c5-8173-8c7058351fba,"Firma",0]: VPN plugin: state changed: stopped (6)
Apr 1 13:03:15 UserPC nm-dispatcher: req:3 'down' [tun0]: new request (2 scripts)
Apr 1 13:03:15 UserPC nm-dispatcher: req:3 'down' [tun0]: start running ordered scripts...

In following line you can see my action (terminate ssh tunnel) before ovenvp connection going down:
Apr 1 13:03:15 UserPC nm-dispatcher[17999]: NetworkManager: VPN connection to Firma terminated, terminate ssh tunnel

But before this happend, systemd shows (unwanted) warnings.

@socketpair
Copy link

socketpair commented Feb 25, 2021

I use KillMode=mixed. So, the warning is misleading. It makes me think that processes stay alive. But actually they are not. They are finally get killed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-discussion 🤔 pid1 RFE 🎁 Request for Enhancement, i.e. a feature request
Development

No branches or pull requests

7 participants