-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a linux_systemd job adapter #743
Conversation
on systemd-run and systemctl to manage "jobs".
Hi, thanks for this. I've been thinking about a localhost adapater - and this may suite the bill just fine. It does need work - even the linux host adapter does too. I need time to look this over. It may go into a different branch while we work somethings out. I'm not familiar with running systemd in usersspace ( |
No privileges needed, at least not in the default systemd setup. The only thing needed is to enable lingering (loginctl enable-linger) or else the newly created service gets terminated as soon as the ssh session creating it exits. Archlinux has a really well written documentation about much of this on https://wiki.archlinux.org/title/systemd/User |
I'm fairly certain RHEL7 does not support user units, but looks like RHEL8 does support this. https://access.redhat.com/solutions/5101061 |
lib/ood_core/job/adapters/linux_systemd/templates/script_wrapper.erb.sh
Outdated
Show resolved
Hide resolved
I'm having trouble testing this. Trying to avoid spawning a VM. I have a rockylinux:8 container with systemd but it's getting stuck on the I may have to still spawn that VM. In any case - I'm on this and trying to replicate it. |
I think this is just about ready to go - one last thing - can we change from |
Renamed. Thanks! |
Awesome. Now, last last thing, I'm ready to pull this in. Do you want to squash the commits yourself and I pull that commit in or do you want me to just squash and merge and take care of the commit message? |
Please go ahead, I'm sure you can do it much much faster than I can. Thanks! |
Sorry I didn't see this before - but you're commits are not tied to your github user. I think it's an issue with the email + username combination? I want to be sure you get credit. |
If you mean that I didn't have that particular email address associated with my github account - I just added it. Does that make it look better? Thanks for thinking about it... |
That's it exactly. TYSM for the addition! |
This is a job adapter that uses systemd - in particular systemd-run and systemctl show. It is heavily based on linux_host adapter with minimal changes.
Its configuration is even simpler than linux_host, e.g.:
v2:
metadata:
title: My Login node
login:
host: login1
job:
cluster: login1
bin: "/usr/bin"
adapter: linux_systemd
submit_host: login1
ssh_hosts:
- login1
site_timeout: 2678400
strict_host_checking: false
As far as how it works - systemd-run is used to create a transient systemd user unit - this bit from linux_systemd/templates/script_wrapper.erb illustrates well what can/is set:
systemd-run --user -r --no-block --unit=<%= session_name %> -p RuntimeMaxSec=<%= script_timeout %>
-p ExecStartPre="$systemd_service_tmp_file_pre" -p ExecStartPost="$systemd_service_tmp_file_post"
-p StandardOutput="file:<%= output_path %>" -p StandardError="file:<%= error_path %>"
-p Description="<%= job_name %>" "$systemd_service_tmp_file"
This creates a user unit called session_name (starting with "ondemand-"). Note that systemd-run takes care of much of the required functionality - timeout, pre/post scripts (for emails), description ... Programs started under such units live in their own systemd slice/cgroup and therefore can be managed/stopped/limited.
To check on a "job" status "systemctl --user show -t service --state=running ondemand-*" is parsed.
One caution - some of the systemd features might require a new enough systemd version - in particular StandardOutput=file:/..... This was tested to work on RHEL8 with desktop app.
This code clearly requires more work - in particular it is missing testing/specs entirely (sorry, don't know how :( ) but maybe you and others could find it useful as is. More testing and maybe more scripts for reacting to job failures and sending emails are needed. Per job limits (memory/cpu) could be added fairly easily - we didn't need them for our desktop app - we are relying on systemd user cgroup based limits.
┆Issue is synchronized with this Asana task by Unito