Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
28b161b
add cloud_nodes option/templating
sjpb Sep 8, 2021
a770503
Merge branch 'feature/flexi-config' into feature/autoscale
sjpb Sep 8, 2021
f787c51
add openhpc_slurm_dirs variable
sjpb Sep 23, 2021
1e87ade
Merge branch 'master' into feature/autoscale
sjpb Sep 23, 2021
b3dba86
Fix errors for login-only nodes not matching compute node specs (issu…
sjpb Sep 23, 2021
eb85c8b
Merge branch 'fix/116' into feature/autoscale
sjpb Sep 23, 2021
5548366
Merge commit 'cd381cf' into feature/autoscale
sjpb Sep 23, 2021
2f21c0a
use hostlist expressions for shorter config
sjpb Sep 23, 2021
b7b14b7
rename group_hosts filter plugin -> hostlist_expression for clarity
sjpb Sep 24, 2021
4f8fa0e
WIP: support cloud-only partitions (see TODOs and fix docs)
sjpb Sep 24, 2021
11703b9
fix State for cloud_nodes
sjpb Sep 24, 2021
2ced32b
WIP/FAIL add partition info back in
sjpb Sep 24, 2021
4e99892
fix cloud node templating
sjpb Sep 24, 2021
7cd9706
Add openhpc_suspend_exc_nodes
sjpb Sep 24, 2021
3cd5a8e
fix galaxy meta info for CI tests
sjpb Sep 23, 2021
92b4864
Revert "add openhpc_slurm_dirs variable"
sjpb Sep 28, 2021
399bd61
support specifying cpus=
sjpb Sep 28, 2021
a868d94
add support for features
sjpb Sep 28, 2021
2816433
deprecate openhpc_extra_config in favour of openhpc_config
sjpb Sep 28, 2021
1b24122
use cloud_nodes directly in forming nodenames
sjpb Sep 29, 2021
840fff3
add cloud_features support
sjpb Sep 29, 2021
1f7bcf4
fix default value fall-through - removes cpus
sjpb Sep 29, 2021
97a25b4
Merge branch 'master' into feature/autoscale
sjpb Sep 29, 2021
1a1fc8e
remove cloud_features and cope with empty features
sjpb Sep 30, 2021
07e33f5
raise error (again) if cpu/memory info not specified
sjpb Sep 30, 2021
534ae78
remove duplicate CLOUD definition
sjpb Sep 30, 2021
c23a735
support lists as values for openhpc_config
sjpb Sep 30, 2021
0d13fe2
revert partition defaults to previous documentation
sjpb Oct 5, 2021
d42c71d
fix #122 re. configless options
sjpb Oct 5, 2021
6595cce
remove cloud_nodes and features and replace with extra_nodes
sjpb Oct 5, 2021
242faaf
remove use of DEFAULT for node config to avoid configs 'falling through'
sjpb Oct 6, 2021
da5a043
add extra_nodes to README and reorganise
sjpb Oct 6, 2021
726f1c2
remove openhpc_suspend_exc_nodes - can be done through openhpc_config
sjpb Oct 7, 2021
f8fe1d2
minimise diff to master
sjpb Oct 7, 2021
8481261
Revert "minimise diff to master"
sjpb Oct 7, 2021
cd2435a
fix slurmctld location when configless
sjpb Oct 7, 2021
3b387e6
revert README to master
sjpb Oct 7, 2021
dfa9045
document fact ram_multipler is part of partition/group definition
sjpb Oct 7, 2021
a9f395a
clarify group/partition definition
sjpb Oct 7, 2021
fae7ae0
move openhpc_ram_multiplier to correct place
sjpb Oct 12, 2021
e672b65
silence risky-file-permissions lint error
sjpb Oct 12, 2021
baef09f
add molecule support for DOCKER_MTU env var
sjpb Oct 13, 2021
2f938f4
reallow empty inventory groups as per docs
sjpb Oct 13, 2021
cc262b2
allow NOT setting DOCKER_MTU for test6
sjpb Oct 13, 2021
a803905
support DOCKER_MTU for test5
sjpb Oct 13, 2021
b6416fd
fix docker networking for molecule test5
sjpb Oct 14, 2021
bca9709
add tests 13 and 14 to github CI
sjpb Oct 14, 2021
97268e6
fix test13 verification
sjpb Oct 14, 2021
1617c96
make openhpc_slurm_configless depend on openhpc_config
sjpb Nov 11, 2021
717752c
simplify enable_configless logic
sjpb Nov 11, 2021
505661e
clarify assert logic
sjpb Jan 4, 2022
9b04782
clarify why invalid IPs are used
sjpb Jan 4, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,8 @@ jobs:
- test10
- test11
- test12
- test13
- test14

exclude:
- image: 'centos:7'
Expand All @@ -61,6 +63,10 @@ jobs:
scenario: test11
- image: 'centos:7'
scenario: test12
- image: 'centos:7'
scenario: test13
- image: 'centos:7'
scenario: test14

steps:
- name: Check out the codebase.
Expand Down
19 changes: 11 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,32 +39,35 @@ package in the image.

`openhpc_module_system_install`: Optional, default true. Whether or not to install an environment module system. If true, lmod will be installed. If false, You can either supply your own module system or go without one.

`openhpc_ram_multiplier`: Optional, default `0.95`. Multiplier used in the calculation: `total_memory * openhpc_ram_multiplier` when setting `RealMemory` for the partition in slurm.conf. Can be overriden on a per partition basis using `openhpc_slurm_partitions.ram_multiplier`. Has no effect if `openhpc_slurm_partitions.ram_mb` is set.

### slurm.conf

`openhpc_slurm_partitions`: list of one or more slurm partitions. Each partition may contain the following values:
* `groups`: If there are multiple node groups that make up the partition, a list of group objects can be defined here.
Otherwise, `groups` can be omitted and the following attributes can be defined in the partition object:
* `name`: The name of the nodes within this group.
* `cluster_name`: Optional. An override for the top-level definition `openhpc_cluster_name`.
* `extra_nodes`: Optional. A list of additional node definitions, e.g. for nodes in this group/partition not controlled by this role. Each item should be a dict, with keys/values as per the ["NODE CONFIGURATION"](https://slurm.schedmd.com/slurm.conf.html#lbAE) docs for slurm.conf. Note the key `NodeName` must be first.
* `ram_mb`: Optional. The physical RAM available in each server of this group ([slurm.conf](https://slurm.schedmd.com/slurm.conf.html) parameter `RealMemory`) in MiB. This is set using ansible facts if not defined, equivalent to `free --mebi` total * `openhpc_ram_multiplier`.

For each group (if used) or partition there must be an ansible inventory group `<cluster_name>_<group_name>`, with all nodes in this inventory group added to the group/partition. Note that:
- Nodes may have arbitrary hostnames but these should be lowercase to avoid a mismatch between inventory and actual hostname.
- Nodes in a group are assumed to be homogenous in terms of processor and memory.
- An inventory group may be empty, but if it is not then the play must contain at least one node from it (used to set processor information).
* `ram_multiplier`: Optional. An override for the top-level definition `openhpc_ram_multiplier`. Has no effect if `ram_mb` is set.
* `ram_multiplier`: Optional. An override for the top-level definition `openhpc_ram_multiplier`. Has no effect if `ram_mb` is set.
* `default`: Optional. A boolean flag for whether this partion is the default. Valid settings are `YES` and `NO`.
* `maxtime`: Optional. A partition-specific time limit in hours, minutes and seconds ([slurm.conf](https://slurm.schedmd.com/slurm.conf.html) parameter `MaxTime`). The default value is
given by `openhpc_job_maxtime`.

For each group (if used) or partition any nodes in an ansible inventory group `<cluster_name>_<group_name>` will be added to the group/partition. Note that:
- Nodes may have arbitrary hostnames but these should be lowercase to avoid a mismatch between inventory and actual hostname.
- Nodes in a group are assumed to be homogenous in terms of processor and memory.
- An inventory group may be empty, but if it is not then the play must contain at least one node from it (used to set processor information).
- Nodes may not appear in more than one group.
- A group/partition definition which does not have either a corresponding inventory group or a `extra_nodes` will raise an error.

`openhpc_job_maxtime`: A maximum time job limit in hours, minutes and seconds. The default is `24:00:00`.

`openhpc_cluster_name`: name of the cluster

`openhpc_config`: Mapping of additional parameters and values for `slurm.conf`. Note these will override any included in `templates/slurm.conf.j2`.

`openhpc_ram_multiplier`: Optional, default `0.95`. Multiplier used in the calculation: `total_memory * openhpc_ram_multiplier` when setting `RealMemory` for the partition in slurm.conf. Can be overriden on a per partition basis using `openhpc_slurm_partitions.ram_multiplier`. Has no effect if `openhpc_slurm_partitions.ram_mb` is set.

`openhpc_state_save_location`: Optional. Absolute path for Slurm controller state (`slurm.conf` parameter [StateSaveLocation](https://slurm.schedmd.com/slurm.conf.html#OPT_StateSaveLocation))

#### Accounting
Expand Down
3 changes: 2 additions & 1 deletion defaults/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@ openhpc_resume_timeout: 300
openhpc_retry_delay: 10
openhpc_job_maxtime: 24:00:00
openhpc_config: "{{ openhpc_extra_config | default({}) }}"
openhpc_slurm_configless: "{{ 'enable_configless' in openhpc_config.get('SlurmctldParameters', []) }}"

openhpc_state_save_location: /var/spool/slurm

# Accounting
Expand Down Expand Up @@ -49,7 +51,6 @@ ohpc_slurm_services:
ohpc_release_repos:
"7": "https://github.com/openhpc/ohpc/releases/download/v1.3.GA/ohpc-release-1.3-1.el7.x86_64.rpm" # ohpc v1.3 for Centos 7
"8": "http://repos.openhpc.community/OpenHPC/2/CentOS_8/x86_64/ohpc-release-2-1.el8.x86_64.rpm" # ohpc v2 for Centos 8
openhpc_slurm_configless: false
openhpc_munge_key:
openhpc_login_only_nodes: ''
openhpc_module_system_install: true
Expand Down
33 changes: 28 additions & 5 deletions filter_plugins/group_hosts.py → filter_plugins/slurm_conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@
# License for the specific language governing permissions and limitations
# under the License.

# NB: To test this from the repo root run:
# ansible-playbook -i tests/inventory -i tests/inventory-mock-groups tests/filter.yml

from ansible import errors
import jinja2
Expand All @@ -30,11 +32,25 @@ def _get_hostvar(context, var_name, inventory_hostname=None):
namespace = context["hostvars"][inventory_hostname]
return namespace.get(var_name)

@jinja2.contextfilter
def group_hosts(context, group_names):
return {g:_group_hosts(context["groups"].get(g, [])) for g in sorted(group_names)}
def hostlist_expression(hosts):
""" Group hostnames using Slurm's hostlist expression format.
E.g. with an inventory containing:
[compute]
dev-foo-0 ansible_host=localhost
dev-foo-3 ansible_host=localhost
my-random-host
dev-foo-4 ansible_host=localhost
dev-foo-5 ansible_host=localhost
dev-compute-0 ansible_host=localhost
dev-compute-1 ansible_host=localhost
Then "{{ groups[compute] | hostlist_expression }}" will return:
["dev-foo-[0,3-5]", "dev-compute-[0-1]", "my-random-host"]
"""

def _group_hosts(hosts):
results = {}
unmatchable = []
for v in hosts:
Expand All @@ -58,9 +74,16 @@ def _group_numbers(numbers):
prev = v
return ','.join(['{}-{}'.format(u[0], u[-1]) if len(u) > 1 else str(u[0]) for u in units])

def error(condition, msg):
""" Raise an error if condition is not True """

if not condition:
raise errors.AnsibleFilterError(msg)

class FilterModule(object):

def filters(self):
return {
'group_hosts': group_hosts
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks like its backwards incompatible? Could we keep old and new easily enough?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is the filter name is massively confusing when used in the templating which also used group_hosts as a variable containing the list of hosts in that group. I can't see why backward compat is required as I can't see a use-case for another playbook using this filter.

'hostlist_expression': hostlist_expression,
'error': error,
}
1 change: 1 addition & 0 deletions molecule/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ test10 | 1 | N | As for #5 but then tries to ad
test11 | 1 | N | As for #5 but then deletes a node (actually changes the partition due to molecule/ansible limitations)
test12 | 1 | N | As for #5 but enabling job completion and testing `sacct -c`
test13 | 1 | N | As for #5 but tests `openhpc_config` variable.
test14 | 1 | | As for #5 but also tests `extra_nodes` via State=DOWN nodes.

# Local Installation & Running

Expand Down
4 changes: 2 additions & 2 deletions molecule/test13/verify.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,8 @@
command: scontrol show config
register: slurm_config
- assert:
that: "item in slurm_config.stdout"
that: "item in (slurm_config.stdout_lines | map('replace', ' ', ''))"
fail_msg: "FAILED - {{ item }} not found in slurm config"
loop:
- SlurmctldSyslogDebug=error
- SlurmctldSyFirstJobId=13
- FirstJobId=13
30 changes: 30 additions & 0 deletions molecule/test14/converge.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
---
- name: Converge
hosts: all
tasks:
- name: "Include ansible-role-openhpc"
include_role:
name: "{{ lookup('env', 'MOLECULE_PROJECT_DIRECTORY') | basename }}"
vars:
openhpc_enable:
control: "{{ inventory_hostname in groups['testohpc_login'] }}"
batch: "{{ inventory_hostname in groups['testohpc_compute'] }}"
runtime: true
openhpc_slurm_control_host: "{{ groups['testohpc_login'] | first }}"
openhpc_slurm_partitions:
- name: "compute"
extra_nodes:
# Need to specify IPs for the non-existent State=DOWN nodes, because otherwise even in this state slurmctld will exclude a node with no lookup information from the config.
# We use invalid IPs here (i.e. starting 0.) to flag the fact the nodes shouldn't exist.
# Note this has to be done via slurm config rather than /etc/hosts due to Docker limitations on modifying the latter.
- NodeName: fake-x,fake-y
NodeAddr: 0.42.42.0,0.42.42.1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use internal IPs here? like 10.42.42.0?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this also affect cloud nodes that do not exist?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JohnGarbutt does 9b04782 clarify why I'm using invalid IPs rather than internal ones?

It doesn't affect cloud nodes which don't exist as they will have been listed in the config as State=CLOUD, not State=DOWN. The former specifically means slurmctld knows it doesn't know how to contact them until they're "resumed".

State: DOWN
CPUs: 1
- NodeName: fake-2cpu-[3,7-9]
NodeAddr: 0.42.42.3,0.42.42.7,0.42.42.8,0.42.42.9
State: DOWN
CPUs: 2
openhpc_cluster_name: testohpc
openhpc_slurm_configless: true

60 changes: 60 additions & 0 deletions molecule/test14/molecule.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
---
name: single partition, group is partition
driver:
name: docker
platforms:
- name: testohpc-login-0
image: ${MOLECULE_IMAGE}
pre_build_image: true
groups:
- testohpc_login
command: /sbin/init
tmpfs:
- /run
- /tmp
volumes:
- /sys/fs/cgroup:/sys/fs/cgroup:ro
networks:
- name: net1
docker_networks:
- name: net1
driver_options:
com.docker.network.driver.mtu: ${DOCKER_MTU:-1500} # 1500 is docker default
- name: testohpc-compute-0
image: ${MOLECULE_IMAGE}
pre_build_image: true
groups:
- testohpc_compute
command: /sbin/init
tmpfs:
- /run
- /tmp
volumes:
- /sys/fs/cgroup:/sys/fs/cgroup:ro
networks:
- name: net1
docker_networks:
- name: net1
driver_options:
com.docker.network.driver.mtu: ${DOCKER_MTU:-1500} # 1500 is docker default
- name: testohpc-compute-1
image: ${MOLECULE_IMAGE}
pre_build_image: true
groups:
- testohpc_compute
command: /sbin/init
tmpfs:
- /run
- /tmp
volumes:
- /sys/fs/cgroup:/sys/fs/cgroup:ro
networks:
- name: net1
docker_networks:
- name: net1
driver_options:
com.docker.network.driver.mtu: ${DOCKER_MTU:-1500} # 1500 is docker default
provisioner:
name: ansible
verifier:
name: ansible
12 changes: 12 additions & 0 deletions molecule/test14/verify.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
---

- name: Check slurm hostlist
hosts: testohpc_login
tasks:
- name: Get slurm partition info
command: sinfo --noheader --format="%P,%a,%l,%D,%t,%N" # using --format ensures we control whitespace
register: sinfo
- name:
assert: # PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
that: "sinfo.stdout_lines == ['compute*,up,60-00:00:00,6,down*,fake-2cpu-[3,7-9],fake-x,fake-y', 'compute*,up,60-00:00:00,2,idle,testohpc-compute-[0-1]']"
fail_msg: "FAILED - actual value: {{ sinfo.stdout_lines }}"
12 changes: 12 additions & 0 deletions molecule/test5/molecule.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,10 @@ platforms:
- /sys/fs/cgroup:/sys/fs/cgroup:ro
networks:
- name: net1
docker_networks:
- name: net1
driver_options:
com.docker.network.driver.mtu: ${DOCKER_MTU:-1500} # 1500 is docker default
- name: testohpc-compute-0
image: ${MOLECULE_IMAGE}
pre_build_image: true
Expand All @@ -29,6 +33,10 @@ platforms:
- /sys/fs/cgroup:/sys/fs/cgroup:ro
networks:
- name: net1
docker_networks:
- name: net1
driver_options:
com.docker.network.driver.mtu: ${DOCKER_MTU:-1500} # 1500 is docker default
- name: testohpc-compute-1
image: ${MOLECULE_IMAGE}
pre_build_image: true
Expand All @@ -42,6 +50,10 @@ platforms:
- /sys/fs/cgroup:/sys/fs/cgroup:ro
networks:
- name: net1
docker_networks:
- name: net1
driver_options:
com.docker.network.driver.mtu: ${DOCKER_MTU:-1500} # 1500 is docker default
provisioner:
name: ansible
verifier:
Expand Down
4 changes: 4 additions & 0 deletions molecule/test6/molecule.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,10 @@ platforms:
- /sys/fs/cgroup:/sys/fs/cgroup:ro
networks:
- name: net1
docker_networks:
- name: net1
driver_options:
com.docker.network.driver.mtu: ${DOCKER_MTU:-1500} # 1500 is docker default
provisioner:
name: ansible
inventory:
Expand Down
4 changes: 3 additions & 1 deletion tasks/runtime.yml
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,7 @@
state: touch
owner: slurm
group: slurm
mode: 0644
access_time: preserve
modification_time: preserve
when: openhpc_slurm_job_comp_type == 'jobcomp/filetxt'
Expand Down Expand Up @@ -85,6 +86,7 @@
src: slurm.conf.j2
dest: "{{ _slurm_conf_tmpfile.path }}"
lstrip_blocks: true
mode: 0644
delegate_to: localhost
when: openhpc_enable.control | default(false) or not openhpc_slurm_configless
changed_when: false # so molecule doesn't fail
Expand All @@ -95,7 +97,7 @@
path: "{{ _slurm_conf_tmpfile.path }}"
option: "{{ item.key }}"
section: null
value: "{{ item.value }}"
value: "{{ (item.value | join(',')) if (item.value is sequence and item.value is not string) else item.value }}"
no_extra_spaces: true
create: no
mode: 0644
Expand Down
47 changes: 21 additions & 26 deletions templates/slurm.conf.j2
Original file line number Diff line number Diff line change
Expand Up @@ -106,39 +106,34 @@ NodeName={{ node }}
PropagateResourceLimitsExcept=MEMLOCK
Epilog=/etc/slurm/slurm.epilog.clean
{% for part in openhpc_slurm_partitions %}
{% set nodelist = [] %}
{% for group in part.get('groups', [part]) %}

{% set group_name = group.cluster_name|default(openhpc_cluster_name) ~ '_' ~ group.name %}
{% if groups[group_name] | length > 0 %}
{# If using --limit, the first host in each group may not have facts available. Find one that does, or die: #}
{% set group_hosts = groups[group_name] | intersect(play_hosts) %}
{% set first_host = group_hosts | first | mandatory('Group "' + group_name + '" contains no hosts in this play - was --limit used?') %}
# openhpc_slurm_partitions group: {{ group_name }}
{% set inventory_group_hosts = groups.get(group_name, []) %}
{% if inventory_group_hosts | length > 0 %}
{% set play_group_hosts = inventory_group_hosts | intersect (play_hosts) %}
{% set first_host = play_group_hosts | first | mandatory('Group "' ~ group_name ~ '" contains no hosts in this play - was --limit used?') %}
{% set first_host_hv = hostvars[first_host] %}

NodeName=DEFAULT State=UNKNOWN \
RealMemory={% if 'ram_mb' in group %}{{group.ram_mb}}{% else %}{{ (first_host_hv['ansible_memory_mb']['real']['total'] * group.ram_multiplier | default(openhpc_ram_multiplier)) | int }}{% endif %} \
Sockets={{first_host_hv['ansible_processor_count']}} \
CoresPerSocket={{first_host_hv['ansible_processor_cores']}} \
ThreadsPerCore={{first_host_hv['ansible_processor_threads_per_core']}}
{% for node in groups[group_name] %}
NodeName={{ node }}
{% set ram_mb = (first_host_hv['ansible_memory_mb']['real']['total'] * (group.ram_multiplier | default(openhpc_ram_multiplier))) | int %}
{% for hostlist in (inventory_group_hosts | hostlist_expression) %}
NodeName={{ hostlist }} State=UNKNOWN RealMemory={{ group.get('ram_mb', ram_mb) }} Sockets={{first_host_hv['ansible_processor_count']}} CoresPerSocket={{ first_host_hv['ansible_processor_cores'] }} ThreadsPerCore={{ first_host_hv['ansible_processor_threads_per_core'] }}
{% set _ = nodelist.append(hostlist) %}
{% endfor %}{# nodes #}
{% else %}
NodeName=-nonesuch
{% endif %}{# non-empty group #}
{% endif %}{# inventory_group_hosts #}
{% for extra_node_defn in group.get('extra_nodes', []) %}
{{ extra_node_defn.items() | map('join', '=') | join(' ') }}
{% set _ = nodelist.append(extra_node_defn['NodeName']) %}
{% endfor %}
{% endfor %}{# group #}
PartitionName={{part.name}} \
Default={% if 'default' in part %}{{ part.default }}{% else %}YES{% endif %} \
MaxTime={% if 'maxtime' in part %}{{ part.maxtime }}{% else %}{{ openhpc_job_maxtime }}{% endif %} \
State=UP \
Nodes=\
{% for group in part.get('groups', [part]) %}
{% set group_name = group.cluster_name|default(openhpc_cluster_name) ~ '_' ~ group.name %}
{% if groups[group_name] | length > 0 %}{{ groups[group_name] | join(",\\\n") }}{% if not loop.last %}{{ ",\\\n" }}{% endif %}{% else %}-nonesuch{% endif %}
{% endfor %}{# group #}
{% if not nodelist %} {# empty partition - define an invalid hostname which slurm accepts #}
{% set nodelist = ['-nonesuch'] %}
NodeName={{ nodelist[0] }}
{% endif %}
PartitionName={{part.name}} Default={{ part.get('default', 'YES') }} MaxTime={{ part.get('maxtime', openhpc_job_maxtime) }} State=UP Nodes={{ nodelist | join(',') }}
{% endfor %}{# partitions #}

# Want nodes that drop out of SLURM's configuration to be automatically
# returned to service when they come back.
ReturnToService=2

# Parameters from openhpc_config (which do not override values templated above) will be below here:
Loading