Skip to content

Commit

Permalink
Add CalcJob test over SSH (#4732)
Browse files Browse the repository at this point in the history
Adds a configuration for a remote computer (slurm docker container) and uses it
to run a CalcJob test over SSH.

This is a follow-up on the memory leak tests, since the leak of the process
instance was discovered to occur only when running CalcJobs on a remote
computer via an SSH connection.

Co-authored-by: Chris Sewell <chrisj_sewell@hotmail.com>
  • Loading branch information
ltalirz and chrisjsewell committed Feb 10, 2021
1 parent 13358ed commit 2e18f5b
Show file tree
Hide file tree
Showing 8 changed files with 96 additions and 4 deletions.
5 changes: 5 additions & 0 deletions .github/config/README.md
@@ -0,0 +1,5 @@
# AiiDA configuration files

This folder contains configuration files for AiiDA computers, codes etc.

- `slurm_rsa`: private key that provides access to the `slurm-ssh` container
7 changes: 7 additions & 0 deletions .github/config/slurm-ssh-config.yaml
@@ -0,0 +1,7 @@
---
safe_interval: 0
username: xenon
look_for_keys: true
key_filename: "PLACEHOLDER_SSH_KEY"
key_policy: AutoAddPolicy
port: 5001
12 changes: 12 additions & 0 deletions .github/config/slurm-ssh.yaml
@@ -0,0 +1,12 @@
---
label: slurm-ssh
description: slurm container
hostname: localhost
transport: ssh
scheduler: slurm
shebang: "#!/bin/bash"
work_dir: /home/{username}/workdir
mpirun_command: "mpirun -np {tot_num_mpiprocs}"
mpiprocs_per_machine: 1
prepend_text: ""
append_text: ""
27 changes: 27 additions & 0 deletions .github/config/slurm_rsa
@@ -0,0 +1,27 @@
-----BEGIN RSA PRIVATE KEY-----
MIIEpAIBAAKCAQEAnCqpTQFbmi1WPX4uTUFCHAvf61AhvqXUFoJEHQEvtDYibWJZ
bI7LueA2eEKw68oynIfPeinr4+DOnejMG1+HKCWi03DzWoorBOYc0e9i3nxkU93j
hZZsiQZfBgcCenqh2t1ZLbEFdFnCqLDw6gbDH0F3W3NJW0Q30a8HQ01lqdSKyVdf
UghVLCx1HM53BxXEYGU2m2ii+uyoMIsz9TSCJdKXIAb5N4tZYqKPF8q0vf1eP2BB
SUsn4bAHpPqvx3I0HkyR6qV5UT4K91FteULLTJHjK3Y0bBUMOmNQPh0JTmfj/KNB
EtJdlGYE0Tce1XINvhHItSpdFZs8GTnmOzUaVQIDAQABAoIBAEpWsILcm5tX646Y
KzhRUUQCjxP38ChNzhjs57ma3/d8MYU6ZPEdRHN1/Nfgf1Guzcrfh29S11yBnjlj
IQ4CulbtG4ZlZSJ7VSEe3Sc+OiVIt4WIwY7M3VuY8dDvs0lUaQnDhnkOpFcPh28/
017D20xcoJGi3o+YeK3TELUD+doOeaot4+5TvR0PiLEmyjlnWB1FRkYpGAVDRKKa
F3dSAGf41ygoDOaGmtNmpH/Fn1k9cSDZsRsMKjZQTjgKfX+y/H6eOpORgHYHVmlu
eFIK8+yVVBy5k+m7nTIAUzXm01yJ5fQuT/75EcILUvjloTwmykaTfO1Ez6rNf+BC
VCdD9H0CgYEAyBjEB9vbZ5gDnnkdG0WCr34xPtBztTuVADWz5HorHYFircBUIaJ0
XOIUioXMmpgSRTzbryAXVznh+g3LeS8QgiGQJoRhIknN8rrRUWd25tgImCMte0eb
bTieJYpvUk8RPan/Arb6f1MLZjWYfJelSw8qQS6R4ydk1L2M78sri/8CgYEAx8vy
KP1e5gGfA42Q0aHvocH7vqbEAOfDK8J+RpT/EoSJ6kSu2oPvblF1CBqHo/nQMhfK
AGbAtWIfy8rs1Md2k+Y+8PXtY8sJJ/HA8laVnEvTHbPSt4X7TtrLx27a8ZWtTNYu
JH/kK8rFBHEGqLnS6VJmqvHKqglp7FIQmHNNaasCgYEApGSMcXR0zqh6mLEic6xp
EOtZZCT4WzZHVTPJxvWEBKqvOtbfh/6jIUhw3dnNXll/8ThtuHRiGLyqZrj8qWQ8
aN1QRATQlM4UEM7hd8LMUh28+dk03arYDCTO8ULJ8NKa9JF8vGs+ZGsC24c+72Xb
XE5qRcEQBJLx6UKNztiZv1sCgYACHBEuhZ5e5116eCAzVnZlStsRpEkliUzyRVd3
/1LCK0wZgSgnfoUksQ9/SmhsPtMH9GBZqLwYLjUPvdDKXmDOJvw7Jx2elCJAnbjf
1jI2OEa+ZYuwDGYe6wiDzpPZQS9XRFuwXvlVzQpPhbIAThYACLK002DEctz/dc5f
DbifiQKBgQCdXgr7tdEAmusvIcTRA1KMIOGE5pMGYfbMnDTTIihUfRMJbCnn9sHe
PrDKVVgD3W4hjOABN24KOlCZPtWZfKUKe893ali7mFAIwKNV/AKhQhDgGzJPidqc
6DIL2GhDwqtPIf3b6sI21ZvyAFDROZMKnoL5Q1xbbp5EADi2wPO55Q==
-----END RSA PRIVATE KEY-----
31 changes: 27 additions & 4 deletions .github/system_tests/pytest/test_memory_leaks.py
Expand Up @@ -10,17 +10,23 @@
"""Utilities for testing memory leakage."""
from tests.utils import processes as test_processes # pylint: disable=no-name-in-module,import-error
from tests.utils.memory import get_instances # pylint: disable=no-name-in-module,import-error
from aiida.engine import processes, run
from aiida.engine import processes, run_get_node
from aiida.plugins import CalculationFactory
from aiida import orm

ArithmeticAddCalculation = CalculationFactory('arithmetic.add')


def run_finished_ok(*args, **kwargs):
"""Convenience function to check that run worked fine."""
_, node = run_get_node(*args, **kwargs)
assert node.is_finished_ok, (node.exit_status, node.exit_message)


def test_leak_run_process():
"""Test whether running a dummy process leaks memory."""
inputs = {'a': orm.Int(2), 'b': orm.Str('test')}
run(test_processes.DummyProcess, **inputs)
run_finished_ok(test_processes.DummyProcess, **inputs)

# check that no reference to the process is left in memory
# some delay is necessary in order to allow for all callbacks to finish
Expand All @@ -30,8 +36,25 @@ def test_leak_run_process():

def test_leak_local_calcjob(aiida_local_code_factory):
"""Test whether running a local CalcJob leaks memory."""
inputs = {'x': orm.Int(1), 'y': orm.Int(2), 'code': aiida_local_code_factory('arithmetic.add', '/usr/bin/diff')}
run(ArithmeticAddCalculation, **inputs)
inputs = {'x': orm.Int(1), 'y': orm.Int(2), 'code': aiida_local_code_factory('arithmetic.add', '/bin/bash')}
run_finished_ok(ArithmeticAddCalculation, **inputs)

# check that no reference to the process is left in memory
# some delay is necessary in order to allow for all callbacks to finish
process_instances = get_instances(processes.Process, delay=0.2)
assert not process_instances, f'Memory leak: process instances remain in memory: {process_instances}'


def test_leak_ssh_calcjob():
"""Test whether running a CalcJob over SSH leaks memory.
Note: This relies on the 'slurm-ssh' computer being set up.
"""
code = orm.Code(
input_plugin_name='arithmetic.add', remote_computer_exec=[orm.load_computer('slurm-ssh'), '/bin/bash']
)
inputs = {'x': orm.Int(1), 'y': orm.Int(2), 'code': code}
run_finished_ok(ArithmeticAddCalculation, **inputs)

# check that no reference to the process is left in memory
# some delay is necessary in order to allow for all callbacks to finish
Expand Down
4 changes: 4 additions & 0 deletions .github/workflows/ci-code.yml
Expand Up @@ -70,6 +70,10 @@ jobs:
image: rabbitmq:latest
ports:
- 5672:5672
slurm:
image: xenonmiddleware/slurm:17
ports:
- 5001:22

steps:
- uses: actions/checkout@v2
Expand Down
10 changes: 10 additions & 0 deletions .github/workflows/setup.sh
Expand Up @@ -10,18 +10,28 @@ chmod 755 "${HOME}"

# Replace the placeholders in configuration files with actual values
CONFIG="${GITHUB_WORKSPACE}/.github/config"
cp "${CONFIG}/slurm_rsa" "${HOME}/.ssh/slurm_rsa"
sed -i "s|PLACEHOLDER_BACKEND|${AIIDA_TEST_BACKEND}|" "${CONFIG}/profile.yaml"
sed -i "s|PLACEHOLDER_PROFILE|test_${AIIDA_TEST_BACKEND}|" "${CONFIG}/profile.yaml"
sed -i "s|PLACEHOLDER_DATABASE_NAME|test_${AIIDA_TEST_BACKEND}|" "${CONFIG}/profile.yaml"
sed -i "s|PLACEHOLDER_REPOSITORY|/tmp/test_repository_test_${AIIDA_TEST_BACKEND}/|" "${CONFIG}/profile.yaml"
sed -i "s|PLACEHOLDER_WORK_DIR|${GITHUB_WORKSPACE}|" "${CONFIG}/localhost.yaml"
sed -i "s|PLACEHOLDER_REMOTE_ABS_PATH_DOUBLER|${CONFIG}/doubler.sh|" "${CONFIG}/doubler.yaml"
sed -i "s|PLACEHOLDER_SSH_KEY|${HOME}/.ssh/slurm_rsa|" "${CONFIG}/slurm-ssh-config.yaml"

verdi setup --config "${CONFIG}/profile.yaml"

# set up localhost computer
verdi computer setup --config "${CONFIG}/localhost.yaml"
verdi computer configure local localhost --config "${CONFIG}/localhost-config.yaml"
verdi computer test localhost
verdi code setup --config "${CONFIG}/doubler.yaml"
verdi code setup --config "${CONFIG}/add.yaml"

# set up slurm-ssh computer
verdi computer setup --config "${CONFIG}/slurm-ssh.yaml"
verdi computer configure ssh slurm-ssh --config "${CONFIG}/slurm-ssh-config.yaml" -n # needs slurm container
verdi computer test slurm-ssh --print-traceback

verdi profile setdefault test_${AIIDA_TEST_BACKEND}
verdi config runner.poll.interval 0
4 changes: 4 additions & 0 deletions .github/workflows/test-install.yml
Expand Up @@ -147,6 +147,10 @@ jobs:
image: rabbitmq:latest
ports:
- 5672:5672
slurm:
image: xenonmiddleware/slurm:17
ports:
- 5001:22

steps:
- uses: actions/checkout@v2
Expand Down

0 comments on commit 2e18f5b

Please sign in to comment.