This notebook contains example usage of the slurm-magic package

# Loading the package

to use the magic commands defined in this package you need to first install it and than load it.

In [1]:
%load_ext slurm_magic
import warnings
warnings.filterwarnings("ignore")

If you also decide to make changes to the package, you can reload it

In [21]:
%reload_ext slurm_magic

Now you can test whether the package is working

In [3]:
%squeue

Unnamed: 0,JOBID,PARTITION,NAME,USER,ST,TIME,NODES,NODELIST(REASON)
0,16680345,plgrid,cartpole,plgjakub,R,2:09:37,1,ac0410
1,16685977_0,plgrid,run_arra,plgboksa,R,11:05,1,ac0590
2,16685977_1,plgrid,run_arra,plgboksa,R,11:05,1,ac0590
3,16685977_2,plgrid,run_arra,plgboksa,R,11:05,1,ac0590
4,16685977_3,plgrid,run_arra,plgboksa,R,11:05,1,ac0590
5,16685977_4,plgrid,run_arra,plgboksa,R,11:05,1,ac0590
6,16686032,plgrid-gp,bash,plgdwole,R,10:27,1,ag0009
7,16686398,plgrid-no,bash,plgabist,R,1:00,1,ac0787


# Running jobs

## Simple srun jobs

you can either type the command as you would in terminal or use some python to levrage variables

In [22]:
%srun --nodes=1 --ntasks=1 --time=00:00:01 --partition=plgrid-testing --account=plglscclass24-cpu venv/bin/python3 example_job.py

('This is the example job output c:\n',
 'srun: job 16687522 queued and waiting for resources\nsrun: job 16687522 has been allocated resources\n')

As you can see the output from jobs' stdout was captured and we can see the job completed sucessfully. When working with line magic you can save this output into a variable which will come in handy later

In [23]:
example_job_result = %srun --nodes=1 --ntasks=1 --time=00:00:01 --partition=plgrid-testing --account=plglscclass24-cpu venv/bin/python3 example_job.py

In [24]:
print(example_job_result[0])

This is the example job output c:



## Simple sbatch jobs
You can also run batch jobs. It is achieved through whats called a cell magic

In [25]:
%%sbatch
#!/bin/bash

#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --time=00:00:01
#SBATCH --partition=plgrid-testing
#SBATCH --account=plglscclass24-cpu

venv/bin/python3 example_job.py

'Submitted batch job 16687800\n'

In [26]:
%squeue

Unnamed: 0,JOBID,PARTITION,NAME,USER,ST,TIME,NODES,NODELIST(REASON)
0,16687014,plgrid,gaia,plgwbart,R,19:47,1,ac0767
1,16680345,plgrid,cartpole,plgjakub,R,2:42:49,1,ac0410
2,16685977_0,plgrid,run_arra,plgboksa,R,44:17,1,ac0590
3,16685977_1,plgrid,run_arra,plgboksa,R,44:17,1,ac0590
4,16685977_2,plgrid,run_arra,plgboksa,R,44:17,1,ac0590
5,16685977_3,plgrid,run_arra,plgboksa,R,44:17,1,ac0590
6,16685977_4,plgrid,run_arra,plgboksa,R,44:17,1,ac0590
7,16686032,plgrid-gp,bash,plgdwole,R,43:39,1,ag0009
8,16686398,plgrid-no,bash,plgabist,R,34:12,1,ac0787


In [28]:
# to access the result just read the output file
with open("slurm-16687800.out") as f:
    print(f.read())

This is the example job output c:



# Running code without file
This is not possible with the current version of package. There is option however to use built in magic commands to not leave the notebook anyway

In [30]:
%%writefile example_job_2.py
# we can use %%writefile magic to save the cell contents into a file
# after that we can run the created file like before

print("This is a script created inside notebook")

Overwriting example_job_2.py


In [31]:
%srun --nodes=1 --ntasks=1 --time=00:00:01 --partition=plgrid-testing --account=plglscclass24-cpu venv/bin/python3 example_job_2.py

('This is a script created inside notebook\n',
 'srun: job 16689478 queued and waiting for resources\nsrun: job 16689478 has been allocated resources\n')

# Interacting with input and output
For now there is no good way to pass input to the job directly. The only option is to use files. The output from srun however is captured which enables some possibilities.

In [37]:
%%writefile example_job_3.py

import pandas as pd
import numpy as np
import pickle
import base64

NUM_ROWS = 1_000_000

data = {
    'value': np.random.rand(NUM_ROWS) * 100,
    'category': np.random.choice(['A', 'B', 'C', 'D'], size=NUM_ROWS)
}
df = pd.DataFrame(data)

result_series = df.groupby('category').agg({'value': 'mean'})

result = pickle.dumps(result_series)
print(base64.b64encode(result).decode('utf-8'))


Overwriting example_job_3.py


We cannot pass the objects from jobs directly to python since we need to go through stdout. To successfuly pass an object we can use pickle and base64 encoding. After saving the result to variable we can easily access the result.

In [38]:
result = %srun --nodes=1 --ntasks=1 --time=00:01:00 --partition=plgrid-testing --account=plglscclass24-cpu venv/bin/python3 example_job_3.py

In [39]:
import pandas as pd
import pickle
import io
import base64

decoded_pickled_bytes = base64.b64decode(result[0].encode('utf-8'))
byte_stream = io.BytesIO(decoded_pickled_bytes)
df = pickle.load(byte_stream)

In [42]:
df

Unnamed: 0_level_0,value
category,Unnamed: 1_level_1
A,49.973025
B,49.960188
C,49.928621
D,50.048835


Additionaly the scripts that are being run can use imports and juypter python interpreter to extend their functionality. As an example one can even transfare whole script context back to the jupyter notebook.

In [99]:
%%writefile example_job_4.py

import pandas as pd
import numpy as np
import pickle
import base64
from copy import deepcopy

NUM_ROWS = 1_000_000

data = {
    'value': np.random.rand(NUM_ROWS) * 100,
    'category': np.random.choice(['A', 'B', 'C', 'D'], size=NUM_ROWS)
}
df = pd.DataFrame(data)

result_series = df.groupby('category').agg({'value': 'mean'})

# --- Logic to pickle all relevant variables ---
context = {}
global_keys = deepcopy(list(globals().keys()))

for k in global_keys:
    if not k.startswith('__') and k != "context" and k!= "global_keys":
        try:
            context[k] = deepcopy(globals()[k])
            print(f"copied: {k}", file=sys.stderr)
        except:
            ...

try:
    pickled_context = pickle.dumps(context)
    encoded_context = base64.b64encode(pickled_context).decode('utf-8')
    print(encoded_context)

except Exception as e:
    print(f"An error occurred during pickling or encoding: {e}", file=sys.stderr)

Overwriting example_job_4.py


In [100]:
result = %srun --nodes=1 --ntasks=1 --time=00:00:10 --partition=plgrid-testing --account=plglscclass24-cpu venv/bin/python3 example_job_4.py

In [101]:
print(result[1])

srun: job 16690963 queued and waiting for resources
srun: job 16690963 has been allocated resources



In [103]:
import pandas as pd
import pickle
import io
import base64

decoded_pickled_bytes = base64.b64decode(result[0].encode('utf-8'))
byte_stream = io.BytesIO(decoded_pickled_bytes)
loaded_context = pickle.load(byte_stream)

for k, v in loaded_context.items():
    globals()[k] = v

After using ```locals()``` to load the context we can use the variables inside the notebook

In [105]:
print(NUM_ROWS)
print(df.head())

1000000
       value category
0  41.642109        D
1  52.184944        D
2  95.215801        B
3   3.803339        A
4  24.473911        A
