# An example to submit a parallel SoS job on the cluster

This notebook shows a toy example how to submit an SoS job on the cluster.

For brevity I will simply write up a toy example here without explaining what each parameter is for. Please visit [SoS website](https://vatlab.github.io/sos-docs/) to learn more details.

## A toy example script

Here I write a workflow to create 10 text files, each file prints a line of text in it. 

In [None]:
[global]
parameter: walltime = '1h'
parameter: mem = '3G'
parameter: ncore = 1
parameter: job_size = 1

In [None]:
[toy_example]
parameter: n = 10
n  = [x+1 for x in range(n)]
input: for_each = 'n'
output: f'File_{_n}.out'
task: trunk_workers = 1, trunk_size = job_size, walltime = walltime, mem = mem, cores = ncore, tags = f'{step_name}_{_output:bn}'
bash: expand = True
    echo {_n} > {_output}

## Submit the workflow from login node

To run the example on a cluster, eg `csg.yml` and on a queue `neurology` defined inside `csg.yml`:

```
sos run Job_Example.ipynb toy_example -c csg.yml -q neurology
```

You can then wait for this toy example to finish. In the mean time you can monitor its status, eg via `qstat -u <your username>` to check.

Also defined in `csg.yml` there is another queue called `csg`:

```
sos run Job_Example.ipynb toy_example -c csg.yml -q csg -s force
```

Notice we use `-s force` so the existing output file will be ignored and new commands to generate the files will be submitted -- to demonstrate submitting to `csg` queue.


## Submit the workflow through a compute node

It is encourage (in fact, **required**) that for long running jobs we submit the above command to a compute node that will distribute the 10 tasks from that node instead of from the login node. To implement this you have to create a text file, for example for CU Neurology cluster:

```bash
#!/bin/sh
#$ -l h_rt=36:00:00
#$ -l h_vmem=4G
#$ -N job_submitter
#$ -cwd
#$ -S /bin/bash
#$ -q csg.q

export PATH=$HOME/miniconda3/bin:$PATH
sos run Job_Example.ipynb toy_example -c csg.yml -q neurology -s force &> toy_example.log
```

As you can tell you requested 4GB memory and 36hrs to manage your pipeline execution. Please copy and save the contents about to a file called `toy_example.sh` and submit it via:

```
qsub toy_example.sh
```

You can check inside `toy_example.log`, for example use command:

```
cat toy_example.log
```

At the end of the job you should see exactly the same content as you have seen earlier on the screen when you submit jobs from login node.