# How to: Work on Narval, CCDB

**Learning outcomes**

0. [CCDB account setup and ssh connection to Narval](#scrollTo=)
1. [Create a GLobus account for Data transfert (optional)](#scrollTo=be524736-3944-4376-b03c-6b85ffb0eb3e)
2. [Structure and Datasets in Narval](#scrollTo=) \
    a) [Storage and file management on Narval](#scrollTo=3877d880-4c49-4093-9b69-3cdfe6462132) \
    b) [Datasets in data](#scrollTo=80d1a788-1996-4b38-8775-1aa50e7cb509)
3. [Basic commands on a Narval node](#scrollTo=9f94af8c-5dce-4327-9580-793e7aaecc22)
4. [SLURM job templates](#scrollTo=3877d880-4c49-4093-9b69-3cdfe6462132)

<a id='scrollTo=be524736-3944-4376-b03c-6b85ffb0eb3e'></a>
## 0. CCDB account setup and ssh connection to Narval
See [Tutorial-1.2](https://colab.research.google.com/github/Neuro-iX/Tutorials/blob/main/Tutorial_1_NewMember/Tutorial_1_NewMember.ipynb#scrollTo=3877d880-4c49-4093-9b69-3cdfe6462132)

<a id='scrollTo=be524736-3944-4376-b03c-6b85ffb0eb3e'></a>
## 1. Create a GLobus account for Data transfert (optional)
1. Got to: https://globus.computecanada.ca. 
2. Existing organisation: Compute Canada 
3. Use CCDB login information. 
4. Follow the instructions and allow everything.
5. Clic on "Collection: Search" -> Get Globus Connect Personal -> Show me the other supported operating systems \
On linux: 
```bash
tar xzf globusconnectpersonal-latest.tgz
cd globusconnectpersonal-3.2.2/
./globusconnect
```
6. Give a name to your new collection and create it
7. Clic on "access data in this collection"
8. On Globus website, you should see your collection selected. Clic on "Transfer or Sync to" and "Collection: Search" on the right side.
9. Enter Collection: Narval. Select "Compute Canada: Narval"
10. Login using same identity
11. Create a bookmark for this page and use it to transfer data between the two selected collections



<a id='scrollTo=9f94af8c-5dce-4327-9580-793e7aaecc22'></a>
## 2. Structure and Datasets in Narval

<a id='scrollTo=1601e374-1d76-4216-b7f3-0426c30add20'></a>
### a) Storage and file management on Narval
- **Overall structure:** \
See the structure of your `$HOME (~)`:

![Tree_Narval](tree_Narval.png)

See the structure of our common project space (**def-sbouix**):

![ls_projects](ls_projects.png)

Each member has is own folder.  \
You can find different datasets in `data` (see next section). \
Not native softwares (listed with `module spider`) are stored in `software`.

- **Use the right storage space for the right task:** \
`/home/<username> (=$HOME)`: source code, small parameter files and job submission scripts  \
`/home/<username>/projects/def-sbouix`: fairly static data to share between members of the project space (**def-sbouix**). It containes your private folder (`<username>`), some data/images (data), some softwares installed manually (software).  \
`/home/<username>/nearline/def-sbouix`: tape-based filesystem intended for **inactive data**.   \
`/scratch/<username> (=$SCRATCH) or $HOME/scratch`: temporary files and intensive read/write operations on large files (> 100 MB per file). **Automatic purging after 60 days.**  \
`SLURM_TMPDIR`: large collections of small files (< 1 MB per file), **deleted at the end of the job**.

See best practices and more: https://docs.alliancecan.ca/wiki/Storage_and_file_management

- **Safely move a directory with symlinks by making an archive and extract it:**
```bash
diskusage_report #Check disk spaces being used
cd /scratch/.../your_data
mkdir project/.../your_data
tar cf - ./* | (cd /project/.../your_data && tar xf -)

```

Documentation: https://docs.alliancecan.ca/wiki/Scratch_purging_policy

<a id='scrollTo=1601e374-1d76-4216-b7f3-0426c30add20'></a>
### b) Datasets in data
- AMPSCZ:
- HCPEP:  
- HCP-YA:  
- HCP-YA-1200:  
- HOA2_100_Subcortical:  
- NIMH_HC:

<a id='scrollTo=f5f45d3e-c3a6-4d7d-a4cb-3723185a4714'></a>
## 3. Basic commands on a Narval node

- **List all the modules that can be loaded, and load a specific version:**
```bash
module spider
module load freesurfer/5.3.0
```
Documentation: https://lmod.readthedocs.io/en/latest/135_module_spider.html

- **Launch interactive jobs:**
```bash
salloc --x11  -n 4 --mem-per-cpu=4000 --account=def-sbouix
```
Documentation: https://docs.alliancecan.ca/wiki/Running_jobs

- **Execute a bash script, determine the status of the job, and display the output file:**
```bash
sbatch simple_job.sh --output=outputfilename.out --time=00:30:00 --account=def-sbouix
sq
cat outputfilename.out
```
See next section for examples. \
Documentation: https://docs.alliancecan.ca/wiki/What_is_a_scheduler%3F

- **Use Apptainer to create a container:**
```bash
salloc --x11  -n 4 --mem-per-cpu=4000 --account=def-sbouix #Don't stay on a login node
module load apptainer
apptainer run -C -B /project -W ${SLURM_TMPDIR} myimage.sif myprogram
```
See next section for examples. \
Documentation: https://docs.alliancecan.ca/wiki/Apptainer

<a id='scrollTo=3877d880-4c49-4093-9b69-3cdfe6462132'></a>
## 4. SLURM job templates
### a) Freesurfer 7.4.1 container
<ins>Example you can try directly in Narval</ins>:
```bash
sbatch ~/projects/def-sbouix/software/Freesurfer7.4.1_container/freesurfer741_job.sh ~/projects/def-sbouix/software/Freesurfer7.4.1_container/test_freesurfer.sh
rm -r ~/projects/def-sbouix/software/Freesurfer7.4.1_container/bert
```
The first script (**freesurfer741_job.sh**) is used to execute the second one (**test_freesurfer.sh**) in a container created based on fs741.sif. \
The second script is using the function **recon-all** of Freesurfer to compute the steps -autorecon1 on the test image /usr/local/freesurfer/subjects/bert/mri/orig/001.mgz. \
**You can copy both scripts in your folder and change them to your needs.** \
They are also available here: https://github.com/Neuro-iX/Tutorials/tree/main/Tutorial_3_Narval_CCDB/scripts

<ins>Remark 1</ins>: It is not possible to show an image in **freeview** from a script. \
Instead:
```bash
salloc --x11  -n 4 --mem-per-cpu=4000 --account=def-sbouix --time=2:59:0
module load apptainer/1.1.8
apptainer run --bind /lustre06/project/6074560:/mnt ~/projects/def-sbouix/software/Freesurfer7.4.1_container/fs741.sif freeview
#apptainer shell --bind /lustre06/project/6074560:/mnt ~/projects/def-sbouix/software/Freesurfer7.4.1_container/fs741.sif
#freeview
#Ctrl+c
#exit
```
To access $HOME/projects/def-sbouix when **uploading an image in freeview**, go to **Computer -> /mnt**.


<ins>Remark 2</ins>: In freesurfer741_job.sh, the #SBATCH directives --ntasks=x and --nodes=y and --ntasks-per-node=z mean that the second script will be **executed x or y*z times** (x=y=z=1 by default).

<ins>Remark 3</ins>: In freesurfer741_job.sh, the #SBATCH directive **--time=0:01:00 has to be changed** according to the time needed to compute both scripts.  \
A time higher than 2:59:59 will make it longer for the SLURM job to be launched.
