# Facilitator Speaking Notes

### Overview for Facilitator
Speaking notes for an Introduction to Alliance Digital Research Infrastructure (DRI) with workshop participants. Learners should already be familiar with the concept of a terminal, and the basics of DRAC's infrastructure. This session will introduce learners to JupyterHub, submitting a slurm job, and running the CLASSIC model.

Instructions for the facilitator are italicized. Code sections indicate commands that the facilitator should enter and execute in their own terminal.

**Note:** This tutorial was designed for use on a Magic Castle virtual cluster.

### Introduction

*Pass out a username and password to each workshop participant, they will need it to log onto the JupyterHub*

We are going to connect to a DRAC's cluster using JupyterHub. JupyterHub is a development environment which will give us a more user friendly interface than the terminal for interacting with the cluser.

*Write the JupyterHub URL on the board and get students to navigate to it*

*Follow along with the instructions below to demonstrate what to do*
<br>You should be directed to a login page. Use the Username and Password provided to sign in. You can leave the "OTP" field blank.

You will be redirected to a "Server Options" page. Update the following fields:
- **Time (hours)**: 3
- **Memory**: 1500

Then click the "Start" button.

You will see a message that your server is starting up. It may take a minute for the server to launch. 

*Give students a quick tour of the JupyterHub interface*

*Get everyone to open a terminal*
*Get them to do basic `ls`, `pwd`, `cd` commands to familarize themselves, and point out how they can navigate around to different files in using the file navigation bar on the left hand side of the screen.*

*Point out CLASSIC files, and show them around*

### Environment Configuration

We need to configure our environment by loading a few packages into JupyterLab.

I like to think of packages as tools that I can add into my toolbox. Each package is a big or small tool that lets me accomplish a task a lot more efficiently than if I didn't have it. 

Navigate to the "Softwares" menu on the sidebar, and search for "scipy-stack/2023b" and Load it.

#### Import xarray

There is a package we will need called xarray. We need to install it on the machine. 

In the terminal enter
`pip install xarray --no-index`

### Submitting a job

Next, we're going to send a job request to the server to run the CLASSIC model.

We need to submit our job request to DRAC's scheduling software, called slurm. DRAC needs a scheduling software because it can have high demand, but finite resources available.

Before we submit our job, let's take a looks at what we're asking the computer to run. 

*Navigate to the `classic_submit_dra.sh` script*

The job we're going to submit runs the follow bash script. The computer will read and execute each line one by one from top to bottom. 

Let's go through it together. 

`#!/bin/sh`
This tells our computer to open a shell where bash commands will be entered.

```
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=4
#SBATCH --time=00:30:00
#SBATCH --mem=2G
```
These are a series of commands for the slurm scheduling software, informing it of the resources that our job will require which will help it schedule the job. 

When thinking about scheduling, I like to envision a landromat.  

<img src="images/laundry.png" alt="Laundromat" width="300"/>

Remember how a cluster is made up of multiple computers? Each computer is called a node. 
So in our landromat analogy, a node is a single landromat, and a cluster is a whole city block filled with laundromat after laudromat.

With 
`#SBATCH --nodes=1`
We're telling slurm that we only need to use one node.

Within each node, there are multiple processors that do the actual computations. These are known as cores or CPUs. In our laundromat each node is an individual washing machine.

With 
`SBATCH --ntasks-per-node=4`
We're telling slurm that we need to use 4 cores.

To allocate it's resources efficiently, slurm needs to know how long you expect your job to run for. Like the timer on a washing machine.

With
`SBATCH --time=00:30:00`
We're telling slurm we need 30 minutes to comlete our job. The timer uses the format HH:MM:SS.

Finally, slurm needs to know how much memory the job will require. This is the amount of working memory, or RAM, that it takes up on the node. You can think of this as the node's short term memory. When thinking of our laundromat, the RAM is the actual physical space available inside the machines. 

With 
`SBATCH --men=2G`
We're telling slurm we need 2 GB of memory for our job.

##### Small activity

I want you all to think about the following scenario. It's a Thursday night, and your parents are arriving tomorrow for a visit where they'll stay with you. You're frantically cleaning your house, and decide to wash your laundry, towels, and bedsheets at the laundromat down the road. At the last minute your roommate asks you to wash their laundry and sheets too, seeing as "you're already going the laundromat". 

You only have 3 hours until you want to go to sleep, but you have 5 loads of laundry! What do you do?

*Coach folks into realizing that the most efficient solution is to run 5 loads of laundry in five different machines*

Great, so by diving our job into smaller loads and assigning each load to it's own machine, we were able to complete our job faster. 

In this scenario what does this represent doing with DRI?

*Answer: parallel processing, although they don't know the term for that yet*

Thinking about CLASSIC, when we run a simulation over a large area, parallel processing makes this process much more efficient. Based on what you've learned about CLASSIC so far, what's one aspect of the model structure that lends it to parallel processing?

*Hint: Try to think about how the model can be broken into smaller pieces*
*Answer: The gridded structure allows each grid cell to be handled individually*

##### Back to job script

Okay, moving back to our job script, the next part of the code is
```
module load gcc
module load openmpi
module load cdo
module load nco
module load netcdf-fortran-mpi
module load hdf5-mpi
```

This is a loading a series of packages that CLASSIC will need to run.

Lastly we have the line
`CLASSIC/bin/CLASSIC_parallel_intel /home/rlwhall/scratch/test6/job_options.txt 90.5/105.5/30.4/45.5`
which will execute CLASSIC.

This has three parts.
 1. `CLASSIC_2/bin/CLASSIC_parallel_intel`
    - This tells the computer which file we're running. 
 3. `/home/rlwhall/scratch/test6/job_options.txt`
    - This is a settings file that `CLASSIC_parallel_intel` needs to run. For example, where CLASSIC can find the input data it needs to make it's predictions.
 5. `90.5/105.5/30.4/45.5`
    - This is the geographic range of the area to model with CLASSIC.

##### Submit slurm job

We're ready to submit our job using slurm. 

Open a terminal window, and use your `cd` commands to navigate to the folder which has the `classic_submit_dra.sh` file. 

Once you're there, enter the following command 
`sbatch classic_submit_dra.sh`

We've now submitted the job to slurm, and it will schedule it to run, and execute it based on availability on the cluster. 

We can check on the status of our job using
`sq`

### Break

*Break while the model runs (30 minutes)*