# How to Run a Job on the Sherlock Computer Cluster

### Logging onto Sherlock From Terminal

Open up a new terminal window on your computer. On Mac, this can be done by holding down 'cmd' + 'space'. A Spotlight Search bar will pop up, and typing in 'terminal' will bring up an icon with a small black square that you can click on to open the Terminal Application. 

The next step is to ssh (remote login) to the cluster. Type the following command into the command line prompt on your computer. 

In [None]:
ssh <your SUnet ID>@sherlock.stanford.edu

It will prompt you to enter the password associated with your Stanford email and select a method for 2-factor verification (Duo Mobile)

### Navigating the Login Node

You are now in the login node on Sherlock. You generally don't want to run any code on the login node because it has limited memory and will bog down performance for other users. From the login node, we can log into the scratch node. All data on the scratch node is purged after 90 days, so be careful to move important data either onto your local machine or a more permanent node. (more on this later)

Type the following command into the command line prompt to navigate to the scratch node

In [None]:
cd $SCRATCH

It should now say '/scratch/users/\<your SUnet ID>' to the left of your command line prompt.
You're now ready to begin creating files and scripts.
    
To create a new folder use the keyword 'mkdir' + '\<folder name>'

In [None]:
mkdir summer_2023

You can then use the command 'cd' to change directory into that (or any other existing) folder.
This won't work for files (so 'cd python_script.py' won't work).

In [None]:
cd summer_2023

To navigate out of the directory you are currently in, you can use 'cd' without any other arguments to return to the login node or 'cd ..' to back up to the previous folder.

In [None]:
cd .. # takes you to the previous folder
cd # takes you to the login/Home node

To find the full path to the directory you are currently in, use the command 'pwd'

In [None]:
pwd

You can use the command 'vim' to create or open a file (.txt, .csv, .py, etc.) in a text editor

In [None]:
vim data.csv
vim notes.txt
vim data_analysis.py

In vim, type 'i' to enter 'insert mode' to be able to enter text.<p>
When you are done typing, press 'esc' to exit insert mode.<p>
To save your work type ':w'.<p>
To save your work and exit the text editor type ':wq' or ':x'

You can use the command 'cat' to print the contents of a file to terminal without opening it.

In [None]:
cat notes.csv

You can use the command 'ls' to print all files and folders in the current directory.

In [None]:
ls

You can use the command 'rm' + '\<filename>' to delete a file. 
This is permanent and cannot be undone. 

In [None]:
rm data.csv

You can use 'ctrl' + 'c' to stop/interrupt a process that is currently running

The command 'exit' will end your session and log you off of the computer cluster

In [None]:
exit

### Moving Files to/from Sherlock

This process is done from your local machine (either before logging into Sherlock or from another terminal window. <p>on a Mac, additional windows can be opened in the terminal by pressing 'cmd' + 't'). <p>
The login node only has ~15gb of memory, which can easily be eaten up by a large fits file. I recommend sending files directly to the scratch node. <p>
The format to send a file from your local machine to Sherlock it is: <p> scp \<filename> \<your SUnet ID>@login.sherlock.stanford.edu:/scratch/users/\<your SUnet ID> <p>
To send a file from Sherlock to your local machine, reverse the order and specify the location to store the file (ex. in Downloads): <p>
scp \<your SUnet ID>@login.sherlock.stanford.edu:/scratch/users/\<your SUnet ID>/\<path to file> \<local path to store file>

In [None]:
# send from local machine to scratch
scp test_data.fits <your SUnet ID>@login/sherlock/stanford.edu:/scratch/users/<your SUnet ID>

# example of sending from folder summer_2023 within scratch to local machine
<your SUnet ID>@login.sherlock.stanford.edu:/scratch/users/<your SUnet ID>/summer_2023/test_data.fits Downloads

### Loading Modules on Sherlock

Sherlock is not guaranteed to have all of the modules or libraries that you need. <p>To look up to different versions of libraries use the command 'module spider \<library>'

In [None]:
module spider python
module spider pandas
module spider astropy
module spider numpy

It will give you examples of different versions of the library, and you can use the command 'module load' or 'ml' to load them.

In [None]:
module load python/3.9.0
ml py-pandas/1.3.1_py39
ml py-numpy/1.20.3_py39

### Creating an sbatch file to submit a job

Start off by creating a new sbatch file. You can call it whatever you want as long as it ends in .sbatch

In [None]:
vim submit.sbatch
vim example.sbatch

In the sbatch file you need to provide information such as: <p> <b>
    What partition to run your job on<p>How much time to give your job to run<p>Whether you want to receive email updates
        <p>Where to send email updates<p>The number of tasks<p>CPUs per task<p>Memory needed<p>How to save the output file<p>Any dependencies to load at runtime<p>The commands you want executed<p></b> <p>Here is an example sbatch file the will run on the KIPAC partition for 2 hours and require 32GB of memory. It will run the 3 example python scripts listed.

In [None]:
#!/bin/bash -l
#SBATCH --partition=kipac
#SBATCH -t 2:00:00
#SBATCH --mail-type=ALL
#SBATCH --mail-user=<your SUnet ID>@stanford.edu
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=32G
#SBATCH --output=submit.%j.out

#source ~/setup.sh
ml python/3.9.0
ml py-pandas/1.3.1_py39

python3 clean_data.py
python3 data_analysis.py
python3 correlate_data.py

After saving your sbatch file, type 'sbatch' + '\<sbatch filename>' to submit your job to scheduler

In [None]:
sbatch submit.sbatch