# Cluster Computing 

Physics and astronomy often invove large datasets or a large number of computations or both, for which simply running code on your personal machines will not suffice. 
You could need more firepower (eg. more cores), more memory or just different processors (eg. GPUs). This is where computer clusters come in. You take your code, stick it on a cluster and submit jobs to be run on the cluster remotely. Here, we'll go over the basics of how to do that. 

## Connecting to a cluster

In [Section 2](https://github.com/jeffiuliano/Penn-Summer-Computing-Training/blob/main/2_ssh_and_scp/SSH_SCP_Workbook.ipynb), we learnt to securely interact with a non-local machine. To connect to most clusters, we will use SSH as described there. For example, for those of you using *marmalade*, you will do 

`ssh username@marmalade.physics.upenn.edu`

- Note that to connect to *marmalade*, you must be on a Penn secured network (eg. AirPennNet wi-fi) or VPN into Penn. For using the Penn VPN, you need a PennKey (UPenn username and account) and associated account setup. For details of how to VPN into Penn, go [here](https://www.isc.upenn.edu/how-to/university-vpn-getting-started-guide). 
- Note also that *marmalade* access is granted by your PI. Please bother them to get access. 

This will then prompt you for a password. The password for your account should be the same as the password associated with your PennKey. Remember that your terminal will not indicate that you are typing while you enter your password. If you are successful, the terminal will print out something like:

`Last login: Wed May 4 15:58:52 2022 from xx.xxx.xxx.xxx`

You are now on the **head node / login node** on *marmalade*. 

### Head / Login Nodes

#### What is a head node and why will everyone get mad at you if you mess with it?

Most clusters are set up such that you land on a *head or login node* when you connect to it. This is not the location where actual jobs are run, this node will not perform your computations. It's purpose is to facilitate those computations being run on a *compute node*. **Do not use the login node to run scripts.** You might wind up preventing other people from logging in by consuming resources. This node will not have enough memory to support large jobs. Etc etc. **This is how you make everyone in your research group and beyond angry with you.** 

Things it is okay to use the login node for:
- view and edit scripts
- view output
- perform git actions 
- submit and managae jobs
- spy on who else is using the cluster resources 

Things I will yell at you for using the login node for:
- running jobs
- debugging scripts
- I'll probably add some more things here

Grey area:
- managing your python environment
- installing other packages 
- installing code 

Note that for the "grey area" points, it's still better to perform these actions on a compute node if you can, because ultimately, those are the nodes that will be running your jobs, not your login node. For example, *marmalade* has AMD nodes as well as Intel nodes. Use the specific compute nodes that you will run your jobs on to compile your code, because there are some differences between the two types of nodes. 

#### Cores vs. Nodes

**Nodes** are effectivly a self-contained CPU. This has some memory, input/output and storage. This also has processors. The processors are sometimes referred to as **cores**, but usually, each processor is made up of a ~couple cores. 

The cores then share everything the node has - they share memory, I/O and storage. But for parallelisable code, you can parallelise across these cores and make them simultaneously run tasks. 

For *marmalade*, 
- astro has 3 AMD nodes which each have 64 processors or a total of 128 cores
- CM group has Intel nodes, you should bother them for their details 
- CM also has some GPUs 

### Home 

The login node will take you to your home directory. This is where you can install your code, edit your `.bashrc`, direct your output. The home directories on *marmalade* are backed up every 24h. 

### Scratch

Depending on the cluster, a separate location is preferred for runtime output called *scratch*. This has faster input/output than the *home* directory. Depending on how often you need to read/write during a job, it's better to send stuff to scratch as opposed to home. 

For *marmalade*, the scratch directories are attached to the compute nodes. So if your job was running on `node11` say, then the scratch in node11 will hold your output. The jobs I run on *m* are basic MCMCs, don't need fast I/O, so I just write to home. 

### Data

Again, depending on the setup of the cluster, sometimes a data directory is provided per group to share and store large quantities of data. 

Astro on *m* doesn't have this, but CM does I think. Bother your CM grad students / postdocs to learn more. 

## Cluster set-up

Now that you're logged in, how do you go about setting up you code and ensuring it runs correctly? There's a few simple places to start:
- if it's your own, personal code, you can scp it onto the cluster 
- git clone from a repository 

### Loading available software

You might also need modules to be able to run your code, for eg. an installation of C++ or fortran or python. Most clusters will have installations of these already in place, available for you to use. A good place to start to look for these is 

`module avail` 

This prints out all available pre-installed software that you can just load onto your profile on the cluster, for eg. with 

`module load git/xx.xx`

You can then check that you indeed have access to git now with 

`which git`

This should print some location of where the git you are using lives `...bin/git`

Useful modules on *m* include a few versions of GCC, git, anaconda, valgrind ... The cluster helpdesk (marmalade-manager@sas.upenn.edu) is usually responsive and will install more software to add to the list of available modules if you make a good case for it (perhaps even if you make a bad one). 

### Saving cluster set-up

Once you know which modules you will need and have compiled your codes based on these modules, you should save this set-up. 
Check whihc modules you currently have loaded by doing 

`module list`

Then save these in your .bashrc or your .bash_profile to ensure they are loaded everytime you login:

`module load xxx/xxx.xx`

You should also save your anaconda environment. Someone else will tell you about that. This saves the version numbers of your python packages and loads to exact right ones for which you have compiled your code and know that it works. To export the package info for the active environment, do 

`conda env export > environment.yml`

These are important programming practices and will save you SO much time when things invariably break or you have to move your work to a different cluster. 

### Source cluster set-up

When you login, the cluster usually sources your .bashrc and .bash_profile. This loads all your settings, as long as oyu saved them here. 

Some clusters are set up such that you don't need to reload these settings when you move to a compute node to run a job interactively or when you submit a job. Some aren't. 

For *m*, I need to source my .bashrc and .bash_profile everytime I run an interactive job, and have gotten into the habit of doing the same for submitted jobs

`source .bashrc` 


## Running jobs 

Ideally, you'd like to test your code before try to submit a job and have to wait for it to begin, compute and possibly fail because of some bug. Clusters often have debug queues dedicated to this purpose that you can submit jobs to or directly interact with to run your code. 

Below I'll cover how to do both **assuming the cluster uses the SLURM queuing system.** Commands for this usually start with "s" and you can identify them below. 

### Interactive sessions

An interactive session is how you usually run code on your own machine. You run a command, wait for it to finish running, kepp getting runtime output printed to your screen, once it's finished, you run the next command. 

You can also run interactive sessions on clusters. The exact commands sometimes differ, but the gist is 

`srun -N <number of nodes requested> -c <cpus per task> -n <number of tasks> -p <name of partition / queue> -t <time in minutes> --pty bash` 

where you're asking SLURM to run asap (you can add a delay by specifying with more arguments) on N nodes, c cpus per task and n tasks (this is an overspecified problem, N and c or N and n or n and c are enough), on the p queue for t time and to open you a bash shell. 

Alternatively, you can give a similar command with an executable at the end instead of requesting a bash shell. 

Usually, I just want a few cores for a small time to debug. I'm astro so on *m* I use the *highcore* queue and ask for a bash shell:

`srun -n 4 -p highcore -t 60 --pty bash` 

There's infinite more commands you can give to specify what kind of job you want to run. [Here](https://slurm.schedmd.com/srun.html) is a pretty exhaustive list of them. You can also try looking at the documentation for other clusters because as long as they too use SLURM, the commands should tranfer. 

NERSC should have its own documentation, but unfortunately *m* doesn't have any. 

### Submitting jobs

Another, usually better way to run things is to submit a job to the queuing system. The system then schedules your job, it will start without further intervention and per specification, email you when it ends. You can then go do other things and still delude yourself into believing you're being productive. I highly recommend it. 

#### SLURM specifications

To do this, we write a job submission script that specifies for example, the queue to submit to, how many nodes you want, how many cores, for how long, when to email you about the job, etc. Here's an example:

Here, I'm doing the following:
- specifying that this is a bash script
- setting job name = lcdm
- sending output to the file j_lcdm_base_mcmc.txt in the same directory as the one from which I launch the script 
- asking for 6 tasks per node
- and 4 cpus per task. So in total, this takes up 24 cpus. 
- for the queue I submit to, 24 cpus easily fit into one node, so I don't need more than that. This also puts all my tasks on the same node, important for my specific code. 
- asking this to run for 1 day
- on the highcore queue
- requesting an email about everything - this includes job beginning, finishing, timing out, or failing
- to my email address. For marmalade, I think this has to be a Penn address

Any line that begins with `#SBATCH` is read by SLURM. All others are taken to be bash commands.

#### Which queue is for you? 

Which queue you submit to is important. You might request too many resources and your job will never be scheduled, you might not have permissions to submit to certain queues, etc. all which will delay science. Or make people mad. Or both! 

For *marmalade*, if you're astro, you can submit to:
- highcore (AMD nodes)
- low_compute (Intel nodes)
- low_gpu
- low

For CM on marmalade, please check with folks in your group. Besides the partitions of your group, you can also submit to
- low_highcore 

This is just so astro folk have principal access to nodes they paid for and the same for CM. 

For *NERSC*, check with your group. 

#### Job commands

The commands above tell SLURM *how* I want to run my job. Here's the rest of the script telling it what I actually want to run:

Let's use knowledge you've hopefully built over this tutorial and see that here I'm doing:
- change directory to my home directory. That's where `~` takes you
- source my bashrc and bash_profile. This sets gets all necessary modules, sets my python environment, makes sure anything that needs to be on my PATH is added
- then for redundancy and to debug, I check which python, gcc and mpirun the programme will call. If these are different from what I expect, one or more of my environment variables didn't set correctly
- change directory to the one I actually want to run my script in 
- run my parallel script. Note that I'm asking mpirun to launch 6 processes here, this matches what I told SLURM - that I want 6 tasks per node and 1 node = 6x1=6 tasks. 

Remember, the job output file that we specified earlier will be in whatever location you submitted your job script from, not in the directories `~` nor in `/home/karwal/some/directory/`. 

So now we have our job script. Let's save this in some file `job_script.sh`.  We submit simply by 

`sbatch job_script.sh`

That's it. The job is submitted. SLURM will take note of the resources you have requested and will allocate them when it can. Your job will begin, you'll get an email about it beginning and again when it finishes (or crashes) (or runs out of time). 

#### Submitting batch jobs 

## Useful SLURM commands

I have hopefully driven the point home that a cluster is a community resource. If you abuse that resource or don't follow rules, people will get mad at you. 
And what do you do to make them less mad? You cancel your offending jobs! 

`scancel <job ID number>` will terminate that job. 

`sinfo` tells you about what resources are in use on the cluster. Nodes can be:
- alloc for completely allocated 
- mix for some allocated and some free cores on the node 
- drain for a node being shut down for maintainence, but the jobs on the node will finish first 
- down for a node out of commision and
- idle for free nodes! That await a purpose in life! And that purpose is SCIENCE! 

`squeue` can be used to get info on jobs currerntly running on the cluster. You can add optional arguments to this to get more detailed info, for example 

`squeue -u username` tells you what jobs that user currently has running. 

I usually define a new command in my .bash_profile as 

`alias sqme="squeue -u karwal"` for quickly checking on my own jobs. 

You can similarly do 

`squeue -p highcore` to check on a specific partition and so on. 


Lastly, a really useful command to check when your job is scheduled to start, end and other relevant details is 

`scontrol show jobid ####`

This fills in a few seconds after submission, once SLURM has had a chance to schedule the job. 


## Bonus: open remote text files with atom

[Atom](https://atom.io/) is a text editor that has a tonne of useful packages and is very customiseable. 

[One](https://atom.io/packages/hydrogen) of these incredibly useful packages lets you run python code line by line. 

[Another](https://atom.io/packages/ftp-remote-edit), lets you SFTP more elegantly, and view and edit the text files on a cluster in Atom, just as if they were on your machine. 

To set this up, first download and install [atom](https://atom.io/).

Then, from its package manager, get the ftp-remote-edit package. 

Hitting control+space on your laptop should then prompt you to put in a master password. Don't forget it, I don't know how to help you if you do. 



Then, on the left "Remote" column that comes up, right click and Edit servers. Fill in the details, usually for protocol you'd select SFTP for secure file transfer protocol. Here's an example:

![ftp_remote_edit](ftp_remote_edit_example.png) 

And voila, you're done. 

You can now access remote files and edit them with ease locally. 