# Getting Started on the HPC

### Questions:
- What is the HPC?
- How can I enroll for access?
- How can I login?
- How can I access the class code repository on GitHub?

### Objectives:
- Get logged into the HPC
- Download or "clone" the class GitHub repository

### Keypoints:
- We are going to use the HPC to run large-scale bioinformatics jobs.
- We are going to use GitHub as a way to share the latest cod and Jupyter notebooks for our project. 
- Let's get started!

# Getting started on the HPC

## What is the HPC?

HPC is an acronym for "high-performance computing," and it generally means using a cluster of computers.  Students have access to several clusters (puma, ocelote, elgato) at the University of Arizona.  To use a cluster, you usually submit a batch job along with a description of the resources you need (e.g., memory, number of CPUs, number of nodes) to a scheduler that will start your job when the resources become available.  We will discuss schedulers, and in particular "SLURM" that is used on UA clusters, later in this class as we dive deeper into using the HPC and generating bioinformatics pipelines. For now, we will get started with using the command line via the HPC web-portal.

When you login to the HPC web-portal and ask for a "shell access" (see steps below), you are placed on the head node. **YOU ARE NOT ALLOWED TO RUN COMPUTE INTENSIVE JOBS ON THE HEAD NODE**. When we get to the point where we start running real bioinformatics jobs, you will do all of these steps on a compute node by sending out the job via the SLURM scheduler (more on that later!). Because we are getting started with simple Unix commands and scripting (that are not computationally intensive), we will login to the head node to do a few simple tasks.

## Steps for HPC access

To get started, please make sure that you have completed these two steps:

1. Enroll in Netid+ to access HPC systems. https://webauth.arizona.edu/netid-plus/

2. Create an HPC Account (if you don't already have one). https://account.arizona.edu

3. I have already added each of you to the bh_class group on the HPC.


## Logging into the HPC Online Portal at the University of Arizona

Once those steps are done, we are ready to check and make sure you can access the HPC. 

1. Go to the HPC web-portal: https://ood.hpc.arizona.edu/pun/sys/dashboard and login with your UA net-id and password. 

2. On the top menu bar select "Clusters" and "_Shell access" from the pull-down list. 

![image.png](attachment:image.png)

3. A shell terminal will open for you in a new window. Type "ocelote" after prompt and hit enter. Now you will automatically be in your home directory.  

![image-2.png](attachment:image-2.png)

Once you login, the system will send you to your home directory. In my case, this is "/home/u20/bhurwitz". 


## Downloading the Class GitHub Repository

Now that you can access the HPC, you are ready to download or "clone" the class GitHub repository to your home directory. As we go through the project, I will add exercises and assignments to our class GitHub repository. Before you start a new exercise or assignment you will need to download the lastest version of the repository with any new or modified material. 

Let's walk through how to "clone" the class GitHub Repository, and how to keep it up to date.

1. In the shell terminal from above, type the following commands. Note that you will only need to do this once to download the repository to your home directory on the HPC.

'''
cd ~
git clone https://github.com/hurwitzlab/be487-fall-2024.git
'''

You should see something like this...

![image.png](attachment:image.png)

2. To update the repository with any new material you will type the following commands from the shell:

'''
cd ~/be487-fall-2024
git pull
'''

If the repository is up to date you will see this:

![image-2.png](attachment:image-2.png)

Otherwise you will see the following:

![image-3.png](attachment:image-3.png)

## Opening your first Jupyter Notebook:

In this class, all of our exercises and assignments (for the class project) will be implemented in Jupyter Notebooks. So, from here on out, you will use Jupyter Notebooks in the GitHub repository for each exercise and assignment. Let's try doing this for the first exercise.

1. Go to the HPC web-portal: https://ood.hpc.arizona.edu/pun/sys/dashboard and login with your UA net-id and password. 

2. On the top menu bar select "Interactive Apps" and "Jupyter Notebook" from the pull-down list.

3. Select the Ocelote Cluster, Standard queue, PI Group: bh_class, and all other defaults.

4. When the session is running, click on the button "Connect to Jupyter"

![image.png](attachment:image.png)

5. This will open the Jupyter server, where you will be able to see and navigate to the class repository in your home directory.

![image-2.png](attachment:image-2.png)

6. Click on assignments, and go into the 01_getting_started folder. Here you will see your first assignments to learn about using Jupyter Notebooks and the HPC. 