# HPC Tutorial
This notebook provides an overview of the High-Performance Computing (HPC) system setup at NYU, focusing on accessing resources, running jobs, and managing environments.


## Running Interactive Jobs
1. SSH into Greene’s login node. See [this](https://sites.google.com/nyu.edu/nyu-hpc/accessing-hpc), [this](https://sites.google.com/nyu.edu/nyu-hpc/training-support/general-hpc-topics/tunneling-and-x11-forwarding).
2. `ssh burst`
2. Request a job to access CPU or GPU resources. (see example `srun ...` commands below)
3. Execute commands inside the assigned node as needed.


### Available Partitions
- **Account**: `ds_ga_1008_002-2025sp`
- **Partitions**: interactive, n2c48m24, n1s8-v100-1, n1s16-v100-2, g2-standard-12, g2-standard-24, c12m85-a100-1, c24m170-a100-2


## Running Jobs
### CPU-only Interactive Job for 4 Hours

In [None]:
srun --account=ds_ga_1008_002-2025sp --partition=interactive --time=04:00:00 --pty /bin/bash

### GPU Jobs
- **1 V100 GPU for 4 hours**

In [None]:
srun --account=ds_ga_1008_002-2025sp --partition=n1s8-v100-1 --gres=gpu:v100:1 --time=04:00:00 --pty /bin/bash

- **1 A100 GPU for 4 hours**

In [None]:
srun --account=ds_ga_1008_002-2025sp --partition=c12m85-a100-1 --gres=gpu --time=04:00:00 --pty /bin/bash

## Running Jupyter Notebook on HPC
1. Open the web browser and go to `https://ood-burst-001.hpc.nyu.edu/`
2. Log in and open Jupyter Notebook from the Interactive Apps section.
3. Submit your request with the following settings:
   - Number of GPUs: 1
   - Slurm Account: `ds_ga_1008_002-2025sp`
   - Slurm Partition: `c12m85-a100-1` or `n1s8-v100-1`
   - Root Directory: `scratch`
   - Number of Hours: 1

## Setting Up Singularity and Conda
1. Get on a GPU node.


In [None]:
srun --account=ds_ga_1008_002-2025sp --partition=n1s8-v100-1 --gres=gpu:v100:1 --time=01:00:00 --pty /bin/bash

### Navigate to Scratch Directory


In [None]:
cd /scratch/[netid]

### Download Overlay Filesystem


In [None]:
scp greene-dtn:/scratch/work/public/overlay-fs-ext3/overlay-25GB-500K.ext3.gz .

### Unzip the Image
Takes about 5 minutes to unzip.

In [None]:
gunzip -vvv ./overlay-25GB-500K.ext3.gz

### Copy the Singularity Image


In [None]:
scp -rp greene-dtn:/scratch/work/public/singularity/ubuntu-20.04.3.sif .

### Start Singularity and Install Conda


In [None]:
#Start Singularity:
singularity exec --bind /scratch --nv --overlay /scratch/[netid]/overlay-25GB-500K.ext3 /scratch/[netid]/ubuntu-20.04.3.sif /bin/bash

In [None]:
#Inside Singularity:
Singularity> cd /ext3/
Singularity> wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh

In [None]:
#Install Conda
bash ./Miniconda3-latest-Linux-x86_64.sh -b -p /ext3/miniconda3

### Set Up Conda Path
Add Conda to your PATH for easy access.


In [None]:
source /ext3/miniconda3/etc/profile.d/conda.sh
export PATH=/ext3/miniconda3/bin:$PATH

## Installing Python Libraries
Create a Conda environment and install necessary libraries.

In [None]:
conda create -n my_env python==3.9
conda activate my_env
conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia

### Exit the Session
To exit, press `Ctrl+D` or type `exit`.

### Reactivating Singularity

After the initial setup, everytime you want to start an interactive Singularity session, just do:

In [None]:
ssh greene
ssh burst
# Request compute (see above)
singularity exec --bind /scratch --nv --overlay /scratch/[netid]/overlay-25GB-500K.ext3:rw /scratch/[netid]/cuda11.8.86-cudnn8.7-devel-ubuntu22.04.2.sif /bin/bash -c "
source /ext3/miniconda3/etc/profile.d/conda.sh
conda activate my_env

## Running Batch Jobs
Submit a batch job for longer experiments or multiple jobs.
### Writing the Batch Script


In [None]:
#SBATCH --job-name=job_wgpu
#SBATCH --account=ds_ga_1008_002-2025sp
#SBATCH --partition=n1s8-v100-1
#SBATCH --open-mode=append
#SBATCH --output=./%j_%x.out
#SBATCH --error=./%j_%x.err
#SBATCH --export=ALL
#SBATCH --time=00:10:00
#SBATCH --gres=gpu:1
#SBATCH --requeue

singularity exec --bind /scratch --nv --overlay /scratch/[netid]/overlay-25GB-500K.ext3:rw /scratch/[netid]/cuda11.8.86-cudnn8.7-devel-ubuntu22.04.2.sif /bin/bash -c "
source /ext3/miniconda3/etc/profile.d/conda.sh
conda activate my_env
cd /scratch/[netid]/nlp_tutorial/
python ./test_gpu.py
"

### Submit the Batch Job

In [None]:
sbatch gpu_job.slurm

### Check Job Status
Check your job status in the queue.

In [None]:
squeue -u [netid]

### Cancel a Job
If you need to cancel a running job.

In [None]:
scancel [job_id]