# Setting up llama includes the following steps:
1. Connecting to the greene cluster
2. Creating singularity and installing required packages
3. Git clone the llama repository
4. Asking for a compute node
5. Running the code

### Step 1: Logging in to Greene

1. Use NYU VPN (ignore it if you are on campus) and directly ssh to Greene
2. `ssh netid@greene.hpc.nyu.edu`
3. provide password
```
| \ | \ \ / / | | | | | | |  _ \ / ___|
|  \| |\ V /| | | | | |_| | |_) | |
| |\  | | | | |_| | |  _  |  __/| |___
|_| \_| |_|  \___/  |_| |_|_|    \____|

  ____
 / ___|_ __ ___  ___ _ __   ___
| |  _| '__/ _ \/ _ \ '_ \ / _ \
| |_| | | |  __/  __/ | | |  __/
 \____|_|  \___|\___|_| |_|\___|
 ```
 `[yb970@log-2 ~]$ `



### Step 2: Setting up the environment
Setting up singularity according to https://sites.google.com/nyu.edu/nyu-hpc/hpc-systems/greene/software/singularity-with-miniconda <br>
*For replication purpose, lets all use `/scratch/work/public/overlay-fs-ext3/overlay-25GB-500K.ext3.gz`*
And we now have a conda environment that we can activate when we have requested a compute node and are ready to run our code.

### Step 3: Get llama repo

1. Generate a key pair on greene, and add it to your github account <br>
https://docs.github.com/en/authentication/connecting-to-github-with-ssh/generating-a-new-ssh-key-and-adding-it-to-the-ssh-agent
2. Change the directory to scratch in order to have enough space for the code <br>
`cd /scratch/yb970/`
3. Clone the llama repository <br>
`git clone git@github.com:facebookresearch/llama.git`
4. Change the directory to llama <br>
`cd llama`
5. Ask for a compute node (be sure to ask for large memory, otherwise the llama can not run)<br>
`srun --pty -c 2 --mem=25GB --gres=gpu:v100:1 /bin/bash`
6. Activate the conda environment (Replace the path with your own path (the overlay is what you have used in step 2)) <br>
```
singularity exec --bind /scratch --nv --overlay /scratch/yb970/capstone/overlay-25GB-500K.ext3:rw \
/scratch/work/public/singularity/cuda11.7.99-cudnn8.5-devel-ubuntu22.04.2.sif /bin/bash

source /ext3/env.sh

conda activate
```
6. Install llama according to https://github.com/facebookresearch/llama <br>
`pip install -e .` <br>
`bash download.sh` <br>
It will prompt you to enter the URL you get from your email that grants you the access (Note that the URL expires in 24h. If it does, you should fill the form again and request a new one).
7. Test that it is successfully installed <br>
```
torchrun --nproc_per_node 1 example_chat_completion.py \
--ckpt_dir llama-2-7b/ \
--tokenizer_path tokenizer.model \
--max_seq_len 512 --max_batch_size 6
```

