Compute clusters generally consist of login nodes and compute nodes. When you first ssh from your local computer to a compute cluster, you will be met with a login node. The login node allows you to ask for compute resources like CPUs and GPUs, which are hosted on compute nodes. srun is the command to ask for interactive compute, and is useful for development.
Local -- ssh --> login node -- srun --> compute node (NOT USING FOR THIS COURSE)Note that I save this folder in the home directory of my remotes as ~/my_setup. You can copy this using cd ~ ; git clone https://github.com/livctr/my_setup.git. You can remove git tracking by removing the .git folder inside it.
Access the BigPurple VPN with Big-IP Edge Client. Use your Langone Health email password. If the link doesn't work, do a Google search for NYU Langone VPN. You may need to ask your advisor to get you access to the VPN (if your password doesn't work).
Access the Greene VPN with Cisco AnyConnect. Follow the download instructions here. If the link doesn't work, see this page or do a Google search for NYU VPN. Use the password for your NYU edu email. Note that if you are using NYU Wi-Fi, you are connected to the network and do not need a VPN.
If you're able to VPN, great, you can continue. If you are on Windows (even if you use WSL), open the command prompt. (Note that you can also connect through your WSL environment. If you want to set up VSCode, however, I recommend you use the command prompt). If you are on Mac, open the terminal. From now, I will refer to either as the terminal.
Type ssh <langoneid>@bigpurple.nyumc.org, where <langoneid> is the string preceding the @ in your NYU Langone email. You will be prompted for a password. Type your Langone email password. You should be able to see a login screen and something like [<langoneid>@bigpurple-ln<1-4> ~]$ in your terminal. This is the login node.
Greene is slightly more involved. First, you'll have to ssh into a gateway server that sits between your local computer and Greene. Type ssh <nyuid>@gw.hpc.nyu.edu, where <nyuid> is the string preceding the @ in your NYU email. Type in your NYU email password. You should be able to see a login screen and something like [<nyuid>@pco01la-1520a:~]$ in your terminal. Then do another ssh into Greene with ssh <nyuid>@greene.hpc.nyu.edu. If you're prompted for a password, use the same one. You should see the NYU HPC Greene login letters and something like [<nyuid>@log-<1-3> ~]$ in your terminal.
There are two ways to submit jobs.
srun: a Slurm command used to submit interactive jobs to a compute cluster.sbatch: a Slurm command used to submit batch jobs to a compute cluster. Submit these so you don't have to sit in front of your computer waiting for things to finish so you can type your next command.
Type the following to request a GPU on BigPurple.
srun -p gpu4_dev --ntasks-per-node=1 --cpus-per-task=2 --gres=gpu:1 --time=00:05:00 --pty bashWhat does each component mean? Use ChatGPT to find out 😊. But srun is a Slurm command used to submit interactive jobs to a compute cluster. Here, you are requesting 5 minutes of GPU time with a job named gpu4_dev. This is what I see after I run the command:
srun: job 52038204 queued and waiting for resources
srun: job 52038204 has been allocated resources
[<userid>@gn-0002 ~]$ This means we have access to a GPU! gn-0002 is the hostname of the compute node (type hostname to find out). After 5 minutes, the session should disconnect. You can type exit to close the connection.
Do the same as above and type the following to request a GPU on Greene.
srun -c8 --gres=gpu:rtx8000:1 --mem=32000 -t 0:10:00 --pty bashYou can type exit twice to go back to your local terminal.
That's it! That's all there is to ssh-ing! However, you'll need to do some extra things so you can develop quickly:
- See
README_SSH.mdfor setting upssh-ing with VSCode (and doing it without passwords). Also how to bypass the gateway server and forward directly to Greene. - See
README_REMOTE_ENV.mdfor setting up Singularity and conda so you can get started quickly. - See
README_SLURM_REFERENCES.mdfor already-very-good references on Slurm. - See
README_ALIASES.mdfor setting up aliases so you don't have to keep typing very long commands. - See
misc/for miscellaneous. Includes bash scripts for moving two large files in my home directory to scratch (unsure what the effect of moving.vscode-serveris whenscratchdoes its quarterly erase... we will see 😊). - Please make sure to read Greene HPC, even if you are primarily on BigPurple. It's good. For example, read Greene HPC Data Management so that you don't overload your home directory and get headaches. You'll have to create symlinks from your
homedirectory to yourscratchdirectory.
For more information, the following links (some may not be available to non-NYU students) are useful:
- Greene HPC: Very, very useful. I recommend reading this fairly thoroughly. You can use it to understand how to get started, the hardware available, what
dtnmeans, whathomeandscratchmean, how to run jobs, and how to set up VSCode. - BigPurple HPC. Found this to be relevant
- DSGA 1011 HPC Tutorial: for setting up on your remote side.