# Introduction of Basic Operations in HPC

## Outline
- SSH connection
- Basic Linux commands
- File transfer
- Job submission
- Tasks

## SSH Connection

### 1. Why SSH?

- PCs and laptops are not suitable for heavy computation
- HPC clusters are designed for high performance computing
- Need a way to connect to HPC clusters remotely
- SSH (Secure Shell) is a protocol for secure remote login and other secure network services over an insecure network

### 2. How to SSH?

- Apply account for HPC clusters
    - University clusters: [NUS-VANDA](https://nusit.nus.edu.sg/hpc/get-an-hpc-account/)
    - National clusters: [NSCC](https://user.nscc.sg/saml/)

- Download some related software
    - Windows: 
        - [MobaXterm](https://mobaxterm.mobatek.net/): Easy for beginners, with built-in X server and SFTP
        - [Xshell](https://www.xshell.com/zh/free-for-home-school/): Popular SSH client, works well with [Xftp](https://www.netsarang.com/en/xftp/) for file transfer
        - [PuTTY](https://www.putty.org/): Lightweight, but old-fashioned
    - MacOS/Linux:
        - [iTerm2](https://iterm2.com/): Popular terminal emulator for MacOS, works well with [FileZilla](https://filezilla-project.org/) for file transfer
        - Terminal (built-in)
    - Optional: 
        - [VSCode](https://code.visualstudio.com/)

- Connect to remote server
    - The core command to connect to an HPC cluster is `ssh user@server`:
        - `ssh`: Secure Shell, a protocol for securely connecting to remote servers.
        - `user`: Your username on the HPC cluster, generally your `NUS ID`.
        - `server`: The address of the HPC cluster you are trying to connect to.
            - Atlas: `atlas8.nus.edu.sg`
            - VANDA: `vanda.nus.edu.sg`
            - NSCC: `aspire2a.nus.edu.sg`
    - For connecting to server with external software, find the corresponding key-value and fill in the blanks
    - Note: 
        - The first time you connect to a new server, you may see a message asking if you want to trust the server's host key. Type `yes` and press Enter to continue.
        - VPN is required for off-campus access to the above clusters.
- Check your connection:
    - After connecting, you should see a command prompt that looks something like this: `user@server:~$`
    - You can also check the hostname of the server you are connected to by typing `hostname` and pressing Enter.
    - Then you are in the remote server, and you can interact with it by typing commands into the terminal or using the GUI of the software.

## Basic Linux Commands

### 1. Check current directory and list files
- `pwd`: Print Working Directory, shows the current directory you are in.
- `ls`: List files and directories in the current directory.
    - `ls -l`: List files in long format, showing permissions, ownership, size, and modification date.
    - `ls -a`: List all files, including hidden files (those starting with a dot).

### 2. Move your directory to the desired location
- `cd <directory_path>`: Change Directory, move to the specified directory.
    - `cd ..`: Move up one directory level.
    - `cd ~`: Move to your home directory.
    - `cd -`: Move to the previous directory.

### 3. Create, copy, move, and delete files and directories
- `mkdir <directory_name>`: Create a new directory with the specified name.
- `touch <file_name>`: Create a new empty file with the specified name.
- `cp <source> <destination>`: Copy a file or directory from source to destination.
    - `cp -r <source_directory> <destination_directory>`: Copy a directory and its contents recursively.
- `mv <source> <destination>`: Move or rename a file or directory.
- `rm <file_name>`: Delete a file.
    - `rm -r <directory_name>`: Delete a directory and its contents recursively.

**Note: Be careful with the `rm` command, as deleted files cannot be easily recovered, especially for `rm -rf`.**

### 4. Edit files
Generally, there are two types of text editors in Linux: command-line editors (`vim`, `nano`, `sed`) and <font color="read">graphical editors</font>.
- The former one is more powerful and flexible, but has a steep learning curve.
- The latter one is more user-friendly, but may not be available in all environments.

### 5. View file content
- `cat <file_name>`: Concatenate and display the content of a file.
- `less <file_name>`: View the content of a file one page at a time, allowing for easy navigation.
- `head <file_name>`: Display the first 10 lines of a file.
- `tail <file_name>`: Display the last 10 lines of a file.
    - `tail -f <file_name>`: Continuously monitor a file for new content, useful for log files.
- `grep <pattern> <file_name>`: Search for a specific pattern in a file and display matching lines.

####

## File Transfer

1. Upload/download files to/from remote server
    - Use external software to transfer files between local machine and remote server (drag and drop is supported in some software)
        - Windows: MobaXterm, Xftp, FileZilla
        - MacOS/Linux: FileZilla
    - Use `scp` (secure copy) command in terminal
        - Upload: `scp <local_file_path> user@server:<remote_file_path>`
        - Download: `scp user@server:<remote_file_path> <local_file_path>`

2. Download files from public websites to remote server (conducted in remote server)
    - Use `wget` command in terminal
        - `wget <file_url>`: Download a file from the specified URL to the current directory.
    - Use `curl` command in terminal
        - `curl -O <file_url>`: Download a file from the specified URL to the current directory, preserving the original filename.
    - Use `git` command in terminal (for version control)
        - `git clone <repository_url>`: Clone a remote Git repository to the current directory.
        - `git pull`: Update the local repository with changes from the remote repository.
        - `git push`: Upload local changes to the remote repository.

## Job Submission

A job management system centrally manages and schedules computing resources, distributing user-submitted jobs across different nodes or cores within the same node to maximize cluster utilization and simplify operations.

### 1. Submit job
Two ways to submit your job:
- Interactive job:
    - Interactive jobs allow you to access compute nodes directly for testing or debugging.
    - `qsub -I -l select=1:ncpus=24:mem=64gb -l walltime=02:00:00`
- Batch job:
    - Batch jobs are submitted via a script and run automatically when resources are available.
    - `qsub run_job.pbs`

**Example batch script (`run_job.pbs`):**
```bash
#!/bin/bash
#PBS -N my_job
#PBS -l select=1:ncpus=24:mem=64gb
#PBS -l walltime=12:00:00
#PBS -q normal
#PBS -j oe

cd $PBS_O_WORKDIR
module load vasp
python test_script.py
```

Note:    
- `#PBS -N my_job`: Job name
- `#PBS -l select=1:ncpus=24:mem=64gb`: Request 1 node with 24 CPUs and 64GB memory
- `#PBS -l walltime=12:00:00`: Set maximum wall time to 12 hours
- `#PBS -q normal`: Specify the queue to submit the job to
- `#PBS -j oe`: Combine standard output and error into a single file

### 2. Check job status
- `qstat`: Check the status of your jobs in the queue.
    - `R`: means the job is currently running.
    - `Q`: means the job is queued and waiting for resources.
    - `C`: means the job has completed.
    - `E`: means the job has exited with an error.
- `qdel <job_id>`: Delete a job from the queue using its job ID.
- `qstat -f <job_id>`: Get detailed information about a specific job.

## Tasks

### 1. Connect to the HPC cluster using SSH.
### 2. Navigate through directories and manage files using basic Linux commands.
### 4. Configure your environment
- Download the needed files: `git clone https://github.com/Y-Chao/s25-cm5100.git`
- Install the required dependencies: `cd s25-cm5100 && bash config_hpc.sh`
- Input the ip of the file server when prompted.
- Submit a toy job to test your configuration: `qsub sub_vasp.pbs`