# TSCC User Guide

* * *

## What is TSCC?

TSCC houses the 640-core supercomputer as part of a resource sharing system which allows researchers to perform calculations and experiments when they need extra computing power.  
  
* TSCC user guide: http://rci.ucsd.edu/computing/index.html
* The main contacts for questions about TSCC is the TSCC users mailing lists. The main contact for problems with TSCC is [Jim Hayes](jhayes@sdsc.edu). 
    * TSCC users: tscc-l@mailman.ucsd.edu
    * Jim Hayes: jhayes@sdsc.edu

## My First Supercomputer Login Session

Your first login session will familiarize you with the cluster, teach you how to do some useful tasks on the queue, and help you set up a common and useful directory structure shared.

Learning goals:

* Learned about Public and Private keys
* Learn how to log in to TSCC
* Create directories for your code and Jupyter notebooks
* Learn how to submit jobs to TSCC

### Set up your logins

To check whether you the user are authorized to log into a server, the server has a few options. It can ask you for a password, which will have to match exactly, or it will check whether you are an "Authorized User" by looking at your `~/.ssh/` file and whether your private key `~/.ssh/id_rsa` matches its list of authorized keys in the servers `authorized_keys` file. What underlies this concept, which is called "Public Key Cryptography," is cryptography, number theory, and hashing - learn more about it [here](https://en.wikipedia.org/wiki/Public-key_cryptography).

#### Mac/Linux

Copy the `biom262_rsa` private key emailed out to `~/.ssh`. To avoid errors, you may also need to add the key with the extension `.pri`, which we will do as well.

```
cp ~/Downloads/biom262_rsa ~/.ssh
cp ~/Downloads/biom262_rsa ~/.ssh/biom262_rsa.pri
```

Use `ssh-add` to add this private key to the list of keys `ssh` looks at when matching you up with the accepted users to a server.

```
ssh-add ~/.ssh/biom262_rsa
ssh-add ~/.ssh/biom262_rsa.pri
```

#### Windows
(Instructions compiled from [here1](http://www.it.cornell.edu/services/managed_servers/howto/file_transfer/fileputty.cfm),  [Analyzing Next-Gen Seq (ANGUS) data workshop](http://ged.msu.edu/angus/tutorials/using-putty-on-windows.html), and [How To Forge](https://www.howtoforge.com/how-to-configure-ssh-keys-authentication-with-putty-and-linux-server-in-5-quick-steps))

Get the latest PuTTY executables ([link to zip](http://the.earth.li/~sgtatham/putty/latest/x86/putty.zip). It doesn't need to be installed so you can put the file on your desktop or wherever you like to put your programs. You'll need the `putty.exe`, `pscp.exe`, and `pageant.exe` files.


First, open PUTTYGEN.EXE. Click "Load" to load an existing private key file, and select the Biom262_RSA file as the key. Save the key as a private key, and don't set a passphrase.

Next, open PAGEANT.EXE. Check your taskbar on the bottom-right (you may need to click the triangle to expand the taskbar) and click the syspanel icon: 
Add the key you just generated using PUTTYGEN.EXE and it should appear on the Pageant Key List.

(proceed to the steps shown below now)


## Log in to TSCC

In your terminal, type the following (you'll need to replace "`##`" with your number). This should not ask for a password. If it asks for a password, raise your hand.

### Mac/Linux

```
$ ssh ucsd-train##@tscc.sdsc.edu
Rocks 6.2 (SideWinder)
Profile built 17:40 06-Jan-2016

Kickstarted 18:26 06-Jan-2016
TSCC Cluster Login Node

For information on using the TSCC, please visit http://idi.ucsd.edu/computing
By using the TSCC, you agree to the Acceptable Use Policy found on
http://idi.ucsd.edu/_files/TSCC-Acceptable-Use-Policy.pdf

PLEASE NOTE: a portion of the /oasis/tscc/scratch filesystem failed during
Tuesday's outage.  We are working on repairs; in the meantime, we have
mounted the old (pre 12/7) filesystem read-only.
[ucsd-train##@tscc-login1 ~]$ 
```

#### Possible error messages

You may get a message like the one below. This is expected for the first time you connect to any server. Press "Enter" (which will choose "yes", the first and default anwser)

```
The authenticity of host '192.168.0.100 (192.168.0.100)' can't be established.
RSA key fingerprint is 3f:1b:f4:bd:c5:aa:c1:1f:bf:4e:2e:cf:53:fa:d8:59.
Are you sure you want to continue connecting (yes/no)? 
```



If you get this error:
```
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@         WARNING: UNPROTECTED PRIVATE KEY FILE!          @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
Permissions 0644 for '/Users/olga/.ssh/biom262_rsa' are too open.
It is required that your private key files are NOT accessible by others.
This private key will be ignored.
bad permissions: ignore key: /Users/olga/.ssh/biom262_rsa
Password: 
````

Then you need to make the key not readable by "group" and "other". Do this with:

```
chmod go-r ~/.ssh/biom262_rsa
```

Which subtracts the "r" (reading) permission from "g" (group) and "o" (other).

### Windows

Double-click the PuTTY exe.

![](images/windows-putty-connect.png)

Enter `tscc.sdsc.edu` as the host (not `lyorn.idyll.org` as above). You may get a warning like the one below. This is completely normal for when you first log on to a new server. Click "Yes"

![](images/windows-putty-security-alert.png)

You'll get a terminal like the one below. Make sure to log in with your "`ucsd-train##`" username.

![](images/windows-putty-lyorn-login.png)

## 2. Organize your home directory
Phew, we made it to TSCC!

Create the base storage location for your code development (or just use your home area):
```
mkdir code
mkdir notebooks
mkdir data
ln -s /oasis/tscc/scratch/$USER $HOME/scratch
```

Now look at what's there in your home directory with `ls -l` (the `-l` stands for "long listing")

The output should look like this

```
total 10
drwxr-xr-x 2 ucsd-train12 biom262-group  2 Jan  4 11:57 code
drwxr-xr-x 2 ucsd-train12 biom262-group  2 Jan  4 11:57 data
drwxr-xr-x 2 ucsd-train12 biom262-group  2 Jan  4 11:57 notebooks
lrwxrwxrwx 1 ucsd-train12 biom262-group 32 Jan  4 11:57 scratch -> /oasis/tscc/scratch/ucsd-train12
```

## 3. Environment Variables and your Bash Profile

Unix commands are written in "[BASH](https://en.wikipedia.org/wiki/Bash_(Unix_shell)) (stands for "Bourne-again shell", where "Bourne shell" was a previous version but someone thought they could do better so they made BASH).  

Set a BASH environment variable
```
export STR="hello world"
```

Access a variable
* `echo $STR`

The most important environment variable is `$PATH`.  Folders in this path are automatically searched when looking for executable tools via auto-complete or `which`
* `echo $PATH`
* `which programname.sh`

Customize your BASH profile by editing your `~/.bashrc` file with `nano`:

```
nano ~/.bashrc
```


This command is executed each time you log in to TSCC:
* `source ~/.bashrc`

A convenient command to add to your `.bashrc` is to do "long listing" of files with a few keystrokes:

```
alias ll='ls -lha'
```

To add this to your `.bashrc`, type `nano ~/.bashrc` (or some other editor that you prefer), which will open up a text-only editor, and add the line above. Then save and quit

( *optional* ) Additional details on BASH profile customization
* https://wiki.archlinux.org/index.php/Bash

## 4. Submitting jobs to TSCC compute cluster

When you log in to TSCC, you are connected to a "login node".  When executing a task, you should always use an "execution node".  
* More details in the TSCC user guide: http://rci.ucsd.edu/computing/index.html

Write a small script to test with. The following command will create `test_script.sh` if it doesn't exist already. Write `echo "hello I am a test"` into the file.

```
$ nano test_script.sh
```

Now you should be able to look at the file with `cat` and see the contents you just wrote.

```
$ cat test_script.sh
echo "hello I am a test"
```

To submit a script that you wrote, in this case called test_script.sh, to TSCC, for 10 minutes (time is in `hours:minutes:seconds`)

```
$ qsub -q hotel -l nodes=1:ppn=1 -l walltime=0:10:00 test_script.sh
3962194.tscc-mgr.local
```

### Wait, what just happened??

What happeneed is this: you typed some keystrokes on your laptop, which were sent to the "head node" on TSCC (the first node you get onto when you log on - it appears as `tscc-login1` or `tscc-login2`). A "node" is equivalent to a single computer.

![](images/head_node.png)

Then, the head node told the job scheduleer to run your script, `test_script.sh` for a maximum of 10 minutes, on one node (one computer) and with one processor (how many cores/pieces of the computer to use. Maximum is 16, but be nice and get 8 or less). In general, increase `ppn` or processors per node first, rather than the nodes. This is because your program probably wants to use shared memory between the processes and this is harder to do across computers, and thus your code will be slower.

The nice thing about supercomputers is that you can submit a job, close your laptop, get on an airplane, and it will still be running! 


#### Well what about the input and output?

Since we didn't specify a name for the job, two files will be created from the results of our script:

```
test_script.sh.o3962194  # o = output
test_script.sh.e3962194  # e = error
```

Where the `.o` file captures the output sent via `stdout` and the `.e` file captures the output via `stderr`. In this case the `.e` file should be empty:

```
$ cat test_script.sh.e3962194
```

(No output)

And the `.o` file should contain the text we wrote and the node this script was run on:

```
$ cat test_script.sh.o3962194
hello I am a test
Nodes:        tscc-0-25
```

### Additional options for `qsub`

#### Interactive jobs

To submit interactive jobs (which you will need for running Jupyter notebooks), do:
* `qsub -I -q hotel -l nodes=1:ppn=1 -l walltime=0:30:00`

#### Different Queues

To submit to the home-scrm queue, add -W group_list=scrm-group to your qsub command:
* `qsub -I -l walltime=0:30:00 -q hotel -W group_list=scrm-group`


#### Check job status
Check the status of your jobs:
* `qstat -u $USER`

#### Check status of array jobs

Check the status of your array jobs, you need to specify ``-t`` to see the status of the individual array pieces. 
* `qstat -t $USER`

Example output:

```
tscc-mgr.local: 
                                                                                  Req'd    Req'd       Elap
Job ID                  Username    Queue    Jobname          SessID  NDS   TSK   Memory   Time    S   Time
----------------------- ----------- -------- ---------------- ------ ----- ------ ------ --------- - ---------
3924211.tscc-mgr.local  obotvinnik  home-scr STDIN             21407     1      8    --  168:00:00 R 133:31:05
```

Killing jobs
* `qdel 2006527`

Kill an array job
* `qdel 2006527[]`

Kill all your jobs
* `qdel $(qselect -u $USER)`

## 5. Which queue do I submit to? (check status of queues)

**Check the status of the queue** (so you know which queues to NOT submit to!)

Example output is:  

    $ qstat -q

    server: tscc-mgr.local

    Queue            Memory CPU Time Walltime Node  Run Que Lm  State
    ---------------- ------ -------- -------- ----  --- --- --  -----
    home-dkeres        --      --       --      --    2   0 --   E R
    home-komunjer      --      --       --      --    0   0 --   E R
    home-ong           --      --       --      --    2   0 --   E R
    home-tg            --      --       --      --    0   0 --   E R
    home-yeo           --      --       --      --    3   1 --   E R
    home-visres        --      --       --      --    0   0 --   E R
    home-mccammon      --      --       --      --   15  29 --   E R
    home-scrm          --      --       --      --    1   0 --   E R
    hotel              --      --    168:00:0   --  232  26 --   E R
    home-k4zhang       --      --       --      --    0   0 --   E R
    home-kkey          --      --       --      --    0   0 --   E R
    home-kyang         --      --       --      --    2   1 --   E R
    home-jsebat        --      --       --      --    1   0 --   E R
    pdafm              --      --    72:00:00   --    1   0 --   E R
    condo              --      --    08:00:00   --   18   6 --   E R
    gpu-hotel          --      --    336:00:0   --    0   0 --   E R
    glean              --      --       --      --   24  75 --   E R
    gpu-condo          --      --    08:00:00   --   16  36 --   E R
    home-fpaesani      --      --       --      --    4   2 --   E R
    home-builder       --      --       --      --    0   0 --   E R
    home               --      --       --      --    0   0 --   E R
    home-mgilson       --      --       --      --    0   4 --   E R
    home-eallen        --      --       --      --    0   0 --   E R
                                                   ----- -----
                                                     321   180

### Show available processors

To show available processors

```
$ showbf
backfill window (user: 'ucsd-train12' group: 'biom262-group' partition: ALL) Mon Jan  4 11:51:55

1258 procs available for       6:41:01
1246 procs available for       6:46:49
1234 procs available for       6:47:55
1222 procs available for       6:58:56
1210 procs available for       7:03:56
1198 procs available for       7:07:21
1197 procs available for       7:53:57
1196 procs available for    2:22:28:47
1189 procs available for    3:21:36:13
1181 procs available for    3:21:36:37
1171 procs available for    7:18:01:48
1169 procs available for    7:18:02:38
1168 procs available for   11:13:35:19
1152 procs available for   11:13:38:46
1151 procs available for   11:13:39:00
1150 procs available for   11:13:39:06
1149 procs available for   11:13:39:08
1148 procs available for   11:13:39:20
1146 procs available for   12:16:17:21
1145 procs available for   12:16:24:43
1144 procs available for   12:19:30:54
1128 procs available for   12:19:32:45
1112 procs available for   12:19:47:11
1097 procs available for   13:02:59:20
1095 procs available for   18:00:46:12
1085 procs available for   18:00:52:42
1073 procs available for   18:01:18:11
1061 procs available for   19:08:20:06
1059 procs available for   32:23:59:07
1055 procs available for   39:06:02:29
1051 procs available for   39:08:29:56
1047 procs available with no timelimit
```

Show specs of all nodes (show first 20 lines for brevity)
```
$ pbsnodes -a | head 20
tscc-0-0
     state = job-exclusive
     np = 16
     properties = rack0,ib,ibswitch1,mem64,hotel-node,ibgroup0,sandy
     ntype = cluster
     jobs = 0/3939246.tscc-mgr.local,1/3939246.tscc-mgr.local,2/3939246.tscc-mgr.local,3/3939246.tscc-mgr.local,4/3939246.tscc-mgr.local,5/3939246.tscc-mgr.local,6/3939246.tscc-mgr.local,7/3939246.tscc-mgr.local,8/3939246.tscc-mgr.local,9/3939246.tscc-mgr.local,10/3939246.tscc-mgr.local,11/3939246.tscc-mgr.local,12/3939246.tscc-mgr.local,13/3939246.tscc-mgr.local,14/3939246.tscc-mgr.local,15/3939246.tscc-mgr.local
     status = rectime=1451937165,varattr=,jobs=3939246.tscc-mgr.local,state=free,netload=326963399719,gres=,loadave=1.00,ncpus=16,physmem=66068376kb,availmem=62016840kb,totmem=68116372kb,idletime=834178,nusers=1,nsessions=1,sessions=25478,uname=Linux tscc-0-0.sdsc.edu 2.6.32-504.16.2.el6.x86_64 #1 SMP Wed Apr 22 06:48:29 UTC 2015 x86_64,opsys=linux
     mom_service_port = 15002
     mom_manager_port = 15003

tscc-0-1
     state = free
     np = 16
     properties = rack0,ib,ibswitch1,mem64,hotel-node,ibgroup0,sandy
     ntype = cluster
     jobs = 0/3940449[3162].tscc-mgr.local,1/3940449[3162].tscc-mgr.local,2/3940449[3162].tscc-mgr.local,3/3940449[3162].tscc-mgr.local,4/3940449[3162].tscc-mgr.local,5/3940449[3162].tscc-mgr.local,6/3940449[3162].tscc-mgr.local,7/3940449[3162].tscc-mgr.local
     status = rectime=1451937126,varattr=,jobs=3940449[3162].tscc-mgr.local,state=free,netload=25069731960580,gres=,loadave=15.57,ncpus=16,physmem=66068376kb,availmem=52672300kb,totmem=68116372kb,idletime=10776,nusers=1,nsessions=1,sessions=60315,uname=Linux tscc-0-1.sdsc.edu 2.6.32-504.16.2.el6.x86_64 #1 SMP Wed Apr 22 06:48:29 UTC 2015 x86_64,opsys=linux
     mom_service_port = 15002
     mom_manager_port = 15003
```

* * *

## Addtional Resources

* The TSCC supercomputer system primarily uses [SLURM](http://slurm.schedmd.com/) for grid processing control.
* Official guides from the San Diego Supercomputing Center (SDSC):
    * [TSCC Quick Start Guide](http://www.sdsc.edu/support/user_guides/tscc-quick-start.html) explains how to write all submit parameters into a single script
    * [TSCC User Guide](http://www.sdsc.edu/support/user_guides/tscc.html)