# Opening Jupyter Notebooks on TSCC

[Jupyter](http://jupyter.org/) notebooks are a great tool to keep track of the workflow for your data analysis. You can load up your results, maniplate them, make pretty figures, export your final data and figures to a file, all in one place!

We installed jupyter as one of the installations when we installed miniconda [here](https://github.com/biom262/cmm262-2020/blob/master/Module_2/Notebooks/Downloading_Miniconda.ipynb), so no more installations will be necessary if you did this. You can see the executibles for jupyter in your miniconda bin.

```
$ ls ~/miniconda3/bin
```

## Before we do anything, let's grab an interactive job so that we are not all computing on the login node:
This may take a couple minutes but be patient!


Let's break down the command:

- `qsub`: Request a job on the cluster.
- `-I`: Make the job interactive. Once the cluster allocates your job successfully, it will bring up the shell prompt again.
- `-l nodes=1:ppn=2`: Request 2 cores on a single node in the cluster.
- `-l walltime=03:00:00`: Request that this job runs for 3 hours. The job will automatically terminate 3 hours after it starts.

```
$ qsub -I -l nodes=1:ppn=2 -l walltime=03:00:00
```


## To start a notebook, on TSCC:


### 1. Run the following command: 

Replace the 4-digit number at the end with a random number between 2000 and 9999. Do a good job of picking randomly! If anyone else is using this number, you will be bumped to the next higher available number. **Save this 4-digit number for Step 5.**

**Recommended:** Add the '&' sign at the end of the line to allow this command to "run in the background". Without it, you will not be able to return to the command line while running a notebook in this window. If you choose not to run it in the background, just start a new tab in your terminal, re-login to TSCC, and run your commands there.

```
$ jupyter notebook --no-browser --port #### &
```   
  
Wait a minute for the following to appear on your screen:

    [1] 40110
    [ucsd-train01@tscc-4-55 ~]$ [W 12:06:56.912 NotebookApp] Unrecognized JSON config file version, assuming version 1
    [I 12:06:57.957 NotebookApp] [nb_conda_kernels] enabled, 2 kernels found
    [I 12:06:59.812 NotebookApp] ✓ nbpresent HTML export ENABLED
    [W 12:06:59.813 NotebookApp] ✗ nbpresent PDF export DISABLED: No module named nbbrowserpdf.exporters.pdf
    [I 12:06:59.837 NotebookApp] [nb_conda] enabled
    [I 12:07:00.201 NotebookApp] [nb_anacondacloud] enabled
    [I 12:07:00.222 NotebookApp] Serving notebooks from local directory: /home/ucsd-train01
    [I 12:07:00.222 NotebookApp] 0 active kernels 
    [I 12:07:00.222 NotebookApp] The Jupyter Notebook is running at: http://localhost:6221/
    [I 12:07:00.222 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
    
### 2. Then press enter to return to the command line. 
(Presumes that you are running jupyter notebook in the background.)

### 3. Find your login node:
Look for the node your interactive job has loaded on. You can find this in your command line prompt. We need to tunnel with the same node, so this node name is important for the next step. For example, my prompt shows that I am running my interactive job on `tscc-4-55`.

    [ucsd-train01@tscc-4-55 ~]$ 

### 4. Now move to a new tab on your local machine *(not TSCC)*

Follow instructions for your OS:

**MAC** 
---

We are going to tunnel our connection through our local laptop in order to view Jupyter notebooks in a web browser. Remember, TSCC does not have a web interface so we have to take this extra step. Run the following command:

```
$ ssh -NL ####:localhost:#### ucsd-train##@tscc-#-##.sdsc.edu
```

There are a couple things that will be specific to you. 1)The 4-digit numbers should be the same that you chose above. 2) The two numbers after ucsd-train should be the numbers you were assigned as your login username. 3) The `tscc-#-##` after the `@` symbol should be the specific node you found in step 3. 

You may be prompted to enter your password. Do that to continue.

**WINDOWS** 
---

Step 1: Create a new Putty session.
So that you only need to have one Putty session open, we'll make a new TSCC Session. Create one with ucsd-train##@login_node.sdsc.edu, and call the session "TSCC Jupyter"

Step 2: Add your private key and allow forwarding
Go to Connection > SSH > Auth > Load your private key file

Step 3: Add a tunnel
Go to "Connnection > SSH > Tunnels" Then:
Click the checkbox next to "Local ports accept connections from other ports"
Add your #### for your source port
Add localhost:#### for your Destination
Click "Local"
Click "Add"

Step 4: Save your settings!
So you don't have to do this every time... Save your settings! Go all the way back to the "Session" window and click "Save" Remember to save this with a different name then your normal login information. Maybe "tscc_jupyter"

Step 5: Click open and continue through the login information. 

### 5. Open a web browser. 
In the URL link, type the following command with your specific 4 digit random number.

```
$ localhost:####
```

## Success! You have now started a Jupyter notebook! 

You can confirm that this notebook is running on TSCC with the following command:

### 6. Starting a new or old notebook 
Play around with all the features of the notebooks that you see. We will work through these together initially. Notice when you select "File - New Notebook" You can select python to open a new python notebook.

### 7. Exiting Jupyter notebook. 
If you **did not** run Jupyter notebook in the background (i.e. include '&' at the end of your command), use Control-C to stop this server and shut down all kernels (twice to skip confirmation)..

If you ran Jupyter notebook in the background (i.e. included '&' at the end of your command), CTRL+C will not work to stop the server.
You'll need to exit the server by 'killing' the Jupyter notebook by first finding the PID then entering the kill command: 

```
$ ps -u ucsd-trainXX
```

For example my output looks like....

      PID TTY          TIME CMD
     4758 ?        00:00:00 sshd
     4759 pts/108  00:00:00 bash
    24349 ?        00:00:00 sshd
    24350 pts/298  00:00:00 bash
    40110 pts/136  00:00:01 jupyter-noteboo
    40400 ?        00:00:02 sshd
    40401 pts/209  00:00:00 bash
    51978 pts/136  00:00:00 ps
    60325 ?        00:00:00 sshd
    60326 pts/136  00:00:00 bash
    
If I want to kill my jupyter notebook I can do that with:

```
$ kill -9 40110
```

Notice 40110 is the PID of the notebook.

Note: If you do not exit the server, your files are all safe :) But you won't be able to re-enter Jupyter notebook with the same 4-digit number next time (you'll automatically get bumped to the next higher number available).