# Opening Jupyter Notebooks on TSCC

[Jupyter](http://jupyter.org/) notebooks are a great tool to keep track of the workflow for your data analysis. You can load up your results, maniplate them, make pretty figures, export your final data and figures to a file, all in one place!

Jupyter came installed by default when we downloaded anaconda, so no more installations will be necessary. You can see the executibles for jupyter in your anaconda bin.

    ls ~/anaconda2/bin
    

**0) Identify which login node you are on.** Pay attention to the number after login on your command line. This is your login node. TSCC has two login nodes and you are randomly assigned one when logging into tscc. However, we need to tunnel with the same login node, so this number is important for the next step. In this example, I am on login node 1.

    [ucsd-train01@tscc-login1 ~]$  

**1) Submitting interactive job to obtain own node.** Because some of the packages that we will be running are somewhat memory intensive, we are going to get in the habit of obtaining an interactive node prior to opening Jupyter notebooks. This allows us to "reserve" nodes ahead of time and run tasks interactively without having to request compute resources on a per job basis. The following will request 1 compute nodes (4 processors) for a duration of 4 hours:

    qsub -I -l nodes=1:ppn=4 -l walltime=4:00:00 -q home-yeo 
    
Wait to be assinged a node and for your job to start. Once your requested nodes are obtained, your display will now look something like this:

    qsub: waiting for job 11846292.tscc-mgr.local to start
    qsub: job 11846292.tscc-mgr.local ready

    [ucsd-train01@tscc-0-25 ~]$ 

This means that you were successful in your request and will now be able to launch Jupyter.


**2) To start a notebook, on TSCC run the following command.** Replace the 4-digit number at the end with a random number between 2000 and 9999. Do a good job of picking randomly! If anyone else is using this number, your notebook will not load. Add the & sign at the end of the line to allow this command to "run in the background". Without it, you will not be able to return to the command line while running a notebook in this window.

    jupyter notebook --no-browser --port #### &
    
What a minute for the following to appear on your screen:

    [1] 40110
    [ucsd-train01@tscc-login1 ~]$ [W 12:06:56.912 NotebookApp] Unrecognized JSON config file version, assuming version 1
    [I 12:06:57.957 NotebookApp] [nb_conda_kernels] enabled, 2 kernels found
    [I 12:06:59.812 NotebookApp] ✓ nbpresent HTML export ENABLED
    [W 12:06:59.813 NotebookApp] ✗ nbpresent PDF export DISABLED: No module named nbbrowserpdf.exporters.pdf
    [I 12:06:59.837 NotebookApp] [nb_conda] enabled
    [I 12:07:00.201 NotebookApp] [nb_anacondacloud] enabled
    [I 12:07:00.222 NotebookApp] Serving notebooks from local directory: /home/ucsd-train01
    [I 12:07:00.222 NotebookApp] 0 active kernels 
    [I 12:07:00.222 NotebookApp] The Jupyter Notebook is running at: http://localhost:6221/
    [I 12:07:00.222 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
    


**3) Wait until your login token appears on the screen, then press enter to return to the command line.** 


**4) Set up a tunnel FROM TSCC to your local machine.** This step is absolutely necessary when initiating Jupyter from an interactive job. Enter the following command, replacing #### with your port number and # with the tscc login node you are logged in on (we found this in step 0):

    ssh -NR ####:localhost:#### tscc-login# &

**5) Now move to a new tab on your local machine (not TSCC)**

**MAC** 

We are going to tunnel our connection through our local laptop in order to view Jupyter notebooks in a web browser. Remember, TSCC does not have a web interface so we have to take this extra step. Run the following command:

    ssh -NL ####:localhost:#### ucsd-train##@tscc-login#.sdsc.edu

There are a couple things that will be specific to you. 1)The 4-digit numbers should be the same that you chose above. 2) The two numbers after ucsd-train should be the numbers you were assigned as your login username. 3) The number after login should be the specific node you found in step 3. 

You will prompted to enter your password. Do that to continue.

**WINDOWS** 

Step 1: Create a new Putty session.
So that you only need to have one Putty session open, we'll make a new TSCC Session. Create one with ucsd-train##@tscc-login#.sdsc.edu, and call the session "TSCC Jupyter"

Step 2: Add your private key and allow forwarding
Go to Connection > SSH > Auth > Load your private key file

Step 3: Add a tunnel
Go to "Connnection > SSH > Tunnels" Then:
Click the checkbox next to "Local ports accept connections from other ports"
Add your #### for your source port
Add localhost:#### for your Destination
Click "Local"
Click "Add"

Step 4: Save your settings!
So you don't have to do this every time... Save your settings! Go all the way back to the "Session" window and click "Save" Remember to save this with a different name then your normal login information. Maybe "tscc_jupyter"

Step 5: Click open and continue through the login information. 

**6) Open a web browser.** In the URL link, copy and paste your entire token provided on TSCC:

    http://localhost:4253/?token=7afd94d117fb2dec855e562a463769e42fcd75c7b21f87de&token=7afd94d117fb2dec855e562a463769e42fcd75c7b21f87de
    
**7) You have now successfully started a jupyter notebook!** Look, you should be able to see your TSCC directory. Make a new folder in your home called jupyter_notebooks (either on this interface, or on the command line with mkdir). Move into that folder to start a new notebook. You can confirm that this notebook is running on TSCC with the following command:

    ps -u username
    
For example my output looks like....

      PID TTY          TIME CMD
     4758 ?        00:00:00 sshd
     4759 pts/108  00:00:00 bash
    24349 ?        00:00:00 sshd
    24350 pts/298  00:00:00 bash
    40110 pts/136  00:00:01 jupyter-noteboo
    40400 ?        00:00:02 sshd
    40401 pts/209  00:00:00 bash
    51978 pts/136  00:00:00 ps
    60325 ?        00:00:00 sshd
    60326 pts/136  00:00:00 bash
    
If I want to kill my jupyter notebook I can do that with:

    kill -9 40110
    
Notice 40110 is the PID of the notebook.

Alternatively, you can wait for your interactive job to reach the end of its walltime which will shut everything down automatically. 
