# Jupyter Notebook

[Jupyter](http://jupyter.org/) notebooks are a way that you can have code, text, images, and math all live together in harmony. This concept can be called "[Literate Proramming](https://en.wikipedia.org/wiki/Literate_programming)", where all code that is written also has comments or an explanation of *why* it was written (which is not always obvious). In particular, the Jupyter notebook allows you to write some code, run (evaluate) it, and see the output all in once. This is called a "read-eval-print loop" or [REPL](https://en.wikipedia.org/wiki/Read%E2%80%93eval%E2%80%93print_loop).


By the end of this notebook, you will have...

* Installed Anaconda Python and R on your TSCC account
* Created a `biom262` specific environment that has only your specific packages
* Started a Jupyter notebook server on TSCC
* Viewed your remotely hosted (i.e. living on TSCC) Jupyter notebooks on your personal laptop

* * *


## Log in to TSCC

```
ssh username@tscc.sdsc.edu
```



## Install Anaconda Python, R, and Jupyter to your account on TSCC

Check your python version:  
* `python -V`

Download the Anaconda Python/R package manager using `wget` (web-get). The link below is from the Anaconda downloads [page](https://www.continuum.io/downloads). This takes some time..

```
wget https://3230d63b5fc54e62148e-c95ac804525aac4b6dba79b00b39d1d3.ssl.cf1.rackcdn.com/Anaconda3-2.4.1-Linux-x86_64.sh
```

To install Anaconda, run the shell script with bash (this will take some time). It will ask you a bunch of questions, and use the defaults for them (press enter for all)

```
bash Anaconda3-2.4.1-Linux-x86_64.sh
```

This has added the folder `~/anaconda` to your system and added stuff to your `$PATH` variable in `~/.bashrc`, but your current `$PATH` variable has not been updated, and therefore the terminal has no idea where this newfangled thing is. If you try to do any `conda` command, you'll get an error:

```
[ucsd-train12@tscc-login2 ~]$ conda --help
-bash: conda: command not found
```

To activate `conda`, use `source` on your `.bashrc`:

```
source ~/.bashrc
```

## Let's make an environment


"Environments" are sandboxes where you can install python package of specific versions, and then they don't conflict with other versions. They're very helpful if you're testing your software package with different versions of R or Python but don't want to mess up your own installation.

For `biom262`, we'll want to install these packages:
* R
* Jupyter notebook
* [`pandas`](pandas.pydata.org) (for dataframes/datatables in Python)
* [`matplotlib`](http://matplotlib.org/) (plotting in Python)
* [`seaborn`](http://stanford.edu/~mwaskom/software/seaborn/) (nicer plots in Python)
* [`scikit-learn`](http://scikit-learn.org/) (machine learning in Python)

Create a `biom262`-specific `conda` environment. By specifying these packages, we're also specifying all their dependencies (the other packages that each of these requires)

```
conda create --name biom262 --channel https://conda.anaconda.org/r r r-irkernel jupyter pandas matplotlib seaborn scikit-learn
```

Here's that big command broken down:

* `conda` - the base command (like how `git` was the base command you used for git stuff). Every `conda` subcommand is actually `conda-subcommand` e.g. `conda-create` under the hood, but we use it with just the spaces for convenience.
* `create` - The conda subcommand to create an environment
* `--name biom262` - Name of the environment to create, in this case, `biom262` is the environment name
* `--channel https://conda.anaconda.org/r` - A "channel" is a URL to a folder that contains packages that you can install. Anaconda doesn't come with the R channel by default so we have to specify it here.
* `r r-irkernel jupyter pandas matplotlib seaborn scikit-learn` - The packages to install.
 
The output is quite big, it will look something like this:

```
Fetching package metadata: ......
Solving package specifications: ...............................................................
Package plan for installation in environment /home/ucsd-train12/anaconda3/envs/biom262:

The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    r-base64enc-0.1_3          |         r3.2.2_0          25 KB  r
    r-digest-0.6.8             |         r3.2.2_2          93 KB  r
    r-jsonlite-0.9.17          |         r3.2.2_0         927 KB  r
    r-magrittr-1.5             |         r3.2.2_1         154 KB  r
    r-repr-0.3                 |         r3.2.2_0          44 KB  r
    r-rzmq-0.7.7               |         r3.2.2_3          60 KB  r
    r-stringi-1.0_1            |         r3.2.2_0        10.7 MB  r
    r-uuid-0.1_2               |         r3.2.2_0          18 KB  r
    r-irdisplay-0.3            |         r3.2.2_0          23 KB  r
    r-stringr-1.0.0            |         r3.2.2_0          78 KB  r
    r-evaluate-0.8             |         r3.2.2_0          39 KB  r
    r-irkernel-0.5             |         r3.2.2_1          71 KB  r
    ------------------------------------------------------------
                                           Total:        12.2 MB

The following NEW packages will be INSTALLED:

    cairo:            1.12.18-6          defaults
    cycler:           0.9.0-py35_0       defaults
    decorator:        4.0.6-py35_0       defaults
    fontconfig:       2.11.1-5           defaults
    freetype:         2.5.5-0            defaults
    glib:             2.43.0-2           r       
    harfbuzz:         0.9.35-6           r       
    ipykernel:        4.1.1-py35_0       defaults
    ipython:          4.0.1-py35_0       defaults
    ipython_genutils: 0.1.0-py35_0       defaults
    ipywidgets:       4.1.0-py35_0       defaults
    jbig:             2.1-0              defaults
    jinja2:           2.8-py35_0         defaults
    jpeg:             8d-0               defaults
    jsonschema:       2.4.0-py35_0       defaults
    jupyter:          1.0.0-py35_1       defaults
    jupyter_client:   4.1.1-py35_0       defaults
    jupyter_console:  4.0.3-py35_0       defaults
    jupyter_core:     4.0.6-py35_0       defaults
    libffi:           3.0.13-0           defaults
    libgcc:           4.8.5-1            r       
    libgfortran:      1.0-0              defaults
    libpng:           1.6.17-0           defaults
    libsodium:        1.0.3-0            defaults
    libtiff:          4.0.6-1            defaults
    libxml2:          2.9.2-0            defaults
    markupsafe:       0.23-py35_0        defaults
    matplotlib:       1.5.0-np110py35_0  defaults
    mistune:          0.7.1-py35_0       defaults
    nbconvert:        4.1.0-py35_0       defaults
    nbformat:         4.0.1-py35_0       defaults
    ncurses:          5.9-4              r       
    notebook:         4.0.6-py35_0       defaults
    numpy:            1.10.2-py35_0      defaults
    openblas:         0.2.14-3           defaults
    openssl:          1.0.2d-0           defaults
    pandas:           0.17.1-np110py35_0 defaults
    pango:            1.36.8-3           r       
    path.py:          8.1.2-py35_1       defaults
    pcre:             8.31-0             defaults
    pexpect:          3.3-py35_0         defaults
    pickleshare:      0.5-py35_0         defaults
    pip:              7.1.2-py35_0       defaults
    pixman:           0.32.6-0           defaults
    ptyprocess:       0.5-py35_0         defaults
    pygments:         2.0.2-py35_0       defaults
    pyparsing:        2.0.3-py35_0       defaults
    pyqt:             4.11.4-py35_1      defaults
    python:           3.5.1-0            defaults
    python-dateutil:  2.4.2-py35_0       defaults
    pytz:             2015.7-py35_0      defaults
    pyzmq:            15.1.0-py35_0      defaults
    qt:               4.8.7-1            defaults
    qtconsole:        4.1.1-py35_0       defaults
    r:                3.2.2-0            r       
    r-base:           3.2.2-0            r       
    r-base64enc:      0.1_3-r3.2.2_0     r       
    r-boot:           1.3_17-r3.2.2_0    r       
    r-class:          7.3_14-r3.2.2_0    r       
    r-cluster:        2.0.3-r3.2.2_0     r       
    r-codetools:      0.2_14-r3.2.2_0    r       
    r-digest:         0.6.8-r3.2.2_2     r       
    r-evaluate:       0.8-r3.2.2_0       r       
    r-foreign:        0.8_66-r3.2.2_0    r       
    r-irdisplay:      0.3-r3.2.2_0       r       
    r-irkernel:       0.5-r3.2.2_1       r       
    r-jsonlite:       0.9.17-r3.2.2_0    r       
    r-kernsmooth:     2.23_15-r3.2.2_0   r       
    r-lattice:        0.20_33-r3.2.2_0   r       
    r-magrittr:       1.5-r3.2.2_1       r       
    r-mass:           7.3_45-r3.2.2_0    r       
    r-matrix:         1.2_2-r3.2.2_0     r       
    r-mgcv:           1.8_9-r3.2.2_0     r       
    r-nlme:           3.1_122-r3.2.2_0   r       
    r-nnet:           7.3_11-r3.2.2_0    r       
    r-recommended:    3.2.2-r3.2.2_0     r       
    r-repr:           0.3-r3.2.2_0       r       
    r-rpart:          4.1_10-r3.2.2_0    r       
    r-rzmq:           0.7.7-r3.2.2_3     r       
    r-spatial:        7.3_11-r3.2.2_0    r       
    r-stringi:        1.0_1-r3.2.2_0     r       
    r-stringr:        1.0.0-r3.2.2_0     r       
    r-survival:       2.38_3-r3.2.2_0    r       
    r-uuid:           0.1_2-r3.2.2_0     r       
    readline:         6.2-2              defaults
    scikit-learn:     0.17-np110py35_1   defaults
    scipy:            0.16.1-np110py35_0 defaults
    seaborn:          0.6.0-np110py35_0  defaults
    setuptools:       19.1.1-py35_0      defaults
    simplegeneric:    0.8.1-py35_0       defaults
    sip:              4.16.9-py35_0      defaults
    six:              1.10.0-py35_0      defaults
    sqlite:           3.8.4.1-1          defaults
    terminado:        0.5-py35_1         defaults
    tk:               8.5.18-0           defaults
    tornado:          4.3-py35_0         defaults
    traitlets:        4.0.0-py35_0       defaults
    wheel:            0.26.0-py35_1      defaults
    xz:               5.0.5-0            defaults
    zeromq:           4.1.3-0            defaults
    zlib:             1.2.8-0            defaults

Proceed ([y]/n)? y

Fetching packages ...
r-base64enc-0. 100% |######################################################| Time: 0:00:00 347.71 kB/s
r-digest-0.6.8 100% |######################################################| Time: 0:00:01  81.73 kB/s
r-jsonlite-0.9 100% |######################################################| Time: 0:00:18  52.38 kB/s
r-magrittr-1.5 100% |######################################################| Time: 0:00:00 436.33 kB/s
r-repr-0.3-r3. 100% |######################################################| Time: 0:00:00  75.39 kB/s
r-rzmq-0.7.7-r 100% |######################################################| Time: 0:00:00 294.29 kB/s
r-stringi-1.0_ 100% |######################################################| Time: 0:00:03   3.71 MB/s
r-uuid-0.1_2-r 100% |######################################################| Time: 0:00:00 247.64 kB/s
r-irdisplay-0. 100% |######################################################| Time: 0:00:00 329.61 kB/s
r-stringr-1.0. 100% |######################################################| Time: 0:00:00 366.92 kB/s
r-evaluate-0.8 100% |######################################################| Time: 0:00:00 381.25 kB/s
r-irkernel-0.5 100% |######################################################| Time: 0:00:00 342.46 kB/s
Extracting packages ...
[      COMPLETE      ]|#########################################################################| 100%
Linking packages ...
[      COMPLETE      ]|#########################################################################| 100%
#
# To activate this environment, use:
# $ source activate biom262
#
# To deactivate this environment, use:
# $ source deactivate
#

```

Activate the environment with the instructions above.

## My First Jupyter Notebook

Start jupyter notebook server, where "`####`" is some number larger than 1024 (this is for a unique "port" number - yes like a port for boats and ships - that your notebook will run on). The `&` ("ampersand") at the end is important, because it tells the Jupyter process to run in the background, so we can run other commands on top.

```
jupyter notebook --port #### &
```

This should create output like this:

```
(biom262)[ucsd-train12@tscc-login2 ~]$ [I 13:23:05.786 NotebookApp] Writing notebook server cookie secret to /home/ucsd-train12/.local/share/jupyter/runtime/notebook_cookie_secret
[I 13:23:06.291 NotebookApp] Serving notebooks from local directory: /home/ucsd-train12
[I 13:23:06.291 NotebookApp] 0 active kernels
[I 13:23:06.291 NotebookApp] The IPython Notebook is running at: http://localhost:7788/
[I 13:23:06.291 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).

```

Now, back on your home laptop, open another tab in your terminal window. To send this notebook back to your laptop from TSCC, use this command (replace `####` and `username` with your own port and username):

```
ssh -NL ####:localhost:#### username@tscc-login2.sdsc.edu &
```

Connect to the jupyter notebook server `http://localhost:####/`.
    
Start a new notebook using the dropdown menu in the top right of the screen:
![New doc image reference](newdoc.png "New doc image reference")

You should see a page