# Jupyter/Ipython Notebooks

[This file](./JupyterNotebooks.ipynb) is a jupyter notebook. The notebook is a convenient way to leverage your web-broswer to do interactive computing. To unpack that a bit more, it gives you access to a rich set of elements such as documentation, code cells, visualizations, interactive controls etc. from which you can capture your work or reasearch **as you think**. As a side-effect you end up with a very effective tool for communicating your ideas (and learning from others) and you can move smoothly from exploring an idea to a production workflow on HPC systems.

The Jupyter notebook took its inspiration from systems like mathematica and matlab which in turn took their inspiration from old-school physical notebooks. Each link in that chain brings something new and powerful to the scientific workflow and it is worth exploring that a bit more by looking at the project history

# Interactive Computation

A typical research workflow stars with a problem them you have a loop something like ...

  1. Capture some relevant data
  1. Explore the data: Make some plots, clean it, slice it, dice it
  1. Look for structure, test some hypotheses
  1. Build a model

In most cases you're doing this sitting in front of the same screen, and interactive computation is about making that process as fluid, pleasant and effective as possible. It's like machine learning with a fancy ASIC (you!) interactive computing is about making sure we feed that resource as efficiently and effectively as possible.

# The Jupyter ecosystem

Loosely the Jupyter ecosystem encompases a wide range of software and standards which tries to live up to the promise of interactive computing

* Software: Jupyter, JupyterHub, JupyterLab, IPython, IRkernel, Binder, ...
* Protocols: Jupyter Kernel Protocol, Notebook Format; Notebook server, JupyterHub REST APIs
* Community: [jupyter](https://github.com/jupyter), [jupyterhub](https://github.com/jupytercon), [scipy](https://conference.scipy.org/), [jupytercon](https://conferences.oreilly.com/jupyter/jup-ny)

## A Little History
* IPython started as an interactive shell for python in 2001 by *Fernando Perez (CU)*
  * Inspired by Mathematica's notebook interface
  * Python interactive shell c.f. IDLE etc.
  * .ipynb JSON lists of cells + state = interactive REPL
  * Command history, integrated help, *%magics* etc.
  
* 2010 architecture shifted to ZMQ - Implementing two-process implementation of IPython Kernel/Client (QT)

Already at this point the idea that the protocols are the important thing started peeking through. If you specify specify a standard for .ipynb files, people can write other clients without worrying about you breaking things, e.g nteract from netflix as a UI for notebooks, or the QT interface or the notebook interface. As long as the standard stays the same all of them will work and can _continue development_!

* 2011 notebook added as a frontend (*James Gao*)
  * IO cells which can contain code, narrative, mathematics, plots, interactive controls and rich media.
  <div style="text-align:center"><img src="./images/ipy_0.13.png" width="600px" /></div>

* Live Code
* Narrative Text
* Mathematical Equations
* Visualizations
* Interactive Control
* Rich Media

* **2014** Spin-off project from IPython called Project Jupyter - formalizes and extends the kernel/client split
  * Python, R, Julia, Haskell, Octave, Matlab, ... (~40 kernels)
  * Now IPython *ONLY* refers to the Python kernel
* **2015** JupyterHub - Multiuser version of the notebook. Pluggable authentication and spawners
* **2018** JupyterLab - An _extensible_! IDE

<div style="padding-top:40px;">
<div style="float:left;margin-left:150px;"><img src="./images/jupyter.png" width="250px"></div>
<div style="float:right;margin-right:150px;"><img src="./images/foz_login_page.png" width="250px"></div>
</div>

## Project Jupyter

* Strong support and governance anchored at UC Berkeley
* Widely adopted (~$7\times10^6$ notebooks visible on github)
* Also picked up some awards
  * 2013 Free Software Foundation Advancement of Free Software Award
  * 2018 ACM Software System Award
* Reference example of open standards in Interactive Computing

<div style="float:left;margin-left:150px;"><img src="./images/Sponsors.png" width="600px"></div>

<div style="float:left;margin-left:150px;"><img src="./images/InstitutionalSupport.png" width="600px"></div>

## [syzygy.ca](https://syzygy.ca)

A partnership between [PIMS](https://www.pims.math.ca), [Compute Canada](https://www.computecanada.ca/) and [Cybera](https://cybera.ca) to provide [JupyterHubs](https://github.com/jupyterhub/) for Canadian researchers and students.

* Focus on University/Organization/Workshop Hubs
* 30+ Hubs, 21 Universities ~26500 accounts
* Leverages University SSO (Shibboleth)
* R, Python3 kernels

Research is a "Work in progress". All of the tools we'll talk about today are available, but not always in the smoothest way!


## Accessing Jupyter

1. On your laptop: [anaconda](https://www.anaconda.com/)
1. On syzygy: (https://westgrid.syzygy.ca)
1. On CC-HPC systems: check the [wiki](https://docs.computecanada.ca/wiki/Jupyter)

## [Anaconda](https://www.anaconda.com/)

This is the local option, you install on your device and can control all aspects of it

* Installers for Mac, Windows, Linux
* Provides a large package index for Python/R/etc.
* Operate `conda env` - practise safe conda

Well integrated and always available (flights, conference "wifi" etc.). 

```
$ conda create --name westgrid-rss
$ conda create -n westgrid-rss python=3.7
$ source activate westrid-rss
```

Your prompt should be updated to remind you which env you are in
```
$ conda list -n westgrid-rss
$ conda install -n westgrid-rss spacy
$ conda install -n westgrid-rss pip
```
You can capture and share environments with `conda env export > environment.yml`. Which people can then rebuild `conda env create -f environment.yml`.

## [syzygy.ca](https://syzygy.ca)

Org based hubs, e.g. https://westgrid.syzygy.ca
* Access is controlled by the institution
* Common packages should "just work"
* R and Python3 installed
* Some extensions enabled (nbgitpuller and RISE)

Capacity is limited, 1 core + 2GB Mem + 1GB storage.

## HPC-CC

Most CC systems don't have a hub associated with them (we're trying to change this!) but all have parts of the Jupyter ecosystem available. 

* See https://docs.computecanada.ca/wiki/Jupyter

Might take a little playing with environment variables and software modules, but gives you access to large resources

# The Notebook

* [documentation](https://jupyter-notebook.readthedocs.io/en/stable/examples/Notebook/Notebook%20Basics.html)
* Client/Server app
  * Server is a language kernel (or kernels) running somewhere else
  * Client is a collection if Input and Ouput Cells (notice the [##])
* When cells with code are evaluated (Shift-Enter) the code gets sent to the kernel
* The kernel sends back results which gets displayed as an output cell


Shift-Enter, the following cells. Let's start with a traditional "Hello World"

In [None]:
print("Hello World")

If the last line of an input cell returns an object (most things in python are objects) then the notebook will give the text representation of that object in the output area _and_ give it a reference `Out[XX]`.

In [None]:
1+1

The input and output make up your command history. You can reference things as you work. To see what is in your command history enter `%history` in a cell and execute it, use the `-n` argument to get the line numbers 

In [None]:
%history -n

### Editing Cells - Cut/Copy/Paste etc.

Take a look at the actions available to you in the menu bar.

Notebooks are just a collection of cells, and at any one time only one cell will have focus in your browser (marked by a green outline). When a cell has focus the notebook will be in one of two modes: `Edit mode` or `Command mode`. In edit mode you can modify the cell contents while in `Command mode` you the key combinations you enter will be interpreted as commands. Command mode is pretty powerful, allowing you to cut/copy/paste cells, split existing cells & much more. Enter `Command mode` and press `h` to display a help overlay. If you are in command mode (no cursor) you can press enter to go into `Edit mode` on the cell with focus. One (well two) particularly useful command is to insert a new cell either above or below the current cell. In command mode `a` will place a new cell `a`bove your current cell, while `b` will place one `b`elow the cell. Try it. The same functionality is (mostly) available in the menu, but the keyboard shortcuts will make your life easier as you get more familiar.



In [None]:
# Exercise: This cell will waste 100s of your time! Find a keyboard shortcut to interupt it!
import time
time.sleep(100)

## Getting Help

The notebook also gives you easy access to documentation and even source code for functions, objects, modules etc. To access is you can either wrap the object you are interested in with help() or, more commonly, put a question mark after the object. e.g. to find out about the python 3 `range` keyword, put `range?` in an empty cell and execute it

In [None]:
range?

When the cell is executed it should open a help window as a horizontal split. You can read the documentation for the range function to understand how it works. This is really useful for methods or functions in modules which support lots of arguments (`pandas.read_csv` leaps to mind...). If you need more information you can add another question mark to see the source code of the function (when available).

In [None]:
import pandas as pd

In [None]:
pd.read_csv??

#### Tab completion

Pressing tab to get avaialble completions is available in almost every system you will use and Jupyter is no different. It can let you expore module contents, grab difficult to remember function names and a lot more. Try `from numpy.random import <TAB>` to see some of what is available inside the numpy module (we'll come back to numpy in gory detail later!)


In [None]:
from numpy.random import 

## Markdown

Markdown is a lightweight mark*up*, plain text language which is widely used 

* hackmd.io
* github.com / gist.github.com <- Also natively render entire notebooks!
* Jupyter - [documentation]
* Jekyll/Hugo,
* pandoc
* ...

It is very small, so it is easy to pick up, but it is well structured and can be usefully converted into a large number of useful formats. In most flavours it supports lists, headings, tables, code blocks, images etc. and within Jupyter you can also use $\LaTeX$, and because we are in a browser you can also render images and other elements

![Jupyter Logo](https://jupyter.org/assets/main-logo.svg)

With Markdown, you can handle
  * Headings: - #, ##, ###, ...
  * LaTeX Code: - inline between `$` symbols, or "equation mode" between `$$`
  * Embedded code blocks: - use single or triple tics: `` ` `` or ```` ``` ```` (also syntax higlighting)
  * Tables: |---|---|--| etc 
  * General HTML: (`<img>, <video>` etc
  * Links to local & remote files `[name of link](source of link, e.g. https://www.westgrid.ca)`


**Exercise**: Write a markdown cell describing your research, give it a heading, subheading and a list of your interests. Try to add some links to sources describing your research interests in words small enough for me to understand.

Your browser and Jupyter actually support a much richer set of display elements. Have a look at the items you can import from `IPython.display`, there's even a YoutubeVideo element (V0D2mhVt7NE)

In [None]:
from IPython.display import YouTubeVideo
YouTubeVideo('V0D2mhVt7NE')

### Widgets & Interaction

The notebook supports the concept of widgets. Widgets will typically give you some input element (select list, text box, radio buttons, sliders etc.) which you can connect to your code. See [the documentation](https://ipywidgets.readthedocs.io/en/stable/) for some more exciting examples

In [None]:
from ipywidgets import interact, interactive, fixed, interact_manual
import ipywidgets as widgets

In [None]:
def double(x):
    return 2 * x

In [None]:
interact(double, x=5)

interact is making some guesses about reasonable parameter ranges then connecting a little chunk of javascript to our (python) kernel. When we move the slider, the widget sends the new value to our double function and gets the result back. We'll come back to widgets once we know a little bit more python

## Running System Commands

I *think* you had an introduction to the Linux CLI last week. You can run many of these directly from the notebook interface by prefixing them with an exclaimation mark `!`, e.g. `date`

In [None]:
!date

This will run the command in a shell on whichever machine your notebook server is running. Try running `!ls` to see a directory listing or `uname -a` to see some information about the machine. If you're surprised by the output, think through where the notebook server is actually running. One more gotcha, try changing directory with `cd ..`, what does `pwd` say?

Each command is executed in a subshell so the `cd` didn't "take". If you actually wanted to change directories you could do `%cd`. For the same reason you'll find shadow magic commands for some other common utilities `%mkdir`, `%cp` etc. Until you are familiar with them it's worth checking the help before running anything which will make a change!

Going a litte further we can capture the output of the commands by assignment, e.g.

In [None]:
images = !ls images
images

We get a list (actually we get an IPython.utils.text.SList, check the documentation if you're interested in the difference). We can then do things with the result, e.g. translate all of the filenames to uppercase

In [None]:
for image in images:
    print(image.upper())

## Magics

IPython has a rich library of so called "[magic](https://ipython.readthedocs.io/en/stable/interactive/magics.html)" functions. They are a part of IPython (not Python) so you can only use them in an interactive setting. All of them start with a percent sign and you can get a list of them using `%lsmagic`. You can also use the question mark `?` to get help on the magic commands.

* %timeit
* %time
* %profile
* %pastebin

In [None]:
%lsmagic

In [None]:
%history -n

In [None]:
%pastebin 3

In [None]:
%pdb