# Basics of Jupyter Notebooks

If you haven't done so already, please fetch the source from GitHub:

```bash
$ ssh <username>@tegner.pdc.kth.se
$ cd /cfs/klemming/nobackup/<initial>/<username>/prace-jupyter
$ git clone https://github.com/PDC-support/jupyter-notebook.git
```

While we are still exploring how Jupyter works we can work on the login node (please don't run anything demanding!).
Launch a Jupyter server in your `prace-jupyter` directory:

```bash
$ module load anaconda/py36/5.0.1
$ source activate prace
$ ipnport=$(shuf -i8000-9999 -n1)
$ ipnip=$(hostname -i)
$ jupyter-notebook --no-browser --port=$ipnport --ip=$ipnip --certfile=mycert.pem --keyfile mykey.key
```

Then look at the output and find the line containing `https://<IP.address>:<port>/`. 
Copy/paste this URL into your local browser and enter your password. You should now be connected to the Jupyter server. 

## Navigating Jupyter notebooks
 - Notebook Dashboard
   * `Files` tab shows files in current directory
   * `Running` tab shows kernels running on your computer
   * `Clusters` tab lets you launch kernels for parallel computing
 - Fully-fledged terminal (you can run emacs and vi)
 - Text editor for source code in many different languages  
 

## Cells

- **Markdown cells** contain formatted text written in Markdown 
- **Code cells** contain code to be interpreted by the *kernel* (Python, R, Julia, Octave/Matlab...)

![Components](img/notebook_components.png)

## Markdown cells

This cell contains simple [markdown](https://daringfireball.net/projects/markdown/syntax), a simple language for writing text that can be automatically converted to other formats, e.g. HTML, LaTeX or any of a number of others.

**Bold**, *italics*, **_combined_**, ~~strikethrough~~, `inline code`.

* bullet points

or

1. numbered
3. lists

**Equations:**   
inline $e^{i\pi} + 1 = 0$
or on new line  
$$e^{i\pi} + 1 = 0$$

Images ![Beskow](https://www.pdc.kth.se/polopoly_fs/1.771286!/image/Beskow%20front%20row%20no%20floor%20from%20right_670pW_300pH_72ppi_KTH_l_grey_to_tranps_bg.jpg)

Links:  
[One of many markdown cheat-sheets](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet#emphasis)


## Code cells

In [1]:
# a code cell can run statements of code.
# when you run this cell, the output is sent 
# from the web page to a back-end process, run 
# and the results are displayed to you
print("hello world")

hello world


## In this lesson you will learn 
- How *markdown* and *code* cells work
- How to use keyboard shortcuts to speed up your work
- How to use *widgets*
- How to use notebook *magic commands* and create new custom magics
- How to mix in different markup and programming languages (html, LaTeX, bash, ruby, perl, R, octave)

# Data analysis and visualization in Jupyter Notebooks


### Let us look into some Jupyter features
- Toggle between code and markdown cells
- Edit mode and Command mode
- Executing a cell
- Inserting, copying, pasting and removing cells
- Execution order - prompt numbers
- Meaning of _
- Getting help with ?

### <font color="red"> *Exercise 1.1* </font>

Spend a couple of minutes playing around with Markdown and code cells:
1. Create a new cell below this one, and make it a Markdown cell 
2. Go to Edit mode, and add a heading along with some bullet points and an equation
3. Add another cell below, and make it a code cell
4. Add some code which returns output (either use `print()` or type the variable name at the end of the cell)
5. Try some of the keyboard shortcuts listed below

Here are some useful hints:
* You can edit the cell by double-clicking on it, or pressing `Enter` when it's selected
* You can run the cell by pressing the play-button in the toolbar, or press `Shift-Enter`
* You can change the type of the cell from the toolbar, or press `m` for Markdown and `y` for code

**Questions**
* What is the difference between executing a cell with `Shift-Enter`, `Ctrl-Enter` or `Alt-Enter`?


If you already know all this or if you want to move on:
- Go to exercise 1.2 below

### Keyboard shortcuts 

Some shortcuts only work in Command or Edit mode.

* `Enter` key to enter Edit mode (`Escape` to enter Command mode)
* `Ctrl`-`Enter`: run the cell
* `Shift`-`Enter`: run the cell and select the cell below
* `Alt`-`Enter`: run the cell and insert a new cell below
* `Ctrl`-`s`: save the notebook 
* `Tab` key for code completion or indentation (Edit mode)
* `m` and `y` to toggle between Markdown and Code cells (Command mode)
* `d-d` to delete a cell (Command mode)
* `z` to undo deleting (Command mode)
* `a/b` to insert cells above/below current cell (Command mode)
* `x/c/v` to cut/copy/paste cells (Command mode)
* `Up/Down` or `k/j` to select previous/next cells (Command mode)
* `h` for help menu for keyboard shortcuts (Command mode)
* Append `?` for help on commands/methods, `??` to show source (Edit mode) 

### Shell commands
  - You can run shell commands by prepending with !
    - NB: on Windows, GitBash needs to have the following option enabled:   
    `Use Git and the optional Unix tools from the Windows Command Prompt` 
  - Useful, e.g., for managing the python environment
  - Remember to make sure your cell command doesn't require interaction

In [None]:
!echo "hello"

In [None]:
!pip list

 - Many common linux shell commands are available as magics: %ls, %pwd, %mkdir, %cp, %mv, %cd, *etc.*, more on magics [later in the lesson](#Magics)

<a id="exercise_git"></a>
### <font color="red"> *Exercise 1.2* </font>

Try to only use keyboard shortcuts for the following steps:

1. Create a new code cell below this one
2. Run a `git diff`
3. Toggle the output of this cell since it's too long


Since the file format of notebooks doesn't play nicely with version control, a tool called `nbdime` has been developed.

**Optional step:**

1. Install `nbdime`, activate `nbdime` with git, and rerun `git diff`. For further instructions [click here](#Version-control-of-notebooks)

## Interactive plotting

Jupyter supports interactive plotting with matplotlib and other visualization libraries (including for other languages). Matplotlib can be used with different backends, which will make the plots appear differently in the Notebook

In [None]:
%matplotlib --list

In [None]:
#%matplotlib notebook
%matplotlib inline

import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(0,2*np.pi,100)
y = np.sin(x)
plt.plot(x,y, 'r-')
plt.show()

## Widgets

Widgets add more interactivity to Notebooks, allowing one to visualize and control changes in data, parameters etc.

In [None]:
from ipywidgets import interact

#### Use `interact` as a function

In [None]:
def f(x, y, s):
    return (x, y, s)

interact(f, x=True, y=1.0, s="Hello");

#### Use `interact` as a decorator

In [None]:
@interact(x=True, y=1.0, s="Hello")
def g(x, y, s):
    return (x, y, s)

## More interactive plotting using widgets

In [None]:
from ipywidgets import interact # IPython.html.widgets before IPython 4.0

@interact
def plot(n=(1,6)):
    x = np.linspace(0,2*np.pi,100)
    y = np.sin(n*x)
    plt.plot(x,y, 'r-')
    plt.show()

### <font color="red"> *Exercise 1.3* </font>

- Execute the cell below. It fits a 5th order polynomial to a gaussian function with some random noise 
- Use the `@interact` decorator together with the function `fit`, such that you can visualize fits with polynomial orders `n` ranging from, say, 3 to 30


In [None]:
# gaussian function
def gauss(x,param):
    [a,b,c] = param
    return a*np.exp(-b*(x-c)**2)

# gaussian array y in interval -5<x-5 
nx = 100
x = np.linspace(-5.,5.,nx)
p = [2.0,0.5,1.5] # some parameters
y = gauss(x,p)

# add some noise
noise = np.random.normal(0,0.2,nx)
y += noise

# we fit a 5th order polynomial to it

def fit(n):
    pfit = np.polyfit(x,y,n)
    yfit = np.polyval(pfit,x)
    plt.plot(x,y,"r",label="Data")
    plt.plot(x,yfit,"b",label="Fit")
    plt.legend()
    plt.ylim(-0.5,2.5)
    plt.show()
    
# call function fit
# these lines are unnecessary when you use the interact widget
n=5
fit(n)

## Magics

Magics are a simple command language which significantly extend the power of Jupyter 

Two kinds of magics:

  - **Line magics**: commands prepended by one % character and whose arguments only extend to the end of the current line.
  - **Cell magics**: use two percent characters as a marker (%%), receive as argument the whole cell (must be used as the first line in a cell)

Other features:
  - Use `%lsmagic` magic to list all available line and cell magics
  - Question mark shows help: `%lsmagic?`
  - `%quickref` gives a short reference of available magic (and other) functionality 
  - Additional magics can be created, see below for example

In [None]:
%lsmagic

In [None]:
%quickref

You can capture the output of line magic (and shell) commands

In [None]:
!ls

In [None]:
ls_out = %ls
ls_out

In [None]:
%sx?

In [None]:
ls_out = %sx ls
ls_out

### %timeit
- Timing execution
- Both Line and Cell level

In [None]:
%timeit import time ; time.sleep(1)

In [None]:
import numpy as np

In [None]:
%%timeit 
a = np.random.rand(100, 100)
np.linalg.eigvals(a)

### %%writefile
Writes the cell contents as a named file

In [None]:
%%writefile foo.py
print('Hello world')

### %run 
 - Executes python code from .py files 
 - Can also execute other jupyter notebooks

In [None]:
%run foo

### %load
 - Loads code directly into cell. File either from local disk or from the internet
 - After uncommenting the code below and executing, it will replace the content of cell with contents of file.

In [None]:
# %load https://matplotlib.org/_downloads/annotate_transform.py

### %prun
 - Python code profiler
 - Cell and Line magic

### Mixing in other languages (assuming that they're installed)

The `%%script` magic is like the #! (shebang) line of a Unix script,
specifying a program (bash, perl, ruby, etc.) with which to run.  
But one can also directly use these:
- %%ruby
- %%perl
- %%bash
- %%html
- %%latex
- %%R

Why would you want to mix programming languages in the same notebook?
 - leverage strengths from different languages
 - using code from colleagues
 - a fantastic library exists in another language than your favorite one

In [None]:
%%ruby
puts 'Hi, this is ruby.'

In [None]:
%%script ruby
puts 'Hi, this is also ruby.'

In [None]:
%%perl
print "Hello, this is perl\n";

In [None]:
%%bash
echo "Hullo, I'm bash"

In [None]:
%%html
<table>
<tr>
<th>Header 1</th>
<th>Header 2</th>
</tr>
<tr>
<td>row 1, cell 1</td>
<td>row 1, cell 2</td>
</tr>
<tr>
<td>row 2, cell 1</td>
<td>row 2, cell 2</td>
</tr>
</table>

In [None]:
%%latex
\begin{align}
\nabla \times \vec{\mathbf{B}} -\, \frac1c\, \frac{\partial\vec{\mathbf{E}}}{\partial t} & = \frac{4\pi}{c}\vec{\mathbf{j}} \\
\nabla \cdot \vec{\mathbf{E}} & = 4 \pi \rho \\
\nabla \times \vec{\mathbf{E}}\, +\, \frac1c\, \frac{\partial\vec{\mathbf{B}}}{\partial t} & = \vec{\mathbf{0}} \\
\nabla \cdot \vec{\mathbf{B}} & = 0
\end{align}

### R

In [None]:
# first we need to install the necessary packages
#!conda install -c r r-essentials 
#!conda install -y rpy2
#%load_ext rpy2.ipython

In [None]:
%%R
myString <- "Hello, this is R"
print ( myString)

Inline plotting in R is straightforward 

In [None]:
%%R 
# Define the cars vector with 5 values
cars <- c(1, 3, 6, 4, 9)

# Graph cars using blue points overlayed by a line 
plot(cars, type="o", col="blue")

# Create a title with a red, bold/italic font
title(main="Autos", col.main="red", font.main=4)

# Summing up

## Key features of Jupyter Notebooks
- Excels at [literate programming](https://en.wikipedia.org/wiki/Literate_programming)
- Many features of integrated development environment (IDE): code completion, easy access to help
- [Support for many programming languages](https://github.com/jupyter/jupyter/wiki/Jupyter-kernels)

## Use cases
- Experimenting with new ideas, testing new libraries/databases 
- Interactive code, data analysis and visualization development
- Sharing and explaining code to colleagues
- Learning from other notebooks
- Keeping track of interactive sessions, like a digital lab notebook
- Supplementary information with published articles
- Teaching (programming, experimental/theoretical science)
- Presentations

## When not to use notebooks?

- Large codebases are difficult to manage in notebooks
- More difficult to follow good software development practices
    - doesn't play well with version control (see below)
    - not as easy to do automated testing
    - not as useful as IDE to ensure PEP8-compliance

## [JupyterHub](https://github.com/jupyterhub)

- A multi-user hub to spawn, manage and proxy multiple instances of the Jupyter Notebook server
- Purpose: supporting multiple users, who can log in and start notebooks
- Used by: student classes, corporate data science workgroup, scientific research group, high-performance computing group

## [JupyterLab](https://github.com/jupyterlab/jupyterlab)

- Natural evolution of the Jupyter Notebook user interface
- An "IDE": *Interactive* Development Environment
- Flexible user interface for assembling the building blocks of interactive computing
- Adaptable to multiple workflows. Switch between Notebook/narrative focus and script/console focus
- A stable version suitable for general usage was released in Feb. 2018

![jupyterlab](img/jlab-screenshot-nb-con-term-2_40.png)

## Lesson key points

- Keyboard shortcuts simplify using Jupyter
- Magics allow you to
 - access the filesystem
 - time, debug and profile your code
 - run shell commands in underlying system
- You can also create your own magics
- You can add inline plots, and widgets provide more interactivity
- The json format of Jupyter Notebooks is not optimal for version control with Git, but the nbdime tool helps
- Jupyter can run many kernels, among them Python, Octave, Julia and R (assuming they are installed on the host running Jupyter)