# Using Jupyter

_Parts of this notebook are adapted from the Jupyter documentation_

## Introduction

Jupyter is an interactive computing environment that enables users to author notebook documents that include executable code, markdown text with MathJax, multimedia, and static and interactive charts. These documents provide a complete and self-contained record of a computation that can be converted to various formats and shared with others. Jupyter thus supports a form of [literate programming](https://en.wikipedia.org/wiki/Literate_programming).

Documentation is available at https://jupyter-notebook.readthedocs.io/en/stable/

Jupyter combines three components:

- **The notebook application**: An interactive application for writing and running code interactively and authoring notebook documents. There are multiple different applications available, running on the CLI, natively, or in a browser. We will use JupyterLab which runs in the browser and will soon replace the older Jupyter Notebook application (note: as of April 2018 it is recommended that Windows users stick to Jupyter Notebook as installing Jupyter Lab on Windows has some issues). 
- **Kernels**: Separate processes started by the notebook web application that runs users’ code in a given language and returns output back to the notebook web application. The kernel also handles things like computations for interactive widgets, tab completion and introspection.
- **Notebook documents**: Self-contained documents (JSON text in files with .ipynb extension) that contain a representation of all content visible in the notebook web application, including inputs and outputs of the computations, narrative text, equations, images, and rich media representations of objects. Each notebook document has its own kernel.

Notebooks consist of a linear sequence of cells. There are four basic cell types:

- **Code cells**: Input and output of live code that is run in the kernel
- **Markdown cells**: Narrative text with embedded LaTeX equations
- **Heading cells**: 6 levels of hierarchical organization and formatting (I usually just put my headings in markdown cells but using heading cells can be useful for outline views of a document)
- **Raw cells**: Unformatted text that is included, without modification, when notebooks are converted to different formats using nbconvert


## Running JupyterLab

Install with:

    conda install -c conda-forge jupyterlab
    
and run with:

    jupyter lab

(On Windows, use Jupyter Notebook, which can be launched directly from Anaconda Navigator or from an Anaconda Python console with:

    jupyter notebook
 
Jupyter Notebook uses separate browser tabs for each notebook and the filebrowser rather than integrating everything within one browser tab; nonetheless is should be fairly straightforward to follow along).

Jupyter Lab will open in a web browser tab displaying a main menu along the top, a file browser on the left listing the notebooks and files in the current directory, and a tabbed interface on the right for open notebooks, terminals, etc.

![https://jupyterlab.readthedocs.io/en/stable/_images/interface_jupyterlab.png](https://jupyterlab.readthedocs.io/en/stable/_images/interface_jupyterlab.png)

The top of the notebook list displays clickable breadcrumbs of the current directory. By clicking on these breadcrumbs or on sub-directories in the notebook list, you can navigate your file system.

To create a new notebook, use the menu option "File - New - Notebook"click on the “New” button at the top of the list and select a kernel from the dropdown. Which kernels are listed depend on what’s installed on the server.

The file manager shows a green dot next to running notebooks (as seen below). Notebooks remain running until you explicitly shut them down; closing the notebook’s page is not sufficient.

To view the running notebooks and shutdown a notebook, click the "Running" tab on the left side of the window, and then the "Shutdown" button next to the notebook in question.

To delete, duplicate, or rename a notebook right-click on it in the file browser; a context menu will show with these options along with others.

If you create a new notebook or open an existing one, you will be taken to the notebook user interface within a tab, which allows you to run code and author notebook documents interactively. The notebook UI has a toolbar at the top, followed by the notebook area below, which consists of one or more *cells*.

The notebook UI is modal. If you click in a cell or press ENTER you enter "edit mode" and can type into the cell. If you click outside a cell or press ESC you will be in "command mode" which allows you to edit the notebook structure.

You can execute a code cell or render a markdown cell by pressing Shift-ESC. Focus will move to the next cell (a new cell will be created if you executed the last cell).

If for some reason your notebook kernel hangs (e.g. waiting on some I/O that never happens, or due to a long running process), you can interrupt the execution by selecting "Interrupt Kernel" from the Kernel menu.

## Tutorial

These tutorials are for the old Jupyter Notebook UX, not Jupyter Lab, but are still useful as there is much in common.

Basics: https://jupyter-notebook.readthedocs.io/en/stable/examples/Notebook/Notebook%20Basics.html

http://jakevdp.github.io/blog/2017/03/03/reproducible-data-analysis-in-jupyter/

## Markdown

Jupyter supports GitHub-flavored markdown; see https://guides.github.com/pdfs/markdown-cheatsheet-online.pdf for a quick reference.


## Re-ordering, Inserting, Deleting and Executing Cells

![https://jupyter-notebook.readthedocs.io/en/stable/_images/menubar_toolbar.png](https://jupyter-notebook.readthedocs.io/en/stable/_images/menubar_toolbar.png)

The notebook toolbar, shown above for Jupyter Notebook but similar in Jupyter Lab, has options to save the notebook, add (+), delete (cut) cells, copy cells, paste cells, move cells up or down in the notebook, run the cell, interrupt or restart the kernel, and change the cell type.

To run a cell that has focus, use Shift-Enter. The output of the execution will be added to the notebook. Try it now:

In [None]:
1+2+3+4

More generally, you can use Python `print` statements to print info to the cell output, or you can put a Python expression (often just a variable name) at the end of the code cell and have that print automatically. It's possible to extend this latter functionality to multiple expressions by executing this code in a cell:

```python
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"
```

or, if you want this to happen automatically every time you start Jupyter, you can edit or create the file `~/.ipython/profile_default/ipython_config.py` and add the contents:

```python
c = get_config()

# Run all nodes interactively
c.InteractiveShell.ast_node_interactivity = "all"
```

If you end the last line with a semi-colon the output will be suppressed:

In [None]:
1+2+3+4;

The result of execution of the most recently executed cell is assigned to a special variable, '_':

In [None]:
print(_)

You can put a semicolon at the end of code in a cell to suppress the output. Try it:

In [None]:
1+2+3+4;

Note how the cells are marked with `In` and `Out`, and a count. The count allows you to keep track of the order in which cells were executed. `Out` is the result of execution, unless the execution of the cell resulted in some output to `stdout`/`stderr`; in this case that is shown instead with no `Out` label. Output to `stderr` is shown in red:

In [None]:
import sys
print('hello', file=sys.stderr)

`In` and `Out` are also variables that contain the history of execution (`In` is a list/array of strings, while `Out` is a dictionary/hash table):

In [None]:
print(In)

In [None]:
print(Out)

## Executing Shell Commands

You can execute a shell command in a cell by starting it with '!'. For example, if your notebook relies on certain packages, you may want to start with a cell that uses shell commands to pip install the dependencies.

Try it:

In [None]:
!ls -l

It's possible to assign this to a variable:

In [None]:
x = !ls -l

In [None]:
print(x)

## Checkpoints and Saving

The notebook is saved automatically periodically. Expplicitly saving the notebook from the toolbar or file menu actually creates a time-stamped "checkpoint", and you can revert to a saved checkpoint from the file menu. 

## Tips and Tricks


You can write math with MathJax. Double-click on the mathematical formula below to edit the associated MathJax and Markdown:

$$ P(A \mid B) = \frac{P(B \mid A) \, P(A)}{P(B)} $$

Pi is $\pi$ okay?

You can get help on a Python function (view its _docstring_) by following it with `?` in Jupyter:

In [None]:
import os

os.path.exists?

That requires you to execute the code; you can do the same without executing the whole cell by typing Shift-TAB after the function name. You can also use the built-in Python function `help`:

In [None]:
help(len)

In [None]:
help(os.path.exists)

This works on your own functions too so writing docstrings is always recommended.

You can go a step further and use two ?? to get the source code of a function.

In [None]:
import pandas as pd
pd.concat??

Jupyter supports tab-completion; try it below:

In [None]:
from itertools import combinatio


## Cell and Line Magics

There are numerous special commands called Ipython Magics that can be used to control things in Jupyter. These are either *line magics* that start with `%` or *cell magics* that start with `%%`. A line magic consists of a single line, while a cel magic consists of everything from the `%%` to the end of the cell.

`%load` can load code from external scripts. We will use that for hiding the answers to some exercises.

`%run` will let you run an external script or another notebook.

`%%time` at the start of a cell will time the execution of the cell and print a summary when done.  `%%timeit` will run the code repeatedly (100,000 times by default) and then show the mean of the top 3 times.

`%env` can be used to set environment variable values.

`%%writefile` writes the contents of a cell tro a file.

`%pycat` shows the syntax-highlighted contents of the specirfied Python file in a pop-up window.

`%%pdb` runs the contents of the cell under control of the Python debugger.

You can use `%lsmagic` to see all availble magics.

More documentation on magics is available here: http://ipython.readthedocs.io/en/stable/interactive/magics.html

A very common one for data science is `%matplotlib inline`; this is necessary if using the matplotlib or Seaborn plottting libraries to make sure the plots appear as cell outputs in the notebook. If you have a retina Mac you can use retina-resolution for plots by executing `%config InlineBackend.figure_format = 'retina'`.

## Custom Magics

You can easily create your own magics; they are just Python functions. You just need to import the appropriate Python decorators and then annotate your function. We're getting ahead of ourselves but a quick example should illustrate:

In [None]:
from IPython.core.magic import register_line_magic

@register_line_magic
def greet(line):
    print(f'Hello {line}!')

In [None]:
%greet Dave

Read more here: http://ipython.readthedocs.io/en/stable/config/custommagics.html



## The Jupyter Display System

Jupyter can display many different types of output from cells, not just text. This can be determined by the MIME type of the result, but you can use expplicit control too with the `IPython.display` module:

In [None]:
from IPython.display import display, Image

display(Image('https://www.python.org/static/community_logos/python-logo.png'))

In [None]:
from IPython.display import YouTubeVideo
# a talk about IPython at Sage Days at U. Washington, Seattle.
YouTubeVideo('1j_HxD4iLn8')

## Going Further

More good tips here: https://www.dataquest.io/blog/jupyter-notebook-tips-tricks-shortcuts/

RISE is an extension that allows you to create a slide deck in Jupyter and present it. Your deck can include live code that you execute, so it is great for Python programming talks :-). See https://github.com/damianavila/RISE for more details.

Here's an example, which is an overview of Jupyter :-) http://quasiben.github.io/dfwmeetup_2014/#/

For more advanced users, Jupyter can be extended and customized in multiple ways; you can read about them here: https://mindtrove.info/4-ways-to-extend-jupyter-notebook/

Diffing notebooks for use with SCMs like git can be tricky as they are complex JSON files. A tool to help is nbdime: http://nbdime.readthedocs.io/en/stable/

You can find a gallery of interesting notyebooks at https://github.com/jupyter/jupyter/wiki/A-gallery-of-interesting-Jupyter-Notebooks