## 1b. Jupyter

Jupyter is an execution environment for Python, a web-based REPL (Read Eval Print Loop). It creates interactive documents which can be used to tell a story, walk through the exploration process, showcase both the code and its outcomes and display rich media such as images or animations.

### Table of contents

- [The Jupyter Environment](#Jupyter)
 - [Navigation](#Navigation)
     - [Keyboard Shortcuts](#Keyboard-Shortcuts)
     - [The Interface](#Interface-Navigation)
 - [Features](#Term-Definitions)
     - [Cells](#About-Cells)
     - [Kernel](#About-the-Kernel)
     - [Notebook](#About-the-Notebook)
 - [Code Cells](#Code-Cells)
     - [Execution History](#Execution-History)
     - [Outputting Tricks](#Outputting-Tricks)
 - [Text Cells](#Text-Markup)
   - [Markdown](#Markdown)
   - [Latex](#Latex)
   - [HTML](#HTML)
 - [Magics](#Magics)
   - [Timing](#Timing)
   - [Auto-reloading](#Auto-reloading)
 - [Shell Integration](#Shell-Integration)
 - [Interactivity](#Interactivity)


- [Topics Not Covered](#Topics-Not-Covered)
- [Further Reading](#Further-Reading)

### Term Definitions

A **notebook** is a text file, with the extension `ipynb`. It is composed of multiple cells, and can be executed inside a kernel.

A **cell** is a container that stores either code or text. 

The **kernel** is the Python engine in which code is executed. It stores variable values, imported libraries and other environment state data. A notebook is the document containing the pieces of code, while the kernel gives it ability to run it. There is always at most one kernel per notebook.

**👾 Trivia**: Jupyter supports more than just Python, in fact, its name comes from the original three languages: Julia, Python and R. The extension name comes from Jupyter's precursor, [IPython](https://ipython.org), and.. the word _notebook_.

_Note_: the cloud version (Colaboratory) supports a restricted subset of these features. The interface is also stripped down. More restrictions (time, persistence, memory) apply.

### Keyboard Shortcuts

Jupyter has a lot of functionality, this workshop presents just the most useful commands and concepts:

 - **`ctrl+s`** save

Running:
 - **`ctrl+enter`** run cell
 - **`shift+enter`** run cell and advance to next one
 - **`ii`** interupt kernel
 - **`00`** restart kernel
 - [Edit] > [Clear all outputs]

Cell operations:
- **`esc`** exit out of edit mode
- **`z`** undo
- **`shift-z`** redo
- **`x`** cut (use as delete)
- **`c`** copy
- **`v`** paste
- **`a`** insert a new cell above
- **`b`** insert a new cell below
- **`shift+m`** merge cells
- **`ctrl + shift + -`** split a cell at the current cursor position (in edit mode)

Change cell type:
- **`m`** to text/markdown
- **`y`** to code (default when creating a cell)
- **`1`** through **`6`** to header 1 through 6

In [None]:
x = 2

In [None]:
2 + 3

In [None]:
import time
time.sleep(2)

### About Cells
 - Code cells can be executed in the current state of the kernel and output its result. 
 - If the cell's output is `None`, nothing is displayed.
 - Text cells can be "executed" (same shortcuts) to render their formatting.


 - A cell that has not been executed has a `[ ]` before it
 - After execution, each cell is given an index, turning `[ ]` into `[n]`, where `n` is the cell's execution order
 - A cell currently in execution has `[*]` before it. Multiple cells can be queued for execution, all having `[*]` before them.


 - While writing code, `tab` completion is available.
 - When calling a function, after the opening bracket `shift+tab` shows its docstring (arguments, default values, examples, etc).
 
 
 - Multiple cells can be selected (using `shift`), and cell operations are then performed on all of them.
 - When multiple cells are being scheduled to be executed, they will run in order. Execution stops when one of them raises an error.
 - The [Run] menu (at the top) contains options to run all cells above/below selected one.
 - More Vim-inspired shortcuts are available for navigation, such as **`j`** and **`k`**.

### About the Kernel
 - A new kernel is automatically started when you open a notebook.
 - The kernel can be shut down or restarted. 
 - All cell contents persist beyond kernels, but variable are given values only in the context of a running kernel.
 
 
 - While a cell is running, the kernel is _working_, indicated by ⬤ in the top right corner.
 - After finishing execution, the kernel is _idle, indicated by ◯ in the top right corner.
 - The kernel can be interrupted if it the kernel is stuck, or processing takes too long. Variables are intact, just the cell that was running has been stopped.
 
 
 - To clean-up the results from all cells, use [Kernel] > [Restart Kernel and Clear Outputs]
 - To re-execute the entire notebook, use [Kernel] > [Restart Kernel and Run All Cells]

### About the Notebook

- Every 120 seconds, Jupyter will autosave your notebook into the `.ipynb_checkpoints` folder. It stores a copy of your current `.ipynb` file, without altering your current one (and with no running kernel). It can be reverted using [File] > [Revert Notebook to Checkpoint].
- Inline images can drastically increase the size of a notebook. 
- Code platforms (such as Github, Bitbucket, Gitlab, Atlassian's Git Workflow) render notebooks natively.
- [NBViewer](https://nbviewer.jupyter.org) is an online tool which allows you to visualize any notebook without having to start an instance of Jupyter locally.

**ℹ️ Tip**: restarting the kernel and running all cells insures that you have no stale variables, and it is ready to share.

### Interface Interaction
 - Navigate folders and files in the [File Browser] (on the sidebar to the right, or **`ctrl+b`**).
 - Switch between open notebooks using the tabs at the top.
 - Select multiple, move files using drag-n-drop.
 - Download/upload using drag-n-drop to/from your system's file browser.
 - Split the screen and view multiple files at once by arranging them through drag-n-drop.
 - As in any Unix system, notebooks can be safely renamed without affecting their current state.
 
 - [View] > [Show line numbers] to more easily track error lines.
 - Each cell's output, and its input can be collapsed by pressing the blue bar to the left of it.
 
 
 - Create a new file by opening the _File Browser_ and clicking the plus (**`+`**) sign.
 - Besides Python kernels, it also gives the option of creating _Terminal_ instances for running system commands.
 
 
 - Search all commands in the [Commands Palete] (on the sidebar to the right).
 - All commands presented above are also available in the menu bar at the top (and many more).
 
 - Jupyter supports many kinds of useful data types. Some samples are provided in the `example_files` folder:
  - plain text (`.txt`) 
  - images (`.png`, `.jpg`, etc)
  - JSON files (`.json`), non-editable but allows collapsing and expanding nodes
  - comma-separated-values (`.csv`), big files can be navigated smoothly
  - markdown files (`.md`) with a lightweight rendition of the formatting
  - Python source files (`.py`), with syntax highlighting
  - other languages such as HTML, with syntax highlighting

### Code Cells

#### Execution History

`In` and `Out` are special variables which automatically store the input and output for each executed cell (in order their order of execution).

In [None]:
In[2]  # the content of tenth executed cell, as a string

In [None]:
Out[2]  # the output of the tenth executed cell, if it had any

There are also some shortcuts:

In [None]:
_2  # as a shortcut for Out[10]

In [None]:
_  # output of the latest ran cell

In [None]:
__  # output of the cell ran before that

**👾 Trivia**: `__` is called a _dunder_ (for double under) in Python

#### Outputting Tricks

You can combine the assignment of a variable and display into a single cell

In [None]:
a = 1 + 2

In [None]:
a

Because the output of a cell is the result of the last expression in it

In [None]:
a = 1 + 2
a

---

The following cell evaluates the expression inside of it, a string, and shows its output, which is the same string:

In [None]:
'hello'

The following cell, on the other side, evaluates the expression inside of it, the print statement, and shows its output (`None`, so nothing displayed), while also showing the console output (the string given as argument):

In [None]:
print('hello')

To better illustrate this, the following function produces both an output (the number 24) and console output (the string):

In [None]:
def custom_print(s):
    print(s)
    return 24

custom_print('hello')

---

Sometimes, we just want to run a statement for its functionality, and don't care about the returned value

In [None]:
def launch_missile():
    print('missile launched')
    success_odds = 0.8
    return success_odds

In [None]:
launch_missile()

We can either assign it to a dummy variable (conventionally `_`):

In [None]:
_ = launch_missile()

Or terminate the current statement using `;`:

In [None]:
launch_missile();

---

You can change the behavior to showing more than just the last expression:

In [None]:
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = 'all'

In [None]:
name = 'Tommy'

In [None]:
name[:3]
name[-1]

In [None]:
InteractiveShell.ast_node_interactivity = 'last_expr'  # back to the default

### Text Cells

As illustrated throughout this document, _text_ cells can contain more than just plain text

#### Markdown

_Markdown_ is a lightweight markup language with plain text, intuitive formatting syntax. Though not as powerful as other markup languages (such as HTML), due to its simplicity and expressivity, it is widely used (Github readmes, Slack messages, StackOverflow posts, static site generators, project management tools).

Double-click this cell to see the source that creates these styles.


Text styles:
 - regular
 - **bold**
 - _italic_ 
 - `code` 
 - [link](https://www.example.com)
 
 
 
> this is a quote


block of code (text, not executable):
```html
<div id="greeting">hello</div>
```


Ordered and unordered lists:
 1. first
 2. second
 3. third
   - this is
   - a sublist

Headers (1`#` thorugh 6`######`)

Below is a separation line:

---

Below is an embedded image (note the `!` before the link):

![usc logo](http://i238.photobucket.com/albums/ff58/Portergirl2311/University_of_Southern_California_s.png)

You can also embedd gifs:

![tommy flag](https://media.giphy.com/media/DpoZg6IWyRSsE/giphy.gif)

**ℹ️ Tip**: remember the format for a link like this:
 - the first part is what's displayed, acts like a button thus it is surrounded by square brackets `[link]`
 - the second part is what it links to `(address.com)`

#### Latex

_Latex_ is the standard in scientific documents. It can be used to typeset beautiful equations such as $e^{i\pi} + 1 = 0$

**ℹ️ Tip**: the combination of Markdown and Latex a common one. It blends quick organization with complex snippets when needed. This makes it very useful in contexts such as note taking (and not only for STEM fields). I recommend [Typora](https://typora.io) for a standalone editor and [StackEdit](https://stackedit.io/) for a web-based editor.

#### HTML

HTML formatting is available for more complex formatting: 

<div style="text-align: center; color: orange; font-size: 30px">Hello from HTML!</div>

You can also render it from inside code cells:

In [None]:
from IPython.display import HTML

In [None]:
HTML('<div style="text-align: center; color: orange; font-size: 30px">Hello from HTML!</div>')

Just a small example is shown in this workshop, but you can use (almost) everything from HTML in a notebook, even JS scripts:

In [None]:
HTML('''
    alert("Hello from JavaScript!")
''')

**💪 Exercise**: put `<script>` `</script>` tags  around the `alert` statement inside the string above and run the cell!

### Magics

A _magic_ is a special commands for Jupyter, which start with one `%` for line-magics and `%%` for cell-magics.

#### Timing

Measure execution time in for logging or optimization purposes

In [None]:
from time import sleep

Measure how long the entire cell takes to run:

In [None]:
%%time
for i in range(3):
    print(i)
    sleep(.5)

Measure how long a single line takes to run:

In [None]:
%time _=[n ** 2 for n in range(1_000_000)]

In [None]:
%time _=list(map(lambda n: n ** 2, range(1_000_000)));

Due to external environment variations, running it again might yield different results. Doing multiple trials is more robust to such noise:

In [None]:
%timeit _=[n ** 2 for n in range(1_000_000)]

In [None]:
%timeit _=list(map(lambda n: n ** 2, range(1_000_000)))

---

### Interactivity

"Animations" can be used by clearing the output of a cell and then filling it again:

In [None]:
from IPython.display import clear_output

In [None]:
for i in range(5):
    clear_output()
    print(i)
    sleep(.5)

---

Besides code and text, Jupyter also supports _widgets_. They can be used as alternative input methods which also refresh on change.

_Note:_ you might have to run `!jupyter labextension install @jupyter-widgets/jupyterlab-manager` and possibly re-run Jupyter (`ctrl-C, ctrl-C` and `jupyter lab`) if this shows an error.

In [None]:
# import ipywidgets
# from ipywidgets import interact

In [None]:
# def power(base, exp, negative):
#     result = base ** exp
#     if negative:
#         result *= -1
#     return round(result, 2)

# interact(power, base=2.5, exp=3, negative=False);

---

Due to how Python modules are structured, this trick is needed in order to import from nested folders:

In [None]:
from sys import path
path.append('..')  # add the current root to the list of directories where to look for packages

In [None]:
from example_files.lucky import lucky_number

In [None]:
lucky_number()

#### Auto-reloading

Continously scan an external file and re-import it upon changes:

In [None]:
%load_ext autoreload
%autoreload 2
# first load the extension, then activate it

In [None]:
from example_files.lucky import lucky_number

In [None]:
lucky_number()

**💪 Exercise**: Change the number in `lucky.py`, save the file and then run the cell above again.

---

Pass variables between notebooks:
 - `%store x` to pass the `x` variable from the source notebook
 - `%store -r x` to assign the passed value to the variable `x` in the destination notebook

**ℹ️ Tip**: Jupyter is meant to be a complement for your IDE, not a replacement. The bulk of your processing (functions, classes, etc) should be organized in `.py` files, while notebooks should be used for importing public functions/classes, running them and inspecting the results.

### Shell Integration

Direct interaction with the shell can be achieved by running system commands prepended with a bang (`!`):

In [None]:
!ls

In [None]:
!echo 'Hello from shell!'

Command outputs can be assigned to Python variables

In [None]:
n_files = !ls | wc -l

In [None]:
n_files  # does not always come in the desired format

In [None]:
int(n_files[0].strip())  # but can be easily converted

A frequent usecase is installing packages without leaving the notebook:

In [None]:
!pip install --upgrade pip

---

**ℹ️ Tip**: you might see other people/tutorials using Jupyter **Notebook**. The first version of the Jupyter environment was called _Jupyter Notebook_, which has extra functionality. We are now using the latest version, called _Jupyter Lab_. The files both versions operate on are called _Jupyter notebooks_ (or _notebooks_ for short). Extra functionality in Jupyter Lab includes tabs, split view, cell operations across notebooks, support for multiple kinds of files, making it a more capable and efficient environment.

## Topics left uncovered

- [Jupyter extensions](https://jupyterlab.readthedocs.io/en/stable/user/extensions.html)
- turning notebooks into presentations

## Further Reading
These are some of the resources that cover the important information and do so efficiently:

 - Jupyter: 
   - [built-in magics](https://ipython.readthedocs.io/en/stable/interactive/magics.html)
   - [built-in widgets](https://ipywidgets.readthedocs.io/en/stable/examples/Widget%20List.html)
   - [gallery of interesting notebooks](https://github.com/jupyter/jupyter/wiki/A-gallery-of-interesting-Jupyter-Notebooks#data-visualization-and-plotting)
   - [IPython options](https://ipython.readthedocs.io/en/stable/config/options/terminal.html)
   - [list of tips and tricks](https://www.dataquest.io/blog/jupyter-notebook-tips-tricks-shortcuts/)
   - [contrib extensions](https://github.com/ipython-contrib/jupyter_contrib_nbextensions)
 - Latex: [tutorial series](https://www.sharelatex.com/blog/latex-guides/beginners-tutorial.html)
 - Markdown: [GFM guide](https://guides.github.com/features/mastering-markdown/)
 - HTML: [interactive tutorial](https://www.codecademy.com/learn/learn-html)
 - JavaScript: [MDN guide](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide)
 - Command line: [interactive tutorial](https://www.codecademy.com/learn/learn-the-command-line)