# Jupyter use cases

## <font color="red"> *Type-along exercise 1* </font>

Let's spend a few minutes working on a toy project. We will be using the [word-count project](https://github.com/coderefinery/word-count) from earlier lessons.

1. Go to the File browser, select New launcher, and create a new notebook.
2. Drag the notebook so that it's by the side this notebook.
3. Drag this cell into the new notebook, and then go to single-notebook view again.
4. Use the `Commands` palette from the left hand menu (`Ctrl(⌘)-Shift-C`) to open a new terminal, drag it to the bottom.
5. In terminal, change directory to the `word-count` example project (if you don't already have it, clone `https://github.com/coderefinery/word-count.git`). Run `snakemake -s Snakefile_all` (this could be done from notebook too).
6. In notebook, use a magic to load the word-count project README file at the top, and add a heading.
7. In notebook, create a directory `zipf-test`, and copy the `word_count/processed_data` folder to it.
8. Copy-paste the code below to a code cell (pretending that we just wrote it), and save it to a file `zipf.py`.

```python
def load_word_counts(filename):
    """
    Load a list of (word, count, percentage) tuples from a file where each
    line is of the form "word count percentage". Lines starting with # are
    ignored.
    """
    counts = []
    with open(filename, "r") as input_fd:
        for line in input_fd:
            if not line.startswith("#"):
                fields = line.split()
                counts.append((fields[0], int(fields[1]), float(fields[2])))
    return counts

def top_n_word(counts, n):
    """
    Given a list of (word, count, percentage) tuples,
    return the top n word counts.
    """
    limited_counts = counts[0:n]
    count_data = [count for (_, count, _) in limited_counts]
    return count_data

def zipf_analysis(input_file, n=10):
    counts = load_word_counts(input_file)
    top_n = top_n_word(counts, n)
    return top_n
```

9. Run the `zipf_analysis()` function for a processed datafile. Plot the output, and compare with a 1/N function, using the following code:

```python
nmax = 10
z = zipf_analysis("processed_data/isles.dat", nmax)
n = range(1,nmax+1)
z_norm = [i/z[0] for i in z]
plt.plot(n,z_norm)
inv_n = [1.0/i for i in n]
plt.plot(n, inv_n)
```


## Widgets

Widgets add more interactivity to Notebooks, allowing one to visualize and control changes in data, parameters etc.

In [None]:
from ipywidgets import interact

#### Use `interact` as a function

In [None]:
def f(x, y, s):
    return (x, y, s)

interact(f, x=True, y=1.0, s="Hello");

#### Use `interact` as a decorator

In [None]:
@interact(x=True, y=1.0, s="Hello")
def g(x, y, s):
    return (x, y, s)

## <font color="red"> *Type-along exercise 2* </font>

Let's see how we can use an interactive widget to analyze Zipf's law in the word-count project!
> Hint: you can for example try these two widget parameters:   
> `@interact(nmax=(6,14), p=-1.0)`

## Additional useful magic commands

### %timeit
- Timing execution
- Both Line and Cell level

In [None]:
%timeit import time ; time.sleep(1)

In [None]:
%%timeit 
a = np.random.rand(100, 100)
np.linalg.eigvals(a)

### %run 
 - Executes python code from .py files 
 - Can also execute other jupyter notebooks

In [None]:
%run foo

### %debug
Activate interactive debugger

Let's try using `%debug` to hunt down a bug. We first execute the cell, and then run the `%debug` magic.

In [None]:
def calc_reciprocal(x):
    inv_x = []
    for i in x:
        inv_x.append(1.0 / i)
    return inv_x

x = [1,5,2,0,5]
y = calc_reciprocal(x)

Run the debugger post-mortem. If an exception has just occurred, the debug magic lets you inspect its stack frames interactively

In [None]:
%debug

**Don't forget to exit the debugger by typing `q` and `Enter`!**  
If you don't, the background process will not be ready for your next command.

### %prun
 - Python code profiler
 - Cell and Line magic

In [None]:
%%prun 
a = np.random.rand(1000, 1000)
np.linalg.eigvals(a)

## Exercises

> Now open the [exercises](exercises.ipynb) notebook and start working on the exercises!

## Mixing in other languages (assuming they're installed)

Why would you want to mix programming languages in the same notebook?
 - Leverage strengths from different languages
 - Using code from colleagues
 - A fantastic library exists in another language than your favorite one

In [None]:
%%ruby
puts 'Hi, this is ruby.'

In [None]:
%%script ruby
puts 'Hi, this is also ruby.'

In [None]:
%%perl
print "Hello, this is perl\n";

In [None]:
%%bash
echo "Hullo, I'm bash"

In [None]:
%%html
<table>
<tr>
<th>Header 1</th>
<th>Header 2</th>
</tr>
<tr>
<td>row 1, cell 1</td>
<td>row 1, cell 2</td>
</tr>
<tr>
<td>row 2, cell 1</td>
<td>row 2, cell 2</td>
</tr>
</table>

In [None]:
%%latex
\begin{align}
\nabla \times \vec{\mathbf{B}} -\, \frac1c\, \frac{\partial\vec{\mathbf{E}}}{\partial t} & = \frac{4\pi}{c}\vec{\mathbf{j}} \\
\nabla \cdot \vec{\mathbf{E}} & = 4 \pi \rho \\
\nabla \times \vec{\mathbf{E}}\, +\, \frac1c\, \frac{\partial\vec{\mathbf{B}}}{\partial t} & = \vec{\mathbf{0}} \\
\nabla \cdot \vec{\mathbf{B}} & = 0
\end{align}

### R

The R world already has a powerful IDE, RStudio, where one can annotate code using Markdown and export to HTML. But Jupyter is good for mixing languages.

In [None]:
# first we need to install the necessary packages
#!conda install -c r r-essentials 
#!conda install -y rpy2

To run R from the Python kernel we need to load the rpy2 IPython extension

In [None]:
%load_ext rpy2.ipython

In [None]:
%%R
myString <- "Hello, this is R"
print ( myString)

Inline plotting in R is straightforward 

In [None]:
%%R 
# Define the cars vector with 5 values
cars <- c(1, 3, 6, 4, 9)

# Graph cars using blue points overlayed by a line 
plot(cars, type="o", col="blue")

# Create a title with a red, bold/italic font
title(main="Autos", col.main="red", font.main=4)

Data in R cells is of course persistent

In [None]:
%%R 
barplot(cars)

## When to use notebooks
- Experimenting with new ideas, testing new libraries/databases 
- Interactive code, data analysis and visualization development
- Interactive work on HPC clusters
- Sharing and explaining code to colleagues
- Learning from other notebooks
- Keeping track of interactive sessions, like a digital lab notebook
- Supplementary information with published articles
- Teaching (programming, experimental/theoretical science)
- Presentations with slides using [Reveal.js](https://github.com/damianavila/RISE)

On the other hand, notebooks are: 
- Less useful for large codebases 
- More difficult to do automated testing on 
- Tricky when it comes to non-linear execution of cells, discipline is needed!

## Sharing notebooks

- You can enter a URL, GitHub repo or username, or GIST ID in [`nbviewer`](https://nbviewer.jupyter.org/) and view a rendered Jupyter notebook
    - try entering just "coderefinery" and see if you can find this current notebook
- Read the Docs can render Jupyter Notebooks via the [nbsphinx package](https://nbsphinx.readthedocs.io/)
- [Binder](https://mybinder.org/) creates live notebooks based on a GitHub repository
- [CoCalc](https://cocalc.com/) (formerly SageMathCloud) allows collaborative editing of notebooks in the cloud 
- Google's [colaboratory](https://colab.research.google.com/) lets you work on notebooks in the cloud, and you can [read and write to notebook files on Drive](https://colab.research.google.com/notebooks/io.ipynb)
- [Microsoft Azure Notebooks](https://notebooks.azure.com/) also offers free notebooks in the cloud
- [JupyterLab](https://github.com/jupyterlab/jupyterlab) supports sharing and collaborative editing of notebooks via Google Drive 
- [Notedown](https://github.com/aaren/notedown), [Jupinx](https://github.com/QuantEcon/sphinxcontrib-jupyter) and [DocOnce](https://github.com/hplgit/doconce) can take Markdown or Sphinx files and generate Jupyter Notebooks
- The `jupyter nbconvert` tool can convert a (`.ipynb`) notebook file to:
    - python code (`.py` file) 
    - an HTML file
    - a LaTeX file
    - a PDF file
    - a slide-show in the browser

Note: the Google, Microsoft and CoCalc platforms are free but have paid subscriptions for faster access to cloud resources

## Key points

- Jupyter is powerful for data analysis and quick prototyping of code.
- Allows fast feedback in your test-code-refactor loop (see [test-driven development](https://en.wikipedia.org/wiki/Test-driven_development)).
- Widgets provide more interactivity.
- [Support for many programming languages](https://github.com/jupyter/jupyter/wiki/Jupyter-kernels)
    - and different languages can be mixed
- Platforms exist to share and collaborate on with colleagues.


## Final discussion

- If you are already using Jupyter, what tasks do you use it for? 
- If you are new to Jupyter, do you see any possible use cases?
- Do you think Jupyter Notebooks can help tackle the problem of irreproducible results?