# Jupyter Notebook Bootcamp
Zening Qu, April 7 2020

## 1. Why Jupyter Notebook?

1. The author can show how they came to their data analysis conclusions.

2. The reader can reproduce the analysis, or learn how something is implemented.

3. Text, [links](https://jupyter.org/), `code`, equations $E = mc^2$, plots, pictures, and videos all in one place to create an experience similar to a technical blog, *and* it’s interactive!

In [1]:
from IPython.display import HTML
HTML('<iframe width="560" height="315" src="https://www.youtube-nocookie.com/embed/d9XhQkzcciY?start=106" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture"></iframe>')



In [2]:
import altair as alt
import pandas as pd
import numpy as np

# generate fake data
source = pd.DataFrame({'gender': ['M']*1000 + ['F']*1000,
               'height':np.concatenate((np.random.normal(69, 7, 1000),
                                       np.random.normal(64, 6, 1000))),
               'weight': np.concatenate((np.random.normal(195.8, 144, 1000),
                                        np.random.normal(167, 100, 1000))),
               'age': np.concatenate((np.random.normal(45, 8, 1000),
                                        np.random.normal(51, 6, 1000)))
        })

selector = alt.selection_single(empty='all', fields=['gender'])

color_scale = alt.Scale(domain=['M', 'F'],
                        range=['#1FC3AA', '#8624F5'])

base = alt.Chart(source).properties(
    width=250,
    height=250
).add_selection(selector)

points = base.mark_point(filled=True, size=200).encode(
    x=alt.X('mean(height):Q',
            scale=alt.Scale(domain=[0,84])),
    y=alt.Y('mean(weight):Q',
            scale=alt.Scale(domain=[0,250])),
    color=alt.condition(selector,
                        'gender:N',
                        alt.value('lightgray'),
                        scale=color_scale),
)

hists = base.mark_bar(opacity=0.5, thickness=100).encode(
    x=alt.X('age',
            bin=alt.Bin(step=5), # step keeps bin size the same
            scale=alt.Scale(domain=[0,100])),
    y=alt.Y('count()',
            stack=None,
            scale=alt.Scale(domain=[0,350])),
    color=alt.Color('gender:N',
                    scale=color_scale)
).transform_filter(
    selector
)


points | hists

👉 **Caution:** If not careful, a notebook can look like <font color='darkgreen'><b>"a code vomit."</b></font> You don’t want your notebook to look like that!

<img src="img/code-vomit.png" align="left"/>

## 2. Install & Start Jupyter Notebook In Your Command Line Console

***Prerequisites:***

Python(, Conda)

***Install:***
```
conda install -c conda-forge notebook
```

***Start:***
```
jupyter notebook
```

See the [official documentation](https://jupyter.org/install) for more explanations.

## 3. Markdown

Click on `Help -> Markdown -> Basic Writing and formatting syntax` to see the full guide.

Some of the most commonly used ones are listed here:

# The largest heading
## The second largest heading
###### The smallest heading

*italicized text*

**bold text**

***bold AND italicized text***

~~strikethrough text~~

<font color='blueviolet'><b>bold</b></font>
<font color='cornflowerblue'><b>and</b></font>
<font color='crimson'><b>colorful</b></font>
<font color='darkslateblue'><b>text</b></font>

Allow me to quote the Holy Bible:

> *Devote yourselves to prayer, being watchful and thankful. And pray for us, too, that God may open a door for our message, so that we may proclaim the mystery of Christ, for which I am in chains. Pray that I may proclaim it clearly, as I should. Be wise in the way you act toward outsiders; make the most of every opportunity. Let your conversation be always full of grace, seasoned with salt, so that you may know how to answer everyone.*
>
> <div style="text-align: right"><em>Holy Bible, The Letter to the Colossians, Chapter 4, Verse 2-6, New International Version</em></div>

In case you want to mention some code `foo` ***inline.***

Or if you want to ***block quote some code:***
```
def foo()
    # my smart implementation
    return;
```

Links, images, videos, and plots? Find examples in this notebook to see how it's done!

## 4. Python

## The Hello World

In [20]:
print('Hello World ❤️')

Hello World ❤️


## Import Your Favorite Libraries and Inspect Their Versions (Useful for Debugging)

In [19]:
import sys
print(sys.version)

3.8.1 (v3.8.1:1b293b6006, Dec 18 2019, 14:08:53) 
[Clang 6.0 (clang-600.0.57)]


In [18]:
import sklearn
print(sklearn.__version__)

0.22.2.post1


In [21]:
import pandas as pd
print(pd.__version__)

1.0.3


In [22]:
import numpy as np
print(np.__version__)

1.18.2


## Very User-Friendly Python Keywords: `all` and `any`

In [24]:
a = [True, True, False]
if any(a):
    print('At least one True')
if all(a):
    print('All true')
if any(a) and not all(a):
    print('Some true but not all true')

At least one True
Some true but not all true


## Numpy Arrays and Comparison Operators: A Cool Trick About Mask

In [57]:
np.array([1, 2, 3, 4, 5])

array([1, 2, 3, 4, 5])

In [58]:
np.array([1, 2, 3, 4, 5]) < 3

array([ True,  True, False, False, False])

In [55]:
# filtering an array, or "fancy indexing", however you call it
a = np.array([1, 2, 3, 4, 5])
a[a<3]

array([1, 2])

In [56]:
# you can threshold an image
a = np.array([
[12, 13, 14, 12, 16, 14, 11, 10,  9],
[11, 14, 12, 15, 15, 16, 10, 12, 11],
[10, 12, 12, 15, 14, 16, 10, 12, 12],
[ 9, 11, 16, 15, 14, 16, 15, 12, 10],
[12, 11, 16, 14, 10, 12, 16, 12, 13],
[10, 15, 16, 14, 14, 14, 16, 15, 12],
[13, 17, 14, 10, 14, 11, 14, 15, 10],
[10, 16, 12, 14, 11, 12, 14, 18, 11],
[10, 19, 12, 14, 11, 12, 14, 18, 10],
[14, 22, 17, 19, 16, 17, 18, 17, 13],
[10, 16, 12, 14, 11, 12, 14, 18, 11],
[10, 16, 12, 14, 11, 12, 14, 18, 11],
[10, 19, 12, 14, 11, 12, 14, 18, 10],
[14, 22, 12, 14, 11, 12, 14, 17, 13],
[10, 16, 12, 14, 11, 12, 14, 18, 11]])

b = a < 15
b.astype(np.int)

array([[1, 1, 1, 1, 0, 1, 1, 1, 1],
       [1, 1, 1, 0, 0, 0, 1, 1, 1],
       [1, 1, 1, 0, 1, 0, 1, 1, 1],
       [1, 1, 0, 0, 1, 0, 0, 1, 1],
       [1, 1, 0, 1, 1, 1, 0, 1, 1],
       [1, 0, 0, 1, 1, 1, 0, 0, 1],
       [1, 0, 1, 1, 1, 1, 1, 0, 1],
       [1, 0, 1, 1, 1, 1, 1, 0, 1],
       [1, 0, 1, 1, 1, 1, 1, 0, 1],
       [1, 0, 0, 0, 0, 0, 0, 0, 1],
       [1, 0, 1, 1, 1, 1, 1, 0, 1],
       [1, 0, 1, 1, 1, 1, 1, 0, 1],
       [1, 0, 1, 1, 1, 1, 1, 0, 1],
       [1, 0, 1, 1, 1, 1, 1, 0, 1],
       [1, 0, 1, 1, 1, 1, 1, 0, 1]])

## Check Out [`collections`](https://docs.python.org/3/library/collections.html)

In [45]:
from collections import OrderedDict
OrderedDict(a = 1, b = 2, c = 3)

OrderedDict([('a', 1), ('b', 2), ('c', 3)])

In [41]:
from collections import Counter
Counter('101 Dalmations')

Counter({'1': 2,
         '0': 1,
         ' ': 1,
         'D': 1,
         'a': 2,
         'l': 1,
         'm': 1,
         't': 1,
         'i': 1,
         'o': 1,
         'n': 1,
         's': 1})

## [`*args` and `**kwargs`](https://www.geeksforgeeks.org/args-kwargs-python/)

Lets you pass variable-length list of arguments to your functions.

`kw` meaning 'keyworded`

## XKCD

In [23]:
import antigravity

## 5. Keyboard Shortcuts

Shortcuts save time. Here are some of the mostly frequently used ones:

`Enter` go to 'edit' mode

`Shift + Enter` run the current cell, select below

`Esc` go to 'command' mode

`Esc + A` insert cell above

`Esc + B` insert cell below

`Esc + D + D` (press the key twice) delete selected cells

`Esc + Y` change the cell type to Code

`Esc + M` change the cell type to Markdown

`Esc + P` open the command palette

For more shortcuts, click on `Help -> Keyboard Shortcuts`, or see Ventsislav Yordanov's blog post [Jupyter Notebook Shortcuts](https://towardsdatascience.com/jypyter-notebook-shortcuts-bf0101a98330)

## 6. Troubleshooting

If you are using Python 3, have installed a package, but are seeing a `ModuleNotFoundError` in Jupyter Notebook:

<img width='700px' src="img/pip3.png" align="left"/>

You might fix this error by installing the package via `pip3`:

```
pip3 install sklearn
```