# Authoring Jupyter Notebooks for Data Science Education

This is a demo notebook on creating Jupyter notebooks for education for the 2023 National Workshop on Data Science Education at UC Berkeley.

## Importing Modules

First we must import the libraries that we will use in this notebook. Select the following cell and hit shift-enter to run the cell. You can also run the cell by clicking on the "Run" button in the above Jupyter pane. When working with students, a common bug is forgetting to run this cell.

In [3]:
# Don't change this cell; just run it. 
import numpy as np
from datascience import *

## Adding Markdown Cells

Instructions and background information can be added to Jupyter notebooks with markdown cells. Right click on this cell and play around with editing the markdown cell.

### Some Useful Features

You can insert equations with LaTeX:

\begin{equation} 
e^{\pi i} + 1 = 0
\end{equation}

You can add tables:

| Function | Description                                                   |
|----------|---------------------------------------------------------------|
| `abs`      | Returns the absolute value of its argument                    |
| `max`      | Returns the maximum of all its arguments                      |
| `min`      | Returns the minimum of all its arguments                      |
| `pow`      | Raises its first argument to the power of its second argument |
| `round`    | Rounds its argument to the nearest integer                     |


You can denote important information in **bold**, *italics*, or <span style="color: #BC412B">**color**</span>.

You can add [links](https://www.espn.com/college-football/game/_/gameId/401404041).

You can insert images:

<img src="images/data8.png">

## Loading Data into a Table

We can create tables in several different ways with the `datascience` module. One such way is to do it manually via `with_columns`:

In [4]:
t = Table().with_columns(
    'Player', ['Curry', 'James', 'Jokic', 'Butler'],
    'Points',  [  31,   25,   29,   19],
    'Assists', [  6,   9,   12,  10],
)
t

Player,Points,Assists
Curry,31,6
James,25,9
Jokic,29,12
Butler,19,10


More commonly, we will want to load data from a pre-existing csv file.

In [5]:
power_plants = Table().read_table('data/California_Power_Plants.csv')	
power_plants.show(5)

X,Y,OBJECTID,CECPlantID,PlantName,Retired_Plant,OperatorCompanyID,County,Capacity_Latest,Units,PriEnergySource,StartDate
-119.568,36.1372,1865,S0335,Corcoran 2 Solar LLC CED,0,CED California Holdings LLC,Kings,19.8,1,SUN,2015/06/10 00:00:00+00
-119.58,36.1443,1866,S0520,Corcoran 3 Solar,0,CED California Holdings LLC,Kings,20.0,Unit 1,SUN,2016/02/11 00:00:00+00
-119.648,36.2696,1867,C0007,Hanford - Retired October 2011,1,Hanford LP,Kings,24.0,GEN 1,PC,1990/09/01 00:00:00+00
-119.647,36.2703,1868,G0832,Hanford Energy Park Peaker,0,"MRP San Joaquin Energy, LLC",Kings,92.0,"1, 2",NG,2001/09/01 00:00:00+00
-119.128,36.2663,1869,S0608,Exeter Solar,0,Tulare PV I LLC,Tulare,3.5,ES,SUN,2014/02/12 00:00:00+00


You can also load data from a URL.

In [6]:
sat = Table().read_table('https://www.inferentialthinking.com/data/sat2014.csv')
sat.show(5)

State,Participation Rate,Critical Reading,Math,Writing,Combined
North Dakota,2.3,612,620,584,1816
Illinois,4.6,599,616,587,1802
Iowa,3.1,605,611,578,1794
South Dakota,2.9,604,609,579,1792
Minnesota,5.9,598,610,578,1786


## Writing Questions

TODO

## Common Student Errors

In [11]:
sat = Table().read_table('https://www.inferentialthinking.com/data/sat2014.csv')
sat.show(5)

State,Participation Rate,Critical Reading,Math,Writing,Combined
North Dakota,2.3,612,620,584,1816
Illinois,4.6,599,616,587,1802
Iowa,3.1,605,611,578,1794
South Dakota,2.9,604,609,579,1792
Minnesota,5.9,598,610,578,1786


One common student error is destructively modifying data and rerunning past cells. For example, try running the below cell twice. The first time works fine, but the second time there is an error because the `state` column no longer exists in the table!

In [12]:
states_arr = sat.column('State')
sat = sat.drop('State')
sat.show(5)

Participation Rate,Critical Reading,Math,Writing,Combined
2.3,612,620,584,1816
4.6,599,616,587,1802
3.1,605,611,578,1794
2.9,604,609,579,1792
5.9,598,610,578,1786


## Keyboard Shortcuts

Jupyter uses some really useful keyboard shortcuts that make authoring, navigating and running notebooks a more seamless experience. Some useful ones are listed below but you can also find a more comprehensive guide [here](https://towardsdatascience.com/jypyter-notebook-shortcuts-bf0101a98330). 

The following shortcuts work at any time: 
- `shift` + `enter` : Run selected cell and move cursor to the next cell directly below
- `ctrl` / `cmd` + `enter` : Run selected cell and stay on the same cell (useful if you need to run the same cell multiple times) 

The following shortcuts only work when in command mode (i.e. not editing the contents of a specific cell): 
- `enter` - Edit the selected cell 
- `A` : Insert a new code cell above selected cell  
- `B` : Insert a new code cell below selected cell
- `D, D` : Delete selected cell 
- `Y` : Convert selected cell to Code cell 
- `M` : Convert selected cell to Markdown cell