# <center> Getting Started Guide for Jupyter notebooks @ UW </center>

### Index
_Note: you can also view the structure of a document using the "Table of Contents" view on the left navigation pane._
- [UW JupyterHub for Teaching components](#components)
- [Working with documents](#working-with-ipynb)
- [Working with cells](#working-with-cells)
  - [Working with code cells](#working-with-cells-code)
  - [Working with Markdown cells](#working-with-cells-markdown)
- [Example development workflow](#development-workflow)
- [Tips and tricks](#tips-and-tricks)
- [Best practices](#best-practices)

## What is Jupyter?

The Jupyter Notebook is a powerful open source tool designed to support interactive documents (the "notebook") with text, visualizations, live code, code outputs, and many other elements displayed inline. The JupyterHub for Teaching service at the UW makes a hosted Jupyter Notebook environment available for UW courses, and requires only a browser to use.

<a id='components'></a>
## UW JupyterHub for Teaching components
- **JupyterHub**: software which manages a group of Notebooks, and provides UW NetID login and shared features, as well as management features for instructors.
- **Jupyter Notebook environment**: software provided by [Project Jupyter](https://jupyter.org/) which can be run on your own computer or on a hosted server (such as the one you're in now). In a hosted environment, you have access not only to the features provided by the Jupyter Notebook software, but also some of the capabilities of the underlying Linux operating system. The environment has installed software and packages selected by the course instructor and is fully configured. This environment is only running while you are using it, and will shut down after a period of inactivity. Depending on your environment, additional software may be available, such as RStudio. You can interact with the underlying operating system by going to "File > New > Terminal" in the menu bar.
- **JupyterLab**: this is the software responsible for the user interface part of the Notebook. Theme changes can be made by going to settings > theme. Depending on the environment you are using, other interfaces may be available. Check out [the documentation](https://jupyterlab.readthedocs.io/en/stable/user/interface.html#:~:text=The%20JupyterLab%20interface%20consists%20of,inspector%2C%20and%20the%20tabs%20list.) for detailed information about the JupyterLab UI.
- **Jupyter notebook document**: a structured file ending with the '.ipynb' extension which contains marked up elements that the Jupyter software can interpret and display. There can be many documents on your server, and they can be upload, downloaded, shared, and exported. Start a new Notebook by going to "File > New > Notebook" in the menu bar. Upload a Notebook by dragging and dropping an .ipynb file from your computer into the files pane on the left side, or by clicking on the upload icon a the top of the files pane.
- **Jupyter kernel**: the live code portions of a document will be run in a programming language such as Python or R, and the kernel is the part of the environment that actually runs that code. Your environment may have more than one kernel installed, and which language is being used is determined when you create a new notebook or by clicking on the kernel name in the top right corner of this pane. Additional kernel features can be accessed by clicking "Kernel" on the menu bar.
- **Home directory**: this hosted environment is designed to always start from the same state, so changes made to operating system or installed software will be reverted the next time you start up a session. The sole exception to this is your home directory, which is the root of the file system as displayed within the Jupyter environment. Files saved to this location will persist for the acadmic quarter, plus a retention period after the quarter is over.


<a id='working-with-ipynb'></a>

## Working in a document
- This notebook is using the Python kernel, so you will be entering Python syntax in code cells. If you look in the top right corner of the notebook, you will see which kernel is in use. Your environment can contain multiple kernels if you are working with more than one language, but a single document can only use one kernel at a time.
- Press the `ENTER` key or click on a cell (double click for rendered Markdown cells) to go into Edit mode, make any changes, then `SHIFT + ENTER` to run or render the cell. If you want to return to Command mode without running the cell, use the `ESC` key.
- While in Command mode, `a` inserts a new cell above your current selected cell and `b` inserts one below.
- To change the type of cell, use the dropdown in the menu bar of this pane, or while in Command mode, select the cell and press `m` for a markdown cell or `y` for a code cell.
- While in Command mode, pressing `d` twice will delete the currently selected cell.
- While in Command mode, `SHIFT + ENTER` will run or render the _current_ cell.
- To save your work and create a checkpoint, use the save icon at the top left corner of this pane, or use "File > Save Notebook" in the menu bar, or use the save shortcut appropriate for your OS (ie, `CTRL + S` for Windows and Linux, `⌘ + S` for a Mac). Note: notebooks have autosave functionality, if you wish to revert to the last time you manually saved, go to "File > Revert Notebook to Checkpoint" in the menu bar.

<a id='working-with-cells'></a>

## Working in cells
- Expressions in a code cell are not evaluated until you run the cell using `SHIFT + ENTER`, or "Run > Run Selected Cells" from the menu bar.
- Output from a cell is saved in the document, which may be desirable if you have a complex expression or visualization that you don't want to have to run every time or may be undesirable if you don't want to increase the size of your document. You can clear output by right-clicking a cell, or going to "Edit > Clear Output" in the menu bar. "Edit > Clear All Outputs" will clear the outputs for the entire document. Notes: saving the output only saves the rendered output, and does not save variables in the kernel, conversely clearing the outputs only clears the rendered output but does not affect any variable assignments that were performed by the cell(s). 
- See comments in the cells below for additional information.

<a id='working-with-cells-code'></a>

### Working with code cells

In [None]:
# Variables exist in a global namespace for the current document and kernel. Restarting the kernel will reset everything. 
# SHIFT + ENTER to run this cell and view the output.
a = 42

print(a)

In [None]:
# the variable you defined above is available to any other cells
print(a)

# Because variables are global, reassigning them will clobber the previous value which may lead to unexpected results 
# if you had code that depended on the variable.
a = 'foo' 

print(a)

In [None]:
# This will produce an error if the cell above has been run, since it expects 'a' can be converted to an integer
print(f'var a is the Answer: {int(a) == 42}') 

In [None]:
# this will produce an error unless you run the cell below first
print(b)

In [None]:
b = 'cells run sequentially in order if you use select "Run > Run All Cells" from the menu, otherwise they only run \
if you explicitly run them with SHIFT + ENTER'

In [None]:
# Any installed packages can imported and used in an expression
import datetime

twenty_thirtyeight_bug = datetime.date(2038, 1, 1)
now = datetime.datetime.now().date()

print(f"There are {(twenty_thirtyeight_bug - now).days} days left to make sure you're on Linux kernel \
>= 5.10 and using the big timestamps\' feature!")


In [None]:
# You can install your own packages, but be aware they will need to be installed each time you start a new Notebook session
!pip install emoji

In [None]:
from emoji import emojize
print(emojize(":thumbs_up:"))

In [None]:
# Note the ! operator will generically run any command you can run at the command line, eg:
!whoami

<a id='working-with-cells-markdown'></a>
### Working with Markdown cells

- You can double click on a cell with rendered markdown to view the source, or select the cell and press `ENTER`. `SHIFT + ENTER` will re-render the cell.
- [Markdown basics](https://www.markdownguide.org/basic-syntax)

#### Lists
- a
- b 

#### Equations
- Jupyter uses [MathJax](https://www.mathjax.org/) to render equations between a set of `$$` symbols. See some [examples of MathJax notation](https://jojozhuang.github.io/tutorial/mathjax-cheat-sheet-for-mathematical-notation/).

$$
\frac{\partial (\epsilon c)}{\partial t} = \frac{\partial}{\partial x}\left( \epsilon D_{eff}\frac{\partial c}{\partial x} \right) + a (1-t_+^0) j_n
$$

#### Tables

| Name | | Column |
|:---:|:---:|:---:|
||This is a fun table||
|To Do| Fill in the rest| |

#### Inline HTML
<h4>HTML can be placed in a markdown cell</h4>
<center> <img src="https://ipython.org/_images/ipy_0.13.png" width="50%"/> <center>

In [None]:
%%html
<!-- # Use a cell magic in a code cell to apply a style to tables to align them to the left -->
<style>
table { align:left;display:block }
</style>

- Note: Inline styles are not currently supported due to a security issue

<a id='development-workflow'></a>

### Jupyter Notebooks allow you to explore problems and break your workflow up into discrete parts

#### Data intensive step

In [None]:
# Import the pandas package
import pandas as pd

# Read in the data from our file
# Remember to check the dataset size - very large datasets may take a long time to process and consume storage space in your home directory
file_location = 'https://raw.githubusercontent.com/mdmurbach/ECS-Hack-Day-2017/master/time-data(833).txt?raw=true'
data = pd.read_csv(file_location)

# assign names to each of the columns for easy reference
data.columns = ['time(s)', 'current(A)', 'potential(V)', 'frequency(Hz)', 'amplitude(A)']

# print out the first 5 rows of our data
data.head()

#### Followed by a visualization or exploratory data analysis

In [None]:
# Import the matplotlib package
import matplotlib.pyplot as plt

# Plot the current and voltage vs time
plt.plot(data['time(s)'], data['current(A)'])
plt.plot(data['time(s)'], data['potential(V)'])

plt.show()

In [None]:
mean_current = data["current(A)"].mean()
mean_potential = data["potential(V)"].mean()

print('Mean current = {0} A;  Mean potential = {1} V'.format(mean_current, mean_potential))

<a id='tips-and-tricks'></a>

### Tips and tricks

#### Keyboard shortcuts
- Command mode vs. Edit mode: Toggle with `ESC` and `ENTER` keys

#### Debugging
- Click the "debugger" icon in the top right of this pane (right next to the kernel picker) to access debugging information about the current notebook.

#### Useful command mode shortcuts
|Command|Action|
|:-----:|:----:|
|a|Create new cell above|
|b|Create new cell below|
|d d|Delete current cell|
|z| Undo delete cell|
|m|Change cell to markdown|
|y|Change cell to code|
|h| Bring up the list of shortcuts|


#### Useful editing shortcuts

|Command|Action|
|:-----:|:----:|
|Ctrl-a|Select all|
|Ctrl-c|Copy|
|Ctrl-v|Paste|
|Ctrl-s|Save|
|Tab| Autocomplete |
|Shift-tab| Tooltips|

### %magics!

Commands that add additional (usually more advanced) features to the notebook

|Magic|Action|
|:-----:|:----:|
| %lsmagic | List all magics |
| {magic}? | Get help on a magic <br>(or check the [official documentation](https://ipython.readthedocs.io/en/stable/interactive/magics.html))|
| ! | Execute shell script |
| %who(s) | See list of variables in current kernel |
| %time(it)| Time a python expression <br>(w/ control over number of executions, etc.)|

Many more built-in magic functions: http://ipython.readthedocs.io/en/stable/interactive/magics.html

#### Cell magics (%%) vs. line magics (%)

Some magics have versions that apply to the entire cell by placing %%magic as the first line in a cell (i.e. %%timeit)

In [None]:
# The timeit magic can be used to check how long an expression takes to run (over several runs, so be cautious about large expressions).
%timeit ",".join(str(n) for n in range(100))

<a id='best-practices'></a>

### Best practices
- The amount of available space in your home directory is limited (typically to 5GB), and filling it up can cause your notebook sessions to fail to start. You can use the command `df -h ~` to view the amount you've used. Be careful of saving large data sets and running expressions that produce significant amounts of output data. If you're near the limits of storage, you can reduce the size of your notebook by going to "Edit > Clear All Outputs" from the menu bar, then saving the file. If you run out of room and your notebook fails to start, send a help request to [help@uw.edu](mailto:help@uw.edu) with "JupyterHub for Teaching" in the subject line.
- Only save files and data in your home directory - any other location on the file system will be reset when your current session ends.
- Close the browser tab when you're done. This allows the system to shut down idle sessions and conserve resources.
- If you want to share a notebook, there are a few options: you can export a notebook to PDF or HTML ("File > Save and Export Notebook As..."), save and download to your computer. This will not be editable or contain runnable code. Alternately, you can download the ipynb file itself (right-click on the notebook and select "Download"), which someone else can upload to their environment (assuming they have the same packages installed in their environment) and interact with. The option labelled "Copy Shareable Link" will only work if your notebook session is actively running, and the user you share it to has admin rights in the JupyterHub for your course.

In [None]:
# Check usage of your home directory
!df -h ~

In [None]:
# Search for large files in your home directory
!find ~ -type f -size +50M -exec du -sh {} \;