# Jupyter Notebooks

This is a quick introduction to Jupyter notebooks.

<div class="alert alert-success">
Jupyter notebooks are a way to combine executable code, code outputs, and text into one connected file.
</div>

<div class="alert alert-info">
The official documentation from project Jupyter is available 
<a href="https://jupyter-notebook.readthedocs.io/en/stable/" class="alert-link">here</a>
and they also have some example notebooks 
<a href="https://github.com/jupyter/notebook/tree/master/docs/source/examples/Notebook" class="alert-link">here</a>
.
</div>

## Menu Options & Shortcuts

To get a quick tour of the Jupyter user-interface, click on the 'Help' menu, then click 'User Interface Tour'.

There are also a large number of useful keyboard shortcuts. Click on the 'Help' menu, and then 'Keyboard Shortcuts' to see a list. 

## Cells

<div class="alert alert-success">
The main organizational structure of the notebook are 'cells'.
</div>

Cells, can be markdown (text), like this one or code cells (we'll get to those).

### Markdown cells

Markdown cell are useful for communicating information about our notebooks.

They perform basic text formatting including italics, bold, headings, links and images.

Double-click on any of the cells in this section to see what the plain-text looks like. Run the cell to then see what the formatted Markdown text looks like.

# This is a heading

## This is a smaller heading

### This is a really small heading

We can italicize my text either like *this* or like _this_.

We can embolden my text either like **this** or like __this__.

Here is an unordered list of items:
* This is an item
* This is an item
* This is an item

Here is an ordered list of items:
1. This is my first item
2. This is my second item
3. This is my third item

We can have a list of lists by using identation:
* This is an item
* This is an item
	* This is an item
	* This is an item
* This is an item

We can also combine ordered and unordered lists:
1. This is my first item
2. This is my second item
	* This is an item
	* This is an item
3. This is my third item

We can make a link to this [useful markdown cheatsheet](https://www.markdownguide.org/cheat-sheet/) as such.

If we don't use the markdown syntax for links, it will just show the link itself as the link text: https://www.markdownguide.org/cheat-sheet/

### LaTeX-formatted text

$$ P(A \mid B) = \frac{P(B \mid A) \, P(A)}{P(B)} $$

### Code Cells

In [17]:
# In a code cell, comments can be typed
a = 1
b = 2

In [18]:
# Cells can also have output, that gets printed out below the cell.
print(a + b)

3


In [19]:
my_string = 'hello world'

In [20]:
print(my_string)

hello world


In [21]:
# tab completion, print(my_string.upper())
my_string.upper()

'HELLO WORLD'

In [22]:
my_list = ['a','b','c']

In [23]:
print(my_list)

['a', 'b', 'c']


## Accessing Documentation

<div class="alert alert-success">
Jupyter has useful shortcuts. Add a single '?' after a function or class get a window with the documentation, or a double '??' to pull up the source code. 
</div>

In [24]:
# Import numpy for examples
import numpy as np

In [25]:
# Check the docs for a numpy array
np.array?

In [26]:
# Check the full source code for numpy append function
np.append??

In [27]:
# get information about variables you've created
my_string?

## Autocomplete

<div class="alert alert-success">
Jupyter also has 
<a href="https://en.wikipedia.org/wiki/Command-line_completion" class="alert-link">tab complete</a>
capacities, which can autocomplete what you are typing, and/or be used to explore what code is available.  
</div>

In [None]:
# Move your cursor just after the period, press tab, and a drop menu will appear showing all possible completions
np.

In [None]:
# Autocomplete does not have to be at a period. Move to the end of 'ra' and hit tab to see completion options. 
ra

In [None]:
# If there is only one option, tab-complete will auto-complete what you are typing
ran

## Kernel & Namespace

You do not need to run cells in order! This is useful for flexibly testing and developing code. 

The numbers in the square brackets to the left of a cell show which cells have been run, and in what order.

However, it can also be easy to lose track of what has already been declared / imported, leading to unexpected behaviour from running cells.

The kernel is what connects the notebook to your computer behind-the-scenes to execute the code. 

It can be useful to clear and re-launch the kernel. You can do this from the 'kernel' drop down menu, at the top, optionally also clearing all ouputs.

## Magic Commands

<div class="alert alert-success">
'Magic Commands' are a special (command-line like) syntax in IPython/Jupyter to run special functionality. They can run on lines and/or entire cells. 
</div>

<div class="alert alert-info">
The iPython <a href="http://ipython.readthedocs.io/en/stable/interactive/magics.html" class="alert-link">documentation</a> has more information on magic commands.
</div>

Magic commands are designed to succinctly solve various common problems in standard data analysis. Magic commands come in two flavors: line magics, which are denoted by a single % prefix and operate on a single line of input, and cell magics, which are denoted by a double %% prefix and operate on multiple lines of input.

In [29]:
# access quick reference sheet
%quickref

In [30]:
# You can check a list of available magic commands
%lsmagic

Available line magics:
%alias  %alias_magic  %autocall  %automagic  %autosave  %bookmark  %cat  %cd  %clear  %colors  %config  %connect_info  %cp  %debug  %dhist  %dirs  %doctest_mode  %ed  %edit  %env  %gui  %hist  %history  %killbgscripts  %ldir  %less  %lf  %lk  %ll  %load  %load_ext  %loadpy  %logoff  %logon  %logstart  %logstate  %logstop  %ls  %lsmagic  %lx  %macro  %magic  %man  %matplotlib  %mkdir  %more  %mv  %notebook  %page  %pastebin  %pdb  %pdef  %pdoc  %pfile  %pinfo  %pinfo2  %popd  %pprint  %precision  %profile  %prun  %psearch  %psource  %pushd  %pwd  %pycat  %pylab  %qtconsole  %quickref  %recall  %rehashx  %reload_ext  %rep  %rerun  %reset  %reset_selective  %rm  %rmdir  %run  %save  %sc  %set_env  %store  %sx  %system  %tb  %time  %timeit  %unalias  %unload_ext  %who  %who_ls  %whos  %xdel  %xmode

Available cell magics:
%%!  %%HTML  %%SVG  %%bash  %%capture  %%debug  %%file  %%html  %%javascript  %%js  %%latex  %%markdown  %%perl  %%prun  %%pypy  %%python  %%python

In [31]:
# see current working directory
%pwd

'/Users/shannonellis/Desktop/Teaching/COGS108/Tutorials'

In [32]:
# all variables
%who

a	 b	 my_list	 my_string	 np	 


In [42]:
# all variables; more info
%whos

Variable    Type      Data/Info
-------------------------------
a           int       1
b           int       2
my_list     list      n=3
my_string   str       hello world
np          module    <module 'numpy' from '/an<...>kages/numpy/__init__.py'>


In [41]:
# history
%hist

### Markdown cells
# history
%hist
# specify you're writing HTML
%%HTML
<p>This is a paragraph</p>
%%bash
# Equivalently, (for bash) use the %%bash cell magic to run a cell as bash (command-line)
pwd
# In a code cell, comments can be typed
a = 1
b = 2
# Cells can also have output, that gets printed out below the cell.
print(a + b)
my_string = 'hello world'
print(my_string)
# tab completion, print(my_string.upper())
my_string.upper()
my_list = ['a','b','c']
print(my_list)
# Import numpy for examples
import numpy as np
# Check the docs for a numpy array
np.array?
# Check the full source code for numpy append function
np.append??
# get information about variables you've created
my_string?
# Move your cursor just after the period, press tab, and a drop menu will appear showing all possible completions
np.
# In a code cell, comments can be typed
a = 1
b = 2
# Cells can also have output, that gets printed out below the cell.
print(a + b)
my_string = 'hello world'
print(my_string)
# tab compl

### Line Magics


Line magics use a single '%', and apply to a single line. 

In [35]:
# For example, we can time how long it takes to create a large list
%timeit list(range(100000))

1.6 ms ± 26.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


### Cell Magics

Cell magics use a double '%%', and apply to the whole cell. 

In [36]:
%%timeit
# For example, we could time a whole cell
a = list(range(100000))
b = [n + 1 for n in a]

6.59 ms ± 97.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


### Running terminal commands

Another nice thing about notebooks is being able to run terminals commands

In [37]:
# You can run a terminal command by adding '!' to the start of the line
!pwd

# Note that in this case, '!pwd' is equivalent to line magic '%pwd'. 
# The '!' syntax is more general though, allowing you to run anything you want through command-line 

/Users/shannonellis/Desktop/Teaching/COGS108/Tutorials


In [38]:
%%bash
# Equivalently, (for bash) use the %%bash cell magic to run a cell as bash (command-line)
pwd

/Users/shannonellis/Desktop/Teaching/COGS108/Tutorials


In [39]:
# list files in directory
!ls

00-Introduction.ipynb              13-OrdinaryLeastSquares.ipynb
01-JupyterNotebooks.ipynb          14-LinearModels.ipynb
02-DataAnalysis.ipynb              15-Clustering.ipynb
03-Python.ipynb                    16-DimensionalityReduction.ipynb
04-DataSciencePython.ipynb         17-Classification.ipynb
05-DataGathering.ipynb             18-NaturalLanguageProcessing.ipynb
06-DataWrangling.ipynb             A1-PythonPackages.ipynb
07-DataCleaning.ipynb              A2-Git.ipynb
08-DataPrivacy&Anonymization.ipynb LICENSE
09-DataVisualization.ipynb         README.md
10-Distributions.ipynb             [34mfiles[m[m
11-TestingDistributions.ipynb      [34mimg[m[m


In [40]:
# change current directory
!cd .

<div class="alert alert-info">
For more useful information, check out Jupyter Notebooks 
<a href="https://www.dataquest.io/blog/jupyter-notebook-tips-tricks-shortcuts/" class="alert-link">tips & tricks</a>
, and more information on how 
<a href="http://jupyter.readthedocs.io/en/latest/architecture/how_jupyter_ipython_work.html" class="alert-link">notebooks work</a>.
</div>