<DIV ALIGN=CENTER>

# Introduction to the IPython Notebook
## Professor Robert J. Brunner
  
</DIV>  
-----
-----

## Introduction to the IPython Notebook

The [IPython notebook][i] is a browser based scientific notebook, and
is, therefore, extremely useful for learning and doing data science
tasks. While [IPython][1] initially supported the [Python][2]
programming language, the IPython notebook was expanded to support
additional programming languages, including [R][3], [Julia][4], and
[Haskell][5]. In addition, you can directly embed scripts, including
[Bash shell][6] scripts, in a cell. This technology has gained
tremendous popularity quite rapidly, and is continuing to be developed,
going forward this project has been reorganized as [Project Jupyter][6]
to highlight the programming language agnostic view the notebook concept
has embraced. In addition, the IPython team develops and maintains a set
of [Docker images][7] to simplify the adoption of IPython notebooks.

In this notebook, we will explore the IPython Notebook, specifically
focusing on

- IPython magics
- writing markdown formatted cells
- including math formulae
- writing and executing Python code
- visualizing plots
- writing and executing Unix commands
- writing and executing Bash shell scripts

Our coverage of most of these topics will be brief, given the time
limitations of this course. For some topics, links will be provided for
more detailed reference; a good example is IPython's [Rich Output][9]
capabilities. However, we will explore some topics, such as writing and
executing code, in more detail through the rest of this course. 

The IPython team has provided a nice [introduction to working][10] in an
IPython Notebook, including how to use the menu commands, toolbar, and
keyboard shortcuts. Of these, the most important points are to use the
mouse to select a cell by single-clicking, and to enter a cell for
editing by double-clicking. To have the IPython kernel process a cell,
you can either enter control-return, which processes and remains in the
current cell, or shift-return, which processes the cell and advances to
the next cell. Finally, another important keyboard trick is to place a
question mark, `?`, at the end of a IPython magic or Python keyword to
bring upan IPython message window that provides online details for the
magic or keyword.

-----
[i]: http://nbviewer.ipython.org/github/ipython/ipython/blob/master/examples/Notebook/What%20is%20the%20IPython%20Notebook.ipynb
[1]: http://www.ipython.org
[2]: http://www.python.org
[3]: http://rpy.sourceforge.net/rpy2/doc-2.4/html/interactive.html#module-rpy2.ipython.rmagic
[4]: https://github.com/JuliaLang/IJulia.jl
[5]: https://github.com/gibiansky/IHaskell
[6]: http://nbviewer.ipython.org/github/ipython/ipython/blob/master/examples/IPython%20Kernel/Script%20Magics.ipynb
[7]: http://jupyter.org
[8]: https://registry.hub.docker.com/repos/ipython/
[9]: http://nbviewer.ipython.org/github/ipython/ipython/blob/master/examples/IPython%20Kernel/Rich%20Output.ipynb
[10]: http://nbviewer.ipython.org/github/ipython/ipython/blob/master/examples/Notebook/Notebook%20Basics.ipynb

## IPython Magics

IPython has [specific commands][1], known as magics, that you can
execute within a code cell to provide enhanced functionality to the
current IPython notebook. Magics are not part of the Python programming
language, but can often make programming easier, especially within the
notebook, and some magics can be used to improve your data processing
work flow. Magics come in two types:

1. line magics, and
2. cell magics.

To see the list of currently available magics, execute the following cell.

In [6]:
%lsmagic

Available line magics:
%alias  %alias_magic  %autocall  %automagic  %autosave  %bookmark  %cat  %cd  %clear  %colors  %config  %connect_info  %cp  %debug  %dhist  %dirs  %doctest_mode  %ed  %edit  %env  %gui  %hist  %history  %install_default_config  %install_ext  %install_profiles  %killbgscripts  %ldir  %less  %lf  %lk  %ll  %load  %load_ext  %loadpy  %logoff  %logon  %logstart  %logstate  %logstop  %ls  %lsmagic  %lx  %macro  %magic  %man  %matplotlib  %mkdir  %more  %mv  %notebook  %page  %pastebin  %pdb  %pdef  %pdoc  %pfile  %pinfo  %pinfo2  %popd  %pprint  %precision  %profile  %prun  %psearch  %psource  %pushd  %pwd  %pycat  %pylab  %qtconsole  %quickref  %recall  %rehashx  %reload_ext  %rep  %rerun  %reset  %reset_selective  %rm  %rmdir  %run  %save  %sc  %set_env  %store  %sx  %system  %tb  %time  %timeit  %unalias  %unload_ext  %who  %who_ls  %whos  %xdel  %xmode

Available cell magics:
%%!  %%HTML  %%SVG  %%bash  %%capture  %%debug  %%file  %%html  %%javascript  %%latex  %%

-----

### line magic

A line magic is prepended by a single `%` character, and will have any
arguments specified all on the same line. As a caveat to this statement,
if the line magic `Automagic` is set to `on`, the preceding `%`
character is not required. Some useful line magics include:

- `%lsmagic`, which lists all currently defined line and cell magics for the current notebook,
- `%matplotlib`, which allows inline plotting to be enabled (and is preferred over the old `%pylab` magic,
- `%run`, which will run the named file as a program in the current cell,
- `%autosave`, which sets the default autosave frequency in seconds, and
- `%timeit`, which in line mode times the execution of a single line of code.

-----


### cell magics

A cell magic is prepended by two `%` characters, and they can have
arguments that include both the current line and the remaining lines in
the current cell. Thus, cell magics must be placed on the first line of
a cell, and in general you can only have one cell magic per cell. Some
useful line magics include:

- '%%timeit', which can be used to time a multi-line Python statement,
- '%%run', 
- '%%writefile' wites the contents of the cell into the named file,
- '%%script', which can be used to create and run a script in a subprocess including Python, Bash, or R, and
- '%%bash', which lets you run a Bash shell script and optionally capture the STDOUT and STDERR streams into variables.

If you are uncertain how to use a particular magic, you can always
obtain help from the IPython kernel by entering the magic by itself in a
cell, adding a `?` character at the end, and executing the cell to bring
up the IPython help window, as shown below for the `%writefile` magic.

![IPython magic help](images/ipython-help.png)

-----

## Header Cells

While you can use Markdown, described next, to create formatted header
lines, the recommended technique is to explicitly make a header cell.
You can create a cell to hold the header text, for example "Introduction
to the IPython Notebook" and change the cell type to the appropriate
header level by using the cell toolbar, as shown below.

![Cell Toolbar](images/cell-toolbar.png)

-----

## Raw NBConvert Cells

The IPython Notebook can be converted into a number of different output
formats. The `nbconvert` tool is used to convert from the default JSON
native format of an `ipynb` file into the desired output format. The
contents of any Raw NBConvert cell are left unmodified during output.
This allows for post-processing of the generated file, such as with
LaTeX or Restructured Text. These cells are beyond the scope of this
course.

-----

## Markdown Cells

[Markdown][1] is a plain text formatting syntax that you can easily use
to write text that can be converted to formatted text, for example,
HTML. Markdown was developed by John Gruber, who runs the popular
[Daring Fireball][df] blog. Markdown has found many uses, two of which
are relevant for this course:
1. github documentation pages
2. IPython Notebook documentation cells.

Markdown is [free software][2] that is available under a BSD--style open
source license. The IPython organization has an [IPython Notebook][3]
that demonstrates how to use Markdown within a cell to produce formatted
text, and of course the documentation for this course consists of github
markdown files and IPython Notebooks that are documented by using
Markdown cells.

In the rest of this section, we will briefly review some of the more
useful Markdown formats.

### Header Text

You can mark different sections of text by using different header levels
(you can also use the IPython Notebook _Header_ cells). Markdown
provides support for six header levels, which are generally marked by
one or more hash `#` characters, where the number of hash characters
indicates the header level. For example, to make a second level header
text called _Markdown Cells_, you would use the following:
```
## Markdown Cells
```

### Paragraphs

Normally, you simply write text as normal for a document. Paragraphs are
simply one or more lines that are enclosed in blank lines. If you need
to insert a line break between two lins of text (sometimes useful when
writing out lists), you simply add two or more blank spaces at the end
of the line to break.

### Emphasis

You can easily indicate text that should be _italicized_ by enclosing
the appropriate text in either single asterisk `*` characters or single
underscore `_` characters. Likewise, for __bold__ text, you enclose the
text in two asterisk `**` or underscore `__` characters. My personal
preference is to underscore characters, for example the following
Markdown text will first create __bold text__ followed by _italics text_:
```
first create __bold text__ followed by _italics text_
```

### Lists

You can easily create two types of lists in Markdown: unordered and
ordered, both of which can be nested. To make a simple unordered list,
we simply prefix the list entries (that is a line) by a single asterisk
`*`, dash `-`, or plus `+` character. To create a nested list, you
indent the nested list by one or more spaces. For consistency, I prefer
to use the dash `-` character to indicate unordered lists. For example,
the following Markdown:
```
- Item 1
- Item 2
 - Item 2.1
 - Item 2.2
- Item 3
````
will produce the following list:
- Item 1
- Item 2
 - Item 2.1
 - Item 2.2
- Item 3

Ordered lists are similar, but are prefixed by numbers and a period, for
example 1. for the first item. While the Markdown numbers do not have to
start at one nor be sequential, the resulting text will be formatted to
begin at one and proceed sequentially. For example, the following
Markdown:
```
2. Item 1
4. Item 2  
1. Item 3
```
will produce the following list:
2. Item 1
4. Item 2  
1. Item 3

### Representing code elements

One of the most useful features of Markdown is the ability to include
formatted code directly in the text. Code elements can be included in
two manners: inline, and block. Inline code elements are simply wrapped
in single starting quote \` characters (also called tick marks). For
example, the Markdown \`print("Hello World!")\` renders as `print("Hello
World!")`. Code blocks can also be placed on a single line with no quote
characters by indenting the line four characters:

    print("Hello World!")

For longer code blocks, you can enclose the block in three single quote
characters \`\`\`. For example, the following code block in Markdown:   
\`\`\`   
x = 3  
y = 4  
z = 5   
  
print("%b" % (x\*\*2 + y\*\*2 = z\*\*2)   
\`\`\`

will produce the following code block:  

```
x = 3  
y = 4  
z = 5   
  
print("%b" % (x**2 + y**2 = z**2) 
```   

You also can use what is known as
github flavored Markdown to indicate the target program language by
adding the language name after the initial three single quote
characters. For example, to indicate Python, you would use \`\`\`python,
while to indicate Javascript, you would use \`\`\`javascript.

### Quoting text

You also can write quoted text by prefixing the line with a greater-than
`>` character. You can also write multi-line block quoted text by
prefixing every line with a greater-than `>` character. For example, the
following Markdown:  
  
\> Here is a long line of text  
\> that we wrapped over multiple lines by inserting a Markdown line break  
\> that is of course multiple space characters inserted at the end  
\> of a line.

will produce the following formatted text:

> Here is a long line of text  
> that we wrapped over multiple lines by inserting a Markdown line break  
> that is of course multiple space characters inserted at the end  
> of a line.

For more detailed discussion, look at the official [Markdown][1]
documentation and the demonstration [IPython Notebook][3] Markdown
demonstration notebook.

-----

[df]: http://daringfireball.net
[1]: http://daringfireball.net/projects/markdown/
[2]: http://daringfireball.net/projects/markdown/license
[3]: http://nbviewer.ipython.org/github/ipython/ipython/blob/master/examples/Notebook/Working%20With%20Markdown%20Cells.ipynb

## Math Formulae

We can include detailed math formulae in a markdown call by using
[LaTeX][1], a general purpose text formatting language that is commonly
used in academia for scientific articles. LaTeX can be difficult to
master, but is fairly simple for writing simple math formulae. The LaTeX
formulae are translated for display in a web browser by the [MathJax][2]
Javascript display engine. 

To indicate a LaTeX formulae, the simplest approach is to enclose the
relevant LaTeX between dollar sign characters, `$`. Many specific
functions, or math symbols are prefixed with a forward slash character,
`\`. For example, to write the LaTeX formulae for lowercase Greek
character theta, you would write `$\theta$`. The [support for
mathematical expressions][3] in LaTeX is quite extensive, and there are
tools, such as [LaTeXit][4] or browser add-ons that can help build and
test LaTeX expressions.

For example, the LaTeX expression 

`$\int_0^{\pi} \sin(\theta)\ d\theta = 2$` 

is translated into 

$\int_0^{\pi} \sin(\theta)\ d\theta = 2$ 

in an IPython markdown cell by MathJaX.

LaTeX can also be used in code blocks to provide cleaner or more
descriptive plot labels (for example, theta versus $\theta$).

-----
[1]: http://latex-project.org
[2]: http://www.mathjax.org
[3]: https://en.wikibooks.org/wiki/LaTeX/Mathematics
[4]: http://pierre.chachatelier.fr/latexit/latexit-home.php?lang=en

## Unix Commands

We [previously](1_unixdp.ipynb), we discussed how to perform different
data science tasks at the Unix command line. We can actually execute
nearly all of thee commands from within the IPython Notebook, by using a
_Code Cell_ and preceding the Unix command by an exclamation point. For
example, to display the current working directory, we would enter `!pwd`
and subsequently execute this code cell. 

This capability is actually very useful; for example, we can put a
`wget` command at the start of an IPython Notebook to retrieve a data
set that will be used in the rest of the notebook. This makes the
notebook self-contained and easy to share or distribute to run in a
Docker container on another machine. This sequence can be seen in the
following figure:

![IPYthon notebook running Unix commands](images/ipynb-unix.png)

If you try this out in your IPython Notebook, you will also see the
non-blocking capability of an IPython Code Cell, since the Unix command
runs in the background allowing you to continue working within the
Notebook.

-----

## Writing and Executing Code

Of course, the reason we are using IPython Notebooks is that they allow
for in place development and execution of Python code. There are a
number of direct benefits you accrue by developing and executing code in
an IPython Notebook:

1. Run code in place with the output displayed in the notebook,  
2. Display visualizations inline,  
3. Run code in the background, while you edit or run code in other cells,  
4. Clean restarts of the IPython kernel, and  
5. Built-in support for parallelization.

The simplest of these capabilities to demonstrate is developing and
running code in the Notebook. The code can be a single line or multiple
lines. Code cells can be executed by using one of two key combinations:
CTRL-return, which executes the cell in place, or SHFT-return, which
executes the code and advances to the next cell. For example, as shown
below, we have a single line of Python code that can be executed with
the output directly shown.

In [2]:
print("Hello World!")

Hello World!


-----
### Python Programs

An IPython Notebook cell can take a full Python program, including
importing Python libraries, which are in scope for the remainder of the
Notebook. For example, we can compute the integral shown earlier by
importing a constant and a function from the `numpy` library and an
integration function from the `scipy` library:

$\int_0^{\pi} \sin(\theta)\ d\theta = 2$ 

In [2]:
import numpy as np
from scipy.integrate import quad

print("The Integral = %3.1f" % quad(np.sin, 0., np.pi)[0])

The Integral = 2.0


-----

### Inline Figures

When making data visualization, the resulting plots or images can be
displayed inline. The recommended way to accomplish this is to use the
`%matplotlib` line magic, which will inform the IPython kernel to
display the image inline; this magiuc can take either the `inline` or
the `notebook` value, the `notebook` value will be preferred in this
class. Note you may see suggestions to use the `%pylab` line magic, but
this approach is no longer recommended since it pollutes the global
namespace by importing several Python libraries.

In [3]:
%matplotlib notebook

theta = np.arange(0., np.pi, 0.01)
y = np.sin(theta)

import matplotlib.pyplot as plt

plt.plot(theta, y)
plt.xlabel(r"$\theta$")
plt.ylabel(r"$\sin$($\theta$)")
plt.title("My Awesome Title")

<IPython.core.display.Javascript object>

<matplotlib.text.Text at 0x7effe9ac5780>

-----

### Running code in the background

One of the features of the IPython kernel that novices fail to
appreciate is the ability for code, by default, to run in the
background. This is useful, as we will see throughout this course, both
when developing, but also when executing code. For example, the
following code block slowly prints out a series of numbers, by default
the integers from 0 to 9. When we execute the cell, the integers slowly
print out while we are free to edit or run code in other cells.

In [5]:
# First we handle our imports.
import sys
from time import sleep

# Parameters that we can change
s = 20
t = 2

# Now loop, printing out a new number before sleeping
for i in range(s):
    sys.stdout.write("%3d," % i)
    sys.stdout.flush()
    sleep(t)
    
# The code continues to run in the background

  0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,

-----

### Clean kernel restarts

In some occasions, our code might cause the Python interpreter to crash.
While this normally might be a serious concern, the IPython kernel can
detect this condition and initiate a clean restart. In this Notebook, we
won't intentionally do this; however, the IPython development team does
provide a [notebook][1] capable of demonstrating this on a Linux or Mac host
computer.

-----
[1]: http://nbviewer.ipython.org/github/ipython/ipython/blob/master/examples/Notebook/Running%20Code.ipynb

### Additional References

1. [IPython videos]( http://ipython.org/videos.html).
2. [IPython documentation](http://ipython.org/ipython-doc/stable/index.html).

-----

### Return to the [Course Index](index.ipynb).

-----