<p style="float:right">
<img src="images/cu.png" style="display:inline" />
<img src="images/cires.png" style="display:inline" />
<img src="images/nasa.png" style="display:inline" />
</p>

# Python, Jupyter & pandas tutorial: Module 1

## Introduction and background

### Python

- First released in 1991 and actively developed since then
- Extremely successful and popular, obviously
- But, as an interpreted language, relatively slow vs e.g. C or Fortran
- Also, Python's `List` object can be awkward in numerical contexts:

In [1]:
v1 = [1.0, 2.0, 3.0, 4.0]
v1 * 3

[1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0]

### NumPy

- Python's applicability to problems in science was bolstered by NumPy
    - Released in 2006
    - Built on the capabilities of Numeric, which appeard in 1995
    - Technically part of the larger SciPy ecosystem, but can be installed independently

In [2]:
import numpy as np

- NumPy provides support for large, multidimensional arrays / matrices with natural semantics...

In [3]:
v2 = np.array(v1)
v2 * 3

array([  3.,   6.,   9.,  12.])

- ... and functions useful for working with such objects:

In [4]:
v2.reshape(2,2)

array([[ 1.,  2.],
       [ 3.,  4.]])

- The `array()` function is just a wrapper around the ubiquitious (but more complicated) `ndarray` (_n-dimensional array_) type, which lies at the heart of NumPy:

In [5]:
type(v2)

numpy.ndarray

- `ndarray` has [lots of powerful functions](http://docs.scipy.org/doc/numpy-1.10.0/reference/generated/numpy.ndarray.html) and NumPy as a whole provides a [wide range of numerical routines](https://docs.scipy.org/doc/numpy/reference/routines.html) in areas such as linear algebra, finance, logic, trigonometry, etc. built to work on NumPy data structures.

- In addition to functionality, NumPy provides better performance by
    - Expressing `ndarray` as a type-homogenous, densely-packed memory representation vs `List`'s dynamic arrays
    - Implementing underlying routines in C or Fortran, with conveient Python wrappers
    - Reusing well-tuned libraries like BLAS for linear algebra

### pandas

- Pandas builds on NumPy and adds higher-level data-manipulation capabilities.

In [8]:
import pandas as pd

- As `ndarray` is NumPy's essential data structure, `DataFrame` is pandas'.
- _"Two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns)."_
# NEED SOME _SIMPLE_ ILLUSTRATIVE EXAMPLE(S) HERE

### Jupyter

- Jupyter is the evolution of the iPython project
    - iPython provides an interactive Python shell with facilities for code and text editing, data visualization, and compute parallelism
    - Similar to Mathematica notebooks
    - iPython Notebook uses a web browser interface
- Extends iPython to support "over 40" programming languages -- not just Python
- Browser environment
    - Home tab
        - _Files_
        - _Running_
        - _Clusters_
    - Notebook tab
        - Cells
            - When to create new cells
            - Shift + Return to evaluate a cell
            - _Cell_ menu: Run all, all above, etc.
            - Clearing output: _Cell_ > _Current Outputs_ > _Clear_ vs _Cell_ > _All Output_
            - Restarting kernel
            - Cut/copy/paste/move cells
            - Checkpoints
            - Keyboard shortcuts
        - Text formatting with Markdown
            - You're looking at it!
        - LaTeX support
            - Use in Markdown cells, e.g.:
$$e^x=\sum_{i=0}^\infty \frac{1}{i!}x^i$$
        - Magics

In [40]:
%lsmagic

Available line magics:
%alias  %alias_magic  %autocall  %automagic  %autosave  %bookmark  %cat  %cd  %clear  %colors  %config  %connect_info  %cp  %debug  %dhist  %dirs  %doctest_mode  %ed  %edit  %env  %gui  %hist  %history  %install_default_config  %install_ext  %install_profiles  %killbgscripts  %ldir  %less  %lf  %lk  %ll  %load  %load_ext  %loadpy  %logoff  %logon  %logstart  %logstate  %logstop  %ls  %lsmagic  %lx  %macro  %magic  %man  %matplotlib  %mkdir  %more  %mv  %notebook  %page  %pastebin  %pdb  %pdef  %pdoc  %pfile  %pinfo  %pinfo2  %popd  %pprint  %precision  %profile  %prun  %psearch  %psource  %pushd  %pwd  %pycat  %pylab  %qtconsole  %quickref  %recall  %rehashx  %reload_ext  %rep  %rerun  %reset  %reset_selective  %rm  %rmdir  %run  %save  %sc  %set_env  %store  %sx  %system  %tb  %time  %timeit  %unalias  %unload_ext  %who  %who_ls  %whos  %xdel  %xmode

Available cell magics:
%%!  %%HTML  %%SVG  %%bash  %%capture  %%debug  %%file  %%html  %%javascript  %%latex  %%

In [47]:
%ls

[34mimages[m[m/         module-0.ipynb  module-1.ipynb  module-2.ipynb  module-3.ipynb  module-4.ipynb  module-5.ipynb  module-6.ipynb


In [41]:
import time
%timeit time.sleep(1)

1 loops, best of 3: 1 s per loop


In [45]:
%%time
time.sleep(1)
time.sleep(2)

CPU times: user 376 µs, sys: 569 µs, total: 945 µs
Wall time: 3.01 s


In [42]:
%%ruby
3.times { puts 'hello' }

hello
hello
hello


In [43]:
%%script bash
whoami

pmadden


In [44]:
%%latex
$n^2 + 1$

<IPython.core.display.Latex object>