# A Numpy Primer

>  *[NumPy] is everywhere. It is all around us. Even now, in this very room. You can see it
> when you look out your window or when you turn on your television. You can feel it when 
> you go to work … when you go to church … when you pay your taxes.*

     - Morpheus, The Matrix

[Source](https://www.safaribooksonline.com/library/view/elegant-scipy/9781491922927/ch01.html)

## What is Numpy

Numpy is the core package for arrays computation in Python.
In this notebook we will review a few basic concepts on how to use Numpy arrays.

## Importing Numpy

Python has only a small number of [builtins](https://docs.python.org/3/library/functions.html). All the other functions are organized in packages
that need to be imported. Here we import numpy:

In [None]:
import numpy as np

In [None]:
np.__version__  # most packages have a "version string"

All the functions provided by numpy are now accessible with the prefix `np.`.

<div class="alert alert-info">
<b>Running a cell:</b><br>
You can run a cell with <code>SHIFT+ENTER</code>. See menu <i>Help -> User Interface Tour</i> for more info.
</div>

<div class="alert alert-info">
<b>Autocompletion:</b><br>
Use <code>TAB</code> key to auto-complete commands. Two <code>TAB</code> show the list of alternatives. 
Autocompletion is a great help in avoiding spelling errors!
</div>

### About namespaces

The `np` prefix is called a *[namespace](https://www.google.com/search?q=python+namespace)* and helps avoiding confusion when 
different packages have a function with the same name. A classical example
is the python [builtin `max()`]() and [numpy's `max()`](). 
We call the latter typing `np.max()`, so the "namespace" resolves the 
ambiguity.

<div class="alert alert-warning">
<b>Trivia:</b><br>
Can you find out the difference between the builtin <code>max()</code> and <code>np.max()</code>?
</div>

## Numpy array creation

Manually entering an array:

In [None]:
np.array([[5, 2, 3],
          [7, 8, 1]])  # NOTE: line splitting here is only for aesthestics

Array zeros:

In [None]:
np.zeros((3,2))  # NOTE the second set of ()

Array of random values:

In [None]:
np.random.random((3,2))  # uniform distribution

**Exercise:** Guess how to create a 3x4 array ones:

**Exercise:** Create an array of 10 numbers starting from 0 to 9 using the function [`np.arange`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.arange.html):

See also: [Array creation cheatsheet](#Array-creation)

## Numpy Indexing

- Index starts at 0, i.e. 0 is the first element
- Index can be negative: -1 is the last element, -2 is the second last, etc...

### Scalars

In [None]:
x = np.arange(10)
x

In [None]:
x[0], x[2], x[-1], x[-2]

<div class="alert alert-info">
<b>NOTE for MATLAB users:</b><br>
Python uses <code>[ ]</code> when indexing and <code>( )</code>
when calling a function. MATLAB uses <code>( )</code> for both.
</div>

### Slicing

- Slice one dimension: **[start : stop : step]**

- You can omit *start*, *stop* or *step* and this will happen:
    1. omitting **start**: slices from the beginning
    1. omitting **stop**: slice till the end
    1. omitting **step**: use step=1
    
Before running the next cells try to "predict" the output of the following commands:

In [None]:
x[2:10]

In [None]:
x[2:10:3]

In [None]:
x[::]

In [None]:
x[:]

In [None]:
x[::2]

In [None]:
x[2::2]

In [None]:
x[::-1]

**Exercise:** Discard the first and last elements in `x`, then take 
one every 2 of the remaining elements (output should be `array([1, 3, 5, 7])`):

<div class="alert alert-info">
<b>NOTE</b><br>
Unlike in MATLAB, in Python indexing can be chained. For example
<code>x[3:-1][::2]</code> is equivalent to <code>x[3:-1:2]</code>.
</div>
<br> 

**Exercise:** Discard the first two elements and the last elements in x, then invert the order. Try to get the result with two slices (`x[ ][ ]`) or with one slice (`x[ ]`):

### Boolean mask

Get all elements in `x` larger than 5:

In [None]:
x[x > 5]

What is the object `x > 5`?

In [None]:
x > 5

**Exercise:** Create an array `y` of 10 random numbers in [0..1], then select
all the elements between 0.2 and 0.7:

Boolean masks can be negated with `~`, combined with `*` (**AND**) or `+` (**OR**) or compared with `==`.

For example:

In [None]:
(~(x > 5))*(~(x < 7)) == ~((x > 5)+(x < 7))

The previous expression always returns all True, for any `x`.
This is called the [De Morgan Law](https://en.wikipedia.org/wiki/De_Morgan%27s_laws)
in boolean logic.

In [None]:
all((~(x > 5))*(~(x < 7)) == ~((x > 5)+(x < 7)))

### 2D Arrays

Numpy array can have multiple dimensions. Here it is a 2D array (it will be
indexed by row, column):

In [None]:
A = np.arange(20).reshape(5, 4)
A

#### Indexing rules

- Index: **[ rows, cols ]**
- **row** or **cols** can be scalars, slices or arrays
- Trailing dimension (`cols`) can be omitted

> **MEMO:** Even for row and columns, index starts at 0!

In [None]:
A[1,2]

In [None]:
A[:2, :3]

In [None]:
A[0, :]

In [None]:
A[0]

In [None]:
A[0,0]

In [None]:
A[A>5]

In [None]:
A[::-1]

<div class="alert alert-success">
<b>COMPLETION</b><br>
If you mastered all the code above you are now a <b>powerful apprentice</b>!
<br>
<br>
You are ready for the workshop. If you want, challenge yourself you'll find one more exercise below!
</div>

> Questions? Ask them on the slack channel!


## Numpy Cheatsheets

### Array creation

In the following cheatsheet the `np.` prefix is omitted. Does the following make sense?

![Numpy%20tutorial%20-%20NRougier%20-%20array%20creation.png](attachment:Numpy%20tutorial%20-%20NRougier%20-%20array%20creation.png)
> Source: [Numpy Tutorial](https://www.labri.fr/perso/nrougier/teaching/numpy/numpy.html#id6) by *Nicolas P. Rougier*, .

### Slicing cheatsheets 1

At this point, you should be able to understand this:

![Python-for-Data-Analysis_by_Wes_McKinney_mod.png](attachment:Python-for-Data-Analysis_by_Wes_McKinney_mod.png)
> Source: **Python for Data Analysis** by *Wes McKinney*, [Ch4. NumPy Basics: Arrays and Vectorized Computation](https://www.safaribooksonline.com/library/view/python-for-data/9781449323592/ch04.html).

### Slicing cheatsheets 2

A little bit more of "slicing" fun:

![Numpy%20tutorial%20-%20NRougier%20-%20array%20slicing.png](attachment:Numpy%20tutorial%20-%20NRougier%20-%20array%20slicing.png)
> Source: [Numpy Tutorial](https://www.labri.fr/perso/nrougier/teaching/numpy/numpy.html#id6) by *Nicolas P. Rougier*, .

### Advanced Indexing

**Exercise:** Given the 2D array `a` in the figure, can you index `a` 
to obtain the 3 selections highlighted by different colors?

![numpy_fancy_indexing_nosolution.png](attachment:numpy_fancy_indexing_nosolution.png)
> Source: **Scipy Lecture Notes** by *Emmanuelle Gouillart, Didrik Pinte, Gaël Varoquaux, and Pauli Virtanen*. [Chapter: The Numpy Arrays Object](http://www.scipy-lectures.org/intro/numpy/numpy.html#the-numpy-array-object)

In [None]:
# Here I create `a` for you
a = np.arange(0, 51, 10)[:,np.newaxis] + np.arange(6)  # broadcasting trick
a

# References

## Basic and Intermediate

- [Numpy Tutorial](https://www.labri.fr/perso/nrougier/teaching/numpy/numpy.html#quick-references) by *Nicolas P. Rougier*

> Get hooked with Numpy by simulating the [game of life](https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life),
>
> solve some [one-line numpy trivia](https://www.labri.fr/perso/nrougier/teaching/numpy/numpy.html#exercises),
>
> or, skip to the [Quick Reference](https://www.labri.fr/perso/nrougier/teaching/numpy/numpy.html#quick-references) section for great graphical examples 
> of Numpy's indexing.

- [NumPy: creating and manipulating numerical data](http://www.scipy-lectures.org/intro/numpy/numpy.html#the-numpy-array-object) by *Emmanuelle Gouillart, Didrik Pinte, Gaël Varoquaux, and Pauli Virtanen*

> Chapter about Numpy from the famous **Scipy Lecture Notes** book.

## Advanced

This is more advanced material not covered in the workshop:

- [Elegant Scipy - Ch 1](https://www.safaribooksonline.com/library/view/elegant-scipy/9781491922927/ch01.html) by *Harriet Dashnow, Stéfan van der Walt, Juan Nunez-Iglesias*

> This free chapter of the *Elegant Scipy* book shows the power and elegance of Numpy
> by analyzing gene-expression data.


- https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html#advanced-indexing

> The official reference of advanced indexing

- https://jakevdp.github.io/PythonDataScienceHandbook/02.07-fancy-indexing.html

> Good explanation of fancy indexing using 2D arrays as examples.
 
 
- https://github.com/stefanv/teaching/blob/master/2010_scipy_numpy_kittens_dragons/kittens_dragons_scipy2010.pdf

> Array broadcasting explained with figures, plus the classical "Jack's problem" from the mailing list

- https://stackoverflow.com/questions/11942747/numpy-multi-dimensional-array-indexing-swaps-axis-order

> Why axis are reordered when fancy indexing is mixed with basic indexing?

## Hinc Sunt Leones

- [Einstein Summation in Numpy](https://obilaniu6266h16.wordpress.com/2016/02/04/einstein-summation-in-numpy/)