## 1. Python basics: Numpy

Numpy is an essential package for scientific computing. Most notably, it contains a powerful N-dimensional array object, similar to the matrices in MATLAB. 

IMPORTANT : the numpy documentation is quite good. The Notebook system is really good to help you. Use the Auto-Completion with Tab, and use Shift+Tab to get the complete documentation about the current function (when the cursor is between the parenthesis of the function for instance).

## Getting started

First, you need to make sure that you have numpy installed. To do this, open `Anaconda Prompt` and type `conda list` to list all installed packages. You can also execute command line commands by starting a line with an exclamation mark like this:

In [None]:
!conda list

If you can't find numpy in the list, install it by running `conda install numpy` <br> and restart the kernel.
If the following line raises an error, verify that you've installed numpy and restart the kernel.

In [None]:
import numpy as np

## Creation of arrays

Creating ndarrays (`np.zeros`, `np.ones`) is done by giving the shape as an iterable (`list` or `tuple` shown in first tutorial). An integer is also accepted for one-dimensional array.

`np.eye` creates an identity matrix.

In [None]:
np.zeros(4)

In [None]:
np.eye(3)

In [None]:
np.array([[1,3,4],[2,5,6]])

In [None]:
np.arange(10)  # NB : np.array(range(10)) is a slightly more complicated equivalent

In [None]:
np.random.randn(3, 4) # normal distributed values

In [None]:
# 3-D tensor
tensor_3 = np.ones((2, 4, 2))
tensor_3

The type of those arrays is

In [None]:
type(np.eye(3))

## ndarray basics

A ndarray python object is just a reference to the data location and its characteristics.

All numpy operations applying on an array can be called np.function(a) or a.function() (i.e `np.sum(a)` or `a.sum()`)

It has an attribute `shape` that returns a tuple of the different dimensions of the ndarray. It also has an attribute `dtype` that describes the type of data of the object (default type is float64)

In [None]:
# The ndarray tensor_3 is a 2 x 4 x 2 matrix and filled with 64bit floating point numbers
tensor_3.shape, tensor_3.dtype

In [None]:
a = np.array([[1.0, 2.0], [5.0, 4.0]])
b = np.array([[4, 3], [2, 1]])
(b.dtype, a.dtype) # each array has a data type 

In [None]:
np.array(["Mickey", "Mouse"]) # can hold more than just numbers

When you create an array you write the values in the array to the memory somewhere. 
The name to which the array is assigned ('a' in the example below) is only a reference to those values in the memory. <br>
This has some consequences that you need to be careful with:

In [None]:
a = np.array([[1.0, 2.0], [5.0, 4.0]])
b = a  # Copying the reference only.
b[0,0] = 3 # Modifying b modifies the original object because b is a reference to the same thing as a
a

Using the copy method takes a second portion of memory and puts the same values there, essentially creating a new object that is independent of the old one

In [None]:
a = np.array([[1.0, 2.0], [5.0, 4.0]])
b = a.copy()  # Deep-copy of the data
b[0,0] = 3
a

**Basic operators are working element-wise (+, -, *, /)**

In [None]:
np.ones((2, 4)) * np.random.randn(2, 4)

In MATLAB, if you create a 1d array, such as [1 3 4 5] is automatically converts it to a 2D array with size 1x4. In Python you can acually have 1d arrays.<br>

For example:

In [None]:
array_1d = np.array([1,3,4,5])
array_1d.shape

If you want to add this to a 2d array array, numpy automatically adds the 2nd dimension if the shapes are otherwise compatible

In [None]:
array_2d = np.array([[3, 5, 5, 8], [3, 4, 4, 4]])
array_2d.shape

In [None]:
array_2d + array_1d

When trying to apply operators for arrays with different sizes, they are very specific rules that you might want to understand in the future 
http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html

**Accessing elements and slicing**

In [None]:
print(a)
print(a[0])  # Get first line (slice for the first dimension)
print(a[:, 1])  # Get second column (slice for the second dimension)
print(a[0, 1])  # Get first line second column element

In [None]:
r = np.random.randint(0, 9, size=(3, 4))
r

In [None]:
r[0], r[1]

In [None]:
r[0:2]

In [None]:
r[1][2] 

In [None]:
r[1, 2] # This is equivalent

In [None]:
r[:, 1:3]

**Change the shape of an array**

`ravel` creates a flattened view of an array (1-D representation) whereas `flatten` creates flattened copy of the array.

`reshape` allows in-place modification of the shape of the data. `transpose` shuffles the dimensions.

`np.newaxis` allows the creation of empty dimensions.

In [None]:
a = np.array([[1.0, 2.0], [5.0, 4.0]])
b = np.array([[4, 3], [2, 1]])
v = np.array([0.5, 2.0])

In [None]:
print(a)
print(a.T)  # Equivalent : a.tranpose(), np.transpose(a)
print(a.ravel())

In [None]:
a.reshape((-1, 1)) # a[-1] means 'whatever needs to go there'

In [None]:
c = np.random.randn(4,5)
print(c.shape)
print(c[np.newaxis].shape)  # Adding a dimension
print(c.T.shape)  
print(c.reshape([10,2]).shape)
print(c)
print(c.reshape([10,2]))

**Reduction operations**<br>
Reduction operations (`np.sum`, `np.max`, `np.min`, `np.std`) work on the flattened ndarray by default. You can specify the reduction axis as an argument

In [None]:
np.sum(a), np.sum(a, axis=0), np.sum(a, axis=1) # reduce-operations reduce the whole array if no axis is specified

**Linear algebra operations**

In [None]:
np.dot(a, b) # matrix multiplication

In [None]:
# Other ways of writing matrix multiplication, the '@' operator for matrix multiplication
# was introduced in Python 3.5
a @ b

In [None]:
# For other linear algebra operations, use the np.linalg module
np.linalg.eig(a)  # Eigen-decomposition

In [None]:
np.linalg.inv(a) # Inverse

**Binary masks**

Using logical operations on arrays give a binary mask. Using a binary mask as indexing acts as a filter and outputs just the very elements where the value is True. This gives a memoryview of the array that can get modified.

In [None]:
r > 5  # Binary element-wise result

In [None]:
r[r > 5]  # Use the binary mask as filter

In [None]:
r[r > 5] = 999  # Modify the corresponding values with a constant

In [None]:
r

## Scipy

SciPy is a collection of libraries more specialized than Numpy. 

Have a look at their collection: http://docs.scipy.org/doc/scipy/reference/

Many traditionnal functions are coded there

In [None]:
X = np.random.randn(1000) # White noise

In [None]:
# Fast Fourier transform
import matplotlib.pyplot as plt
from scipy.fftpack import fft
plt.plot(fft(X).real)

## Some remarks

- The numpy library is huge. If you have to do something with arrays that seems cumbersome to implement, there is probably a function in numpy that does it for you. 
- **Use Google!** Due to being open source, Python has a huge userbase of which many have probably run into the same problem before you did (except if you are doing something very exotic). So searches like "How do I set the diagonal of a numpy array to 0" are going to give you either a link to the documentation of the function you need or refer you to a forum post on StackOverflow, where someone explains how to do it. 

Now that you know how to deal with numbers we can do something a little more interesting next, like plotting and looking at actual data.

## Next tutorial: plotting with matplotlib