# Introduction à NumPy
Les listes Python montrent vite leur limitation en ce qui concerne le calcul scientifique:

In [1]:
liste = [1,2,3,4]

In [2]:
liste + 1

TypeError: can only concatenate list (not "int") to list

In [3]:
liste * 2

[1, 2, 3, 4, 1, 2, 3, 4]

En effet, contrairement à Matlab et IDL, le support des tableaux multidimensionels numériques n'est pas inclus dans le coeur du langage. 
 

## NumPy arrays

C'est pourquoi il existe une librairie, NumPy, qui permet de faire cela. NumPy est la brique de base à tout l'écosystème scientifique de Python.

In [4]:
import numpy as np

NumPy propose un type de tableau numérique N-dimensions : `array`. L'implémentation de NumPy repose sur du C (transparent) et est donc performante. L'interface utilisateur est très proche de celle de Matlab : 

In [5]:
# to create a NumPy array, call array() on a sequence 
my_array = np.array([0,1,2,3,4])

print(my_array)
print(type(my_array))

[0 1 2 3 4]
<class 'numpy.ndarray'>


In [6]:
my_array + 1

array([1, 2, 3, 4, 5])

In [7]:
my_array * 2

array([0, 2, 4, 6, 8])

In [8]:
my_array ** 2

array([ 0,  1,  4,  9, 16])

Parce que les array NumPy ont été conçus avec la problématique de la performance en tête, les array NumPy ont plusieurs propriétés spécifiques :

* Contrairement aux listes Python, on ne peut pas mélanger les types dans un tableau array NumPy

* Le type de données numériques peut être indiqué si besoin:

In [9]:
np.array([1.1, 2.2, 3.3]) # auto-guess 

array([ 1.1,  2.2,  3.3])

In [10]:
np.array([1.1, 2.2, 3.3]).dtype

dtype('float64')

In [11]:
np.array([1.1, 2.2, 3.3], dtype='int') # casting into int

array([1, 2, 3])

In [12]:
np.array([1.1, 2.2, 3.3], dtype='complex') 
# note that 1j or j is the imaginary unit in Python

array([ 1.1+0.j,  2.2+0.j,  3.3+0.j])

In [13]:
array1 = np.arange(0, 9).reshape((3,3))
array2 = np.arange(9, 0, -1).reshape((3,3))
array1 + array2

array([[9, 9, 9],
       [9, 9, 9],
       [9, 9, 9]])

**attention** : contrairement à matlab, l'opérateur * sur des tableaux NumPy effectue le produit élément par élément (cf.section broadcasting):

In [14]:
array1 * array2

array([[ 0,  8, 14],
       [18, 20, 20],
       [18, 14,  8]])

In [15]:
array1 * np.eye(3)

array([[ 0.,  0.,  0.],
       [ 0.,  4.,  0.],
       [ 0.,  0.,  8.]])

Le produit matriciel (où produit *intérieur*) est obtenu avec la fonction `dot` :

In [16]:
np.dot(array1, array2)

array([[ 12,   9,   6],
       [ 66,  54,  42],
       [120,  99,  78]])

In [17]:
array1.dot(np.eye(3))

array([[ 0.,  1.,  2.],
       [ 3.,  4.,  5.],
       [ 6.,  7.,  8.]])

## Fonctions utiles

Tout comme Matlab, NumPy proposes de nombreuses fonctions permettant de créer et éventuellement d'allouer des tableaux :

In [18]:
np.arange(0, 10, 1) # start, stop, step

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [19]:
np.zeros((5,3)) # the shape is a tuple

array([[ 0.,  0.,  0.],
       [ 0.,  0.,  0.],
       [ 0.,  0.,  0.],
       [ 0.,  0.,  0.],
       [ 0.,  0.,  0.]])

In [20]:
np.eye(4)

array([[ 1.,  0.,  0.,  0.],
       [ 0.,  1.,  0.,  0.],
       [ 0.,  0.,  1.,  0.],
       [ 0.,  0.,  0.,  1.]])

In [21]:
np.linspace(0, 10, num=6)

array([  0.,   2.,   4.,   6.,   8.,  10.])

NumPy also provides all mathematical functions which are compatible with NumPy N-D arrays :

In [22]:
array = np.arange(0,9)
print(array)

[0 1 2 3 4 5 6 7 8]


In [23]:
array.shape
# also : ndim, size (!=matlab)

(9,)

In [24]:
array.reshape((3,3))

array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

In [25]:
array.min()

0

In [26]:
array.max()

8

In [27]:
# 9 random integers between 0 (included) to 10 (excluded)
array = np.random.randint(0, 10, 9)
print(array)

[4 1 9 4 8 2 4 2 0]


In [28]:
# get the array index of the first maximum value
array.argmax()

2

In [29]:
np.sum(array)

34

In [30]:
np.sqrt(array)

array([ 2.        ,  1.        ,  3.        ,  2.        ,  2.82842712,
        1.41421356,  2.        ,  1.41421356,  0.        ])

In [31]:
np.tan(array) / np.cos(array)

array([ -1.77133417,   2.8824747 ,   0.49643358,  -1.77133417,
        46.7334012 ,   5.25064634,  -1.77133417,   5.25064634,   0.        ])

In [32]:
array3 = np.random.rand(3,3)
array3_inv = np.linalg.inv(array3)
print(array3_inv)

[[ 1.8417421  -1.14930143 -0.30758123]
 [-2.11880067  0.8970884   1.60786316]
 [ 1.47460841  1.86325903 -1.21222493]]


In [33]:
array3.dot(array3_inv) # OK

array([[  1.00000000e+00,  -2.38535564e-17,   2.49315340e-17],
       [  9.55012945e-17,   1.00000000e+00,   3.38148315e-17],
       [ -1.07507083e-16,   2.41201868e-17,   1.00000000e+00]])

## Slicing arrays

Récupérer des tranches de valeurs fonctionne de la même manière que pour les listes : 

In [34]:
array

array([4, 1, 9, 4, 8, 2, 4, 2, 0])

In [35]:
array[1:3]

array([1, 9])

In [36]:
array[1::2]

array([1, 4, 2, 2])

**Attention**: les *slices* sont des *vues* du tableau original. 

Ca veut dire que la modification de leurs éléments sont visibles dans le tableau original!

In [37]:
a = np.arange(16).reshape((4,4))

print(a)

a[1:3,1:3] = -1
print(a)



[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]
 [12 13 14 15]]
[[ 0  1  2  3]
 [ 4 -1 -1  7]
 [ 8 -1 -1 11]
 [12 13 14 15]]


## Pay attention

* Operations that involve attributes or methods of `ndarray` occur *in-place*. 
* While functions that take an `ndarray` as an argument return a *modified copy*.
* With NumPy ndarray, a=b creates a new reference to b, not a copy.

In [38]:
a = np.random.rand(5)
b = np.arange(5)
b = a # b is a new reference to a.
b[0] = 10
print(b)

[ 10.           0.52370363   0.4348391    0.40267459   0.99096507]


In [39]:
# Because b refers to a, modifyng b also modify a !
print(a)

[ 10.           0.52370363   0.4348391    0.40267459   0.99096507]


In [40]:
b=a.copy() 
b[0] = 20
print(b)
print(a) # a has not been modified.

[ 20.           0.52370363   0.4348391    0.40267459   0.99096507]
[ 10.           0.52370363   0.4348391    0.40267459   0.99096507]


## Broadcasting

Broadcasting rules describe how arrays with different dimensions and/or shapes can be used for computations.

The general rule is that: 2 dimensions are compatible when they are equal or when one of them is 1.

![](http://www.astroml.org/_images/fig_broadcast_visual_1.png)

In [41]:
a = np.arange(3)
b = 5

In [42]:
a+b

array([5, 6, 7])

In [43]:
a = np.ones((3,3))
b = np.arange(3)
a+b

array([[ 1.,  2.,  3.],
       [ 1.,  2.,  3.],
       [ 1.,  2.,  3.]])

In [44]:
a = np.arange(3).reshape(3,1)
b = np.arange(3)
a+b

array([[0, 1, 2],
       [1, 2, 3],
       [2, 3, 4]])

<div class='exercice'><h1>Exercice</h1>
Générez le tableau [2^0, 2^1, 2^2, 2^3]

</div>

In [45]:
2**np.arange(4)

array([1, 2, 4, 8], dtype=int32)

<div class='exercice'><h1>Exercice</h1>
Soit deux vecteur a et b, tels que shape(a)=(4,1) et shape(b)=(1,3). Calculer le produit exterieur a o b = a_i.b_j. Le resultat doit être de shape (4,3). 
</div>

In [46]:
a = np.arange(4).reshape(4,1) # reshapes into column vector
b = np.arange(3)
a*b

array([[0, 0, 0],
       [0, 1, 2],
       [0, 2, 4],
       [0, 3, 6]])

<div class='exercice'><h1>Quizz</h1>
Que va faire ? <br>
a = np.ones((4, 5)) <br>
a[0] = 2
</div>


In [47]:
a = np.ones((4, 5))
a[0] = 2 
a

array([[ 2.,  2.,  2.,  2.,  2.],
       [ 1.,  1.,  1.,  1.,  1.],
       [ 1.,  1.,  1.,  1.,  1.],
       [ 1.,  1.,  1.,  1.,  1.]])

## Fancy indexing, masking

Slicing is great when indices follow a regulary pattern.

But when one want arbitrary indexes, this is known as fancy indexing: the index is an integer array or a list of integer.

This requires a copy of the original array (so a performance cost)

In [48]:
a = np.arange(10)
print(a)

index = [3,1,6]
print(a[index])

[0 1 2 3 4 5 6 7 8 9]
[3 1 6]




Masking is like fancy indexing, except that it must be a *Boolean* array (not a Python list!).

As with fancy indexing, the application of a mask to an array will produce a copy of the original data.


In [49]:
mask = np.array([0,1,0,1,1,0], dtype=bool)
a[mask]

  from ipykernel import kernelapp as app


array([1, 3, 4])

**Pay attention**: The following does not work as one could expect from Matlab behaviour !

In [50]:
a[[0,1,0,1,1,0]] # Here we use a Python list. 

array([0, 1, 0, 1, 1, 0])

<div class='exercice'><h1>Quizz</h1>
Pouvez-vous expliquer le dernier exemple ?
</div>

The mask can be generated in the indexing operation itself

In [51]:
a[a > 5]

array([6, 7, 8, 9])

It is also possible to combine masks with operators

In [52]:
a[(a>5) & (a<=8)]

array([6, 7, 8])

La fonction NumPy `where()` prend en argument un tableau de booléen et retourne un tuple des indices où la condition est vérifiée (True). C'est l'équivalent du `find` de Matlab. 

In [53]:
np.where(a > 5)

(array([6, 7, 8, 9], dtype=int64),)

In [54]:
a[np.where(a > 5)]

array([6, 7, 8, 9])

<div class='exercice'><h1>Exercice</h1>
Récupérez le courant plasma (signal SIPMES) pour le choc 47979 et tracez-le uniquement lorsque Ip>50 kA.
</div>

# Autres operations

## Comparaisons termes à termes:

In [55]:
a = np.array([1,2,3,4])
b = np.array([0,2,3,1])
a == b

array([False,  True,  True, False], dtype=bool)

In [56]:
a > b

array([ True, False, False,  True], dtype=bool)

## Comparaisons globales:

In [57]:
np.array_equal(a,b)

False

In [58]:
# Test whether all array elements evaluate to True.
np.all([1, 1, 0]) 

False

In [59]:
# Test whether any array element evaluates to True.
np.any([1, 1, 0])

True

<div class='exercice'><h1>Quizz</h1>
Ces dernières fonctions sont pratiques pour tester des égalités entre tableaux. Quel va être le résultat ci-dessous ?
</div>

In [60]:
>>> a = np.array([1, 2, 3, 2])
>>> b = np.array([2, 2, 3, 2])
>>> c = np.array([6, 4, 4, 5])

In [61]:
>>> ((a <= b) & (b <= c)).all()

True

## Tests avec des flottants

In [62]:
a = np.array([1.00001, 2, 3, 4])
b = np.array([1,2,3,4])

In [63]:
a == b

array([False,  True,  True,  True], dtype=bool)

In [64]:
np.isclose(a,b)

array([ True,  True,  True,  True], dtype=bool)

In [65]:
np.array_equal(a,b)

False

In [66]:
np.allclose(a, b, rtol=1e-5)

True

## Operations logiques:

In [67]:
a = np.array([0,0,1,1])
b = np.array([0,1,0,1])
np.logical_and(a,b)

array([False, False, False,  True], dtype=bool)

## Transposition

In [68]:
a=np.triu(np.ones(3)) # Upper triangle of an array
a

array([[ 1.,  1.,  1.],
       [ 0.,  1.,  1.],
       [ 0.,  0.,  1.]])

In [69]:
a.T

array([[ 1.,  0.,  0.],
       [ 1.,  1.,  0.],
       [ 1.,  1.,  1.]])

## Reductions

In [70]:
a = np.array([[1,1],[2,2]])
a.sum() # somme tous les elements par defaut

6

In [71]:
a.sum(axis=0)

array([3, 3])

In [72]:
a.sum(axis=1)

array([2, 4])

Idem pour la pluspart des fonctions : min, max, etc...

## Statistiques

In [73]:
a = np.random.rand(1e6)

  if __name__ == '__main__':


In [74]:
a.mean() # ou np.mean(a)

0.49974077726350163

In [75]:
a.std() # ou np.std(a)

0.2887288190022515

## Grilles

In [76]:
x, y = np.arange(4), np.arange(4).reshape(4,1)
# ou, astuce pour y : np.arange(4)[:, np.newaxis]
x * y

array([[0, 0, 0, 0],
       [0, 1, 2, 3],
       [0, 2, 4, 6],
       [0, 3, 6, 9]])

In [77]:
x, y = np.meshgrid(np.arange(4), np.arange(4))
print(x)
print(y)

[[0 1 2 3]
 [0 1 2 3]
 [0 1 2 3]
 [0 1 2 3]]
[[0 0 0 0]
 [1 1 1 1]
 [2 2 2 2]
 [3 3 3 3]]


## Aplatissement

In [78]:
a = np.array([[1,2,3],[4,5,6]])
a

array([[1, 2, 3],
       [4, 5, 6]])

In [79]:
a.ravel() # Flattening

array([1, 2, 3, 4, 5, 6])

In [80]:
a.T

array([[1, 4],
       [2, 5],
       [3, 6]])

In [81]:
a.T.ravel() # ou a.transpose().ravel()

array([1, 4, 2, 5, 3, 6])

<div class='exercice'><h1>Exercice</h1>
Utilisez les fonctions ravel() et flatten(). Quelle est la différence entre ces deux fonctions? (indice: laquelle retourne une vue et laquelle une copie?)
</div>

In [82]:
a = np.array([[1,2,3],[4,5,6]])
b = a.ravel() # Return a flattened array.
c = a.flatten() # Return a copy of the array collapsed into one dimension.

In [83]:
a[0,0] = -1
print(b)
print(c)


[-1  2  3  4  5  6]
[1 2 3 4 5 6]


## Reshaping

In [84]:
a

array([[-1,  2,  3],
       [ 4,  5,  6]])

Reshaper sans spécifier l'ensemble des dimensions :  

In [85]:
# unspecified (-1) value is inferred
a.reshape(3, -1) 

array([[-1,  2],
       [ 3,  4],
       [ 5,  6]])

## Algebre lineaire
Le module [numpy.linalg](http://docs.scipy.org/doc/numpy-1.10.0/reference/routines.linalg.html) contiens les outils pour:

* Matrix and vector products
* Decompositions
* Matrix eigenvalues
* Norms and other numbers
* Solving equations and inverting matrices

In [86]:
a = [[1, 0], [0, 1]]
b = [[4, 1], [2, 2]]
print(a) 
print(b)
np.dot(a, b)

[[1, 0], [0, 1]]
[[4, 1], [2, 2]]


array([[4, 1],
       [2, 2]])

In [87]:
# CSS Styling
from IPython import utils  
from IPython.core.display import HTML  
import os  
def css_styling():  
    """Load the CSS sheet 'custom.css' located in the directory"""
    styles = "<style>\n%s\n</style>" % (open('./custom.css','r').read())
    return HTML(styles)
css_styling()  