# NumPy
- module NumPy
- permet d'effectuer des calculs sur des vecteurs ou des matrices, élément par élément
- nouveau type d'objet appelé array.

  
Documentation supplementaire: https://numpy.org/

### *Other modules:* 
- Pandas (dataframes)
- Scipy (scientific calculations)
- Matplotlib

## Intro Numpy

In [4]:
import numpy as np
a = [1, 2, 3]
a = np.array(a)

In [2]:
np.array(range(10))

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [5]:
print(a)
#arrays are printed WITHOUT comma

[1 2 3]


In [None]:
type(a)
#ndarray = n dimentional array

We cannot make an array of objects of different types

In [9]:
np.array([5, 1.5, 2])
#transforms everything into float
np.array([5, "tiger", 89])
#transforms everything into str

array(['5', 'tiger', '89'], dtype='<U21')

In [11]:
a = np.arange(10)
a + 1
# does not modify a (the og array) 
# a = a + 1

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [12]:
#we can make it with a comprehension list, but less efficient
list1 = list(range(10))
list1
[i+1 for i in list1]

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

In [14]:
a + a 
# array ADDITION, add every i-th element of array1 to i-th element of array2
b = np.array(a)
a + b

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])

In [17]:
#multiple dimentions e.g. 2d array
a2 = np.array([[1,2], [3,4], [5,6]])
print(a2)
a2.shape #shape in an ATTRIBUTE not a method
#3 lines, 2 columns

[[1 2]
 [3 4]
 [5 6]]


(3, 2)

In [19]:
print(a)
a.shape
#for 1n arrays, shape outputs a tuple of one element,
#where element is n of elements in the 1d array (here, 10)
print(a.ndim) #number of dimensions (here, 1d)
print(a.size)


[0 1 2 3 4 5 6 7 8 9]
1
10


In [22]:
print(a)
print(a.reshape((5,2)))
print(a.reshape((2,5)))
#allows to output the og array, but differently shaped
#first: makes 5 lines, 2 cols
#second: makes 2 lines, 5 cols

[0 1 2 3 4 5 6 7 8 9]
[[0 1]
 [2 3]
 [4 5]
 [6 7]
 [8 9]]
[[0 1 2 3 4]
 [5 6 7 8 9]]


#### IMPORTANT:

og array is **unchanged**

In [23]:
print(a)

[0 1 2 3 4 5 6 7 8 9]


#### .resize()

**changes** the og array (in-place modification)

In [26]:
a.resize(5,2)
print(a)

[[0 1]
 [2 3]
 [4 5]
 [6 7]
 [8 9]]


In [27]:
a = np.arange(10)
a

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [28]:
a.resize((5,3), refcheck = False)
#use refcheck=False
#to resize the array where there are not enough elements for reshape
a

array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8],
       [9, 0, 0],
       [0, 0, 0]])

In [29]:
a.resize((2,2), refcheck = False)
#use refcheck=False
#if too many elements: resizes the array to the least possible
a

array([[0, 1],
       [2, 3]])

In [30]:
a = np.arange(10)
b = np.resize(a, (5,2))
print(a)
print(b)
#np.resize(array, (size))
#does not perform in-place modification
#meaning, does not change the og array

[0 1 2 3 4 5 6 7 8 9]
[[0 1]
 [2 3]
 [4 5]
 [6 7]
 [8 9]]


## Indexation des arrays

**Arrays are subscriptable!**


In [31]:
print(a)
a[8]
#outputs the int (but with mentioning the type)

[0 1 2 3 4 5 6 7 8 9]


np.int64(8)

In [32]:
a[2:6]
#outputs the numpy array

array([2, 3, 4, 5])

In [37]:
print(b)
print(b[3,1])
#4th line, 2nd column

print(b[3,: ]) #4th line, all cols
print(b[:,0]) #1st col, all lines
#outputs one-dim array

[[0 1]
 [2 3]
 [4 5]
 [6 7]
 [8 9]]
7
[6 7]
[0 2 4 6 8]


### Calculations with NumPy

- .mean() pour la moyenne

- .sum() pour la somme

- .std() pour l'écart-type

- .max() pour extraire le maximum

- .min() 

- Pour faire sur colonnes/lignes spécifiquement : **axis**

In [39]:
print(b)
b.mean(axis=0)
#0: moyenne pour chaque colonne
#1: moyenne pour chaque ligne
b.mean(axis=1)

[[0 1]
 [2 3]
 [4 5]
 [6 7]
 [8 9]]


array([0.5, 2.5, 4.5, 6.5, 8.5])

In [40]:
np.zeros((5,5))

array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]])

In [41]:
np.ones((3,3))

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

In [42]:
np.full((3,3), 10)

array([[10, 10, 10],
       [10, 10, 10],
       [10, 10, 10]])

In [48]:
np.loadtxt("val.dat")

array([[  1.,   7., 310.],
       [ 15.,  -4.,  35.],
       [ 78.,  95.,  79.]])

## Algèbre linéaire

In [50]:
b = np.loadtxt("val.dat")
#Transpose: .T
# = inversion des valeurs autour la diagonale 
b.T

array([[  1.,  15.,  78.],
       [  7.,  -4.,  95.],
       [310.,  35.,  79.]])

In [None]:
#.linagl = sous-module de numpy
np.linalg.inv(b)

In [51]:
np.dot(b, b)
#dot product

array([[24286., 29429., 25045.],
       [ 2685.,  3446.,  7275.],
       [ 7665.,  7671., 33746.]])

Matrice inverse (M^-1), dont la multiplication sur la matrice d'origine donne la matrice d'identité :

- [1, 0, 0]
- [0, 1, 0]
- [0, 0, 1]

M * M^-1 = I

In [54]:
np.dot(b, np.linalg.inv(b))
#outputs the identity matrix,
#however we don't get the clean matfix of 1 and 0, we get floats of very near to 0 and very near to 1

array([[ 1.00000000e+00,  3.96817995e-17,  5.09575021e-18],
       [ 1.95156391e-17,  1.00000000e+00, -3.34476370e-17],
       [ 4.33680869e-19,  9.66024136e-17,  1.00000000e+00]])

### Eignevalues
eigenvector or characteristic vector is a vector that has its direction unchanged by a given linear transformation

In [58]:
eigvals, eigvects = np.linalg.eig(b)
print(eigvals)
print(eigvects)

[ 216.37258678 -119.22032306  -21.15226372]
[[-0.81413393 -0.9323703   0.79392922]
 [-0.14472907  0.01162499 -0.60648502]
 [-0.5623517   0.36131769 -0.0430385 ]]


PCA: Principal Components Analysis based on **eigenvalues**

In [59]:
eigvects[0,:]

array([-0.81413393, -0.9323703 ,  0.79392922])

In [60]:
eigvects[0]
#sur un array 2d, one index in brackets will output the whole corresponding line as an array

array([-0.81413393, -0.9323703 ,  0.79392922])

## Masques booléens

In [70]:
import random as rd
a = np.array([rd.randint(1,10) for i in range(9)])
a

array([10,  7,  2,  7,  9,  9,  8,  3,  2])

In [72]:
a = np.resize(a, (3,3))
a

array([[10,  7,  2],
       [ 7,  9,  9],
       [ 8,  3,  2]])

In [73]:
a > 6
#outputs the array of booleans

array([[ True,  True, False],
       [ True,  True,  True],
       [ True, False, False]])

In [74]:
a[a > 6]
#we can use array for which bool is True in matrix

array([10,  7,  7,  9,  9,  8])

In [75]:
a[a < 6] = 0
a[a >= 6] = 1
a
#we can the bool to transform the matrix: binarisation

array([[1, 1, 0],
       [1, 1, 1],
       [1, 0, 0]])