# Numpy referentie

Numpy is dé library voor wetenschappelijk rekenen in Python. Het biedt een hoogperformant multidimensioneel array object, en tools om met deze objecten te werken.
Hier geven we een referentie van de meest belangrijke functionaliteit, maar bekijk zeker ook de officiële [documentatie](https://numpy.org/doc/stable/reference/index.html).

Om Numpy te gebruiken, moeten we eerst het `numpy` package importeren:

In [1]:
import numpy as np

## Arrays

Een numpy array is een raster van waarden, allemaal van hetzelfde type, en wordt geïndexeerd aan de hand van integers. De _shape_ van een array is een tuple van integers die de grootte van de array langs elke dimensie aangeeft.

In [2]:
# Create a vector
a = np.array([1, 2, 3])
print(a)
print(f"type(a): {type(a)}\na.shape: {a.shape}\na[0]: {a[0]}\na[1]: {a[1]}\na[2]: {a[2]}")


[1 2 3]
type(a): <class 'numpy.ndarray'>
a.shape: (3,)
a[0]: 1
a[1]: 2
a[2]: 3


In [3]:
# Change an element of the array
a[0] = 5
print(a)

[5 2 3]


In [4]:
# Create a rank 2 array
b = np.array([[1, 2, 3], [4, 5, 6]])
print(b)
print(
    f"type(b): {type(b)}\nb.shape: {b.shape}\nb[0, 0]: {b[0, 0]}\nb[0, 2]: {b[0, 2]}\nb[1, 0]: {b[1, 0]}"
)

[[1 2 3]
 [4 5 6]]
type(b): <class 'numpy.ndarray'>
b.shape: (2, 3)
b[0, 0]: 1
b[0, 2]: 3
b[1, 0]: 4


Indexatie kan ook gebeuren aan de hand van tuples van integers waardoor we de indexatie via variabelen kunnen doen.

In [5]:
index = (0, 2)
print(b[index])

3


### Speciale arrays

Er zijn ook talrijke functies om speciale matrices te creëren op basis van shape parameters.

In [6]:
# Create a range of 5 values
np.arange(5)

array([0, 1, 2, 3, 4])

In [7]:
# Create a equally spaced range between 0 and 1 with 5 steps
np.linspace(0, 1, 5)  # 5 values from 0 to 1

array([0.  , 0.25, 0.5 , 0.75, 1.  ])

In [8]:
# Create an array of all zeros
a = np.zeros((3, 2))
print(a)

[[0. 0.]
 [0. 0.]
 [0. 0.]]


In [9]:
# Create an array of all ones
b = np.ones((4, 5))
print(b)

[[1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]]


In [10]:
# Create a constant array
c = np.full((2, 3), 7)
print(c)

[[7 7 7]
 [7 7 7]]


In [11]:
# Create a 5x5 identity matrix
d = np.eye(5)
print(d)

[[1. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1.]]


In [12]:
# Create an array filled with random values
rng = np.random.default_rng()  # Create a random number generator
e = rng.random((2, 3))
print(e)

[[0.48062712 0.33460006 0.06931556]
 [0.32320907 0.67773802 0.24421142]]


### _Slicing_
Zoals Python _lists_, kunnen we indexen gebruiken om _slices_ uit numpy arrays te bekomen. We kunnen dat apart voor iedere dimensie doen

In [13]:
a = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])

# Last column of a
print(a[:, 3])
# or
print(a[:, -1])

[ 4  8 12]
[ 4  8 12]


In [14]:
# First two rows and columns 1 and 2 of a
print(a[0:2, 1:3])

[[2 3]
 [6 7]]


:::{warning}
Wanneer we een _slice_ toewijzen aan een andere variabele, is dit een _pointer_. Het geheugenadres blijft dus hetzelfde en als we de pointer wijzigen, zal die wijziging ook te zien zijn bij de oorspronkelijke array variabele.
:::

In [15]:
# Assign the first 2 rows and columns 1 and 2 of a to b
b = a[:2, 1:3]
print(f"a: {a}")
print(f"b: {b}")


a: [[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]
b: [[2 3]
 [6 7]]


In [16]:
# Now change an element of b
b[0, 0] = 777
print(f"a: {a}")
print(f"b: {b}")


a: [[  1 777   3   4]
 [  5   6   7   8]
 [  9  10  11  12]]
b: [[777   3]
 [  6   7]]


Indien dit gedrag niet gewenst is, moeten we een expliciete kopie nemen van de oorspronkelijke array.

In [17]:
c = a[:2, 1:3].copy()  # Explicit copy
print(f"a: {a}")
print(f"c: {c}")

a: [[  1 777   3   4]
 [  5   6   7   8]
 [  9  10  11  12]]
c: [[777   3]
 [  6   7]]


In [18]:
c[0, 0] = 999
print(f"a: {a}")  # Verify that a is unchanged
print(f"c: {c}")

a: [[  1 777   3   4]
 [  5   6   7   8]
 [  9  10  11  12]]
c: [[999   3]
 [  6   7]]


### Boolean indexing

Je kan _boolean_ waarden (`True` of `False` in Python) gebruiken om specifieke element uit een array te lichten. Dit wordt frequent gebruikt om arrays te filteren op basis van een bepaalde conditie.

In [19]:
a = np.array([[1, 2], [3, 4], [5, 6]])

bool_idx = a > 2  # Find the elements of a that are bigger than 2;

print(bool_idx)
print(a[bool_idx])
# or
print(a[a > 2])

[[False False]
 [ True  True]
 [ True  True]]
[3 4 5 6]
[3 4 5 6]


Er kunnen ook speciale functies aangeroepen worden om condities te evalueren.

In [20]:
a = np.array([[1, np.nan], [3, 4], [np.nan, 6]])
print(a)

[[ 1. nan]
 [ 3.  4.]
 [nan  6.]]


In [21]:
np.isnan(a)  # Find the nan elements of a

array([[False,  True],
       [False, False],
       [ True, False]])

In [22]:
a[np.isnan(a) == False]
print(a)
# Or
a[~np.isnan(a)]
print(a)

[[ 1. nan]
 [ 3.  4.]
 [nan  6.]]
[[ 1. nan]
 [ 3.  4.]
 [nan  6.]]


### Datatypes (_dtype_)
Numpy probeert het datatype af te leiden uit de input, maar je kan ook expiciete types meegeven of _casten_.

In [23]:
a = np.array(["fruit", "meat", "vegetable", "dairy"])
print(a)
print(a.dtype)

['fruit' 'meat' 'vegetable' 'dairy']
<U9


In [24]:
a.astype("float64")

ValueError: could not convert string to float: np.str_('fruit')

In [25]:
a = np.arange(10)
print(a)
print(a.dtype)

[0 1 2 3 4 5 6 7 8 9]
int64


In [26]:
print(a.astype("float32"))
print(a.astype("float32").dtype)

[0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]
float32


In [27]:
b = np.array([1.5, 2.5, 3.5], dtype=np.int32)
print(b)
print(b.dtype)

[1 2 3]
int32


### Element-gewijze operaties

In [28]:
a = np.array([[1, 2], [3, 4]], dtype=np.float64)
print(f"a: {a}\n")
print(f"a * 3: {a * 3}")

a: [[1. 2.]
 [3. 4.]]

a * 3: [[ 3.  6.]
 [ 9. 12.]]


In [29]:
print(f"a - 3: {a - 3}")

a - 3: [[-2. -1.]
 [ 0.  1.]]


In [30]:
b = np.array([[5, 6], [7, 8]], dtype=np.float64)
print(f"b: {b}\n")
print(f"a + b: {a + b}")

b: [[5. 6.]
 [7. 8.]]

a + b: [[ 6.  8.]
 [10. 12.]]


In [None]:
print(f"a: {a}\n")
print(f"b: {b}\n")
print(f"a*b: {a * b}")  # Elementwise of "Hadamard" product

a: [[1. 2.]
 [3. 4.]]

b: [[5. 6.]
 [7. 8.]]

a*b: [[ 5. 12.]
 [21. 32.]]


### Dot product
:::{warning}
Het dot product van 2 arrays is niet gewoon `a*b` (wat het Hadamard product geeft). Voor het dot product of matrix multiplicatie gebruik je `a.dot(b)` of `a @ b`.
:::

In [32]:
a = np.array([9, 10])
b = np.array([11, 12])

print(f"a: {a}\n")
print(f"b: {b}\n")
print(f"aTb: {a.dot(b)}")

a: [ 9 10]

b: [11 12]

aTb: 219


In [33]:
print(f"aTb: {a[0] * b[0] + a[1] * b[1]}")

aTb: 219


In [34]:
# Alternative syntax for matrix multiplication
print(f"aTb: {a @ b}")

aTb: 219


In [35]:
a = np.array([[1, 2], [3, 4], [5, 7]])
b = np.array([[5, 6], [7, 8], [10, 11]])

print(f"a: {a}\n")
print(f"b: {b}\n")
print(f"aTb: {a @ b}")

a: [[1 2]
 [3 4]
 [5 7]]

b: [[ 5  6]
 [ 7  8]
 [10 11]]



ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 3 is different from 2)

In [36]:
print(f"aTb: {a.T @ b}")

aTb: [[ 76  85]
 [108 121]]


## Lineaire algebra met Numpy

Numpy biedt krachtige functies voor lineaire algebra operaties die essentieel zijn voor machine learning. Hier zijn enkele belangrijke functies.

### Vector en matrix Normen

De norm van een vector of matrix is een maat voor de "grootte". De meest gebruikte normen zijn:

In [37]:
# Vector normen
v = np.array([3, 4])
print(f"Vector: {v}")
print(f"L2 norm (Euclidean): {np.linalg.norm(v)}")
print(f"L1 norm (Manhattan): {np.linalg.norm(v, ord=1)}")
print(f"L∞ norm (Maximum): {np.linalg.norm(v, ord=np.inf)}")

# Matrix normen
A = np.array([[1, 2], [3, 4]])
print(f"\nMatrix:\n{A}")
print(f"Frobenius norm: {np.linalg.norm(A, 'fro')}")

Vector: [3 4]
L2 norm (Euclidean): 5.0
L1 norm (Manhattan): 7.0
L∞ norm (Maximum): 4.0

Matrix:
[[1 2]
 [3 4]]
Frobenius norm: 5.477225575051661


### Determinant

De determinant van een matrix geeft informatie over de lineaire transformatie die de matrix representeert.

In [38]:
A = np.array([[1, 2], [3, 4]])
det_A = np.linalg.det(A)
print(f"Matrix A:\n{A}")
print(f"Determinant: {det_A}")

# Een matrix met determinant 0 is singulier (niet inverteerbaar)
B = np.array([[1, 2], [2, 4]])
det_B = np.linalg.det(B)
print(f"\nMatrix B:\n{B}")
print(f"Determinant: {det_B} (singulier!)")

Matrix A:
[[1 2]
 [3 4]]
Determinant: -2.0000000000000004

Matrix B:
[[1 2]
 [2 4]]
Determinant: 0.0 (singulier!)


### Matrix inverse

De inverse van een matrix A is een matrix A⁻¹ zodat A⁻¹A = I (identiteitsmatrix):

In [None]:
A = np.array([[1, 2], [3, 4]])
A_inv = np.linalg.inv(A)

print(f"Matrix A:\n{A}")
print(f"\nInverse A⁻¹:\n{A_inv}")

# Verificatie: A⁻¹A should be identity matrix
identity_check = A @ A_inv
print(f"\nA × A⁻¹ (should be identity):\n{identity_check}")
print(f"\nIdentity matrix:\n{np.eye(2)}")

Matrix A:
[[1 2]
 [3 4]]

Inverse A⁻¹:
[[-2.   1. ]
 [ 1.5 -0.5]]

A × A⁻¹ (should be identity):
[[1.0000000e+00 0.0000000e+00]
 [8.8817842e-16 1.0000000e+00]]

Identity matrix:
[[1. 0.]
 [0. 1.]]


### Eigenwaarden en Eigenvectoren

Eigenwaarden en eigenvectoren zijn fundamenteel in lineaire algebra en machine learning.

In [40]:
A = np.array([[4, 2], [1, 3]])
eigenvalues, eigenvectors = np.linalg.eig(A)

print(f"Matrix A:\n{A}")
print(f"\nEigenwaarden: {eigenvalues}")
print(f"\nEigenvectoren:\n{eigenvectors}")

# Verificatie: A × v = λ × v voor elke eigenwaarde λ en eigenvector v
for i in range(len(eigenvalues)):
    lambda_i = eigenvalues[i]
    v_i = eigenvectors[:, i]

    left_side = A @ v_i
    right_side = lambda_i * v_i

    print(f"\nEigenwaarde {i + 1}: {lambda_i:.3f}")
    print(f"A × v = {left_side}")
    print(f"λ × v = {right_side}")
    print(f"Verschil: {np.allclose(left_side, right_side)}")

Matrix A:
[[4 2]
 [1 3]]

Eigenwaarden: [5. 2.]

Eigenvectoren:
[[ 0.89442719 -0.70710678]
 [ 0.4472136   0.70710678]]

Eigenwaarde 1: 5.000
A × v = [4.47213595 2.23606798]
λ × v = [4.47213595 2.23606798]
Verschil: True

Eigenwaarde 2: 2.000
A × v = [-1.41421356  1.41421356]
λ × v = [-1.41421356  1.41421356]
Verschil: True


### Stelsels van lineaire vergelijkingen oplossen

Voor het oplossen van lineaire systemen van de vorm Ax = b:

In [41]:
# Stel we hebben het systeem:
# 2x + 3y = 7
# x + 4y = 6

A = np.array([[2, 3], [1, 4]])
b = np.array([7, 6])

# Oplossen met np.linalg.solve
x = np.linalg.solve(A, b)
print(f"Coëfficiënten matrix A:\n{A}")
print(f"Constanten vector b: {b}")
print(f"Oplossing x: {x}")

# Verificatie
verification = A @ x
print(f"\nVerificatie A × x = {verification}")
print(f"Should equal b = {b}")
print(f"Correct: {np.allclose(verification, b)}")

Coëfficiënten matrix A:
[[2 3]
 [1 4]]
Constanten vector b: [7 6]
Oplossing x: [2. 1.]

Verificatie A × x = [7. 6.]
Should equal b = [7 6]
Correct: True


### Singular Value Decomposition (SVD)

SVD is een krachtige matrix factorization techniek die veel gebruikt wordt in machine learning.

In [42]:
# Maak een voorbeeldmatrix
A = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]])
print(f"Original matrix A ({A.shape}):\n{A}")

# SVD: A = U × Σ × V^T
U, s, Vt = np.linalg.svd(A)

print(f"\nU shape: {U.shape}")
print(f"s (singular values): {s}")
print(f"Vt shape: {Vt.shape}")

# Reconstructie van de originele matrix
# We moeten s omzetten naar een diagonale matrix van de juiste grootte
S = np.zeros((U.shape[1], Vt.shape[0]))
S[: len(s), : len(s)] = np.diag(s)

A_reconstructed = U @ S @ Vt
print(f"\nReconstructed matrix:\n{A_reconstructed}")
print(f"\nReconstruction accurate: {np.allclose(A, A_reconstructed)}")

Original matrix A ((4, 3)):
[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]

U shape: (4, 4)
s (singular values): [2.54624074e+01 1.29066168e+00 2.40694596e-15]
Vt shape: (3, 3)

Reconstructed matrix:
[[ 1.  2.  3.]
 [ 4.  5.  6.]
 [ 7.  8.  9.]
 [10. 11. 12.]]

Reconstruction accurate: True


### Matrix _rank_

De rank van een matrix geeft het aantal lineair onafhankelijke rijen of kolommen aan.

In [43]:
# Full rank matrix
A = np.array([[1, 2], [3, 4]])
rank_A = np.linalg.matrix_rank(A)
print(f"Matrix A:\n{A}")
print(f"Rank: {rank_A} (full rank voor 2x2 matrix)")

# Rank deficient matrix
B = np.array([[1, 2], [2, 4]])
rank_B = np.linalg.matrix_rank(B)
print(f"\nMatrix B:\n{B}")
print(f"Rank: {rank_B} (rank deficient!)")

# Grotere matrix
C = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
rank_C = np.linalg.matrix_rank(C)
print(f"\nMatrix C:\n{C}")
print(f"Rank: {rank_C} (rank deficient voor 3x3 matrix)")

Matrix A:
[[1 2]
 [3 4]]
Rank: 2 (full rank voor 2x2 matrix)

Matrix B:
[[1 2]
 [2 4]]
Rank: 1 (rank deficient!)

Matrix C:
[[1 2 3]
 [4 5 6]
 [7 8 9]]
Rank: 2 (rank deficient voor 3x3 matrix)


### Array _reshaping_ voor lineaire algebra

Vaak moeten we arrays hervormen voor matrix operaties.

In [44]:
# 1D array naar column vector
v = np.array([1, 2, 3, 4])
print(f"Original 1D array: {v} (shape: {v.shape})")

# Verschillende manieren om een column vector te maken
col_vector1 = v.reshape(-1, 1)
col_vector2 = v[:, np.newaxis]
col_vector3 = np.expand_dims(v, axis=1)

print(f"\nColumn vector (reshape): \n{col_vector1} (shape: {col_vector1.shape})")
print(f"\nColumn vector (newaxis): \n{col_vector2} (shape: {col_vector2.shape})")
print(f"\nColumn vector (expand_dims): \n{col_vector3} (shape: {col_vector3.shape})")

# Row vector
row_vector = v.reshape(1, -1)
print(f"\nRow vector: {row_vector} (shape: {row_vector.shape})")

# Flatten een matrix terug naar 1D
matrix = np.array([[1, 2], [3, 4]])
flattened = matrix.flatten()
print(f"\nMatrix:\n{matrix}")
print(f"Flattened: {flattened} (shape: {flattened.shape})")

Original 1D array: [1 2 3 4] (shape: (4,))

Column vector (reshape): 
[[1]
 [2]
 [3]
 [4]] (shape: (4, 1))

Column vector (newaxis): 
[[1]
 [2]
 [3]
 [4]] (shape: (4, 1))

Column vector (expand_dims): 
[[1]
 [2]
 [3]
 [4]] (shape: (4, 1))

Row vector: [[1 2 3 4]] (shape: (1, 4))

Matrix:
[[1 2]
 [3 4]]
Flattened: [1 2 3 4] (shape: (4,))
