<div style="margin: 0 auto 30px; height: 60px; border: 2px solid gray; border-radius: 6px;">
  <div style="float: left;"><img src="img/epfl.png" /></div>
  <div style="float: right; margin: 20px 30px 0; font-size: 10pt; font-weight: bold;"><a href="https://moodle.epfl.ch/course/view.php?id=18253">COM202 - Signal Processing</a></div>
</div>
<div style="clear: both; font-size: 30pt; font-weight: bold; color: #483D8B;">
    Lab 2: Introduction to NumPy
</div>

In [1]:
# First things first, import numpy (as almost always)
import numpy as np
import matplotlib
import matplotlib.pyplot as plt

NumPy is a solid library that provides lots of built in functions specifically designed for vector and matrix operations. While this seems a bit useless for you now (after having to compute all these matrices by hand without knowing why), it will quickly one of the most useful tools in lots of applications throughout your studies. 

To get you started, this lab will focus on some simple linear algebra operations using NumPy and present a final application in image processing using the so-called Haar Basis.

# NumPy Arrays

There are two standard ways of creating NumPy arrays:
 * transform a usual array object using ``np.array`` 
 * use NumPy's built-in functions

In [2]:
# Here we create an array from a Python array.
matr = np.array([[1,2,3],[4,5,6]])

# There exists various built-in functions to easily create different array-like objects:
vect = np.ones(5)

id_matrix = np.eye(5)

one_matrix = np.ones((5,3))

zero_matrices = np.zeros((2,3,3))

print("1D vector: \n{}\n".format(vect))
print("2D matrix: \n{}\n".format(matr))
print("Identity matrix: \n{}\n".format(id_matrix))
print("1 matrix: \n{}\n".format(one_matrix))
print("array of 0 matrix: \n{}\n".format(zero_matrices))

1D vector: 
[1. 1. 1. 1. 1.]

2D matrix: 
[[1 2 3]
 [4 5 6]]

Identity matrix: 
[[1. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1.]]

1 matrix: 
[[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]

array of 0 matrix: 
[[[0. 0. 0.]
  [0. 0. 0.]
  [0. 0. 0.]]

 [[0. 0. 0.]
  [0. 0. 0.]
  [0. 0. 0.]]]



## Data types

Even though Python is not a static-typed language, you can still declare a variable' type via the syntax 

``variable:type=value``

Note that all NumPy arrays always refer to an underlying data type; you can specify the type at creation time via the directive

``dtype=...``

or you cast existing arrays using the ``.astype()`` method.

In [None]:
# Creating an array of integers
int_vect = np.array([1,2,3,4,5], dtype=np.int32)
print(int_vect.dtype)

# Cast int array to float array:
float_vect = int_vect.astype(np.float32)
print(float_vect.dtype)

# As we can see, multiplying with standard * two vectors will simply multiply components pairwise
# and output another vector. In fact we will see later on why.
print(int_vect * float_vect)

The features of NumPy arrays can be inspected via a series of attributes such as ``size``, ``shape``, and ``ndim``. This is particularly useful when working with multi-dimensional vectors or matrices.  

In [None]:
maitr = np.array([[1,2,3,4],[5,6,7,8]])
print("The dimension of the matrix is: {}\n".format(maitr.ndim))
print("The shape of the matrix is: {}\n".format(maitr.shape))
print("The size of the matrix is: {}\n".format(maitr.size))
print("The data type of the matrix is: {}\n".format(maitr.dtype))

## Reshaping

The structure of NumPy arrays can be modified using is the ``.reshape()`` method; note that the new shape must contain the same number of elements as the original shape. 

In [None]:
# Creating a matrix of sequentials numbers:
matr = np.arange(10,20).reshape(5,2)
print("A matrix of size 5,2:\n{}\n".format(matr))

matr = matr.reshape(2,5)
print("A matrix of size 2,5:\n{}\n".format(matr))

## Accessing data with slicing

One elegant and easy way of accessing data in arrays is by the mean of slicing. Slices are specific subsets of the array data. 

The standard notation for a 1D array would be ``start:end:step`` where ``end`` is exclusive. If ``end`` is omitted, it defaults to the end of the array (inclusive).

In [None]:
# Create a 1D array of sequential numbers.
vect = np.arange(12)
print("The first five elements are: {}\n".format(vect[:5]))
print("The even elements of the array are: {}\n".format(vect[::2]))

# It is also useful when trying to change only specific parts of the array.
vect_1_added = vect.copy()
vect_1_added[::2] += 1
print(vect_1_added, "\n")

# Creating a matrix of size 10x10 filled with 0s.
matr = np.zeros((11,11), dtype=np.int32)

# Adding ones on the 5th column:
matr[:, 5] = 1
print("The matrix with 1s only on 5th column:\n{}\n".format(matr))

# Adding ones to 5th row:
matr[5, :] = 1
print("The matrix with 1s only on 5th column and 5th row:\n{}\n".format(matr))

## Broadcasting

Last but not least, broadcasting is a built-in method used by NumPy to apply transformations to arrays in an elegant and concise way. When appling operations to arrays of different sizes, NumPy attempts to extend ("broadcast") the smaller array so that it matches the size of the larger one. 

For example, when adding a vector ``vect`` to a matrix ``M``, the broadcasting step will expand the vector into a properly sized matrix.

In [None]:
# Create a vector of length 5
vect = np.array([1,2,3,4,5])

# Matrix of 1s
matr = np.ones((4,5), dtype=np.int64)

print("The matrix: \n{}\n and the vector:{}\n".format(matr, vect))

res_1 = vect + matr 
print("Adding vect to matrix gives: \n{}\n".format(res_1))

# Note that you need at least ONE matching dimension:
matr_2 = np.ones((4,4,5), dtype=np.int64)
res_2 = matr_2 + vect

print("Adding vect to array of matrices gives: \n{}\n".format(res_2))

Be aware that, for broacasting to work, the arrays must have at least one matching dimension:

In [None]:
# Create a vector of length 5
vect = np.array([1,2,3,4,5])

# Matrix of 1s of size 4x6
matr = np.ones((4,6), dtype=np.int64)
res_3 = matr + vect

More details on broadcasting are availabe [here](https://numpy.org/devdocs/user/basics.broadcasting.html).

## Exercise

**Task 1**: Imagine you have a 4096-element vector representing pixels of an image. Reshape the vector into a square image.

In [None]:
img_ex = np.array(plt.imread('img/cameraman.jpg'), dtype=int)
vect = np.ravel(img_ex, "F")

img = ... #Your code

# You can use these to compare your result. 
plt.matshow(img)
plt.matshow(img_ex)

In [None]:
# About slicing :)
arr = np.array([[1,  2,  3,  4],
                [5,  6,  7,  8],
                [9, 10, 11, 12]])

Use slicing to extract the following subarrays:
- The second row of the array.
- The second column of the array.
- The subarray consisting of the first two rows and the first two columns.
- The subarray consisting of the last two rows and the last two columns.

Use slicing to modify the following subarrays:
- Replace the second row with the values [13, 14, 15, 16].
- Replace the third column with the values [17, 18, 19].
- Replace the subarray consisting of the first two rows and the first two columns with the values [[20, 21], [22, 23]].
- Use slicing to extract all the elements of the array except for the first row and the last column.

In [None]:
# Extract subarrays
print(...) # Second row
print(...) # Second column
print(...) # First two rows and columns
print(...) # Last two rows and columns

# Modify subarrays
arr[...] = ... # Replace second row
arr[...] = ... # Replace third column
arr[...] = ... # Replace first two rows and columns
print(arr)

# Extract diagonal
print(...)

# Extract all elements except first row and last column
print(...)

# Linear algebra with NumPy

In NumPy, most of the common linear algebra operations are already implemented. 
- inner product (aka dot product) of two vectors: ``np.vdot``
- matrix-vector or matrix-matrix standard multiplication: ``np.dot(M1,v1)`` or ``@`` 
- matrix inversion: ``np.linalg.inv``. (But don't use this!)
- matrix transposition: ``.T``. 
- matrix L2 norm: ``np.linalg.norm``.
- solving a system of linear equations: ``np.linalg.solve``.

You can find many other useful information [here](https://numpy.org/doc/stable/reference/routines.linalg.html).

In [None]:
# If we instantiate two vectors:
v_1 = np.random.rand(5)
v_2 = np.random.rand(5) * (-1)

v_dot = np.vdot(v_1, v_2)
print("v_1 is:{}\nv_2 is:{}".format(v_1, v_2))
print("Their dot product is: {}".format(v_dot))

# If we instantiate one matrix and one vector:
M_1 = np.arange(1, 7).reshape(3, 2)
v_1 = np.ones(2)
M_dot = np.dot(M_1, v_1)
print("The matrix vector product gives a vector of size 3x1: {}".format(M_dot))

# If we instantiate two matrices:
M_1 = np.array([[1, 2], [3, 4], [5, 6]])
M_2 = np.array([[1, 2, 3], [4, 5, 6]])
M_dot = np.dot(M_1, M_2)
print("The matrix product gives a matrix of size 3x3: \n{}".format(M_dot))

## Exercises

### Linear system of equations

**Task 2**: Suppose we have a system of linear functions, solve this system.
- 2x + y = 10
- x - 3y = -4

In [None]:
# define the coefficients matrix A and the constants matrix b
A = ... #Your code
b = ...

# solve the system of equations
x = ...

# print the solution
print("x =", x[0])
print("y =", x[1])

# verify the solution using the equations.
eq1 = ...
eq2 = ...

if np.isclose(eq1, 10) and np.isclose(eq2, -4):
    print("The solution is verified.")
else:
    print("The solution is not verified.")

### Matrix-vector operations

Let $A$ be the matrix : $$A = \begin{bmatrix} 1 & 2 & 1 \\ 3 & 0 & 1 \\ 0 & 2 & 4 \end{bmatrix}$$
and $v, w$ vectors: $$v = \begin{bmatrix} 1 & 2 & 4 \end{bmatrix} \\w = \begin{bmatrix} 2 & 4 & 6 \end{bmatrix}$$

**Task 3**: Follow along the code bellow. 

In [None]:
# define the matrix A and the vectors v and w
A = ... #Your code
v = ...
w = ...

# compute the product of matrix A and vector v
Av = ...
print("A * v =", Av)

# compute the product of vector w and matrix A
wA = ...
print("w * A =", wA)

# compute the dot product of vectors v and w
vw = ...
print("v * w =", vw)

# compute the transpose of matrix A
At = ...
print("A transpose =", At)

## Change of basis

Suppose we have the representation of a vector in a given basis, and we want to get its representation in another basis. We can do this by using the so-called _change-of-basis matrix_. In this exercise, we'll practice computing change-of-basis matrices and applying them to vectors in NumPy.

Create two 2x2 NumPy arrays to represent each basis. Each column of these matrices represents a basis vector.

**Task 4**: Write a function called ``change_of_basis_matrix()`` that takes two bases as input and returns the change-of-basis matrix from the first basis to the second.

In [5]:
def change_of_basis_matrix(basis_1: np.ndarray, basis_2: np.ndarray) -> np.ndarray:
    ... #Your code
    return np.array([[5, 4],[-4, -3]])

In [6]:
# Test your implementation of change_of_basis_matrix():
basis_1 = np.array([[1, 2], [3, 4]])
basis_2 = np.array([[5, 6], [7, 8]])

A = change_of_basis_matrix(basis_1, basis_2)

assert np.allclose(A, np.array([[5, 4],[-4, -3]]))

[[ 5  4]
 [-4 -3]]


In [None]:
# Test your implementation of the change_of_basis_matrix(): 
# create a vector using basis_1, and verify it's representation in basis_2 is consistent.
vect = np.array([4,1]) # vector representation in basis_1
vect_2 = ... #Your code

assert np.allclose(basis_1 @ vect, basis_2 @ vect_2)

# Basis vectors for telecommunications

The communication channel between a sender and receiver almost always perturbs the sender's message, either due to noise or other channel effects such as distortion, temporal compression, etc.
To mitigate these effects, one possibility is to transmit *coded* messages in place of raw messages by changing the *basis* used to transmit the signal.

As an example, consider sending the binary message $m = \{0, 1, 1, 0\}$ over a discrete channel such that each bit may be randomly flipped with probability $\epsilon$. If, instead of sending the raw message $m$, we send the repeated message $m = \{0, 0, 1, 1, 1, 1, 0, 0\}$ by doubling each symbol, then the probability of correctly reconstructing the sender's message $m$ is much higher.

What happend above is changing the message basis from the canonical basis $B_{1} = Id$ to the *repetition basis*
$$
B_{2} = 
\left[
\begin{array}{c,c,c,c}
1, 0, 0, 0 \\
1, 0, 0, 0 \\
0, 1, 0, 0 \\
0, 1, 0, 0 \\
0, 0, 1, 0 \\
0, 0, 1, 0 \\
0, 0, 0, 1 \\
0, 0, 0, 1 \\
\end{array}
\right].
$$


## The search for a good basis

What we are after is a basis for the space of images that fulfills a set of prerequisites:

 * We want the basis to be orthonormal (or at least orthogonal) so that basis decomposition is obtained simply via a series of inner products.
 * We want the basis to be able to represent the signal information robustly so that a few coefficients capture most of the signal. This will ensure resilence against data loss.
 * we want the basis to be easy to compute.

One such basis is the **Haar basis**. While we cannot go into too many details in this notebook, curious people can read further on the topic [here](https://chengtsolin.wordpress.com/2015/04/15/real-time-2d-discrete-wavelet-transform-using-opengl-compute-shader/). Mathematical formulas aside, the Haar basis works by encoding the information in a *hierarchical* way: the first basis vectors encode the broad information and the higher coefficients encode the details.

First of all, to keep things simple, we will remain in the space of square matrices whose size is a power of two. The code to generate the Haar basis matrices is given below. (It is not required to understand the underlying computation for the rest of this section).

**Task 5**: As a simple first task, fill in the function `check_orthogonality` that verifies if a matrix is orthogonal. 

In [None]:
# Create a Haar matrix of size NxN, required to have N as a power of 2.
def haarmatrix(N):
    n = int(np.log2(N))

    z = np.resize(1.*np.arange(N)/N, (len(1.*np.arange(N)), len(1.*np.arange(N)/N)))
    k = np.transpose(np.resize(1.*np.arange(N), (len(1.*np.arange(N)/N), len(1.*np.arange(N)))))
    
    p  = np.floor(np.log(np.maximum(1,k))/np.log(2))
    q  = k - (2**p) + 1
    z1 = (q-1)/(2**p)
    z2 = (q-0.5)/(2**p)
    z3 = q/(2**p)
    A  = (1/np.sqrt(N)) * ((( 2**(p/2.)) * ((z >= z1) & (z < z2))) + ((-2**(p/2.)) * ((z >= z2) & (z < z3))))
    A[0,:] = 1/np.sqrt(N)
    return A

# Check if the matrix is orthogonoal.
def check_orthogonality(matrix, size):
    ...  #Your code
    print("Orthogonal!")

## Signal decomposition in the Haar Basis

Let's first see if we can correctly encode a basic signal in the Haar basis and reconstruct it afterwards. Keep in mind that the compressed signal won't *look* like the original since information is stored differently!

First we create a box signal and the Haar matrix corresponding to the Haar basis for the right dimension. Note that the code to generate a sinusoid is also given, you can try it yourself.

In [None]:
# Define a signal to be compressed
fs = 64 # Sampling frequency
t = np.arange(0, 1, 1/fs) # Time vector
# x = np.sin(2*np.pi*5*t) # Sine wave with frequency 5 Hz

# Box signal
box_start = 2/5
box_end = 3/5
x = np.zeros_like(t)
x[(t >= box_start) & (t <= box_end)] = 1

# Compute the Haar matrix of order N
N = len(x)
H = haarmatrix(N)

# Check if the matrix is orthogonal
check_orthogonality(H, N)

plt.figure(figsize=(10, 6))
plt.subplot(2, 1, 1)
plt.plot(t, x)
plt.title('Original signal')
plt.xlabel('Time (s)')
plt.ylabel('Amplitude')
plt.subplot(2, 1, 2)
plt.plot(t, H[1])
plt.title('One base vector of the Haar basis')
plt.xlabel('Time (s)')
plt.ylabel('Amplitude')
plt.tight_layout()
plt.show()

We can see that the signal is the box function. You can play with the second plot ` plt.plot(t, H[i])` where `i` is the basis vector

Next we actually change the basis. 

**Task 6**: Compute the Haar coefficients of the signal, or in other words the representation of your signal in the Haar basis.

*Hint*: Recall what you saw in section 2.2 and what is in essence the Haar matrix,

In [None]:
c = ... #Your code

As explained earlier, one key feature of Haar basis is that it encodes information in a *hierarchical* way. So it would be natural to use it either for compression or for transmitting a signal with a high error probability. We first consider compression. The way to do so is by sorting coefficients in decreasing order of magnitude, and keeping only these coefficients as our signal. Putting everything else to 0 gives the last plot below. We can see that almost all information is encoded in the first half of the signal.

In [None]:
# Sort the Haar coefficients in decreasing order of magnitude and keep only the top k coefficients
k = int(N/2)

idx = np.argsort(np.abs(c))[::-1][:k]
c_compress = np.zeros_like(c)
c_compress[idx] = c[idx]

In [None]:
# Plot the original signal and the compressed signal
plt.figure(figsize=(10, 8))
plt.subplot(3, 1, 1)
plt.plot(t, x)
plt.title('Original signal')
plt.xlabel('Time (s)')
plt.ylabel('Amplitude')
plt.subplot(3, 1, 2)
plt.plot(t, c)
plt.title('Signal in Haar Basis')
plt.xlabel('Time (s)')
plt.ylabel('Amplitude')
plt.subplot(3, 1, 3)
plt.plot(t, c_compress)
plt.title('Compressed signal in Haar Basis')
plt.xlabel('Time (s)')
plt.ylabel('Amplitude')
plt.tight_layout()
plt.show()

Now imagine that we want to decompress the signal. We take the compressed signal and apply the formula $\hat s = H^{T}\hat c $. This gives the "decompressed" signal $\hat s$ which is almost correctly reconstructed.

**Task 7**: Compute the inverse Haar transform of the top k coefficients to obtain the decompressed signal.

In [None]:
# Reconstruct the original signal from the compressed signal and the Haar basis
c_reconstruct = np.zeros_like(c)
c_reconstruct[idx] = c_compress[idx]

x_decompressed = ... #Your code

# Plot the original signal, the compressed signal, and the reconstructed signal
plt.figure(figsize=(10, 6))
plt.subplot(3, 1, 1)
plt.plot(t, x)
plt.title('Original signal')
plt.xlabel('Time (s)')
plt.ylabel('Amplitude')
plt.subplot(3, 1, 2)
plt.plot(t, c_compress)
plt.title('Compressed signal in Haar Basis')
plt.xlabel('Time (s)')
plt.ylabel('Amplitude')
plt.subplot(3, 1, 3)
plt.plot(t, x_decompressed)
plt.title('Reconstructed signal')
plt.xlabel('Time (s)')
plt.ylabel('Amplitude')
plt.tight_layout()
plt.show()

You can play witht the value of `k` to see at which point compression factor starts to affect decompression! 

Try to redo these examples by uncommenting the sinusoid function (and commenting the box function). This gives a nice graphical intuition on how hierarchical the information is encoded!