<a href="https://colab.research.google.com/github/flazman/SaturdaysAI_2022/blob/main/ML_BasicMaths_Exercise.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Machine Learning - Week 2 - Exercise**


# **Objective**

Machine Learning heavily depend on mathematics to achieve incredibles results, using a huge amount of data. To work with each data in a secuencial way is very slow and can't be easily scalable.

Matrix and tensor operations allows to perform operations with a complete dataset in a easy way and, at the same time, achiecing a great performance not only in resources but also in time.

The main goal of this week is to know how to use the mathematics in real life problems.


# **Dimensions**

Working with matrices and tensors is a way to view the world from another point of view: it is all about dimentions.



## **Scalar**

Scalars are single numbers and are an example of a 0th-order tensor.

In mathematics it is necessary to describe the set of values to which a scalar belongs. The notation $x  \in  \mathbb{R}$ states that the (lowercase) scalar value $x$ is an element of (or member of) the set of real-valued numbers, $\mathbb{R}$.

There are various sets of numbers of interest within Machine Learning. $\mathbb{N}$ represents the set of positive integers ($1, 2, 3, ...$). $\mathbb{Z}$ represents the integers, which include positive, negative and zero values. $\mathbb{Q}$ represents the set of *rational* numbers that may be expressed as a fraction of two integers.


## **Vector**

A Vector is an ordered array of single numbers and is an example of 1st-order tensor.

Vectors are members of objects known as vector spaces. A vector space can be thought of as the entire collection of *all* possible vectors of a particular length (or dimension). The three-dimensional real-valued vector space, denoted by $\mathbb{R}^{3}$ is often used to represent our real-world notion of three-dimensional space mathematically.

More formally a vector space is an $n$-dimensional Cartesian product of a set with itself, along with proper definitions on how to add vectors and multiply them with scalar values. If all of the scalars in a vector are real-valued then the notation $\boldsymbol{x} \in \mathbb{R}^{n}$ states that the (boldface lowercase) vector value $\boldsymbol{x}$ is a member of the $n$-dimensional vector space of real numbers, $\mathbb{R}^{n}$.

Sometimes it is necessary to identify the components of a vector explicitly. The $i$th scalar element of a vector is written as ${x_{i}}$. Notice that this is non-bold lowercase since the element is a scalar. An $n$-dimensional vector itself can be explicitly written using the following notation:

$$\boldsymbol{x} = \begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{bmatrix}$$

Given that scalars exist to represent values why are vectors necessary? One of the primary use cases for vectors is to represent physical quantities that have both a magnitude and a direction. Scalars are only capable of representing magnitudes.

For instance scalars and vectors encode the difference between the speed of a car and its velocity. The velocity contains not only its speed but also its direction of travel. It is not difficult to imagine many more physical quantities that possess similar characteristics such as gravitational and electromagnetic forces or wind velocity.

In Machine Learning vectors often represent feature vectors, with their individual components specifying how important a particular feature is. Such features could include relative importance of words in a text document, the intensity of a set of pixels in a two-dimensional image or historical price values for a cross-section of financial instruments.


## **Matrix**

A Matrix is a rectangular arrays consisting of numbers and is an example of 2nd-order tensors.

If $m$ and $n$ are positive integers, that is $m,n \in \mathbb{N}$ then the $m \times n$ matrix contains $mn$ numbers, with $m$ rows and $n$ columns.

If all of the scalars in a matrix are real-valued then a matrix is denoted with uppercase boldface letters, such as $\boldsymbol{A} \in \mathbb{R}^{m \times n}$
. That is the matrix lives in a $m \times n$-dimensional real-valued vector space. Hence matrices are really vectors that are just written in a two-dimensional table-like manner.

Its components are now identified by two indices $i$ and $j$. $i$ represents the index to the matrix row, while $j$ represents the index to the matrix column. Each component of $\boldsymbol{A}$is identified by $a_{ij}$.

The full $m \times n$ matrix can be written as:

$$
\boldsymbol{A} = \begin{bmatrix}
a_{11} & a_{12} & a_{13} & \cdots & a_{1n}\\ 
a_{21} & a_{22} & a_{23} & \cdots & a_{2n}\\ 
a_{31} & a_{32} & a_{33} & \cdots & a_{3n}\\ 
\vdots & \vdots & \vdots & \ddots  & \vdots\\ 
a_{m1} & a_{m2} & a_{m3} & \cdots & a_{mn}\\ 
\end{bmatrix}
$$

It is often useful to abbreviate the full matrix component display into the following expression:

$$
\boldsymbol{A} = \left [ a_{ij} \right ]_{m \times n}
$$

Where $a_{ij}$ is referred to as the $(i,j)$-element of the matrix $\boldsymbol{A}$. The subscript of $m \times n$ can be dropped if the dimension of the matrix is clear from the context.

Note that a column vector is a size $m \times 1$ matrix, since it has $m$ rows and 1 column. Unless otherwise specified all vectors will be considered to be column vectors.

Matrices represent a type of function known as a linear map. It is possible to define multiplication operations between matrices or between matrices and vectors. Such operations are immensely important across the physical sciences, quantitative finance, computer science and Machine Learning.

Matrices can encode geometric operations such as rotation, reflection and transformation. Thus if a collection of vectors represents the vertices of a three-dimensional geometric model in Computer Aided Design software then multiplying these vectors individually by a pre-defined rotation matrix will output new vectors that represent the locations of the rotated vertices. This is the basis of modern 3D computer graphics.

In Artificial Neural Networks weights are stored as matrices, while feature inputs are stored as vectors. Formulating the problem in terms of linear algebra allows compact handling of these computations. By casting the problem in terms of tensors and utilising the machinery of linear algebra, rapid training times on modern GPU hardware can be obtained.


## **Tensor**

The more general entity of a tensor encapsulates the scalar, vector and the matrix. It is sometimes necessary, both in the physical sciences and Machine  Learning, to make use of tensors with order that exceeds two.

A tensor is a container which can house data in $n$ dimensions, along with its linear operations.

In theoretical physics, and general relativity in particular, the Riemann curvature tensor is a 4th-order tensor that describes the local curvature of spacetime. In Machine Learning, a 3rd-order tensor can be used to describe the intensity values of multiple channels (red, green and blue) from a two-dimensional image.

Tensors will be identified via the boldface sans-serif notation, $\mathsf{A}$. For a 3rd-order tensor elements will be given by $a_{ijk}$, whereas for a 4th-order tensor elements will be given by $a_{ijkl}$.

Mathematically speaking, tensors are more than simply a data container, however. Aside from holding numeric data, tensors also include descriptions of the valid linear transformations between tensors. Examples of such transformations, or relations, include the cross product and the dot product.



||![](https://www.kdnuggets.com/wp-content/uploads/scalar-vector-matrix-tensor.jpg)||
|---|---|---|


# **Working with Numpy and Tensorflow**


## **Numpy**

Numpy is the core library for scientific computing in Python. Data manipulation is nearly synonymous with NumPy array manipulation: it provides a high-performance multidimensional array object, and tools for working with these arrays.

A numpy array, also known as n-dimensional array `ndarray`, is a grid of values, all of the same type, and is indexed by a tuple of nonnegative integers. The number of dimensions is the rank of the array, and the shape of an array is a tuple of integers giving the size of the array along each dimension.

The first step is to import the Numpy library.


In [None]:
# Import Numpy

import numpy as np


A Numpy array can be created from a number or a nested Python lists.

In [None]:
# Numpy: creating tensors

np_scalar = np.array(27)

np_vector = np.array([1, 2.0, 3, 4, 5])  # The "2.0" force the array to be float

np_matrix = np.array([[1, 4, 7],
                   [2, 5, 8],
                   [3, 6, 9]])

np_tensor = np.array([[[1, 4, 7],
                    [2, 5, 8],
                    [3, 6, 9]],
                   [[10, 40, 70],
                    [20, 50, 80],
                    [30, 60, 90]],
                   [[100, 400, 700],
                    [200, 500, 800],
                    [300, 600, 900]]])


The array object has some attributes that are very useful to have more information about the data inside the array.

Some of these attributes are:
- ndim: the number of dimensions
- shape: the size of each dimension
- size: the total size of the array
- dtype: the data type of the array

Applying these attributes to the arrays created before, it can be seen the different between each other:


In [None]:
# Scalar
print("SCALAR")
print("ndim: ", np_scalar.ndim)
print("shape:", np_scalar.shape)
print("size: ", np_scalar.size)
print("dtype:", np_scalar.dtype)
print("")

# Vector
print("VECTOR")
print("ndim: ", np_vector.ndim)
print("shape:", np_vector.shape)
print("size: ", np_vector.size)
print("dtype:", np_vector.dtype)
print("")

# Matrix
print("MATRIX")
print("ndim: ", np_matrix.ndim)
print("shape:", np_matrix.shape)
print("size: ", np_matrix.size)
print("dtype:", np_matrix.dtype)
print("")

# Tensor
print("TENSOR")
print("ndim: ", np_tensor.ndim)
print("shape:", np_tensor.shape)
print("size: ", np_tensor.size)
print("dtype:", np_tensor.dtype)
print("")



As stated before, the number of dimensions `ndim` shows the type of tensor it is.

Working with tensors, all are tensors of different dimension. For the first 3 dimensions, from 0 to 2, the tensors receive an special name: scalar, vector and matrix.

Numpy provides many functions to create arrays, which is veryuseful when working with big arrays:

- Array with all values equal to 0
- Array with all values equal to 1
- Array with all values equal to a given number
- Identity matrix
- Array with random numbers



In [None]:
# Numpy provides many functions to create arrays:

print("Array with 0s")
np_a = np.zeros((2,2))    # Create an array of all zeros
print(np_a)               # Prints "[[ 0.  0.]
                          #          [ 0.  0.]]"
print("")

print("Array with 1s")
np_b = np.ones((1,2))     # Create an array of all ones
print(np_b)               # Prints "[[ 1.  1.]]"
print("")

print("Array with a value")
np_c = np.full((2,2), 7)  # Create a constant array
print(np_c)               # Prints "[[ 7.  7.]
                          #          [ 7.  7.]]"
print("")

print("Identity Matrix")
np_d = np.eye(2)          # Create a 2x2 identity matrix
print(np_d)               # Prints "[[ 1.  0.]
                          #          [ 0.  1.]]"
print("")

print("Array with random numbers")
np.random.seed(0)
np_e = np.random.random((2,2))  # Create an array filled with random values
print(np_e)                     # Might print "[[0.5488135  0.71518937]
                                #               [0.60276338 0.54488318]]"
print("")


## **TensorFlow**

Similar to NumPy `array` objects, `tf.Tensor` objects have a data type and a shape. Additionally, `tf.Tensor`s can reside in accelerator memory (like a GPU). TensorFlow offers a rich library of operations (`tf.add`, `tf.matmul`, `tf.linalg.inv`, etc.) that consume and produce `tf.Tensor`s. These operations automatically convert native Python types.


### **NumPy Compatibility**

Converting between a TensorFlow `tf.Tensor`s and a NumPy `ndarray` is easy:

- TensorFlow operations automatically convert NumPy ndarrays to Tensors.
- NumPy operations automatically convert Tensors to NumPy ndarrays.

Tensors are explicitly converted to NumPy ndarrays using their `.numpy()` method. These conversions are typically cheap since the array and `tf.Tensor` share the underlying memory representation, if possible. However, sharing the underlying representation isn't always possible since the `tf.Tensor` may be hosted in GPU memory while NumPy arrays are always backed by host memory, and the conversion involves a copy from GPU to host memory.


### **GPU acceleration**

Many TensorFlow operations are accelerated using the GPU for computation. Without any annotations, TensorFlow automatically decides whether to use the GPU or CPU for an operation, copying the tensor between CPU and GPU memory, if necessary. Tensors produced by an operation are typically backed by the memory of the device on which the operation executed.

This make easier the scalability of a project, doing the calculations in the best performance system.


As always, the first step is to import the Numpy library.

In [None]:
# Import TensorFlow

import tensorflow as tf


In the same way that Numpy ndarrays, a tensor can be created with TensorFlow from a number or a nested Python lists.

In [None]:
# TensorFlow: creating tensors

tf_scalar = tf.constant(27)

tf_vector = tf.constant([1, 2.0, 3, 4, 5])  # The "2.0" force the array to be float

tf_matrix = tf.constant([[1, 4, 7],
                   [2, 5, 8],
                   [3, 6, 9]])

tf_tensor = tf.constant([[[1, 4, 7],
                    [2, 5, 8],
                    [3, 6, 9]],
                   [[10, 40, 70],
                    [20, 50, 80],
                    [30, 60, 90]],
                   [[100, 400, 700],
                    [200, 500, 800],
                    [300, 600, 900]]])


Tensorflow has fewer attributes to inspect the tensor. In thiis case the available ones are:

- shape: the size of each dimension
- dtype: the data type of the array


In [None]:
# Scalar
print("SCALAR")
print("shape:", tf_scalar.shape)
print("dtype:", tf_scalar.dtype)
print("")

# Vector
print("VECTOR")
print("shape:", tf_vector.shape)
print("dtype:", tf_vector.dtype)
print("")

# Matrix
print("MATRIX")
print("shape:", tf_matrix.shape)
print("dtype:", tf_matrix.dtype)
print("")

# Tensor
print("TENSOR")
print("shape:", tf_tensor.shape)
print("dtype:", tf_tensor.dtype)
print("")




TensorFlow also supports many functions to create tensors, like Numpy:

- Array with all values equal to 0
- Array with all values equal to 1
- Array with all values equal to a given number
- Identity matrix
- Array with random numbers


In [None]:
# TensorFlow provides many functions to create arrays:

print("Array with 0s")
tf_a = tf.zeros((2,2))    # Create an array of all zeros
print(tf_a)               # Prints "[[ 0.  0.]
                          #          [ 0.  0.]]"
print("")

print("Array with 1s")
tf_b = tf.ones((1,2))     # Create an array of all ones
print(tf_b)               # Prints "[[ 1.  1.]]"
print("")

print("Array with a value")
tf_c = tf.fill((2,2), 7)  # Create a constant array
print(tf_c)               # Prints "[[ 7.  7.]
                          #          [ 7.  7.]]"
print("")

print("Identity Matrix")
tf_d = tf.eye(2)          # Create a 2x2 identity matrix
print(tf_d)               # Prints "[[ 1.  0.]
                          #          [ 0.  1.]]"
print("")

print("Array with random numbers")
tf.random.set_seed(0)
tf_e = tf.random.uniform((2,2)) # Create an array filled with random values
print(tf_e)                     # Might print "[[0.29197514 0.20656645]
                                #               [0.53539073 0.5612575 ]]"


# **Operations with tensors**

As seen before, it is all about dimentions. The tensor entity allows to work with $n$-dimension data or objects.

Most of the operations are based and explained for 2-dimension tensors, this is: matrices.

In this way is easier to understand the calculations in 2 dimensions to later use them in higher dimension tensors.

At this stage it is not likely to be clear why these operations will be useful in the context of Machine Learning.

Understand the basics of them will drive into a much better position to grasp the more complex ideas that form the backbone of Artificial Neural Networks.

Such operations include addition and multiplication. While is common to be very familiar with scalar addition and multiplication, the rules differ somewhat when dealing with more general tensor entities.

While matrix operations can be done both with Numpy and TensorFlow libraries, this exercise will focus on use the later because of compatibility between both libraries but also the added options for scalability.


# **Matrix Addition**

Matrices can be added to scalars, vectors and other matrices. Each of these operations has a precise definition.



## **Matrix-Matrix Addition**

Given two matrices of size $m \times n$:

$$
\boldsymbol{A} = \left [ a_{ij} \right ]
$$

$$
\boldsymbol{B} = \left [ b_{ij} \right ]
$$

... it is possible to define the matrix sum:

$$
\boldsymbol{C} = \boldsymbol{A} + \boldsymbol{B}
$$

Where:

$$
\boldsymbol{C} = \left [ c_{ij} \right ]
$$

$$
c_{ij} = a_{ij} + b_{ij}
$$


That is, $\boldsymbol{C}$ is constructed by element-wise summing the respective elements of $\boldsymbol{A}$ and $\boldsymbol{B}$. This operation is only defined where the two matrices have equal size. The definition implies that $\boldsymbol{C}$ also has size $m \times n$.

Matrix addition is *commutative*. This means that it doesn't matter which way around the matrices are added:

$$
\boldsymbol{A} + \boldsymbol{B} = \boldsymbol{B} + \boldsymbol{A}
$$

It is also *associative*. This means that you get the same result if you add two matrices together first, and then another, as if you add another two together first and then the other:

$$
\boldsymbol{A} + ( \boldsymbol{B} + \boldsymbol{C} )= ( \boldsymbol{A} + \boldsymbol{B} ) + \boldsymbol{C}
$$

Both of these results follow from the fact that normal scalar addition is itself commutative and associative, because it is just adding the elements together.


### **Example**

Given two matrices of shape $3 \times 2$:

$$
\boldsymbol{A} = \begin{bmatrix}
a_{11} & a_{12}\\
a_{21} & a_{22}\\
a_{31} & a_{32}\\
\end{bmatrix}\qquad
\boldsymbol{B} = \begin{bmatrix}
b_{11} & b_{12}\\
b_{21} & b_{22}\\
b_{31} & b_{32}\\
\end{bmatrix}
$$

The matrix sum is:

$$
\boldsymbol{A} + \boldsymbol{B} =
\begin{bmatrix}
a_{11} & a_{12}\\
a_{21} & a_{22}\\
a_{31} & a_{32}\\
\end{bmatrix} +
\begin{bmatrix}
b_{11} & b_{12}\\
b_{21} & b_{22}\\
b_{31} & b_{32}\\
\end{bmatrix} =
\begin{bmatrix}
a_{11}+b_{11} & a_{12}+b_{12}\\
a_{21}+b_{21} & a_{22}+b_{22}\\
a_{31}+b_{31} & a_{32}+b_{32}\\
\end{bmatrix}
$$

This can be calculated with Tensorflow:


In [None]:
# Matrix sum with Tensorflow

a = tf.constant([[1, 2],
                 [3, 4],
                 [5, 6]])

b = tf.constant([[10, 20],
                 [30, 40],
                 [50, 60]])

c = tf.add(a,b)

print(c)


## **Matrix-Scalar Addition**

It is possible to add a scalar value $x$ to a matrix $\boldsymbol{A} = [ a_{ij}]$ to produce a new matrix $\boldsymbol{B} = [ b_{ij}]$ where:

$$
b_{ij} = x + a_{ij}
$$

This simply means that the same scalar value is added to every element of the matrix. It is written as:

$$
\boldsymbol{B} = x + \boldsymbol{A}
$$


Scalar-matrix addition is once again commutative and associative, because normal scalar addition is both commutative and associative.



### **Example**

Given a scalar an a matrix of shape $3 \times 2$:

$$
x
\quad
,
\quad
\boldsymbol{A} = \begin{bmatrix}
a_{11} & a_{12}\\
a_{21} & a_{22}\\
a_{31} & a_{32}\\
\end{bmatrix}\qquad
$$

The matrix sum is:

$$
x + \boldsymbol{A} =
x + 
\begin{bmatrix}
a_{11} & a_{12}\\
a_{21} & a_{22}\\
a_{31} & a_{32}\\
\end{bmatrix} =
\begin{bmatrix}
x+a_{11} & x+a_{12}\\
x+a_{21} & x+a_{22}\\
x+a_{31} & x+a_{32}\\
\end{bmatrix}
$$

This can be calculated with Tensorflow as follows:


In [None]:
# Matrix sum with Tensorflow

x = 3

a = tf.constant([[1, 2],
                 [3, 4],
                 [5, 6]])

b = tf.add(x,a)

print(b)


## **Exercise 01**

Like in most companies, it is common that the employees work in different projects during the month. In order to have a good resource control system, there is a registry of hours dedicated to each project by each employee.

This is a copy of the data of the past month:

||proj_01|proj_02|proj_03|
|-|-|-|-|
|emp_01|10|20|30|
|emp_02|20|10|20|
|emp_03||30|10|
|emp_04|20|20|10|
|emp_05|10|10|30|


It's time to do the bills for each project and employee, for the last 3 months.

How much time has expended the employee 04 in the project 02?

This is the data exported to Python:


In [None]:
# Time resources - Exported

month_01 = [[10,20,30],
            [20,10,20],
            [0,30,10],
            [20,20,10],
            [10,10,30]]

month_02 = [[20,30,10],
            [20,20,20],
            [10,20,20],
            [20,30,10],
            [20,30,0]]

month_03 = [[20,20,10],
            [30,20,0],
            [30,10,10],
            [0,30,20],
            [10,20,30]]


## **Exercise 01 - Solution**

( Your solution here )

In [None]:
# Your solution code

# **Matrix multiplication**

The rules for matrix addition are relatively simple and intuitive. However when it comes to multiplication of matrices the rules become more complex.



## **Matrix-Matrix Multiplication**

This is a more complex operation than matrix addition because it does not simply involve multiplying the matrices element-wise. Instead a more complex procedure is utilised, for each element, involving an entire row of one matrix and an entire column of the other.

The operation is only defined for matrices of specific sizes. The first matrix must have as many columns as the second matrix has rows, otherwise the operation is not defined.

The definition below can be a bit tricky to understand initially, so have a look at it first and then try working through the examples to see how specific numeric instances match up to the general formula.

Given two matrices:

$$
\boldsymbol{A} = [a_{ij}]_{m \times n}
\qquad
\boldsymbol{B} = [b_{ij}]_{n \times p}
$$

The matrix product:

$$
\boldsymbol{C} = \boldsymbol{A}\boldsymbol{B} = [c_{ij}]_{m \times p}
$$

Where:

$$
c_{ij} = \sum_{k=1}^{n} a_{ik} b_{kj}
$$


That is the elements $c_{ij}$ of the matrix $\boldsymbol{C} = \boldsymbol{A}\boldsymbol{B}$ are given by summing the products of the elements of the $i$-th row of $\boldsymbol{A}$ with the elements of the $j$-th column of $\boldsymbol{B}$.

Note that matrix-matrix multiplication is not commutative. That is:

$$
\boldsymbol{A}\boldsymbol{B}\neq\boldsymbol{B}\boldsymbol{A}
$$


### **Example**

Given two matrices:

$$
\boldsymbol{A} = \begin{bmatrix}
1 & 2 & 3 \\
4 & 5 & 6 \\
\end{bmatrix}\qquad
\boldsymbol{B} = \begin{bmatrix}
1 & 2\\
3 & 4\\
5 & 6\\
\end{bmatrix}
$$

It is possible to construct the product $\boldsymbol{A}\boldsymbol{B}$ of size $2 \times 2$:

$$
\boldsymbol{A}\boldsymbol{B} =
\begin{bmatrix}
1·1 + 2·3 + 3·5 & 1·2+2·4+3·6 \\
4·1+5·3+6·5 & 4·2+5·4+6·6 \\
\end{bmatrix} =
\begin{bmatrix}
22 & 28\\
49 & 64\\
\end{bmatrix}
$$

It is also possible to construct the product $\boldsymbol{B}\boldsymbol{A}$ of size $3 \times 3$:

$$
\boldsymbol{B}\boldsymbol{A} =
\begin{bmatrix}
1·1+2·4 & 1·2+2·5 & 1·3+2·6\\
3·1+4·4 & 3·2+4·5 & 3·3+4·6\\
5·1+6·4 & 5·2+6·5 & 5·3+6·6\\
\end{bmatrix} =
\begin{bmatrix}
9 & 12 & 15\\
19 & 26 & 33\\
29 & 40 & 51\\
\end{bmatrix}
$$

This can be calculated with Tensorflow:


In [None]:
# Matrix sum with Tensorflow

a = tf.constant([[1, 2, 3],
                 [4, 5, 6]])

b = tf.constant([[1, 2],
                 [3, 4],
                 [5, 6]])

c = tf.matmul(a,b)

print('AB:')
print(c)
print('')

d = tf.matmul(b,a)

print('BA:')
print(d)
print('')



## **Scalar-Matrix Multiplication**

Scalar-matrix multiplication is simpler than matrix-matrix multiplication. Given a matrix $\boldsymbol{A} = [a_{ij}]_{m \times n}$
and a scalar $x \in \mathbb{R}$, the scalar-matrix product $x\boldsymbol{A}$ is calculated by multiplying every element of $\boldsymbol{A}$ by $x$ such that $x\boldsymbol{A} = [xa_{ij}]_{m \times n}$.




### **Example**

Given a scalar and a matrx:

$$
x = 2
\qquad
\boldsymbol{A} = \begin{bmatrix}
1 & 2 & 3 \\
4 & 5 & 6 \\
\end{bmatrix}
$$

The scalar-matrix product is:

$$
x\boldsymbol{A} =
\begin{bmatrix}
2·1 & 2·2 & 2·3 \\
2·4 & 2·5 & 2·6 \\
\end{bmatrix} =
\begin{bmatrix}
2 & 4 & 6\\
8 & 10 & 12\\
\end{bmatrix}
$$

This can be calculated with Tensorflow:


In [None]:
# Scalar-Matrix product with Tensorflow

a = tf.constant([[1, 2, 3],
                 [4, 5, 6]])

x = tf.constant(2)

xa = tf.math.scalar_mul(x,a)

print('xA:')
print(xa)


# **Image manipulation**

One of the fields where the matrix multiplications is most used, it is the image manipulation or edition. Images are used in almost every activity, from marketing to entertainment, from road signals to art, just to mention just a few.

Images are commonly used but only some people take a look the magic behind working with them. A simple brightness correction or to apply a filter on an image requires a lot of calculations on millions of elements.

For example, a Full HD image has 1920 x 1080 pixels with 3 colors channels: RGB. This is 6,220,800 values from 0 to 255. But a Full HD video of 1 second has 60 Full HD images, which is the same as 373,248,000 elements.

Another way to see a Full HD image is as a tensor of shape `( 1920, 1080, 3 )` where each one of the 3rd dimension is the image on a RGB color channel. This can be used for images of any size. In fact, some images can have a transparent part and for that another color channel is needed, named Alpha. When an image has an alpha channel, the colors channel are RGBA and the 3rd dimension is 4 instead of 3.

A simple image manipulation needs a lot of calculation to process any simple change, like converting a colored image to grayscale.

This is when matrix multiplication can be used.


## **Exercise 02**

In a project using Artificial Neural Networks the input is an image. Hundred of images, in fact, are used to train the model to detect a specific text on them.

For this reason, the color data is not relevant and the images can be converted to grayscale. Moreover, the images have an alpha channel for transparency.

Converting the colored imaged with RGBA channels to grayscale reduces each one by 75%, reducing the overall data size, model complexity and calculation time.

In order to convert a color image to grayscale, the following calculation has to be done:

$$
\boldsymbol{GRAY} = 
0.2989 \times \boldsymbol{R}
+ 0.5870 \times \boldsymbol{G}
+ 0.1140 \times \boldsymbol{B}
$$

Where:

$
\boldsymbol{R} = \text{Red channel} \\
\boldsymbol{G} = \text{Green channel} \\
\boldsymbol{B} = \text{Blue channel}
$ 


The following code can be used to load and show a grayscale image, but it shows a full color image when it is a RGB or RGBA image.


In [None]:
# Import libraries
import matplotlib.pyplot as plt
from matplotlib.pyplot import imread

# Read image
img_np = imread('https://upload.wikimedia.org/wikipedia/commons/thumb/2/2d/Tensorflow_logo.svg/224px-Tensorflow_logo.svg.png')

# Convert image to tensor
img = tf.convert_to_tensor(img_np, dtype=tf.float32)

# Plot grayscale tensor
plt.imshow(img, cmap=plt.get_cmap('gray'))

The objective is to convert this image to grayscale using matrix multiplication.


## **Exercise 02 - Solution**

( Your solution here )

In [None]:
# Your solution code

---
