# 3b Linear algebra

## 3b.1 Basic operations

Linear algebra is a branch of mathematics dealing with vector spaces. Linear algebra operations are very useful when manipulating numeric datasets. Using these operations often allows us to avoid writing explicit loops, and thus make our code more readable, more concise, and faster to execute. Linear algebra is deeply connected to vectorization but has specific mathematical properties. Many machine learning algorithms and methods are specified to be effieciently performed using linear algebra, and we will introduce a few in the following sections.

## Exercise 3b.1

In [2]:
import numpy

## 3b.2 Vector Multiplication (using dot product)

The dot product between two vectors returns a scalar value that is the sum of multiplying each element of the two vectors together. The dot product is often used for calculating physical relationships (like the amount of work done or the power exerted) or when computing the similarity between vectors (see lecture 4 on distance measures).

In numpy the dot product (or scalar product) between two vectors (one-dimensional arrays) is treated as
a special case of multi-dimensional array multiplication. 

The definition of the dot product between two vectors $a$ and $b$ is:

$$\langle a, b\rangle = \sum_{i=1}^N ab$$

Other notations that you will come across for this operation are:

$a \cdot b$

$a^T b$

In order to compute the dot product between two vectors a and b, we use:

```python
numpy.dot(a, b)
```
or
```python
a.dot(b)
```

## Exercise 3b.2

Create two vectors of length 100 consisting of random values between -10 and 10. Compute:
- elementwise product between them
- dot product between them


## 3b.3 Matrix multiplication (using dot product)

Matrix multiplication does not produce a single scalar. Instead, it produces a full matrix of values calculated using the same dot product formula from above. Each column in the first matrix is paired with a row in the second matrix and the dot product between those two vectors is a value in the resulting matrix. 
Because of this, the number of columns in the first matrix needs to be equal to the number of rows in the second matrix. For matrices $A_{m\times n}$ and $B_{n \times p}$, the resulting matrix will be $C_{m \times p}$. The value of each entry in the resulting matrix $C$ is the result of computing the dot product between a row from $A$ and a column from $B$:

![](https://upload.wikimedia.org/wikipedia/commons/thumb/e/eb/Matrix_multiplication_diagram_2.svg/470px-Matrix_multiplication_diagram_2.svg.png)

The equation for calculating each cell is the same as the vector dot product formula:

$$
C_{i,j} = \sum_{k=1}^n A_{i,k}B_{k,j}
$$

In order to compute the dot product between two matricies a and b, we use:

```python
numpy.dot(A, B)
```
or

```python
A.dot(B)
```

## Exercise 3b.3

- Create a random matrix $A_{3\times 4}$ and another random matrix $B_{4 \times 2}$. Multiply AB.  
- Create a random matrix $C_{3\times 3}$ and $D_{3\times3}$. Multiply CD. Multiply DC. Is matrix multiplication commutative?
- Create a identity matrix $I_{3\times 3}$.  Multiply IC, CI, DI, ID. What do you notice?
- What will be the result of multiplying a matrix $Z_{m\times n}$ by a matrix $O_{n \times p}$ whose all entries are zero? Check your answer using some examples.


## 3b.4 Transpose

We have already encountered matrix transpose when we discussed reshaping numpy arrays. Transposing a matrix simply means making the rows into columns and columns into rows. This is often required to align a matrix properly for matrix multiplication.

The mathematical notation for the transpose of matrix $A$ is $A^T$. If $A$ is $m \times n$ then $A^T$ is $n \times m$. The values are:

$$A^T_{i,j} = A_{j,i}$$

The function to compute the transpose of matrix A is either `numpy.transpose(A)` or simply `A.T`.

## 3b.5 Inverse

When using scalar values, dividing by a number $n$ is straightforward: it mearly requires multiplying by $\frac{1}{n}$. This value is the multiplicative inverse (aka reciprocal) $n$ is $\frac{1}{n}$ and can also written as $n^{-1}$. 

The inverse has certain properties, including:
- $n^{-1}n = 1$
- $(n^{-1})^{-1} = n$

For matricies, there is an analogous concept. The inverse of a matrix $A$ is written $A^{-1}$ and it satisfies three critical properties:

- $A^{-1}A = I$ where $I$ is the $m \times m$ identity matrix
- $(A^{-1})^{-1} = A$
- $(A^T)^{-1} = (A^{-1})^T$

Not all matrices are invertible: a matrix needs to be square (that is, the number of rows must equal the number of columns), and its [determinant](https://en.wikipedia.org/wiki/Determinant) needs to be non-zero. 

Computing the inverse of a matrix is critical for many data science techniques, including computing the best fitting parameter weights for ordinary least squares linear regression (an example we will look at next). 

Surprisingly, numpy does have an inverse function. Instead, we will use the function `inv` from the `linalg` submodule of the `scipy` package.


In [3]:
from scipy.linalg import inv
A = numpy.random.uniform(0,1,(3,3))
print(A)
print(inv(A))

[[ 0.67157492  0.59570437  0.05014716]
 [ 0.19629943  0.23241896  0.43308784]
 [ 0.20548254  0.88270923  0.25071074]]
[[ 1.72705903  0.5601077  -1.31299892]
 [-0.21201821 -0.84250941  1.49779257]
 [-0.66901822  2.50726584 -0.20867464]]


## Exercise 3b.4

Generate a $4 \times 5$ matrix of random values (called A).

Verify the shape and values of $A^T$ match what is specified in 3b.4.

First, guess would be the outcome of $(A^T)^T$? Check your guess.

## Exercise 3b.5

Generate a $3 \times 3$ matrix of random values named A.

Part A: Verify that $A^{-1}A = I$, where $I$ is the $m \times m$ identity matrix

Part B: Verify that $(A^{-1})^{-1} = A$

Part C: Verify that $(A^T)^{-1} = (A^{-1})^T$