# 1. Dot Product

Write a function `matrix_multiply(A, B)` using for loops, `+` and `*` that takes in two matrices (can be list of lists, or 2d numpy array) and returns their dot product (matrix multiplication). It should work with column vectors ($k \times 1$ dimensions) and row vectors ($1 \times k$) normally.


```import numpy as np

A = [
    [1,2,3],
    [4,5,6]
]

B = [
    [1,2,3],
    [4,5,6],
    [7,8,9]
]

matrix_multiply(A,B)

RETURNS: 
[[30, 36, 42],
 [66, 81, 96]]

---------example 2-------------
# This is a row vector
A = np.array([
    [1,2,3]
])

# This is a column vector
B = np.array([
    [1],
    [4],
    [7]
])

matrix_multiply(A,B)

RETURNS:
[[30]]

```

Use `np.dot` to test your output

In [60]:
import numpy as np
import pandas as pd

In [63]:
class MatricesAreNotDefineable(Exception):
    pass
"""
We assume all matrices have at least 1 entry in each row and column

This follows the following corrolary:

For matrices A and B, where:
A is a matrix of dimensions m x nA
and B is a matrix of dimensions nB x k

AB are defined if nA = nB

And, should they be defined,

AB will be a matrix of dimensions m x k

"""
def are_defined(A,B):
    m = len(A)
    if m == 0: raise MatricesAreNotDefineable()
    nA = len(A[0])
    if nA == 0: raise MatricesAreNotDefineable()
    nB = len(B)
    if nB == 0: raise MatricesAreNotDefineable()
    if nA != nB: raise MatricesAreNotDefineable()
    k = len(B[0])
    return {
        'm': m,
        'nA': nA,
        'nB': nB,
        'k': k,
        'm x k': '{m} x {k}'.format(m=m,k=k)
    }

"""
Multiply two 1d vectors
"""
def mult_1d(a, b, at=0):
    if at >= len(a): return 0
    return (a[at] * b[at]) + mult_1d(a,b,at+1)

"""
Get a matrix's column in 1d vector form
with it's index
"""
def column_at(index, m):
    return [B[row][index] for row in range(len(m))]

def matrix_multiply(A, B):
    dim = are_defined(A, B)
    result = [[None] * dim['k'] for _ in range(dim['m'])] 
    for m in range(dim['m']):
        Ai = A[m]
        for k in range(dim['k']):
            result[m][k] = mult_1d(Ai, column_at(k, B))
    return result

    
A = [ [1,2,3], [4,5,6] ]

B = [ [1,2,3], [4,5,6], [7,8,9] ]

check = matrix_multiply(A,B)
checkNP = np.dot(A,B)
check == checkNP

array([[ True,  True,  True],
       [ True,  True,  True]])

# 2 Matrix Math torture

**2.1** Give a 3 examples of non-invertible square matrices that are non-zero

**2.2** Explain why the identity matrix $I$ is necessarily a square matrix with only $1$'s on the diagonal (hint: use the dot product from Q1)

**2.3** The **trace** is commutative for two matrices so $tr(AB) = tr(BA)$. Give an example where this is false for 3 matrices which can all be multiplied together.

**2.4** Give an example of a nonzero $4 \times 4$ idempotent matrix (where $A \cdot A = A^2 = A$)

**2.5** solve the following system of equations for `x`, `y` and `z` using matrices and `numpy.linalg.solve`

$$x 	+ 	y 	+ 	z 	= 	6$$

$$2y 	+ 	5z 	= 	−4$$

$$2x 	+ 	5y 	− 	z 	= 	27$$

#### 2.1

First rule of a non-invertible square matrix is one whose determinant is equal to zero.

For 2x2 matrices, this would be any matrix with one row of zeros:

\begin{bmatrix}
0 & 0\\
a & b
\end{bmatrix}

or 
\begin{bmatrix}
a & b\\
0 & 0
\end{bmatrix}

For 3x3 matrices, since their determinants follow the following formulae for a given matrix A:

\begin{bmatrix}
a & b & c\\
d & e & f\\
g & h & i
\end{bmatrix}

\begin{equation}
\begin{vmatrix}
a & b & c\\
d & e & f\\
g & h & i
\end{vmatrix} = 
a 
\begin{vmatrix}
e & f\\
g & h
\end{vmatrix} +  
b
\begin{vmatrix}
d & f\\
g & i
\end{vmatrix} +  
c
\begin{vmatrix}
d & e\\
g & h
\end{vmatrix}
\end{equation}

This means that any 3x3 matrix that meets the following criteria will be non-invertible:

- its first row is all zeros
\begin{bmatrix}
0 & 0 & 0\\
d & e & f\\
g & h & i
\end{bmatrix}

- any matrix who has two columns with their 2nd and 3d entries as zeros

\begin{equation}
\begin{bmatrix}
a & b & c\\
0 & 0 & 0\\
0 & 0 & 0
\end{bmatrix}
 or 
\begin{bmatrix}
a & b & c\\
0 & e & 0\\
0 & h & 0
\end{bmatrix}
 or 
\begin{bmatrix}
a & b & c\\
0 & 0 & f\\
0 & 0 & i
\end{bmatrix}
 or 
\begin{bmatrix}
a & b & c\\
d & 0 & 0\\
g & 0 & 0
\end{bmatrix}
\end{equation}


- any matrix that combines the two points above


#### 2.2
An identity matrix I of dimension m x n satisfies the following rule for a matrix A of height n and any length k:


\begin{equation} I \cdot A = A \end{equation}

Given that a dot product, $I \cdot A$ in this case, is only possible if I has n columns
n  being the number of rows of A, and since A, given the equality above and the rule in Q1 outlining that the product of a dot product is always equal a matrix of height equal to I and of width equal to A's

Given that \begin{equation} I \cdot A = A \end{equation}

Which gives, in dimensional pseudo-expression: \begin{equation} (m n) \cdot (n k) = (m k) \end{equation}

Therefore: \begin{equation} (n k) = (m k) \end{equation}

so: \begin{equation} n = m \end{equation}

Further confirming that the Identity matrix must be a square matrix m x n where m = n.

This means that each entry of $I$, column index of the identity matrix in any of its rows, must produce the entry at that same row index in matrix A's column, while negating other entries in that column. Therefore, for an entry $x_{ij}$ in the identity matrix, i being its row index, and j its column index, which is also the index of the value to maintain in A: $x_{ij} = 1$ only if $i = j$.

The only entries in a matrix where its row and column indexes are the same, are the diagonal entries.

#### 2.3 The **trace** is commutative for two matrices so $tr(AB) = tr(BA)$. Give an example where this is false for 3 matrices which can all be multiplied together.


Considering the cyclic permutations property of a matrix traces where for, say, 3 matrices:

$$trc(ABC) = trc(BCA) = trc(CAB)$$

And while considering that arbitrary permutations in multiplications are not allowed in trace operations:

$$trc(ABC) \neq trc(ACB)$$

We could surmize that for 3 square matrices of the same size $A B C$, and for 3 other matrices $\alpha = ABC$, $\beta = ACB$ and $\gamma = CBA$, the rule above would surmize that while $\alpha$, $\beta$ and $\gamma$ are all mulitplicable with one another (they are defineable as they are of the same size):

$$trc(\alpha) \neq trc(\beta)$$
$$trc(\alpha) \neq trc(\gamma)$$
$$trc(\beta) \neq trc(\gamma)$$

#### 2.4
Technically, the identity matrix $I_{4}$ is idempotent.

$$
I_{4}^2 =
\begin{bmatrix}
1 & 0 & 0 & 0\\
0 & 1 & 0 & 0\\
0 & 0 & 1 & 0\\
0 & 0 & 0 & 1
\end{bmatrix}
\cdot\
\begin{bmatrix}
1 & 0 & 0 & 0\\
0 & 1 & 0 & 0\\
0 & 0 & 1 & 0\\
0 & 0 & 0 & 1
\end{bmatrix}
$$
$$
=
\begin{bmatrix}
1x1 + 0x0 + 0x0 + 0x0 & 0x0 + 0x0 + 0x0 + 0x0 & 0x0 + 0x0 + 0x0 + 0x0 & 0x0 + 0x0 + 0x0 + 0x0\\
0x0 + 0x0 + 0x0 + 0x0 &  0x0 + 1x1 + 0x0 + 0x0 & 0x0 + 0x0 + 0x0 + 0x0 & 0x0 + 0x0 + 0x0 + 0x0\\
0x0 + 0x0 + 0x0 + 0x0 &  0x0 + 0x0 + 0x0 + 0x0 & 0x0 + 0x0 + 1x1 + 0x0 & 0x0 + 0x0 + 0x0 + 0x0\\
0x0 + 0x0 + 0x0 + 0x0 &  0x0 + 0x0 + 0x0 + 0x0 & 0x0 + 0x0 + 0x0 + 0x0 & 0x0 + 0x0 + 0x0 + 1x1 
\end{bmatrix}
$$

$$
=
\begin{bmatrix}
1 & 0 & 0 & 0\\
0 & 1 & 0 & 0\\
0 & 0 & 1 & 0\\
0 & 0 & 0 & 1
\end{bmatrix} 
$$

#### 2.5

$$x 	+ 	y 	+ 	z 	= 	6$$

$$2y 	+ 	5z 	= 	−4$$

$$2x 	+ 	5y 	− 	z 	= 	27$$

is also equal to the following equation:


\begin{equation}
\begin{bmatrix}
1 & 1 & 1\\
2 & 5 & 0\\
2 & 5 & -1
\end{bmatrix} = 
\begin{bmatrix}
6\\
-4\\
27
\end{bmatrix}
\end{equation}

In [67]:
m = np.array([[1,1,1], [2,5,0], [2,5,-1]])
n = np.array([[6], [-4], [27]])

solved = np.linalg.solve(m,n)
print("x is ", solved[0][0])
print("y is ", solved[1][0])
print("z is ", solved[2][0])

x is  63.0
y is  -26.0
z is  -31.0


# 3.1 Boston regression

Using statsmodels and the `boston` dataset, make a regression model to predict house prices. Don't forget to add a constant (intercept) term. Note that statsmodels can take a `pd.DataFrame` as an input for `X`.

Report the $R^2$ and coefficients on each feature

# 3.2 Polynomial features

Use polynomial features to improve your regression model in `2.1`. You can use squared and cubic features. Try to find a model that minimizes the `AIC` or `BIC` of your output table.

# 3.3 Feature plotting

Now that you have a better model, make a regression figure plot for the important feature. The regression plot should be like the ones made at the end of part 3 of this lecture (scatterplot + regression line). It should have the following:

- Have the `x` axis be the values from one of your important features. The values should range from the `[min, max]` of the observed values in the dataset.

- The y axis on each chart is the target value (house price)

- You should have a scatter plot of the datapoints for the feature + the regression line of predicted values on each

- If you used non-linearities (squared and/or cube input) the regression curve should be nonlinear as well

- When plotting values for a single variable, you can set all the other values to their `mean` or `median` when you put them in your model's prediction

# 3.4 Multi-feature plotting

Make a single matplotlib `figure` object with the same chart as in **2.4** but with 4 charts instead for your 4 most important features. 

Do not copy-paste code for each feature you visualize in the plot. Extract your code into a function so you can just have something like

```python
fix, ax1, ax2, ax3, ax4 = plt.subplots((2,2))
reg_plot_on_ax(feature_1, ax1)
reg_plot_on_ax(feature_2, ax2)
reg_plot_on_ax(feature_3, ax3)
reg_plot_on_ax(feature_4, ax4)
```