In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# Linear Algebra 2

## Elementwise Multiplication

$\begin{bmatrix}
1 \\ 2 \\ 3
\end{bmatrix}
*
\begin{bmatrix}
4 \\ 5 \\ 6
\end{bmatrix}$

$\begin{bmatrix}
1 \\ 2 \\ 3
\end{bmatrix}
*
\begin{bmatrix}
4 & 5 & 6
\end{bmatrix}$

In [None]:
# How can we convert np.array([1, 2, 3]) into a vertical array, 
# that is a 3 x 1 using .reshape?
v1 = np.array([1, 2, 3]).reshape(-1, 1)
v1

In [None]:
v2 = np.array([4, 5, 6]).reshape(-1, 1)
v2

$\begin{bmatrix}
1 \\ 2 \\ 3
\end{bmatrix}
*
\begin{bmatrix}
4 \\ 5 \\ 6
\end{bmatrix}$
\=
$\begin{bmatrix}
4 \\ 10 \\ 18
\end{bmatrix}$

In [None]:
v1 * v2 # 1*4, 2*5, 3*6

In [None]:
v2.T

$\begin{bmatrix}
1 \\ 2 \\ 3
\end{bmatrix}
*
\begin{bmatrix}
4 & 5 & 6
\end{bmatrix}$
\=
?

In [None]:
v1 * v2.T # how is this working?

## Broadcast

When compute A * B:
- If A and B have the same number of dimensions: 
    - Match the size of any dim by stretching 1 => N (rule 1)
- else: 
    - add dimensions of size 1 to the beginning of a shape (rule 2)

Elementwise multiplication between `v1 * v2.T` will automatically 
- "Broadcast" v1 to 3 x 3 (stretching the second dimension) and 
- "Broadcast" v2.T to 3 x 3 (stretching the first dimension).

In [None]:
v1.shape

In [None]:
v2.T.shape

How can we manually replicate that? 

#### `np.concatenate([a1, a2, ...], axis=0)`.
- `a1, a2, …`: sequence of arrays
- `axis`: the dimension along with we want to join the arrays
    - default value is 0, which is for row dimension (vertically)
    - value of 1 is for column dimension (horizontally)

In [None]:
v1

In [None]:
v1.shape

In [None]:
# Broadcast v1 to 3 x 3 (stretching the second dimension)
v1_broadcast = ???
v1_broadcast

In [None]:
v2.T

In [None]:
v2.T.shape

In [None]:
# Broadcast v2.T to 3 x 3 (stretching the second dimension)
v2t_broadcast = ???
v2t_broadcast

In [None]:
v1_broadcast * v2t_broadcast # same as v1 * v2.T

In [None]:
v1 * v2.T

In [None]:
v2.T * v1

#### Generate a multiplication table from 1 to 10

In [None]:
# 1. generate a range of numbers from 1 to 10
# 2. reshape to make it 2D
digits = ???
digits

In [None]:
???

In [None]:
# Convert the multiplication table into a DataFrame
pd.DataFrame(digits * digits.T, ???)

#### Back to bug example

Let's do more complex broadcasting example

In [None]:
# Read "bug.jpg" into a numpy array
a = plt.imread("bug.jpg")
a.shape

In [None]:
# Display "bug.jpg"
plt.imshow(a)

#### GOAL: create a fade effect (full color on the left, to black on the right)

- To achieve this, we need to:
    1. multiply the left most columns with numbers close to 1's (retains the original color)
    2. the rightmost columns with numbers close to 0's (0 will give us black color)
    3. the middle columns with numbers close to 0.5's

In [None]:
a.shape

In [None]:
# Create an array called fade with 2521 numbers
fade = ???
print(fade.shape)
fade
# How many dimensions does fade have? 1

In [None]:
a.shape

How can we multiply `a` and `fade`? That is how do we `reshape` `fade`?

Can we reshape fade to 1688 x 2521 x 3?

In [None]:
fade.reshape(1688, 2521, 3)

The answer is no - because `reshape` can never add new values / delete values. Meaning after `reshape`, we need to exactly have 2521 values.

In [None]:
# Keep in mind that we need to multiple each column by a number, 
# so which dimension should be 2521?
fade.reshape(???)

In [None]:
# Let's multiple a by reshaped fade
plt.imshow(a * fade.reshape(1, 2521, 1))

Why doesn't this work? Remember pixels can be either represented using the values 0 to 255 or 0 to 1. `a` has the scale 0 to 255 and `fade.reshape(...)` has the scale 0 to 1.

In [None]:
plt.imshow(a / ??? * fade.reshape(1, 2521, 1))

In [None]:
# version 2
plt.imshow(a / 255.0 * ???)
# BROADCAST: (2521, 1) = rule 2 => (1, 2521, 1) = rule 1 => (1688, 2521, 3)

## Matrix Multiplication

$\begin{bmatrix}
1 & 2 & 3
\end{bmatrix}
\cdot
\begin{bmatrix}
4 \\ 5 \\ 6
\end{bmatrix}$

In [None]:
v1

In [None]:
v2

#### `m1 @ m2`

In [None]:
v1.T @ v2     # 1*4 + 2*5 + 3*6

#### `.item()` gives you just the values

In [None]:
(v1.T @ v2).???   # pulls out the only number in the results

## Predicting with Matrix Multiplication

1. use case for matrix multiplication:
    - `y = Xc + b`
2. one's column
3. matrix multiply vector

$\begin{bmatrix}
1 & 2 \\ 3 & 4\\
\end{bmatrix}
\cdot
\begin{bmatrix}
10 \\ 1 \\
\end{bmatrix}$

In [None]:
houses = pd.DataFrame([[2, 1, 1985],
                       [3, 1, 1998],
                       [4, 3, 2005],
                       [4, 2, 2020]],
                      columns=["beds", "baths", "year"])
houses

In [None]:
def predict_price(house):
    """
    Takes row (as Series) as argument,
    returns estimated price (in thousands)
    """
    return ((house["beds"]*42.3) + (house["baths"]*10) + 
            (house["year"]*1.67) - 3213)

predict_price(???)

In [None]:
# How do we convert a DataFrame into a numpy array?
X = houses.???
X

In [None]:
# Extract just first row of data
house0 = ???
house0

In [None]:
# Create a vertical array (3 x 1) with the co-efficients
c = ???
c

In [None]:
# horizontal @ vertical
house0 @ c

`y = Xc + b`

In [None]:
house0 @ c - 3213

Let's add the intercept to the c vector for ease.

In [None]:
c = np.array([42.3, 10, 1.67, -3213]).reshape(-1, 1)
c

If we directly try matrix multiplication now, it won't work because of difference in dimensions.

In [None]:
house0 @ c

In [None]:
house0.shape

In [None]:
c.shape

#### One's column

- Solution, add a 1's column to `X` using `np.concatenate`

In [None]:
# How can we generate an array of 1's using numpy?
ones_column = ???
ones_column

In [None]:
X = ???
X

In [None]:
# Let's extract house0 again
house0 = X[0:1, :]
house0

In [None]:
# Let's try house0 @ c now
house0 @ c

In [None]:
# Extracting each house and doing the prediction
# Cumbersome
house0 = X[0:1, :]
print(house0 @ c)
house1 = X[1:2, :]
print(house1 @ c)
house2 = X[2:3, :]
print(house2 @ c)
house3 = X[3:4, :]
print(house3 @ c)

In [None]:
# all at once
???

### Fitting with `np.linalg.solve`

**Above:** we estimated house prices using a linear model based on the matrix multiplication as follows:

$Xc = y$

* $X$ (known) is a matrix with house features (from DataFrame)
* $c$ (known) is a vector of coefficients (our model parameters)
* $y$ (computed) are the prices

**Below:** what if X and y are know, and we want to find c?

In [None]:
houses = pd.DataFrame([[2, 1, 1985, 196.55],
                       [3, 1, 1998, 260.56],
                       [4, 3, 2005, 334.55],
                       [4, 2, 2020, 349.60]],
                      columns=["beds", "baths", "year", "price"])
houses

If we assume price is linearly based on the features, with this equation:

* $beds*c_0 + baths*c_1 + year*c_2 + 1*c_3 = price$

Then we get four equations:

* $2*c_0 + 1*c_1 + 1985*c_2 + 1*c_3 = 196.55$
* $3*c_0 + 1*c_1 + 1998*c_2 + 1*c_3 = 260.56$
* $4*c_0 + 3*c_1 + 2005*c_2 + 1*c_3 = 334.55$
* $4*c_0 + 2*c_1 + 2020*c_2 + 1*c_3 = 349.60$


#### `c = np.linalg.solve(X, y)`

- documentation: https://numpy.org/doc/stable/reference/generated/numpy.linalg.solve.html

In [None]:
# Add a column of 1s to this DataFrame
houses???
houses

In [None]:
X = houses[["beds", "baths", "year", "ones"]].values
X

In [None]:
y = houses[["price"]].values
y

In [None]:
# Let's take a look at the co-efficients which we were using for our prediction
c

In [None]:
c = ???
c

In [None]:
X @ c

In [None]:
dream_house = ???
dream_house

In [None]:
dream_house @ c