# Crash Course Lesson 3

In this lesson we learn about:

* **Linear Transformations** from $\mathbb{R}^n$ to $\mathbb{R}^m$.
* What a **matrix** is.
* Three different perspectives on the matrix vector product $A \vec{x}$:
    * As a linear transformation applied to $\vec{x}$.
    * As a linear combination of the columns of $A$, weighted by the components of $\vec{x}$.
    * As the dot product of the rows of $A$ with $\vec{x}$.
* How to rephrase everything we learned in LA1 and LA2 using the three different perspectives
* Understanding **matrix multiplication** as composition of linear maps.
* The **inverse** of a matrix.
* The **transpose** of a matrix $A^\top$.
* The **four fundamental subspaces** of a matrix $A$ and their interrelationships:
    * The **column space** which is the span of the columns of $A$.
    * The **row space** which is the span of the columns of $A^\top$ (which are the rows of $A$).
    * The **null space** which is set of solutions of $A\vec{x} = \vec{0}$.
    * The **left null space** which is the set of solutions of $A^\top \vec{y} = \vec{0}$.

**Definition**:  A **linear transformation** is any function $L : \mathbb{R}^n \to \mathbb{R}^m$ which satisfies the following two conditions:
* Respects vector sums: If $\vec{v},\vec{w} \in \mathbb{R}^n$ are any two vectors then
$$
L(\vec{v} + \vec{w}) = L(\vec{v}) + L(\vec{w})
$$
* Respects scalar multiplication: If $c \in \mathbb{R}$ is any scalar and $\vec{v} \in \mathbb{R}^n$ is any vector then
$$
L(c\vec{v}) = cL(\vec{v})
$$

Note:  we could have combined these two conditions, and just said "$L$ respects linear combinations".  Could you write that intuitively phrased condition as a formal equation?

**Exercise 1**:  Let $L: \mathbb{R}^2 \to \mathbb{R}^3$ be a linear map.  Say you know that

$$
\begin{align*}
L\left( \begin{bmatrix} 1 \\ 0 \end{bmatrix}\right) &= \begin{bmatrix} 1 \\ 2 \\ 3 \end{bmatrix}\\
L\left( \begin{bmatrix} 0 \\ 1 \end{bmatrix}\right) &= \begin{bmatrix} 1 \\ -1 \\ 1 \end{bmatrix}\\
\end{align*}
$$

* Use the properties of linear transformations to figure out $L\left( \begin{bmatrix} 2 \\ 1 \end{bmatrix}\right)$.
* Use the properties of linear transformations to figure out $L\left( \begin{bmatrix} x \\ y \end{bmatrix}\right)$ for any two real numbers $x$ and $y$.

If you did the exercise you might now find the following idea to be believable:

**Idea**:  If you know the outputs of a linear transforation $L: \mathbb{R}^n \to \mathbb{R}^m$ for each of the standard basis vectors $\vec{e}_1, \vec{e}_2, \vec{e}_3, ..., \vec{e}_n$, then you can figure out what the output of $L$ is for *any* input by taking an appropriate linear combination of the basis vector outputs.

We record this information in a **matrix** (2 dimensional array of numbers) as follows:

Let $L: \mathbb{R}^n \to \mathbb{R}^m$. The **matrix of $L$ with respect to the standard basis** (which we will write $M_L$) is an array of numbers with $m$ rows and $n$ columns.  The $j^{th}$ column is the output of the linear transformation when the input is $\vec{e}_j$.

$$
M_L = 
\begin{bmatrix}
\vert & \vert & \vert & \dots & \vert\\
L(\vec{e}_1) & L(\vec{e}_2) & L(\vec{e}_3) & ... & L(\vec{e}_n)\\
\vert & \vert & \vert & \dots & \vert
\end{bmatrix}
= 
\begin{bmatrix}
M_{1,1} & M_{1,2} & M_{1,3} & \dots & M_{1,n}\\
M_{2,1} & M_{2,2} & M_{2,3} & \dots & M_{2,n}\\
& & & \vdots & \\
M_{m,1} & M_{m,2} & M_{m,3} & \dots & M_{m,n}\\
\end{bmatrix}
$$

Note the convention that we are introducing here:  the entry of the matrix $M_{i,j}$ is in the $i^{th}$ row and $j^{th}$ column. 

**Example**:  The linear transformation we introduced in the exercise had

$$
\begin{align*}
L\left( \begin{bmatrix} 1 \\ 0 \end{bmatrix}\right) &= \begin{bmatrix} 1 \\ 2 \\ 3 \end{bmatrix}\\
L\left( \begin{bmatrix} 0 \\ 1 \end{bmatrix}\right) &= \begin{bmatrix} 1 \\ -1 \\ 1 \end{bmatrix}\\
\end{align*}
$$

So the matrix would be 

$$
M_L = \begin{bmatrix}1 & 1\\ 2 & -1 \\ 3 & 1 \end{bmatrix}
$$

For an example of indexing conventions: $M_{1,2} = 1$ and $M_{2,1} = 2$.

**Exercise 2**:  Let $L: \mathbb{R}^n \to \mathbb{R}^m $ be the linear transformation with matrix

$$
M = \begin{bmatrix}
 1 & 2 & 3 & 4\\
 5 & 6 & 7 & 8
\end{bmatrix}
$$

* What is the dimension of the domain (aka what is $n$)?  What is the dimension of the codomain (aka what is $m$)?
* What is the output of the vector whose coordinates are all $1$ from the domain?
* Find a non-zero vector which is mapped to $\vec{0}$ by $L$.

**Practice**:  Try as many of [these exercises ](https://teambasedinquirylearning.github.io/linear-algebra/2023/exercises/#/bank/AT1/1/) and  [these exercises](https://teambasedinquirylearning.github.io/linear-algebra/2023/exercises/#/bank/AT2/1/) as you want until you feel comfortable understanding linear transformations, the matrix of a linear transformation, and how to multiply a matrix by a vector.  Work both by hand and with NumPy.

Hint: to multiply the matrix $M$ by the vector $v$ in NumPy use np.dot(M,v)

To gain a geometric understanding of linear maps, I highly recommend watching [this 3Blue1Brown video](https://www.youtube.com/watch?v=kYB8IZa5AuE).

**Three Perspectives** on matrix-vector products:

Consider the matrix vector product 

$$
\begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6\end{bmatrix} \begin{bmatrix} 7 \\ 8 \\ 9\end{bmatrix}
$$

We can think of this three different ways:

1. We can think of it as a linear trasformation $L : \mathbb{R}^3 \to \mathbb{R}^2$ being applied to a vector in $\mathbb{R}^3$.  From this perspective, the matrix is taking 3D space and "smashing it" onto 2D space in such a way that parallelograms always get mapped to paralelleograms.  The one vector we are plugging in is just coming along for the ride.

2. We can think of it as a linear combination of the columns of the matrix:

$$
\begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6\end{bmatrix} \begin{bmatrix} 7 \\ 8 \\ 9\end{bmatrix} = 7 \begin{bmatrix}1 \\ 4 \end{bmatrix}  + 8 \begin{bmatrix} 2 \\ 5\end{bmatrix} + 9 \begin{bmatrix} 3 \\ 6 \end{bmatrix}
$$

from this perspective the matrix is just a list of column vectors, and a matrix-vector product is a recipe for giving a desired linear combination of the columns vectors.

3. We can think of it as dotting the rows of the matrix with the vector:

$$
\begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6\end{bmatrix} \begin{bmatrix} 7 \\ 8 \\ 9\end{bmatrix} = \begin{bmatrix} 1(7) + 2(8) + 3(9) \\ 4(7) + 5(8) + 6(9)\end{bmatrix}
$$

This is a little harder to interpret, but is an especially useful perspective when $M \vec{v} = 0$:  it says that the rows of $M$ are all perpendicular to the vector $\vec{v}$.

We define matrix multiplication so that it corresponds to composition of the associated linear maps:

**Definition**:  Let $A$ be an $n \times m$ matrix and $B$ be an $k \times n$ matrix.  Then we have associated linear maps $L_A: \mathbb{R}^m \to \mathbb{R}^n$ and $L_B: \mathbb{R}^n \to \mathbb{R}^k$.  We define the matrix product  $BA$ to be the matrix of the linear map $L_B \circ L_A: \mathbb{R}^m \to \mathbb{R}^n$.

In other words, if $A$ has columns $\vec{c}_1, \vec{c}_2, \vec{c}_3, \dots, \vec{c}_m \in \mathbb{R}^n$, so that

$$
A = 
\begin{bmatrix}
 \vert & \vert & \vert & \dots & \vert\\
 \vec{c}_1  &  \vec{c}_2  &  \vec{c}_3  &  \dots  &  \vec{c}_m  \\
  \vert & \vert & \vert & \dots & \vert
 \end{bmatrix}
$$

then 

$$
BA = 
\begin{bmatrix}
 \vert & \vert & \vert & \dots & \vert\\
 B\vec{c}_1  &  B\vec{c}_2  &  B\vec{c}_3  &  \dots  &  B\vec{c}_m  \\
  \vert & \vert & \vert & \dots & \vert
 \end{bmatrix}
$$

If $B$ has rows $\vec{r}_1, \vec{r}_2, \vec{r}_3, \dots \vec{r}_n$, so that

$$
B  = 
\begin{bmatrix} 
\rule[.5ex]{3.5em}{0.4pt} & \vec{r}_1 &  \rule[.5ex]{3.5em}{0.4pt}\\
\rule[.5ex]{3.5em}{0.4pt} & \vec{r}_2 &  \rule[.5ex]{3.5em}{0.4pt}\\
\rule[.5ex]{3.5em}{0.4pt} & \vec{r}_3 &  \rule[.5ex]{3.5em}{0.4pt}\\
\rule[.5ex]{3.5em}{0.4pt} & \vdots &  \rule[.5ex]{3.5em}{0.4pt}\\
\rule[.5ex]{3.5em}{0.4pt} & \vec{r}_n &  \rule[.5ex]{3.5em}{0.4pt}\\
\end{bmatrix}
$$

then we have

$BA = \begin{bmatrix} 
\vec{r}_1 \cdot \vec{c}_1 & \vec{r}_1 \cdot \vec{c}_2 & \vec{r}_1 \cdot \vec{c}_3 & \dots & \vec{r}_1 \cdot \vec{c}_m\\
\vec{r}_2 \cdot \vec{c}_1 & \vec{r}_2 \cdot \vec{c}_2 & \vec{r}_2 \cdot \vec{c}_3 & \dots & \vec{r}_2 \cdot \vec{c}_m\\
\vec{r}_3 \cdot \vec{c}_1 & \vec{r}_3 \cdot \vec{c}_2 & \vec{r}_3 \cdot \vec{c}_3 & \dots & \vec{r}_3 \cdot \vec{c}_m\\
\vdots & \vdots & \vdots & \vdots & \vdots\\
\vec{r}_n \cdot \vec{c}_1 & \vec{r}_n \cdot \vec{c}_2 & \vec{r}_n \cdot \vec{c}_3 & \dots & \vec{r}_n \cdot \vec{c}_m\\
\end{bmatrix}$

**Practice**:  Try as many of [these exercises](https://teambasedinquirylearning.github.io/linear-algebra/2023/exercises/#/bank/MX1/1/) as you want, both by hand and using NumPy, until you feel comfortable with matrix multiplication.

Hint:  the numpy code for the matrix product $BA$ is np.dot(B,A).

**Definition**:  The $n\times n$ **identity** matrix is the $n \times n$ matrix $I$ with ones on the diagonal.  It has the property that for any $k \times n$ matrix $A$ or $n \times k$ matrix $B$ we have

$$
AI = A
$$

$$
IB = B
$$

In [15]:
import numpy as np

#Example:

A = np.random.random((2,3))
B = np.random.random((3,2))
I = np.eye(3)

#If A is equal to AI and IA, both of the following will be True.

print(np.array_equal(A, np.dot(A,I)), np.array_equal(B, np.dot(I,B)))

True True


**Definition**:  Let  $A$ be an $n \times n$ matrix.  $A$  **has an inverse** if and only if there is a matrix $A^{-1}$ with

$$
AA^{-1} = A^{-1}A = I
$$

In [16]:
# Example:

A =  np.random.random((5,5))
A_inv = np.linalg.inv(A)

print("A is \n", A)
print("The inverse of A is \n",A_inv)

# The following are both True if A Ainv = I and Ainv A = I
print(np.array_equal(np.eye(5), np.round(np.dot(A,A_inv)), 5))
print(np.array_equal(np.eye(5), np.round(np.dot(A_inv,A),5)))

A is 
 [[1.18991450e-01 4.75920505e-01 8.84478968e-01 8.54399392e-02
  1.17582013e-01]
 [3.96987954e-02 9.41267488e-01 4.57710041e-01 8.46848802e-01
  6.23257425e-04]
 [7.87653555e-01 5.03543033e-01 7.97327498e-01 8.53321324e-01
  6.63456076e-01]
 [5.53349912e-01 5.44573296e-01 3.19998403e-01 7.39816710e-01
  1.64055084e-01]
 [9.32238423e-01 8.99669016e-01 9.75981682e-01 8.16359179e-01
  2.02184540e-01]]
The inverse of A is 
 [[ -10.49889363    4.32744479    3.38650446  -20.14054562   11.32206186]
 [ -51.21146545   28.38218893   19.52941591 -106.5781967    52.08926006]
 [  31.28845542  -16.56262247  -11.68369822   62.48164291  -30.5040502 ]
 [  40.53069821  -21.63227504  -15.56301531   85.69398391  -41.96836817]
 [ -38.39920731   21.04877119   16.72257366  -80.50624143   37.66213943]]
True
True


**Theorem**:  an $n \times n$ matrix has an inverse if and only if its columns form a basis of $\mathbb{R}^n$.

This should make some intuitive sense from perspective 2:

$$
\begin{align*}
\textrm{Columns of $A$ form a basis} 
&\Longleftrightarrow \textrm{The columns of $A$ are linearly independent and span $\mathbb{R}^n$}\\
&\Longleftrightarrow \textrm{For every vector $\vec{v}$ in $\mathbb{R}^n$ there is one and only one linear combination of the columns yielding that vector}\\
&\Longleftrightarrow \textrm{For every vector $\vec{v}$ in $\mathbb{R}^n$ there is one and only one vector $\vec{\beta}$ for which $A\vec{\beta} = \vec{v}$ }\\
\end{align*}
$$

**Idea**:  The inverse of a matrix is useful for solving a system of equations, when it exists.

For example to solve

$$
\begin{cases}
x + y + z &= 6\\
x - y - 2z&= 4\\
3x + 2y + z = 7
\end{cases}
$$

you can reformulate this as

$$
\begin{bmatrix} 1 & 1 & 1 \\ 1 & -1 & -2 \\ 3 & 2 & 1 \end{bmatrix}  \begin{bmatrix} x \\ y \\ z \end{bmatrix} = \begin{bmatrix} 6 \\ 4 \\ 7\end{bmatrix}
$$

and apply the inverse of the matrix (which I will call $A$) to both sides on the left to get

$$
\begin{bmatrix} x \\ y \\ z \end{bmatrix} = A^{-1} \begin{bmatrix} 6 \\ 4 \\ 7\end{bmatrix}
$$

In [17]:
A = np.array([[1,1,1],[1,-1,-2],[3,2,1]])
Ainv = np.linalg.inv(A)
v = np.array([[6],[4],[7]])
solution = np.dot(Ainv,v)
print(solution)
print("checking")
print(int(solution[0,0]), " + ", int(solution[1,0]), " + ", int(solution[2,0]), " = ", v[0,0])
print(int(solution[0,0]), " -(", int(solution[1,0]), ") -2(", int(solution[2,0]), ") = ", v[1,0])
print("3(", int(solution[0,0]), ") + 2(", int(solution[1,0]), ") + ", int(solution[2,0]), " = ", v[2,0])

[[ 15.]
 [-29.]
 [ 20.]]
checking
15  +  -29  +  20  =  6
15  -( -29 ) -2( 20 ) =  4
3( 15 ) + 2( -29 ) +  20  =  7


**Definition**:  The **transpose** of an $m \times n$ matrix $A$ is the $n \times m$ matrix $A^\top$ whose entries have their indices flipped.  In other words, the rows of $A$ are the columns of $A^\top$. For example:

$$
\begin{bmatrix} 
1 &  2 & 3 \\
4 & 5 & 6
\end{bmatrix}^\top = \begin{bmatrix} 1 & 4 \\ 2 & 5 \\ 3 & 6 \end{bmatrix}
$$

In [18]:
# python example

A = np.random.randint(-10,10,(2,3))
print('A is \n', A)
print('The transpose of A is \n', np.transpose(A))

A is 
 [[-9  8  0]
 [ 5 -3  1]]
The transpose of A is 
 [[-9  5]
 [ 8 -3]
 [ 0  1]]


**Idea**:  If you want to find all of the vectors perpendicular to the span of the columns of $A$, this is equivalent to finding all of the vectors which are mapped to $0$ by $A^\top$:

$$
\begin{align*}
A^\top \vec{v} = 0 
&\Longleftrightarrow \vec{v} \cdot \textrm{(every row of $A^\top$)} = 0 \textrm{ , by ``perspective 3"}\\
&\Longleftrightarrow \vec{v} \cdot \textrm{(every column of $A$)} = 0 \textrm{ , since rows of $A^\top$ are columns of $A$}\\
&\Longleftrightarrow \vec{v} \textrm{ is orthogonal to the span of the columns of $A$}
\end{align*}
$$

**Important relationship**:  if $\vec{x} \in \mathbb{R}^n$, $\vec{y} \in \mathbb{R}^m$ and $A$ is a $m \times n$ matrix, then  

$$
\vec{y} \cdot A\vec{x} = A^\top \vec{y} \cdot \vec{x}
$$

or (using the alternative notation for dot products)

$$
\langle \vec{y}, A  \vec{x}\rangle = \langle A^\top \vec{y}, \vec{x} \rangle
$$

**Definitions**:

* The **image** of a linear transformation $L: \mathbb{R}^n \to \mathbb{R}^m$ is the collection of all vectors in $\vec{w} \in \mathbb{R}^m$ for which we can find a $\vec{v} \in \mathbb{R}^n$ with $L(\vec{v}) = \vec{w}$. It is a subspace of $\mathbb{R}^m$. You may have heard the same concept referred to as the "range" in a pre-Calculus or Calculus class.
    * Note:  From "perspective 2" on matrix-vector products, we can also think of this as the span of the columns.  For this reason the image of the linear transformation is also called the **column space** of the associated matrix.
    * Note: We will use the notation $\textrm{Im}(L)$ (read:  "image of $L$")interchangeably with $\textrm{Col}(M_L)$ (read: "Column space of $M_L$").
* The **null space** of a linear transformation $L : \mathbb{R}^n \to \mathbb{R}^m$ is the collection of all vectors $\vec{v} \in \mathbb{R}^n$ which are sent to $\vec{0}$ by $L$.  It is a subspace of $\mathbb{R}^n$.  We will write $\textrm{Null}(L)$ or $\textrm{Null}(M_L)$ for this subspace.
    * Note:  If $\textrm{Null}(L) \neq \{\vec{0}\}$, then (using perspective 2 again) this says that some linear combination of the columns of $M_L$ is linearly dependent.  On the other hand if $\textrm{Null}(L) = \{\vec{0}\}$, then the columns of $L$ are linearly independent.  You can also check that this is equivalent to the map $L$ being one-to-one.
* The **row space** of  $A$ is the column space of $A^\top$.  In other words, it is the span of the rows of $A$.  We will just write $\textrm{Col}(A^\top)$ for this space.
* The **left null space** of $A$ is the null space of $A^\top$.  We will just write $\textrm{Null}(A^\top)$ for this space.
    * Earlier we said "If you want to find all of the vectors perpendicular to the span of the columns of $A$, this is equivalent to finding all of the vectors which are mapped to $0$ by $A^\top$".  We can rephrase this as $\textrm{Null}(A^\top) = \textrm{Col}(A)^\perp$.

We can use these ideas to write a new implementation of the orthogonal projection function we wrote in the second lesson which does **not** rely on Gram-Schmidt.

Say we have a vector $\vec{y} \in \mathbb{R}^m$.  We also have vectors $\vec{y}_1, \vec{y}_2, ..., \vec{y}_n \in \mathbb{R}^m$.  We want to project $\vec{y}$ onto the span of $\vec{y}_1, \vec{y}_2, ..., \vec{y}_n$.

Let $A$ be the $m \times n$ matrix with columns $\vec{y}_1, \vec{y}_2, ..., \vec{y}_n$.  Then we can restate our problem using the new vocabulary as "We want to project $\vec{y}$ onto the column space of $A$".

Call the projected vector $\hat{y}$, and let $\vec{r} = \vec{y} - \hat{y}$.

Since $\hat{y} \in \textrm{Col}(A)$, we know that $\hat{y}$ is a linear combination of the columns of $A$.  However, using the equivalence of perspectives $1$ and $2$, we can interpret this as meaning that there is a vector $\vec{\beta} \in \mathbb{R}^n$ with $\hat{y} = A \vec{\beta}$.

We want $\vec{r}$ to be perpendicular to the column of $A$.  We explain this is equivalent to $\vec{r}$ being in the null space of $A^\top$.

Putting it together we want

$$
\begin{align*}
A^\top (\vec{y}  - A \vec{\beta}) &= \vec{0}\\
A^\top \vec{y} = A^\top A \vec{\beta} 
\end{align*}
$$

If the columns of $A$ are linearly independent, then $A^\top A$ (a square matrix!) will be invertible, and we can solve

$$
\begin{align*}
\vec{\beta} &= (A^\top A)^{-1} A^\top \vec{y}\\
A\vec{\beta} &= A(A^\top A)^{-1} A^\top \vec{y}\\
\hat{y} &=  A(A^\top A)^{-1} A^\top \vec{y}\\
\end{align*}
$$

So we have the following formula for the projection of a vector $\vec{y}$ onto the column space of the matrix $A$ (assuming these columns are linearly independent):

$$
\textrm{proj}_{\textrm{Col}(A)} (\vec{y}) = A(A^\top A)^{-1} A^\top \vec{y}
$$


**Exercise 3**: Implement 

$$
\textrm{proj}_{\textrm{Col}(A)} (\vec{y}) = A(A^\top A)^{-1} A^\top \vec{y}
$$

as a python function:

In [19]:
#A and y are be numpy arrays

#def proj(A, y):
    # your code here

# Exercise Solutions

> **Exercise 1**:  Let $L: \mathbb{R}^2 \to \mathbb{R}^3$ be a linear map.  Say you know that
> 
> $$
> \begin{align*}
> L\left( \begin{bmatrix} 1 \\ 0 \end{bmatrix}\right) &= \begin{bmatrix} 1 \\ 2 \\ 3 \end{bmatrix}\\
> L\left( \begin{bmatrix} 0 \\ 1 \end{bmatrix}\right) &= \begin{bmatrix} 1 \\ -1 \\ 1 \end{bmatrix}\\
> \end{align*}
> $$
> 
> * Use the properties of linear transformations to figure out $L\left( \begin{bmatrix} 2 \\ 1 \end{bmatrix}\right)$.

$$
\begin{align*}
L\left( \begin{bmatrix} 2 \\ 1 \end{bmatrix}\right) 
&=  L\left( \begin{bmatrix} 2 \\ 0 \end{bmatrix} + \begin{bmatrix} 0 \\ 1 \end{bmatrix}\right) \\
&=  L\left( \begin{bmatrix} 2 \\ 0 \end{bmatrix} \right) +  L\left(\begin{bmatrix} 0 \\ 1 \end{bmatrix}\right)  \textrm{ since $L$ respects vector sums}\\
&=  L\left( 2\begin{bmatrix} 1 \\ 0 \end{bmatrix} \right) +  L\left(\begin{bmatrix} 0 \\ 1 \end{bmatrix}\right)\\
&=  2L\left( \begin{bmatrix} 1 \\ 0 \end{bmatrix} \right) +  L\left(\begin{bmatrix} 0 \\ 1 \end{bmatrix}\right) \textrm{ since $L$ respects scalar multiplication}\\
&=2\begin{bmatrix} 1 \\ 2 \\ 3 \end{bmatrix} + \begin{bmatrix} 1 \\ -1 \\ 1 \end{bmatrix}\\
&= \begin{bmatrix} 3 \\ 3 \\ 7\end{bmatrix}
\end{align*}
$$

> * Use the properties of linear transformations to figure out $L\left( \begin{bmatrix} x \\ y \end{bmatrix}\right)$ for any two real numbers $x$ and $y$.

$$
\begin{align*}
L\left( \begin{bmatrix} x \\ y \end{bmatrix}\right) 
&=  L\left( \begin{bmatrix} x \\ 0 \end{bmatrix} + \begin{bmatrix} 0 \\ y \end{bmatrix}\right) \\
&=  L\left( \begin{bmatrix} x \\ 0 \end{bmatrix} \right) +  L\left(\begin{bmatrix} 0 \\ y \end{bmatrix}\right)  \textrm{ since $L$ respects vector sums}\\
&=  L\left( x \begin{bmatrix} 1 \\ 0 \end{bmatrix} \right) +  L\left( y \begin{bmatrix} 0 \\ 1 \end{bmatrix}\right)\\
&=  xL\left( \begin{bmatrix} 1 \\ 0 \end{bmatrix} \right) +  yL\left(\begin{bmatrix} 0 \\ 1 \end{bmatrix}\right) \textrm{ since $L$ respects scalar multiplication}\\
&=x\begin{bmatrix} 1 \\ 2 \\ 3 \end{bmatrix} + y\begin{bmatrix} 1 \\ -1 \\ 1 \end{bmatrix}\\
&= \begin{bmatrix} x+ y \\ 2x-y \\ 3x + y\end{bmatrix}
\end{align*}
$$

> **Exercise 2**:  Let $L: \mathbb{R}^n \to \mathbb{R}^m $ be the linear transformation with matrix
> 
> $$
> M = \begin{bmatrix}
>  1 & 2 & 3 & 4\\
> 5 & 6 & 7 & 8
> \end{bmatrix}
> $$
> 
> * What is the dimension of the domain (aka what is $n$)?  What is the dimension of the codomain (aka what is $m$)?

Each column corresponds to the output of a standard basis vector of the domain.  Since there are 4 columns, we must have $n = 4$.  Since each column is 2 dimensional we have $m = 2$.

> * What is the output of the vector whose coordinates are all $1$ from the domain?

$$
\begin{align*}
L\left( \begin{bmatrix} 1 \\ 1 \\ 1 \\ 1 \end{bmatrix}\right) 
&= L\left( \begin{bmatrix} 1 \\ 0 \\ 0 \\ 0 \end{bmatrix}\right)  + L\left( \begin{bmatrix} 0 \\ 1 \\ 0 \\ 0 \end{bmatrix}\right) +  L\left( \begin{bmatrix} 0 \\ 0 \\ 1 \\ 0 \end{bmatrix}\right) +  L\left( \begin{bmatrix} 0 \\ 0 \\ 0 \\ 1 \end{bmatrix}\right)\\
& = \begin{bmatrix} 1 \\ 5\end{bmatrix} + \begin{bmatrix} 2 \\ 6\end{bmatrix} +  \begin{bmatrix} 3 \\ 7\end{bmatrix} + \begin{bmatrix} 4 \\ 8\end{bmatrix}\\
&= \begin{bmatrix} 10 \\ 26 \end{bmatrix}
\end{align*}
$$

> * Find a non-zero vector which is mapped to $\vec{0}$ by $L$.

Let $\begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \end{bmatrix}$ be a vector which is mapped to $\begin{bmatrix} 0 \\ 0 \end{bmatrix}$ by $L$.  We have


$$
\begin{align*}
L\left( \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ x_4 \end{bmatrix}\right) 
&= x_1L\left( \begin{bmatrix} 1 \\ 0 \\ 0 \\ 0 \end{bmatrix}\right)  + x_2L\left( \begin{bmatrix} 0 \\ 1 \\ 0 \\ 0 \end{bmatrix}\right) +  x_3L\left( \begin{bmatrix} 0 \\ 0 \\ 1 \\ 0 \end{bmatrix}\right) +  x_4L\left( \begin{bmatrix} 0 \\ 0 \\ 0 \\ 1 \end{bmatrix}\right)\\
& = x_1\begin{bmatrix} 1 \\ 5\end{bmatrix} + x_2\begin{bmatrix} 2 \\ 6\end{bmatrix} +  x_3\begin{bmatrix} 3 \\ 7\end{bmatrix} + x_4\begin{bmatrix} 4 \\ 8\end{bmatrix}\\
&= \begin{bmatrix} x_1 + 2x_2 + 3x_3 + 4x_4 \\ 5x_1 + 6x_2 + 7x_3 + 8x_4 \end{bmatrix}
\end{align*}
$$

So we need to find solutions to 

$$
\begin{cases}
1x_1 + 2x_2 + 3x_3 + 4x_4 &= 0\\
5x_1 + 6x_2 + 7x_3 + 8x_4 &= 0 
\end{cases}
$$

We can solve this with SymPy to get all the solutions: 

In [20]:
from sympy import linsolve, symbols
x_1, x_2, x_3, x_4 = symbols("x_1, x_2, x_3,x_4")
Eqns = [1*x_1 + 2*x_2 + 3*x_3 + 4*x_4, 5*x_1 + 6*x_2 + 7*x_3 + 8*x_4]
linsolve(Eqns, x_1, x_2, x_3, x_4)

{(x_3 + 2*x_4, -2*x_3 - 3*x_4, x_3, x_4)}

We can let $x_3$ and $x_4$ be anything we like and get a solution this way.  To be explicit about it, let's get one solution with $x_3 = 1$ and $x_4 = 0$.  Then $x_1 = 1$ and $x_2 = -2$ so 

$$
\begin{bmatrix}
1\\
-2\\
1\\
0
\end{bmatrix}
$$

is one solution.  Similarly 

$$
\begin{bmatrix}
2\\
-3\\
0\\
1
\end{bmatrix}
$$

is another solution, and together these two vectors span the space of all possible solutions (we will learn that this is called the "null space").

> **Exercise 3**: Implement 
> 
> $$
> \textrm{proj}_{\textrm{Col}(A)} (\vec{y}) = A(A^\top A)^{-1} A^\top \vec{y}
> $$
> 
> as a python function:

In [21]:
#A and y are numpy arrays

def proj(A, y):
    At = np.transpose(A)
    AtA = np.dot(At, A)
    AtAinv = np.linalg.inv(AtA)
    matrices = [A, AtAinv, At, y]
    y_hat = np.linalg.multi_dot(matrices)
    return y_hat

# Note:  you can test that this gives you the same results as the much more complicated function orthoproj from the LA2-dot-product notebook.