# Matrix = some column vectors

![Creative Commons License](https://i.creativecommons.org/l/by/4.0/88x31.png)  
This work by Jephian Lin is licensed under a [Creative Commons Attribution 4.0 International License](http://creativecommons.org/licenses/by/4.0/).

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits import mplot3d

In [None]:
def make_blobs(N=150, k=3, d=2, seed=None):
    """
    Input:
        N: an integer, number of samples
        k: an integer, number of blobs
        d: an integer, dimension of the space
    Output:
        a dataset X of shape (N, d)
    """
    np.random.seed(seed)
    X = np.random.randn(N,d)
    blob_size = N // k
    centers = np.random.randn(k, d) * 3
    for i in range(k):
        left = blob_size * i
        right = blob_size * (i+1) if i != k-1 else N
        X[left:right] += centers[i]
    return X

## Main idea

Let $S = \{{\bf u}_1, \ldots, {\bf u}_n\}$ be a collection of vectors.  
A **linear combination** of $S$ is a vector of the form  
$${\bf v} = c_1{\bf u}_1 + \cdots + c_n{\bf u}_n$$
where $c_1,\ldots,c_n$ are real numbers.  
The **span** of $S$, denoted by $\operatorname{span}(S)$, is the set of all linear combinations of $S$.

Let  
$$A = \begin{bmatrix} 
 | & ~ & | \\
 {\bf u}_1 & \cdots & {\bf u}_n \\
 | & ~ & | \\
\end{bmatrix}$$
be an $m\times n$ matrix.  Let 
$${\bf v} = \begin{bmatrix} c_1 \\ \vdots \\ c_n \end{bmatrix}$$
a vector in $\mathbb{R}^n$.  

Then  
$$A{\bf v} = c_1{\bf u}_1 + \cdots + c_n{\bf u}_n$$  
and  
$$\{A{\bf v}: {\bf v}\in\mathbb{R}^n\} = \operatorname{span}(\{{\bf u}_1, \ldots, {\bf u}_n\}),$$
which is called the **column space** $\operatorname{Col}(A)$ of $A$.

## Side stories

- space, column space
- `np.meshgrid`
- center of mass
- shift the data
- NumPy broadcasting

## Experiments

##### Exercise 1
Let 
```python
A = np.array([[1,-1], 
              [1,1]])
grid = np.meshgrid(np.arange(5), np.arange(5))
xs = grid[0].ravel()
ys = grid[1].ravel()
```

###### 1(a)
Plot the points ${\bf v}$ where the x,y-coordinates are stored in `xs` and `ys`, respectively.

In [None]:
### your answer here
A = np.array([[1,-1], 
              [1,1]])
grid = np.meshgrid(np.arange(5), np.arange(5))
xs = grid[0].ravel()
ys = grid[1].ravel()

plt.scatter(xs,ys)

###### 1(b)
Plot the points $A{\bf v}$ where the x,y-coordinates are stored in `xs` and `ys`, respectively.  
Hint:  You might need the function `np.vstack` .

In [None]:
### your answer here
A = np.array([[1,-1], 
              [1,1]])
grid = np.meshgrid(np.arange(5), np.arange(5))
xs = grid[0].ravel()
ys = grid[1].ravel()

v = np.vstack([xs,ys])
Av = A.dot(v)
plt.axis('equal')
plt.scatter(Av[0],Av[1])

###### 1(c)
Draw the column vectors of $A$ on the figure that you drew earlier.

In [None]:
### your answer here
A = np.array([[1,-1], 
              [1,1]])
grid = np.meshgrid(np.arange(5), np.arange(5))
xs = grid[0].ravel()
ys = grid[1].ravel()

v = np.vstack([xs, ys])
Av = A.dot(v)

plt.scatter(Av[0], Av[1])
plt.axis('equal')
plt.arrow(0, 0, A[0,0], A[1,0], color='red', 
          head_width=0.3, length_includes_head=True)
plt.arrow(0, 0, A[0,1], A[1,1], color='green', 
          head_width=0.3, length_includes_head=True)

##### Exercise 2
Let 
```python
A = np.array([[1,1,1],
              [-1,0,0],
              [0,-1,0]])
B = np.array([[1,1,0],
              [-1,0,1],
              [0,-1,-1]])
grid = np.meshgrid(np.arange(5), np.arange(5), np.arange(5))
xs = grid[0].ravel()
ys = grid[1].ravel()
zs = grid[2].ravel()
```

###### 2(a)
Draw the grid using the columns of $A$ on the three dimensional space.  
Remeber you need the following to setup a 3d-axes.  
```python
%matplotlib notebook
ax = plt.axes(projection='3d')
ax.set_xlim(-5,5)
ax.set_ylim(-5,5)
ax.set_zlim(-5,5)
```

In [None]:
### your answer here

ax = plt.axes(projection='3d')
ax.set_xlim(-5,5)
ax.set_ylim(-5,5)
ax.set_zlim(-5,5)

A = np.array([[1,1,1],
              [-1,0,0],
              [0,-1,0]])
grid = np.meshgrid(np.arange(5), np.arange(5), np.arange(5))
xs = grid[0].ravel()
ys = grid[1].ravel()
zs = grid[2].ravel()

v = np.vstack([xs, ys, zs])
Av = A.dot(v)
ax.scatter(Av[0], Av[1], Av[2])
#why ax.scatter not plt.scatter

###### 2(b)
Draw the grid using the columns of $B$ on the three dimensional space.  
What's the main difference between (a) and (b)?

In [None]:
### your answer here

ax = plt.axes(projection='3d')
ax.set_xlim(-5,5)
ax.set_ylim(-5,5)
ax.set_zlim(-5,5)

B = np.array([[1,1,0],
              [-1,0,1],
              [0,-1,-1]])

grid = np.meshgrid(np.arange(5), np.arange(5), np.arange(5))
xs = grid[0].ravel()
ys = grid[1].ravel()
zs = grid[2].ravel()

v = np.vstack([xs, ys, zs])
Bv = B.dot(v)
ax.scatter(Bv[0], Bv[1], Bv[2])
#Grid in (a) is three dimensional but two dimensional in (b)

###### 2(c)
Let ${\bf u}_1, {\bf u}_2, {\bf u}_3$ be the column vectors of $A$.  
Draw the grid using $S = \{{\bf u}_1, {\bf u}_2\}$ and draw an arrow for ${\bf u}_3$.  
Is ${\bf u}_3$ in $\operatorname{span}(S)$?  
Hint:  You might need `np.quiver`.

In [None]:
### your answer here

ax = plt.axes(projection='3d')
ax.set_xlim(-5,5)
ax.set_ylim(-5,5)
ax.set_zlim(-5,5)

A = np.array([[1,1,1],
              [-1,0,0],
              [0,-1,0]])

grid = np.meshgrid(np.arange(5), np.arange(5), np.arange(5))
xs = grid[0].ravel()
ys = grid[1].ravel()
zs = grid[2].ravel()

S = np.array([[1,1],
              [-1,0],
              [0,-1]])

u3 = np.array([1,0,0])
v = np.vstack([xs, ys])
Sv = S.dot(v)

ax.scatter(Sv[0], Sv[1], Sv[2])
ax.quiver(0,0,0,1,0,0)

# 𝐮3 isn't in span(𝑆)

## Exercises

##### Exercise 3
Let  
```python
x = np.array([0,1,2])
y = np.array([3,4,5])
```

###### 3(a)
Guess and understand the meaning of `x - y` .

In [None]:
### your answer here
#Guess: [0,1,2] - [3,4,5] = [-3,-3,-3]
x = np.array([0,1,2])
y = np.array([3,4,5])
x - y

###### 3(b)
Guess and understand the meaning of `x[:,np.newaxis] - y` .

In [None]:
### your answer here
#Guess:no idea
x = np.array([0,1,2])
y = np.array([3,4,5])
x[:,np.newaxis] - y

###### 3(c)
Guess and understand the meaning of `x[:,np.newaxis] - y[np.newaxis,:]` .

In [None]:
### your answer here
#Guess:no idea
x = np.array([0,1,2])
y = np.array([3,4,5])
x[:,np.newaxis] - y[np.newaxis,:]

###### 3(d)
Let  
```python
ys = np.arange(15).reshape(3,5)
```
Guess and understand the meaning of `x[:,np.newaxis] - ys` .

In [None]:
### your answer here
#Guess:no idea
ys = np.arange(15).reshape(3,5)
x = np.array([0,1,2])
y = np.array([3,4,5])
x[:,np.newaxis] - ys

##### Exercise 4
Let  
```python
A = np.array([[1,1],
              [-1,0],
              [0,-1]])
p = np.array([1,1,1])
grid = np.meshgrid(np.arange(5), np.arange(5))
xs = grid[0].ravel()
ys = grid[1].ravel()

vs = np.vstack([xs,ys])
new_vs = A.dot(vs)
```

###### 4(a)
Draw a red point at the origin and an arrow for `p` .

In [None]:
### your answer here

ax = plt.axes(projection='3d')
ax.set_xlim(-5,5)
ax.set_ylim(-5,5)
ax.set_zlim(-5,5)

A = np.array([[1,1],
              [-1,0],
              [0,-1]])
p = np.array([1,1,1])
grid = np.meshgrid(np.arange(5), np.arange(5))
xs = grid[0].ravel()
ys = grid[1].ravel()

vs = np.vstack([xs,ys])
new_vs = A.dot(vs)
ax.scatter(0,0,0, color = 'r')
ax.quiver(0,0,0,1,1,1)

###### 4(b)
Let `shifted_vs = p[:,np.newaxis] + new_vs` .  
What is the meaning of `shifted_vs` ?  
Draw the points (columns) in `shifted_vs` along with a red point at the oirign and an arrow for `p` .  

In [None]:
### your answer here

ax = plt.axes(projection='3d')
ax.set_xlim(-5,5)
ax.set_ylim(-5,5)
ax.set_zlim(-5,5)

A = np.array([[1,1],
              [-1,0],
              [0,-1]])
p = np.array([1,1,1])
grid = np.meshgrid(np.arange(5), np.arange(5))
xs = grid[0].ravel()
ys = grid[1].ravel()

vs = np.vstack([xs,ys])
new_vs = A.dot(vs)
ax.scatter(0,0,0, color = 'r')
ax.quiver(0,0,0,1,1,1)

shifted_vs = p[:,np.newaxis] + new_vs
ax.scatter(shifted_vs[0], shifted_vs[1], shifted_vs[2])

##### Exercise 5
Let  
```python
A = np.array([[1,1],
              [-1,0],
              [0,-1]])
p = np.array([1,0,0])
grid = np.meshgrid(np.linspace(-10,10,100), np.linspace(-10,10,100))
xs = grid[0].ravel()
ys = grid[1].ravel()

vs = np.vstack([xs,ys])
new_vs = A.dot(vs)
```

###### 5(a)
Calculate 
```python
diff = p[:,np.newaxis] - new_vs
dist = np.sqrt(np.sum(diff**2, axis=0))
```
and guess the meaning of dist.

In [None]:
### your answer here
A = np.array([[1,1],
              [-1,0],
              [0,-1]])
p = np.array([1,0,0])
grid = np.meshgrid(np.linspace(-10,10,100), np.linspace(-10,10,100))
xs = grid[0].ravel()
ys = grid[1].ravel()

vs = np.vstack([xs,ys])
new_vs = A.dot(vs)

diff = p[:,np.newaxis] - new_vs
dist = np.sqrt(np.sum(diff**2, axis=0))

print(diff,dist)

#The dist is the distance from the points in new_vs to p.

###### 5(b)
Use `np.min` to find the shortest distance beteen `p` and a point in `new_vs` .  
Use `np.argmin` to find this point in `new_vs` .  
(This point is the projection of `p` onto the column space of `A` .)

In [None]:
### your answer here
A = np.array([[1,1],
              [-1,0],
              [0,-1]])
p = np.array([1,0,0])
grid = np.meshgrid(np.linspace(-10,10,100), np.linspace(-10,10,100))
xs = grid[0].ravel()
ys = grid[1].ravel()

vs = np.vstack([xs,ys])
new_vs = A.dot(vs)

diff = p[:,np.newaxis] - new_vs
dist = np.sqrt(np.sum(diff**2, axis=0))

print(np.min(dist))
new_vs[:,np.argmin(dist)]

##### Exercise 6
Let `X = make_blobs(k=1)` .  

###### 6(a)
Draw a red point at the origin and the points (rows) in `X` .

In [None]:
### your answer here

X = make_blobs(k=1)

plt.scatter(0,0,color='r')
for i in range(100):
     plt.scatter(X[i,0],X[i,1],color='b')

###### 6(b)
Suppose $\{{\bf x}_i\}_{i=1}^n$ are some points.  
Then the **center of mass** is at $\frac{1}{n}\sum_{i=1}^n {\bf x}_i$.  
Let $\{{\bf x}_i\}_i$ be the rows of `X` .  
Use `mu = X.mean( ... )` to find the center of mass.

In [None]:
### your answer here
X = make_blobs(k=1)

mu = X.mean(axis=0)
print(mu)

###### 6(c)
Let `new_X = X - mu`.  
(Guess its meaning.)  
Draw a red point at the origin and the points (rows) in `new_X` .

In [None]:
### your answer here
#deviation from average?

new_X = X - mu

plt.scatter(0,0,c='red')
for i in range(100):
 plt.scatter(new_X[i,0],new_X[i,1],color='b')

##### Exercise 7
For the following equality, pick some random matrices or vectors and check if the equality is true.

###### 7(a)
The **trace** of an $n\times n$ matrix $A=\begin{bmatrix}a_{ij}\end{bmatrix}$ is  
$$\operatorname{tr}(A) = a_{11} + a_{22} + \cdots + a_{nn}.$$  
For any $n\times n$ matrix $A=\begin{bmatrix}a_{ij}\end{bmatrix}$,  
$$\operatorname{tr}(A^\top A) = \sum_{i=1}^n\sum_{j=1}^n a_{ij}^2.$$

In [None]:
### your answer here
n = 5
x = np.random.randn(n,n)

ATA = np.trace(x.dot(x.T))
S = sum(x ** 2)

print("A:",x)
print("trace of ATA:", ATA)
print("square sum:", S)

##### Veronica  

If you want to calculate the value of  $$\operatorname{tr}(A^\top A) = \sum_{i=1}^n\sum_{j=1}^n a_{ij}^2 ,$$ you need to use the following code.

```python
print((x**2).sum())
```


## 7(b)
Let $A$ be an $m\times n$ matrix and $B$ an $n\times \ell$ matrix.  
Then $(AB)^\top = B^\top A^\top$.  

In [None]:
m = 6
n = 5
l = 7
A = np.random.randn(m,n)
B = np.random.randn(n,l)

a = (np.dot(A,B)).T
b = np.dot(B.T,A.T)
print(a,'\n\n',b)
print()
print(a.round(4) == b.round(4))

###### 7(c)
Let $A$ be an $m\times n$ matrix, ${\bf x}\in\mathbb{R}^n$, ${\bf y}\in\mathbb{R}^m$.  
Then $\langle A{\bf x}, {\bf y}\rangle = {\bf y}^\top A{\bf x} = \langle {\bf x}, A^\top{\bf y}\rangle$.  

In [None]:
m = 6
n = 5
A = np.random.randn(m,n)
x = np.random.randn(n)
y = np.random.randn(m)

a = np.dot(np.dot(A,x),y)
b = y.T.dot(np.dot(A,x))
c = np.dot(x,np.dot(A.T,y))
print(a.round(4)==b.round(4))
print(a.round(4)==c.round(4))
print(b.round(4)==c.round(4))