## TODO Give an example of how the outer product can be useful in ML.

## What’s the geometric interpretation of the dot product of two vectors?

Given two vectors $u$ and $v$, the dot product is $uv=\|u\|\|v\|\cos\theta$. Where $\|u\|\cos\theta$ can be regarded as the projection of vector $u$ on vector $v$, so the dot product of $u$ and $v$ is the magnitude of one vector onto another multiplied by the other's magnitude.

## Given a vector  u , find vector  v  of unit length such that the dot product of  u  and  v  is maximum.

From previous question, if $u$ is given, and $v$ has unit length, i.e. $\|v\|=1$, the dot product of two is $uv=\|u\|\cos\theta$, so it's maximized when $\cos\theta = 1$, i.e. $\theta=0$ such that $v$ is in the same direction of $u$

## Given two vectors  $a=[3,2,1]$  and  $b=[−1,0,1]$ . Calculate the outer product  $a\otimes b$ ?

$a\otimes b = ab^T==\begin{bmatrix}3\\2\\1\end{bmatrix}\begin{bmatrix}-1 & 0 & 1 \end{bmatrix} = \begin{bmatrix}-3 & 0 & 3 \\ -2 & 0 & 2 \\ -1 & 0 & 1\end{bmatrix}$

## What does it mean for two vectors to be linearly independent?

A sequence of vectors $v_1,\dots,v_k$ are said to be "linearly independent" if the equation $a_1v_1 + \dots + a_kv_k = 0$ can only be satisfied by $a_i = 0, i=1,\dots,k$.

This implies:
1. No vector in the sequence can be represented as a linear combination of the remaining vectors in the sequence.
2. The *zero vector* can't be one of "linearly independent" vectors.

The vectors are said to be "linearly dependent" if there exist scalars $a_1, \dots, a_k$ not all zero, such that
$a_1v_1 + \dots + a_kv_k = 0$. Thus, a set of vectors is linearly dependent if and only if one of them is zero or a linear combination of the others.

References:
1. [Linearly Independence on Wiki](https://en.wikipedia.org/wiki/Linear_independence)

## Given  $n$  vectors, each of  $d$  dimensions. What is the dimension of their span?

The span of a set of vectors is the set of all linear combinations of the vectors. E.g. for vectors $v_1, \dots, v_n$, the span is the set of all vectors in the form of $a_1v_1 + \dots + a_nv_n$, where $a_1, \dots, a_n$ are any scalars.
1. First, [As the dimensionality of a vector refers to the space of which the vector is a member](https://math.stackexchange.com/a/2452453/233623), so the vectors live in a space of dimension $d$ as given. The span of these vectors has to have a dimension less than or equal to $d$. 
2. Also, the dimension of their span is determined by the number of linearly independent vectors within the $n$ vectors, so it has to be less than or equal to $n$ as well. 

Combine the two, the dimension of $n$ vectors, each of $d$ dimensions is less than or equal to $min(n,d)$

## Given two sets of vectors  $A=a_1,a_2,...,a_n$  and  $B=b_1,b_2,...,b_m$ . How do you check that they share the same basis?

1. First, [As the dimensionality of a vector refers to the space of which the vector is a member](https://math.stackexchange.com/a/2452453/233623), the vectors in $A$ and $B$ need to have the same dimension, otherwise, it's impossible for $A, B$ to share the same basis.
2. Secondly, a set of vectors can have many bases. We can compute the basis for $A$ first (e.g. solve $Ax=0$ to find out any relationships within the vectors in $A$ and remove extra vectors until we find the basis), assuming it's $C$, then we check that for each $b_i$ in $B$, there's a solution to the equation $Cx=b_i$, i.e. all vectors in $B$ is a linear combination of the basis of $A$. If so, we can say that $A,B$ share the same basis, or they are in the same vector space.

## What's a norm of vector $v$? What is  $l^0$,$l^1$, $l^2$,$l^\infty norm$ ?

Assuming vector $|v|$ has dimension of $n$, The norm of a nonzero vector $v$ is a positive number $\|v\|$, which measures the "length" of the vector. Every norm must share two properties of the absolute value $|c|$ for any number(Linear Algebra and Learning from Data, I.11[5]):
1. Rescaling: $\|cv\| = |c|\|v\|$
2. Triangle inequality: $\|v+w\| \le \|v\|+\|w\|$

here are some norms:
1. [Zero norm, $l^0$ is the number of non-zero elements in the vector](https://www.quora.com/What-is-the-L0-norm-in-linear-algebra)[3]. It is not really a norm as it doesn't satisfy the properties of norm.

2. $l^1=\|v\|_1 = |v_1| + \dots + |v_n|$

3. $l^2=\text{Euclidean norm} = \|v\|_2 = \sqrt{|v_1|^2+\dots+|v_n|^2}$

4. $l^\infty = \text{max norm} = \|v\|_\infty = \text{maximum of } |v_1|, \dots, |v_n|$

5. $l^p=(|v_1|^p+\dots+|v_n|^p)^\frac{1}{p}$

## How do norm and metric differ? Given a norm, make a metric. Given a metric, can we make a norm?

While a metric provides us with a notion of the distance between points in a space, a norm gives us a notion of the length of an individual vector. A norm can only be defined on a vector space, while a metric can be defined on any set. (Introduction to Real Analysis, Metric and Normed Spaces, Christopher Heil).

A metric $d$ on a set $X$ needs to satisfy:
1. $d(x,y)=0 \Rightarrow x=y$
2. $d(x,y)=d(y,x)$
3. $d(x,y)\le d(x,z) + d(y,z)$

[It is easy to see that a norm is a metric on vector space $V$](https://www.quora.com/What-is-the-difference-between-a-metric-and-a-norm/answer/Caleb-Nastasi-1)[4], because length is the same as “distance from 0.”. To check, simply replace vector $v$ by $x-y$, and it's obvious that $d(x,y)=\|x-y\|$ satisfies the conditions for a metric in $V$.

This does not hold conversely because we may not even have addition of elements in a general set $X$ .

Okay, to sum up. All norms are metrics, and normed spaces (vector spaces with a norm) have a lot more structure than general metric spaces. Anything that holds in a metric space will also hold for a normed space. Metric spaces are more general

## References
1. Machine Learning Interviews Book, Chip Huyen
2. Github: Machine-Learning-Interview-FAQ
3. https://www.quora.com/What-is-the-L0-norm-in-linear-algebra
4. https://www.quora.com/What-is-the-difference-between-a-metric-and-a-norm/answer/Caleb-Nastasi-1
5. Linear Algebra and Learning from Data, Gilbert Strang