# Introduction

A vector $x$ is a collection of numbers that defines a point relative to the origin. It can be visualized as an arrow starting from the origin and going to the point defined.

This denotes that $x$ is a vector in 2D space. It denotes the point $[3, -2]$: 3 units to the right of the origin $[0, 0]$ along the x-axis, and 2 units down along the y-axis.
\begin{align}
    x &\in \mathbb{R}^{2} \\
    x &= \begin{bmatrix} 3 \\ -2 \end{bmatrix}
\end{align}

This next vector $y$ is a vector in 3D space. It denotes the point $[-2, 7, 5]$: 2 units to the left of the origin along the x-axis, 7 units along the positive y-axis, and 5 units up along the positive z-axis.
\begin{align}
    x &\in \mathbb{R}^{3} \\
    x &= \begin{bmatrix} -2 \\ 7 \\ 5 \end{bmatrix}
\end{align}

So any vector $x$ is a collection of numbers that define a point in $n$D space $\mathbb{R}^{n}$.

Elements of a vector $x$ are referred to by indexing. For example, $x\left[ 2 \right]$ refers to the 2nd element of $x$. In the 3D case, $x\left[ 2 \right] = 7$.

# Transpose

Vectors are often represented as column vectors. The transpose of a column vector is simply a vector that is a row vector. This distinction becomes important when matrix multiplication is involved.

\begin{align}
    x &= \begin{bmatrix} -2 \\ 7 \\ 5 \end{bmatrix} \\
    x^{T} &= \begin{bmatrix} -2 & 7 & 5 \end{bmatrix}
\end{align}

The transpose of a vector is commonly used to express the [dot product](#Dot-Product) of 2 vectors as $x^{T}y$. It is also commonly used to express the sum-of-squares of a vector $x$ as $x^{T}x$.

# Magnitude, Direction, Unit and Normal Vectors

The magnitude of the vector gives a sense of the size, or "how big", the vector is. The [$L^{2}$](./norms.ipynb#L2) is used to compute the magnitude of the vector.

$$ \lVert x \rVert_{2} = \lVert x \rVert = \sqrt{\sum_{i} x_{i}^{2}} $$

A unit or normal vector is a vector whose magnitude is $1$. To get such a vector, we can take any vector $x$ and divide it by its magnitude $\lVert x \rVert$.

$$ \hat{x} = \frac{x}{\lVert x \rVert} $$

A vector can be thought of as 2 components: its magnitude and direction. Unit vectors are convenient when storing the direction of the vector since it only has to be multiplied or scaled by a constant to get another vector in the same direction. For example,

\begin{align}
    x &= \begin{bmatrix} 3 \\ 2 \end{bmatrix} \\
    \lVert x \rVert &= \sqrt{\sum_{i} x_{i}^{2}} \\
    \lVert x \rVert &= \sqrt{3^{2} + 2^{2}} = \sqrt{9 + 4} = \sqrt{13} \\
    \hat{x} &= \frac{x}{\lVert x \rVert} = \begin{bmatrix} \frac{3}{\sqrt{13}} \\ \frac{2}{\sqrt{13}} \end{bmatrix}
\end{align}

We can derive a vector $y$ that has a magnitude of $3$ with the same direction as $x$ by doing the following.
$$
    y = 3 \hat{x} = 3 \begin{bmatrix} \frac{3}{\sqrt{13}} \\ \frac{2}{\sqrt{13}} \end{bmatrix}
    = \begin{bmatrix} \frac{9}{\sqrt{13}} \\ \frac{6}{\sqrt{13}} \end{bmatrix}
$$

Let's compute $\lVert y \rVert$ to verify that $\hat{y} = \hat{x}$.
\begin{align}
    \hat{y} &= \frac{y}{\lVert y \rVert} =
        \frac{\begin{bmatrix} \frac{9}{\sqrt{13}} \\ \frac{6}{\sqrt{13}} \end{bmatrix}}
            {\sqrt{\left( \frac{9}{\sqrt{13}} \right) ^{2} + \left( \frac{6}{\sqrt{13}} \right) ^{2}}}
        = \frac{\begin{bmatrix} \frac{9}{\sqrt{13}} \\ \frac{6}{\sqrt{13}} \end{bmatrix}}
            {\sqrt{ \frac{81}{13} + \frac{36}{13} }}
        = \frac{\begin{bmatrix} \frac{9}{\sqrt{13}} \\ \frac{6}{\sqrt{13}} \end{bmatrix}}
            {\sqrt{ \frac{117}{13} }}
        = \frac{\begin{bmatrix} \frac{9}{\sqrt{13}} \\ \frac{6}{\sqrt{13}} \end{bmatrix}}
            {\sqrt{ \frac{9 \cdot 13}{13} }}
        = \frac{\begin{bmatrix} \frac{9}{\sqrt{13}} \\ \frac{6}{\sqrt{13}} \end{bmatrix}}
            {\sqrt{ 9 }}
        = \frac{\begin{bmatrix} \frac{9}{\sqrt{13}} \\ \frac{6}{\sqrt{13}} \end{bmatrix}}{3}
        = \begin{bmatrix} \frac{3}{\sqrt{13}} \\ \frac{2}{\sqrt{13}} \end{bmatrix}
        = \hat{x}
\end{align}

# Span

The span of a vector is any vector that can be computed by scaling the original vector by any constant $\alpha$. The span $S_{x}$ of the vector $x$ is:
\begin{align}
    \alpha &\in \mathbb{R} \\
    x &\in \mathbb{R}^{n} \\
    S_{x} &= \alpha x
\end{align}

The span of a vector is simply a line that goes in both directions indefinitely. Consequently, if another vector $y$ is simply $x$ scaled by a constant ($y = \alpha x$), then the span of $y$ is the same as the span of $x$.
\begin{align}
    \alpha, \beta &\in \mathbb{R} \\
    S_{y} &= \beta y \\
    y &= \alpha x \\
    S_{y} &= \beta \alpha x
\end{align}

In the above, $\beta \alpha$ is still an arbitrary constant, so the span of y $S_{y}$ is the same as the span of x $S_{x}$.

# Linear Combinations

A linear combination $x$ of vectors $u$, $v$, and $w$ is simply the sum of the product of these vectors and some real coefficients $a$, $b$, and $c$:
\begin{align}
    a, b, c &\in \mathbb{R} \\
    u, v, w &\in \mathbb{R}^{n} \\
    x &= au + bv + cw
\end{align}

# Span of a Set of Vectors

The span of a set of vectors $\left\{ x, y \right\}$ is the set of all vectors $S$ that can be computed as a linear combination of $x$ and $y$.
\begin{align}
    \alpha_{i}, \beta_{i} &\in \mathbb{R} \\
    v_{i} &\in \mathbb{R}^{n} \\
    S &= \left\{ v_{i} \right\} \;\;\;\; \mathrm{s.t.} \;\;\;\; v_{i} = \alpha_{i}x + \beta_{i}y
\end{align}

Intuitively, the span of a set of vectors gives a sense of how much of the vector space $R^{n}$ these vectors can cover together.


# Basis Vectors

There are an infinite number of 2D vectors in $\mathbb{R}^{2}$. However, every vector can be represented by a linear combination of 2 unit directional vectors. So far, we have assumed the following basis vectors.
\begin{align}
    \alpha, \beta &\in \mathbb{R} \\
    x &\in \mathbb{R^{2}} \\
    \hat{i} &= \begin{bmatrix} 1 \\ 0 \end{bmatrix} \\
    \hat{j} &= \begin{bmatrix} 0 \\ 1 \end{bmatrix} \\
    x &= \begin{bmatrix} \alpha \\ \beta \end{bmatrix}
        = \hat{i} \alpha + \hat{j} \beta
        = \begin{bmatrix} 1 \\ 0 \end{bmatrix} \alpha
            + \begin{bmatrix} 0 \\ 1 \end{bmatrix} \beta
\end{align}

$\hat{i}$ and $\hat{j}$ form a **basis** for the vector space $V = \mathbb{R}^{2}$. The span of the set $\left\{ \hat{i}, \hat{j} \right\}$ is all of $\mathbb{R}^{2}$, since every vector in $R^{2}$ can be derived from linear combinations of $\hat{i}$ and $\hat{j}$. Furthermore, since $\hat{i}$ and $\hat{j}$ both have magnitudes of 1, the are referred to as unit basis vectors.

# Linear Independence

Vectors $u$, $v$, and $w$ are linearly independent vectors if any of the vectors cannot be derived as a linear combination of the other 2. That is, no constants $a$, $b$, $c$, $d$, $e$, $f$ exists such that:
\begin{align}
    w &= au + bv \\
    v &= cu + dw \\
    u &= ev + fw \\
    a, b, c, d, e, f &\in \mathbb{R} \\
    u, v, w &\in \mathbb{R}^{n} \\
\end{align}

# Dot Product

The dot product between two vectors $x$ and $y$ is the sum of the element-wise product of the two vectors.

\begin{align}
    x \cdot y &= x^{T}y = \sum_{i} x_{i}y_{i} = \lVert x \rVert \lVert y \rVert \cos{\theta}
\end{align}

The dot product gives a measure of how similar the directions of the two vectors are. If $x \cdot y$ is close to $\lVert x \rVert \lVert y \rVert$, then $\cos{\theta}$ must be close to 1 and $\theta$ must be close to 0.

Similarly, if $x$ and $y$ are pointing in exactly opposite directions, then $\cos{\theta} = -1$ and the $x \cdot y$ < 0.

# Orthogonal Vectors

Two vectors $x$ and $y$ are orthogonal if their dot product $x \cdot y = 0$. This means that the two vectors form a right angle, meaning the angle between them is $90 \degree$.

# Orthonormal Vectors

Two vectors are orthonormal if they are orthogonal and both have magnitudes of 1.