$$\newcommand{\F}{\mathbb{F}}
\newcommand{\R}{\mathbb{R}}
\newcommand{\v}{\mathbf{v}}
\newcommand{\a}{\mathbf{a}}
\newcommand{\b}{\mathbf{b}}
\newcommand{\c}{\mathbf{c}}
\newcommand{\w}{\mathbf{w}}
\newcommand{\u}{\mathbf{u}}
\newcommand{\0}{\mathbf{0}}
\newcommand{\1}{\mathbf{1}}$$

The following notes use reference from **Mike X Cohen: Linear Algebra: Theory, Intuition, Code, 2021.**, **Sheldon Axler: Linear Algebra Done Right, 2015.** and **Wikipedia** for intuitions, examples, formal definitions and theorems. Since I am at the introductory chapters, I will breeze through them until vector spaces.

## Table of Contents

* [Learning Objectives](#1)
* [Field](#1)
    * [Definition (Field)](#Import_modules_offline)
    * [Examples of Fields](#Import_other_modules)
    * [Notation of Fields](#11)
    * [Summary of Fields](#11)
* [Vectors](#2)
    * [Geometric Definition (Vectors)](#11)
        * [Vector is Invariant under Coordinates](#11)
    * [Algebraic Definition (Vectors)](#11)
    * [Equality of Vectors](#11)
    * [Vector Orientation](#11)
        * [Example of Column and Row Vectors](#11)
    * [Transposed Vector](#11)
        * [Definition (Transposed Vector)](#11)
    * [Vector Addition and Subtraction](#11)
        * [Algebraic Definition](#11)
        * [Geometric Definition](#11)
        * [Vector Addition is Commutative](#11)

!!! summary "Learning Objectives"
    - Definition of a Field
    - Definition of a Vector
        - Vector Operations with both Algebraic and Geometric understanding.

## Vector Multiplications

This section introduces one of the most important idea in Linear Algebra, the **Dot Product**. Since [Wikipedia](https://en.wikipedia.org/wiki/Dot_product)^[Dot_product] has a wholesome introduction, we will be copying over some definitions from it.


[^Dot_product]: https://en.wikipedia.org/wiki/Dot_product

### DOT Product (Algebraic definition)

The dot product of two vectors $\color{red}{\a =  \begin{bmatrix} a_1  \; a_2  \; \dots \; a_n \end{bmatrix}^{\rm T}}$ and 
$\color{blue}{\b =  \begin{bmatrix} b_1 & b_2  & \dots & b_n \end{bmatrix}^{\rm T}}$ is defined as:

$$\mathbf{\color{red}\a}\cdot\mathbf{\color{blue}\b}=\sum_{i=1}^n {\color{red}a}_i{\color{blue}b}_i={\color{red}a}_1{\color{blue}b}_1+{\color{red}a}_2{\color{blue}b}_2+\cdots+{\color{red}a}_n{\color{blue}b}_n$$

where $\sum$ denotes summation and $n$ is the dimension of the vector space. Since **vector space** has not been introduced, we just think of it as the $\R^n$ dimensional space. 

#### Example

For instance, in 3-dimensional space, the **dot product** of column vectors $\begin{bmatrix}1 & 3 & -5\end{bmatrix}^{\rm T}$ and $\begin{bmatrix}4 & -2 & -2\end{bmatrix}^{\rm T}$

$$
\begin{align}
\ [{\color{red}1, 3, -5}] \cdot  [{\color{blue}4, -2, -1}] &= ({\color{red}1} \times {\color{blue}4}) + ({\color{red}3}\times{\color{blue}-2}) + ({\color{red}-5}\times{\color{blue}-1}) \\
&= 4 - 6 + 5 \\
&= 3
\end{align}
$$

---

#### Vector as Matrices

We are a little ahead in terms of the definition of Matrices, but for people familiar with it, or have worked with `numpy` before, we know that we can interpret a row vector of dimension $n$ as a matrix of dimension $1 \times n$, similarly, we can interpret a column vector of dimension $n$ as a matrix of dimension $n \times 1$. With this interpretation, we can perform a so called "matrix multiplication" of the row vector and column vector. The result is the dot product. We will go in details when we get to it.

If vectors are identified with row matrix, the dot product can also be written as a matrix multiplication.

$$\mathbf{\color{red}a} \cdot \mathbf{\color{blue}b} = \mathbf{\color{red}a}^\mathsf T \mathbf{\color{blue}b}$$

Expressing the above example in this way, a 1 × 3 matrix **row vector** is multiplied by a 3 × 1 matrix **column vector** to get a 1 × 1 matrix that is identified with its unique entry:
$$
  \begin{bmatrix}
   \color{red}1 & \color{red}3 & \color{red}-5
  \end{bmatrix}
  \begin{bmatrix}
   \color{blue}4 \\ \color{blue}-2 \\ \color{blue}-1
  \end{bmatrix} = \color{purple}3
$$

### Properties

Extracted from Wikipedia:

The dot product fulfills the following properties if **a**, **b**, and
**c** are real [vectors](vector_(geometry) "wikilink") and *r* is a
[scalar](scalar_(mathematics) "wikilink").[^8][^9]

1.  **[Commutative](Commutative "wikilink"):**

    $\mathbf{a} \cdot \mathbf{b} = \mathbf{b} \cdot \mathbf{a} ,$
    which follows from the definition (*θ* is the angle between **a** and **b**): $\mathbf{a} \cdot \mathbf{b} = \left\| \mathbf{a} \right\| \left\| \mathbf{b} \right\| \cos \theta = \left\| \mathbf{b} \right\| \left\| \mathbf{a} \right\| \cos \theta = \mathbf{b} \cdot \mathbf{a} .$
    
2.  **[Distributive](Distributive_property "wikilink") over vector
    addition:**

    $\mathbf{a} \cdot (\mathbf{b} + \mathbf{c}) = \mathbf{a} \cdot \mathbf{b} + \mathbf{a} \cdot \mathbf{c} .$
    
3.  **[Bilinear](bilinear_form "wikilink")**:

    $\mathbf{a} \cdot ( r \mathbf{b} + \mathbf{c} ) = r ( \mathbf{a} \cdot \mathbf{b} ) + ( \mathbf{a} \cdot \mathbf{c} ) .$
    
4.  **[Scalar multiplication](Scalar_multiplication "wikilink"):**

    $( c_1 \mathbf{a} ) \cdot ( c_2 \mathbf{b} ) = c_1 c_2 ( \mathbf{a} \cdot \mathbf{b} ) .$
    
5.  **Not [associative](associative "wikilink")** because the dot
    product between a scalar (**a ⋅ b**) and a vector (**c**) is not
    defined, which means that the expressions involved in the
    associative property, (**a ⋅ b**) ⋅ **c** or **a** ⋅ (**b ⋅ c**) are
    both ill-defined.[^11] Note however that the previously mentioned
    scalar multiplication property is sometimes called the \"associative
    law for scalar and dot product\"[^12] or one can say that \"the dot
    product is associative with respect to scalar multiplication\"
    because *c* (**a** ⋅ **b**) = (*c* **a**) ⋅ **b** = **a** ⋅ (*c*
    **b**).[^13]
    
6.  **[Orthogonal](Orthogonal "wikilink"):**

    Two non-zero vectors **a** and **b** are *orthogonal* if and only if $\a \cdot \b = \0$.
        
7.  **No [cancellation](cancellation_law "wikilink"):**
Unlike multiplication of ordinary numbers, where if $ab=ac$  then *b* always equals *c* unless *a* is zero, the dot product does not obey the [cancellation law](cancellation_law "wikilink").

8.  **[Product Rule](Product_Rule "wikilink"):**

     If **a** and **b** are (vector-valued) [differentiable functions](differentiable_function "wikilink"), then the derivative, denoted by a prime ' of $\a \cdot \b$ is given by the rule $(\a \cdot \b)' = \a' \cdot \b + \a \cdot \b'$.

### Cauchy-Schwarz Inequality

Let two vectors $\v$ and $\w$ be in field $\F^n$, then the inequality

$$|\v^\top \w| \leq \Vert \v \Vert \Vert \w \Vert$$ holds. 

---

> This inequality provides an **upper bound** for the dot product between two vectors; in other words, the absolute value of the dot product between two vectors cannot be larger than the product of the norms of the individual vectors. Note carefully that in order for the inequality to become an equality if and only if both vectors are the zero vector $\0$ or if one vector (either one) is scaled by the other vector $\v = \lambda \w$.  - **Mike X Cohen, Linear Algebra: Theory, Intuition, Code**

If you wonder why when $\v = \lambda \w$ implies equality, it is apparent if you do a substitution as such $$|\v^\top \w| = |\lambda \w^\top \w| = \lambda |\w^\top \w| = \lambda \|\w\|^2 = \lambda \|\w\| \|\w\| = \|\v\| \|\w\|$$
where we used the fact that $\w^\top \w = \|\w\|^2$ by definition.

The author decided to include this inequality here because this theorem is always used in many proofs. He then shows a use case in the Geometric Interpretation of the Dot Product.

### DOT Product (Geometric definition)

In [Euclidean space](Euclidean_space "wikilink"), a [Euclidean vector](Euclidean_vector "wikilink") is a geometric object that possesses both a magnitude and a direction. A vector can be pictured as
an arrow. Its magnitude is its length, and its direction is the direction to which the arrow points. The magnitude of a vector **a** is denoted by $\left\| \mathbf{a} \right\|$. The dot product of two
Euclidean vectors **a** and **b** is defined by $$\mathbf{a}\cdot\mathbf{b}=\|\mathbf{a}\|\ \|\mathbf{b}\|\cos\theta ,$$
where $\theta$ is the angle between $\a$ and $\b$.

<figure>
<img src='https://storage.googleapis.com/reighns/reighns_ml_projects/docs/linear_algebra/linear_algebra_theory_intuition_code_chap3_fig_3.1_scalar_projection_and_dot_product.PNG' width="500" height="350" align="center"/>
<figcaption align = "center"><b>Fig 3.11; Diagram of Scalar Projection and DOT Product.</b></figcaption>
</figure>

In particular, if the vectors $\a$ and $\b$ are [orthogonal](orthogonal "wikilink")
(i.e., their angle is $\frac{\pi}{2}$), then $\cos \frac \pi 2 = 0$, which implies that

$$\mathbf a \cdot \mathbf b = 0 .$$ At the other extreme, if they are
codirectional, then the angle between them is zero with $\cos 0 = 1$ and

$$\mathbf a \cdot \mathbf b = \left\| \mathbf a \right\| \, \left\| \mathbf b \right\|$$
This implies that the dot product of a vector **a** with itself is

$$\mathbf a \cdot \mathbf a = \left\| \mathbf a \right\| ^2 ,$$ which
gives $$\left\| \mathbf a \right\| = \sqrt{\mathbf a \cdot \mathbf a}$$

the formula for the [Euclidean length](Euclidean_length "wikilink") of
the vector.

#### Scalar projections

#TODO.

#### Sign of the DOT Product is determined by the Angle in between the two vectors

The geometric definition can be re-written as follows:

\begin{equation} \label{eq1}
\begin{split}
\mathbf{a}\cdot\mathbf{b} &=\|\mathbf{a}\|\ \|\mathbf{b}\|\cos\theta \implies \cos(\theta) = \frac{\a^\top \b}{\|\a\| \|\b\|} \implies \theta = \cos^{-1}\left(\frac{\a^\top \b}{\|\a\| \|\b\|}\right)
\end{split}
\end{equation}

which essentially means that one can find the angle between two known vectors in any dimensional space.

---

The author pays particular attention to this part and he explained how the **sign** of the dot product is determined solely by the angle between the two vectors. One can read it in more details on page 51-52 of his book **Linear Algebra: Theory, Intuition, Code**. I will just give a summary:

!!! note
    By the definition $\mathbf{a}\cdot\mathbf{b} = \|\mathbf{a}\|\ \|\mathbf{b}\|\cos\theta$, we know that the sign (positive or negative) of the dot product $\a \cdot \b$ is solely determined by $\cos \theta$ since $\|\a\| \|\b\|$ is always positive. And thus if $0<\theta < 90$ then $\cos \theta > 0 \implies \|\mathbf{a}\|\ \|\mathbf{b}\|\cos\theta > 0 \implies \mathbf{a}\cdot\mathbf{b} > 0$. If $90<\theta < 180$ then $\cos \theta < 0 \implies \|\mathbf{a}\|\ \|\mathbf{b}\|\cos\theta < 0 \implies \mathbf{a}\cdot\mathbf{b} < 0$. The author also pin point that when $\theta$ is 90 degrees, then the dot product is necessarily zero, implying the two vectors are orthogonal. This is one of the most important properties of dot product and should be committed to memory!
    
<figure>
<img src='https://storage.googleapis.com/reighns/reighns_ml_projects/docs/linear_algebra/linear_algebra_theory_intuition_code_chap3_fig_3.2.PNG' width="500" height="350" align="center"/>
<figcaption align = "center"><b>Fig 3.2; Courtesy of Linear Algebra: Theory,<br> Intuition, Code by Mike X Cohen</b></figcaption>
</figure>


#### Application to the law of cosines {#application_to_the_law_of_cosines}

A triangle with lines a, b and c is presented in figure 3.31, a and b are separated by angle *θ*, then the **law of cosine** states that $$c^2 = a^2 + b^2 - 2ab\cos(\theta)$$

$$\begin{align}
\mathbf{\color{orange}c} \cdot \mathbf{\color{orange}c}  & = ( \mathbf{\color{red}a} - \mathbf{\color{blue}b}) \cdot ( \mathbf{\color{red}a} - \mathbf{\color{blue}b} ) \\
 & = \mathbf{\color{red}a} \cdot \mathbf{\color{red}a} - \mathbf{\color{red}a} \cdot \mathbf{\color{blue}b} - \mathbf{\color{blue}b} \cdot \mathbf{\color{red}a} + \mathbf{\color{blue}b} \cdot \mathbf{\color{blue}b} \\
 & = \mathbf{\color{red}a}^2 - \mathbf{\color{red}a} \cdot \mathbf{\color{blue}b} - \mathbf{\color{red}a} \cdot \mathbf{\color{blue}b} + \mathbf{\color{blue}b}^2 \\
 & = \mathbf{\color{red}a}^2 - 2 \mathbf{\color{red}a} \cdot \mathbf{\color{blue}b} + \mathbf{\color{blue}b}^2 \\
\mathbf{\color{orange}c}^2 & = \mathbf{\color{red}a}^2 + \mathbf{\color{blue}b}^2 - 2 \mathbf{\color{red}a} \mathbf{\color{blue}b} \cos \mathbf{\color{purple}\theta} \\
\end{align}$$

which is the [law of cosines](law_of_cosines "wikilink").
`{{clear}}`{=mediawiki}

<figure>
<img src='https://storage.googleapis.com/reighns/reighns_ml_projects/docs/linear_algebra/linear_algebra_theory_intuition_code_chap3_fig_3.31_law_of_cosine.PNG' width="500" height="350" align="center"/>
<figcaption align = "center"><b>Fig 3.31; Law of Cosine</b></figcaption>
</figure>


#### Proof of Algebraic and Geometric Equivalence of DOT Product

#TODO

### Exercise 1:

The code to the solution is presented below, it is important to realize that the number of elements in the weights vector should be the same as the number of vectors.

In [16]:
import numpy as np
from typing import List

# as col vector
v1 = np.asarray([1, 2, 3, 4, 5]).reshape(-1, 1)
v2 = np.asarray([2, 4, 6, 8, 10]).reshape(-1, 1)
v3 = np.asarray([3, 6, 9, 12, 15]).reshape(-1, 1)

weights = [10, 20, 30]


def linear_combination_vectors(
    weights: List[float], *args: np.ndarray
) -> np.ndarray:
    """Computes the linear combination of vectors.

    Args:
        weights (List[float]): The set of weights corresponding to each vector.

    Returns:
        linear_weighted_sum (np.ndarray): The linear combination of vectors.

    Examples:
        >>> v1 = np.asarray([1, 2, 3, 4, 5]).reshape(-1, 1)
        >>> v2 = np.asarray([2, 4, 6, 8, 10]).reshape(-1, 1)
        >>> v3 = np.asarray([3, 6, 9, 12, 15]).reshape(-1, 1)
        >>> weights = [10, 20, 30]
        >>> linear_combination_vectors([10, 20, 30], v1, v2, v3)
    """
    
    linear_weighted_sum = np.zeros(shape=args[0].shape)
    for weight, vec in zip(weights, args):
        linear_weighted_sum += weight * vec
    return linear_weighted_sum

In [17]:
linear_combination_vectors(weights, v1, v2, v3)

array([[140.],
       [280.],
       [420.],
       [560.],
       [700.]])

### Exercise 2:

Since we want to compute the average of all elements in a vector $\v \in \R^n$, we can first see the formula of average to be: $$\bar{\v} = \frac{v_1 + v_2 + ... + v_n}{n}$$

To make use of dot product, we can define $\1$ and perform $\v^\top \cdot \1$ which returns the sum of all elements in $\v$ by the definition of dot product. Lastly, divide this answer by the total number of elements.

In [18]:
def dot_product(v1: np.ndarray, v2: np.ndarray) -> float:
    """Computes the dot product of two vectors.

    Args:
        v1 (np.ndarray): The first vector.
        v2 (np.ndarray): The second vector.

    Returns:
        dot_product_v1_v2 (float): The dot product of two vectors.

    Examples:
        >>> v1 = np.asarray([1, 2, 3, 4, 5]).reshape(-1, 1)
        >>> v2 = np.asarray([2, 4, 6, 8, 10]).reshape(-1, 1)
        >>> dot_product(v1, v2)
    """

    dot_product_v1_v2 = 0
    for element_1, element_2 in zip(v1, v2):
        dot_product_v1_v2 += element_1 * element_2

    return dot_product_v1_v2

In [20]:
dot_product(v1, v2) == np.dot(v1.T, v2) # same as np.dot but does not take into the orientation of vectors

array([[ True]])

In [21]:
# as col vector
v1 = np.asarray([1, 2, 3, 4, 5]).reshape(-1, 1)
shape_v1 = v1.shape

ones = np.ones(shape=shape_v1)

total_sum = dot_product(v1, ones)

average = total_sum / v1.shape[0]

print(f"average is {average}")

average is [3.]


### Exercise 3:

We assume weighted mean to be normalized such that the weights of all the vectors must sum up to 1.

In [25]:
# as col vector
v1 = np.asarray([1, 2, 3, 4, 5]).reshape(-1, 1)
shape_v1 = v1.shape
num_elements = shape_v1[0]

random_weights = np.random.rand(*shape_v1)
normalized_random_weights = random_weights / num_elements

total_sum = dot_product(v1, normalized_random_weights)

weighted_average = total_sum / v1.shape[0]

print(f"weighted average is {weighted_average}")

weighted average is [0.22985858]
