# Vectors

**Definition:** A (*Euclidean*) *vector* is a finite ordered set of real numbers. 

For the time being we will only work with Euclidean vectors, so for now we will drop the term Euclidean and just refer to these objects as vectors. Later on, we will be talking about other types of vectors and then the distinction will be useful again. And yes, infinite vectors can be considered too, but that topic falls outside the scope of what will be covered here.

We write vectors using rectangular brackets and organize their data into vertical columns (*column vectors*), unless told to do otherwise (we will work with *row vectors* later on). Typically, when typing we represent vectors with bolded lower-case letters from the end of the alphabet; when writing by hand we add a small arrow atop the letter. For example,

$$
\mathbf{v} = \begin{bmatrix}
                    1 \\
                    2 \\
                    3
             \end{bmatrix},
\mathbf{u} = \begin{bmatrix}
                    -1 \\
                    \hfill 0 \\
                    \hfill 1
                \end{bmatrix}
$$

are vectors with three entries, and if I was writing this by hand instead of typing it I would have written $\mathbf{v}$ as $\vec{v}$. The entries in a vector are called its *components*, and they are distinguished via subscripts. For example, in the vector above, the third component of $\mathbf{v}$ is $v_3=3$, and the second component of $\mathbf{u}$ is $u_2=0$. The number of components of a vector is its *dimension*. Both of the vectors above are 3-dimensional.

Linear algebra originated over 200 years ago to address the need to describe geometric objects with both a direction and a magnitude. For example, a force applied to an object has both a magnitude - the amount of force applied - and a direction in which that force is applied. 'Vector' is the term used to describe such an object, which originates in Latin, where it means 'carrier.' A helpful way to think about a Euclidean vector originating at point A and terminating at point B is as the force needed to move from point A to point B.

#TODO illustration here - explanation of how we draw vectors in two and three dimensions

**Important:** Note that when we represent a vector as an arrow as we have done above that the vector need not originate at the origin, it can start anywhere. Thus, the represention of a vector as an arrow is not unique. However, the arrow's magnitude and direction always are.  

**Important:** Mathematicians typically use 1-based indexing for vectors; that is, the first component of a vector $\mathbf{u}$ is $u_1$. In contrast, the convention in most programming languages (and in particular, the convention in the Python programming language that we are using here) is to use 0-based indexing, where the first component in a list or array is the $0^{th}$ component. Since this book is as much about programming as it is about math we will have to deal with this inconsistency frequently. Going forward, we will adopt the mathematical convention of using 1-based indexing unless we are actually writing code, when we will switch to 0-based indexing. 

## Vector Arithmetic

In general, arithmetic with vectors is only defined when the vectors have the same number of components. As long as that is the case, most arithmetic operations on vectors are straightforward, because most operations are done *componentwise*; that is, between the components of the vectors.

**Vector Addition:** Let $\mathbf{u}$ and $\mathbf{v}$ be vectors with $n$ components. Then $\mathbf{w} = \mathbf{u} + \mathbf{v}$ is the vector whose $i^{th}$ component $w_i = u_i + v_i$.

**Example:**

$$
    \begin{bmatrix}
        \hfill 1 \\
        \hfill 2 \\
        -1
    \end{bmatrix} + 
    \begin{bmatrix}
        \hfill 2 \\
        -3 \\
        \hfill 0
    \end{bmatrix} = 
    \begin{bmatrix}
        \hfill 1 + 2 \\
        \hfill 2 - 3 \\
        -1 + 0
    \end{bmatrix} =
    \begin{bmatrix}
        \hfill 3 \\
        -1 \\
        -1
    \end{bmatrix}
$$

**Example:**

$$
    \begin{bmatrix}
        1 \\
        2
    \end{bmatrix} + 
    \begin{bmatrix}
        7 \\
        4 \\
        0
    \end{bmatrix}
$$

is undefined, because the vectors do not have the same number of components.

This simple componentwise addition has an elegant visual interpretation illustrated below. Thinking of vectors as representing forces, we see here the interaction of multiple forces acting on a single point simultaneously.

#TODO illustration here

Subtraction is also done componentwise.

**Vector Subtraction:** Let $\mathbf{u}$ and $\mathbf{v}$ be vectors with $n$ components. Then $\mathbf{w} = \mathbf{u} - \mathbf{v}$ is the vector whose $i^{th}$ component $w_i = u_i - v_i$.

**Example:**

$$
    \begin{bmatrix}
        \hfill 1 \\
        \hfill 2 \\
        -1
    \end{bmatrix} - 
    \begin{bmatrix}
        \hfill 2 \\
        -3 \\
        \hfill 0
    \end{bmatrix} = 
    \begin{bmatrix}
        1 - 2 \\
        2 - (-3) \\
        -1 - 0
    \end{bmatrix} =
    \begin{bmatrix}
        -1 \\
        \hfill 5 \\
        -1
    \end{bmatrix}
$$

**Scalar Multiplication:** Given a real number $s$ and a Euclidean vector $\mathbf{u}$, we define a multiplication between $s$ and $\mathbf{u}$ as follows: $s\cdot\mathbf{u}$ is the vector whose $i^{th}$ component is $s\cdot u_i$.

**Example:**

$$
    4 \cdot 
    \begin{bmatrix}
        \hfill 3 \\
        -1 \\
        \hfill 2
    \end{bmatrix} = 
    \begin{bmatrix}
        4 \cdot 3 \\
        4 \cdot (-1) \\
        4 \cdot 2
    \end{bmatrix} = 
    \begin{bmatrix}
        \hfill 12 \\
        -4 \\
        \hfill 8
    \end{bmatrix}
$$

This multiplication is called scalar multiplication because it *scales* the vector: that is, it changes the magnitude of the vector but does not change its direction. You can see this illustrated below. Because of this, we commonly refer to real numbers as *scalars* in linear algebra. 

#TODO illustration here

Scalar multiplication, vector addition, and vector subtraction are associative and distributive operations: if $s, t$ are scalars and $\mathbf{u}, \mathbf{v}, \mathbf{w}$ vectors, then
- $s(\mathbf{u} + \mathbf{v}) = s\mathbf{u} + s\mathbf{v}$ (distributive property),
- $s(t\mathbf{u}) = (st)\mathbf{u}$ (associative property),
- $(s + t)\mathbf{u} = s\mathbf{u} + t\mathbf{u}$ (distributive property), and
- $\mathbf{u} + (\mathbf{v} + \mathbf{w}) = (\mathbf{u} + \mathbf{v}) + \mathbf{w}$ (associative property).

Pause for a moment and make up several examples to verify some of these properties.

Note that vector subtraction can be seen as a combination of vector addition and scalar multiplication with a negative scalar; for example,

$$
     2\cdot
     \begin{bmatrix}
         2 \\
         1 \\
         0
     \end{bmatrix} - 4
     \begin{bmatrix}
         \hfill 1 \\
         \hfill 1 \\
         -1
     \end{bmatrix} =
     \begin{bmatrix}
         4 \\
         2 \\
         0
     \end{bmatrix} + (-4)
     \begin{bmatrix}
         \hfill 1 \\
         \hfill 1 \\
         -1
     \end{bmatrix} =
     \begin{bmatrix}
         4 \\
         2 \\
         0
     \end{bmatrix} +
     \begin{bmatrix}
         -4 \\
         -4 \\
         \hfill 4
     \end{bmatrix} =
     \begin{bmatrix}
         \hfill 0 \\
         -2 \\
         \hfill 4
     \end{bmatrix}.
$$

## The Dot Product

Multiplying vectors is more ambiguous than the preceding operations. There are a number of different ways that one might choose to define multiplication between vectors, including simply doing the multiplication componentwise to produce a new vector, analgous to what we did previously with addition and subtraction. Unfortunately, multiplying vectors componentwise to produce another vector doesn't have an intuitive interpretation or an obvious application, but interestingly enough, if we multiply the corresponding components of two vectors together and *add up* the resulting products, we get a scalar that has several very important uses. This operation is called the *dot product* or *scalar product* (the latter because the result of the operation is a scalar), and it is defined precisely below.

**Definition:** Let $\mathbf{u},\mathbf{v}$ be vectors with $n$ components. The *dot product* of $\mathbf{u}$ and $\mathbf{b}$ is the scalar value

$$
    \mathbf{u}\cdot\mathbf{v} = \sum_{i=1}^n u_iv_i.
$$

**Important:** Just like the previous operations that we have discussed, the dot product of vectors of different dimension is not defined.

**Example:** 

$$
    \begin{bmatrix}
        \hfill 1 \\
        \hfill 2 \\
        -1
    \end{bmatrix} \cdot 
    \begin{bmatrix}
        \hfill 2 \\
        -3 \\
        \hfill 0
    \end{bmatrix} = 1\cdot 2 + 2\cdot (-3) + (-1) \cdot 0 = 2 - 6 + 0 = -4.
$$

**Example:** 

$$
    \begin{bmatrix}
        \hfill 1 \\
        -1 
    \end{bmatrix} \cdot 
    \begin{bmatrix}
        1 \\
        1
    \end{bmatrix} = 1\cdot 1+ (-1) \cdot 1 = 1 +  (-1) = 0.
$$

**Exercise:** Sketch the vectors above in standard position. Notice the angle between them.

The dot product has several nice properties. Given vectors $\mathbf{x}, \mathbf{y}, \mathbf{w}$, we can show that

- $\mathbf{x}\cdot\mathbf{y} = \mathbf{y}\cdot\mathbf{x}$ (commutativity)
- $(\mathbf{x} + \mathbf{y})\cdot\mathbf{w} = \mathbf{x}\cdot\mathbf{w} + \mathbf{y}\cdot\mathbf{w}$ (distributivity).

**Example:** We can calculate the dot product of a vector with itself:

$$
    \begin{bmatrix}
        \hfill 2 \\
        -1 
    \end{bmatrix} \cdot 
    \begin{bmatrix}
        \hfill 1 \\
        -1
    \end{bmatrix} = 2\cdot 2+ (-1) \cdot (-1) = 2^2 +  (-1)^2 = 4 + 1 = 5.
$$

This is where we find our first use for the dot product. Notice that in the previous example, the calculation is reminiscent of the Pythagorean Theorem $a^2 + b^2 = c^2$. When we calculate the dot product of a vector with itself, we are using the Pythagorean theorem to determine the squared magnitude, or length, of the vector. When can sketch this for the example above because it is two-dimensional, but the principle holds no matter how many dimensions we are working with. 

#TODO illustration

**Definition:** Let $\mathbf{u}$ be a vector. The *norm* of $\mathbf{u}$ is written $||\mathbf{u}||$ and defined as the square root of the dot product of $\mathbf{u}$ with itself:

$$
    ||\mathbf{u}|| = \sqrt{\mathbf{u}\cdot\mathbf{u}} = \sqrt{\sum_{i=1}^n u_i^2}.
$$

Note that strictly from the definition we can see that the norm of a vector is never negative, and could only equal zero if every component of the vector was 0. We call a vector whose components are all 0 a *zero vector* or *trivial vector* and write it as $\mathbf{0}$.

**Definition:** A *unit vector* is a vector with norm 1.

Given a vector $\mathbf{x}$, it is easy to produce a unit vector that points in the same direction of $\mathbf{x}$: simply calculate $||\mathbf{x}||$ and multiply $\mathbf{x}$ by $1/||\mathbf{x}||$; that is, calculate 

$$
    \mathbf{u} = (1/||\mathbf{x}||)\cdot \mathbf{x} = \mathbf{x} / ||\mathbf{x}||.
$$

This is scalar multiplication, so we already know that this scales $\mathbf{x}$ without changing its direction, and you can check for small examples that the norm of the resulting vector $\mathbf{u}$ is 1.

**Example:** Let 
$$
    \mathbf{x} = \begin{bmatrix}
                    2 \\
                    1 \\
                    1
                 \end{bmatrix}.
$$
Then $||\mathbf{x}|| = \sqrt{2^2 + 1^2 + 1^2} = \sqrt{6}$. Let
$$
    \mathbf{u} = \frac{1}{\sqrt{6}}\cdot\begin{bmatrix}
                                    2 \\
                                    1 \\
                                    1
                                 \end{bmatrix} = 
                                 \begin{bmatrix}
                                    2/\sqrt{6} \\
                                    1/\sqrt{6} \\
                                    1/\sqrt{6}
                                 \end{bmatrix}.
$$
Now calculate $||\mathbf{u}|| = \sqrt{(2/\sqrt{6})^2 + (1/\sqrt{6})^2 + (1/\sqrt{6})^2} = \sqrt{4/6 + 1/6 + 1/6} = \sqrt{1} = 1$.

The second place where we find use for the dot product is in determining the angle between two vectors. Consider the example above where the dot product of two vectors was zero. You probably noticed when you sketched them that they are perpendicular; that is, the angle between them is $90^{\circ}$ or $\pi/2$ radians. In fact, the following is true:

**Theorem:** Let $\mathbf{u}, \mathbf{v}$ be vectors. $\mathbf{u}\cdot\mathbf{v} = ||\mathbf{u}||||\mathbf{v}||\cos{\theta}$, where $\theta$ is the angle between $\mathbf{u}$ and $\mathbf{v}$.

There are a number of ways to prove this fact, generally all based on the Law of Cosines. A complete proof will be left to the exercises, but below we will give a proof in the two-dimensional case that relies on the fact that (regardless of dimension) the dot product is *invariant to rotation*; that is, the dot product between two vectors will be the same even if the vectors are rotated, as long as the vectors are rotated by the same amount. We will revisit and prove this very important fact later on; but you can easily test this yourself with a quick example or two; for instance, compare the dot products of the vectors

$$
    \mathbf{u} = \begin{bmatrix}
                    1 \\
                    1
                 \end{bmatrix},
    \mathbf{v} = \begin{bmatrix}
                    -1 \\
                    1 \\
                 \end{bmatrix}\text{ and }
    \mathbf{u}_{45} = \begin{bmatrix}
                        0 \\
                        \sqrt{2}
                      \end{bmatrix},
    \mathbf{v}_{45} = \begin{bmatrix}
                        -\sqrt{2} \\
                        0
                      \end{bmatrix},
$$

where the second pair of vectors are just the first vectors rotated counterclockwise by $45^{\circ}$ or $\pi/4$ radians.

**Proof (in 2 dimensions):** Let $\mathbf{x}$ and $\mathbf{y}$ be vectors in two dimensions and let $\mathbf{u} = \mathbf{x} / ||\mathbf{x}||$ and $\mathbf{v} = \mathbf{y} / ||\mathbf{y}||$. Then $\mathbf{u}$ and $\mathbf{v}$ are unit vectors. Because the dot product is invariant to rotation, we can rotate $\mathbf{x}$ and $\mathbf{y}$ until $\mathbf{x}$ points along the $x$-axis without changing the value of $\mathbf{x}\cdot\mathbf{y}$. Then 

$$
    \mathbf{u}=\begin{bmatrix}
                    1 \\
                    0
                \end{bmatrix}\text{ and }
    \mathbf{v}=\begin{bmatrix}
                    \cos{\theta} \\
                    \sin{\theta}
                \end{bmatrix},
$$

where $\theta$ is the angle between $\mathbf{x}$ and $\mathbf{y}$ (equivalently, the angle between $\mathbf{u}$ and $\mathbf{v}$). We get the latter form for $\mathbf{v}$ because as a unit vector $\mathbf{v}$ terminates in a point on the unit circle when in standard position. Now a direct calculation shows that

$$
    \mathbf{u}\cdot\mathbf{v} = \cos{\theta},
$$

but

$$
    \mathbf{u}\cdot\mathbf{v} = \frac{\mathbf{x}}{||\mathbf{x}||}\cdot\frac{\mathbf{y}}{||\mathbf{y}||},
$$

so 

$$
    \mathbf{x}\cdot\mathbf{y} = ||\mathbf{x}||||\mathbf{y}||\cos{\theta}.
$$

$\blacksquare$

One immediate application of this theorem is that it gives us a way to quickly tell if two vectors meet at a right angle: if so, their dot product is 0 because $\cos(\pi/2)=0$. Similarly, when two vectors point in the same direction $\theta=0$, and thus their dot product will be the product of their norms, because $\cos(0) = 1$. More generally, the following inequality is often useful.

**Theorem (Cauchy-Schwarz Inequality):** Let $\mathbf{u}$ and $\mathbf{v}$ be vectors. Then $|\mathbf{u}\cdot\mathbf{v}| <= ||\mathbf{u}||||\mathbf{v}||$.

**Proof:** Since $|\cos\theta| <= 1$, $|\mathbf{u}\cdot\mathbf{v}| = ||\mathbf{u}||||\mathbf{v}|||\cos\theta|<= ||\mathbf{u}||||\mathbf{v}||$. $\blacksquare$

An immediate corollary to the Cauchy-Schwarz inequality is the *Triangle Inequality* given below.

**Theorem (Triangle Inequality):** Let $\mathbf{u}$ and $\mathbf{v}$ be vectors. Then $$||\mathbf{u} + \mathbf{v}|| <= ||\mathbf{u}|| + ||\mathbf{v}||$$.

**Proof:** First, note that $||\mathbf{u} + \mathbf{v}||^2 = (\mathbf{u} + \mathbf{v})\cdot(\mathbf{u} + \mathbf{v})$. Using the distributivity of the dot product, we can write this as
$$
    \mathbf{u}\cdot\mathbf{u} + \mathbf{u}\cdot\mathbf{v} + \mathbf{v}\cdot\mathbf{u} + \mathbf{v}\cdot\mathbf{v},
$$

which equals

$$
    ||\mathbf{u}||^2 + 2(\mathbf{u}\cdot\mathbf{v}) + ||\mathbf{v}||^2.
$$

Now by the Cauchy-Schwarz Inequality, the above expression is less than or equal to 

$$
    ||\mathbf{u}||^2 + 2||\mathbf{u}||||\mathbf{v}|| + ||\mathbf{v}||^2 = (||\mathbf{u}||+||\mathbf{v}||)^2;
$$

that is,

$$
    ||\mathbf{u} + \mathbf{v}||^2 <= (||\mathbf{u}||+||\mathbf{v}||)^2,
$$

and taking square roots on both sides leads to the result.
$\blacksquare$

With two-dimensional vectors sketched in the plane, the triangle inequality is obvious: it simply states that the length of the longest side of a triangle cannot be greater than the sum of the lengths of the other sides. It is important to keep in mind, here and going forward, that this result and the others that we've obtained so far hold in higher dimensions where we can't sketch out and easily visualize the behavior of the vectors we are working with. This is part of what makes liner algebra so powerful.