# Math Camp!
## Linear Algebra & Computing

Computing:
- Use Python in a Jupyter notebook to do a little math and plot the results
- The goal is **not** to teach you how to program.

Linear Algebra:
- Fill gaps from your undergraduate education
- Things we will **not** cover because you will cover them in your courses:
  - Anything related to infinite dimensional vector spaces
  - **Numerical** linear algebra

# Vectors and Matrices
A vector is a list of numbers with a single index; a matrix is a doubly-indexed set of numbers.
Examples
\begin{equation}
\vec{v} = \left(\begin{array}{c}1\\0\\-1\end{array}\right),\quad \mathbf{A} = \left[\begin{array}{cc}1&1\\0&2\end{array}\right]
\end{equation}
The above vector $\vec{v}$ is a 'column' vector with three elements. We say it is $3\times 1$. The matrix has two rows and two columns, so it is $2\times 2$. A matrix with $m$ rows and $n$ columns is $m\times n$. A vector is a special case of a matrix. Here is an example of a $1\times3$ 'row' vector and a $3\times 2$ matrix

$$\vec{v} = (1,\,2,\,3),\;\;\mathbf{A} = \left[\begin{array}{cc}1&1\\0&2\\-1&-1\end{array}\right].$$

Indices are used to pick out individual elements of vectors or matrices. For example

$$\left(\vec{v}\right)_1 = 1,\;\;\left(\mathbf{A}\right)_{3,2} = -1.$$
The standard mathematical convention is for the indices of an $m\times n$ matrix to run from $1$ to $m$ for rows and from $1$ to $n$ for columns. Also if we have a matrix $\mathbf{A}$ it is common to denote the $i,j$ entry by $a_{ij}$.
If we are using specific numbers like $i=2$ and $j=5$ then we might use a comma $a_{2,5}$.

Python has lists and arrays that can act as vectors and matrices, but we will focus on the Numpy packages ndarray object.

In [1]:
import numpy as np
A = np.array([[1,1],[0,2],[-1,-1]]) # Double brackets show that it's a 
                                    # matrix
print(A.shape) # This array is 3 x 2; three rows, two columns
print(A)

(3, 2)
[[ 1  1]
 [ 0  2]
 [-1 -1]]


In [2]:
A[1,1] # A particular element is accessed using square brackets
       # Notice that this is not A_{1,1}! Indexing starts at 0, so
       # this is the (2,2) element of the matrix.

2

You can also pick out 'slices' of arrays as follows

In [3]:
print(A[0,:]) # First row of the matrix A

[1 1]


Matrices that are the same size can be added together element-by-element

$$\left[\begin{array}{cc}1&1\\0&2\\-1&-1\end{array}\right] + \left[\begin{array}{cc}0&1\\2&3\\2&1\end{array}\right] = \left[\begin{array}{cc}1&2\\2&5\\1&0\end{array}\right].$$

You can't add a scalar to a matrix or a vector to a matrix, because the sizes don't match up (a scalar is $1\times1$). But you can multiply a vector or a matrix by a scalar

$$2\times\left[\begin{array}{cc}1&1\\0&2\\-1&-1\end{array}\right] = \left[\begin{array}{cc}2&2\\0&4\\-2&-2\end{array}\right].$$

In [4]:
B = np.array([[0,1],[2,3],[2,1]])
A + B

array([[1, 2],
       [2, 5],
       [1, 0]])

In [5]:
C = np.array([[1,1],[2,3]]) # A 2x2 matrix
A+C # Can't add matrices that are not the same size

ValueError: operands could not be broadcast together with shapes (3,2) (2,2) 

In [6]:
A + 2 # Mathematically you can't add a scalar to a matrix, but in
      # Numpy this operation is defined so that the scalar is
      # added to each element of the matrix

array([[3, 3],
       [2, 4],
       [1, 1]])

In [7]:
2*A # scalar multiplication works as expected

array([[ 2,  2],
       [ 0,  4],
       [-2, -2]])

Matrix multiplication is not defined element-by-element. Instead, here is the definition

$$\left(\mathbf{A}\mathbf{B}\right)_{i,j} = \sum_{k=1}^pa_{ik}b_{kj}.$$

The definition only makes sense when $\mathbf{A}$ has $p$ columns and $\mathbf{B}$ has $p$ rows; otherwise, the matrices are 'not conformable' and the product is undefined. So you can't multiply a $2\times2$ matrix and a $3\times1$ vector.
If $\mathbf{A}$ is $m\times p$ and $\mathbf{B}$ is $p\times n$ then $\mathbf{AB}$ is $m\times n$.

In general the order matters, i.e. sometimes $\mathbf{AB} = \mathbf{BA}$, but usually not.

In Numpy the `*` does not correspond to matrix multiplication. For example

In [8]:
print(A*B) # Matrix multiplication is not defined for these 3x2
           # matrices, but in Numpy they are multiplied elementwise.

[[ 0  1]
 [ 0  6]
 [-2 -1]]


In [9]:
np.dot(A,B) # Numpy uses "dot" for matrix multiplication. 

ValueError: shapes (3,2) and (3,2) not aligned: 2 (dim 1) != 3 (dim 0)

In [10]:
print(np.dot(A,C)) # A is 3x2 and C is 2x2, so the operation works

[[ 3  4]
 [ 4  6]
 [-3 -4]]


In [11]:
print(A@C) # The @ symbol is a shorter way to do matrix multiplication

[[ 3  4]
 [ 4  6]
 [-3 -4]]


Why does Numpy use the term 'dot'? If two vectors are the same length, we can define the 'dot product' as follows

$$\vec{x}\cdot\vec{y} = \sum_{i=1}^nx_iy_i.$$

For example,

$$\vec{x} = \left(\begin{array}{c}1\\0\\-1\end{array}\right),\;\;\vec{y} = \left(\begin{array}{c}a\\b\\c\end{array}\right),\;\;\vec{x}\cdot\vec{y} = a - c.$$

Now suppose that we have two matrices $\mathbf{A}$ and $\mathbf{B}$ that are conformable.
The $i^\text{th}$ row of $\mathbf{A}$ is, in Python notation, `A[i-1,:]`, while the $j^\text{th}$ column of $\mathbf{B}$ is `B[:,j-1]`. The dot product of these two vectors is precisely

$$\left(\mathbf{A}\mathbf{B}\right)_{i,j} = \sum_{k=1}^pa_{ik}b_{kj}.$$

The dot product is an example of an 'inner product.' We will study more general inner products later. We will now briefly look at an 'outer product' that ties back in to the definition of matrix multiplication.

First note that if $\vec{x}$ is an $m\times 1$ column vector and $\vec{y}$ is a $1\times n$ row vector, then their product is an $m\times n$ matrix:

$$\left(\vec{x}\times\vec{y}\right)_{ij} = x_iy_j.$$

For generic vectors we can define the outer product

$$\left(\vec{x}\otimes\vec{y}\right)_{ij} = x_iy_j.$$

Now suppose that we have our two conformable matrices $\mathbf{A}$ and $\mathbf{B}$. Let the $k^\text{th}$ column of $\mathbf{A}$ be denoted $\vec{a}_k = $ `A[:,k-1]` and let the $j^\text{th}$ row of $\mathbf{B}$ be denoted $\vec{b}_j = $ `B[j-1,:]`.
Then the product $\mathbf{AB}$ can also be written as a sum of outer products

$$\mathbf{AB} = \vec{a}_1\otimes\vec{b}_1 + \vec{a}_2\otimes\vec{b}_2 + \ldots + \vec{a}_p\otimes\vec{b}_p = \sum_{k=1}^p\vec{a}_k\otimes\vec{b}_k.$$

A particularly useful case of the foregoing is when $\mathbf{B}$ is just a column vector $\mathbf{B} = \vec{x}$. In the notation of the previous slide, the rows of $\mathbf{B}$ are $\vec{b}_i$. In this particular case they are the 'rows' of a column vector $\vec{b}_i = x_i$. So

$$\mathbf{A}\vec{x} = \vec{a}_1x_1+\cdots+\vec{a}_px_p$$

A matrix/vector product thus takes the entries of the vector, multiplies them by the columns of the matrix, and sums them all up.

One final note about matrix multiplication. Suppose that **A** and **B** are conformable and write **B** in terms of its columns:

$$\mathbf{B} = [\vec{b}_1\,\cdots\,\vec{b}_n]$$

Then the columns of the matrix product are

$$\mathbf{AB} = [\mathbf{A}\vec{b}_1\,\cdots\,\mathbf{A}\vec{b}_n]$$

At this point we have to deal with a convenient feature of Numpy: vectors are not typically set to being either 'row' or 'column' vectors. Instead, they are whichever is needed for a particular operation. For example,

In [12]:
x = np.array([3,2,1]) # create a vector with entries 3,2,1.
print(x.shape) # Notice that it's not (3,1) or (1,3)
print(np.dot(x,A)) # Here x is treated as a row vector
A[0,:].shape # If you pull out the first row, it is treated as a
             # generic vector

(3,)
[2 6]


(2,)

In [13]:
x.shape = (3,1) # Force it to be a column vector
np.dot(x,A) # Now the dot product knows that it's a column and won't
            # allow the product

ValueError: shapes (3,1) and (3,2) not aligned: 1 (dim 1) != 3 (dim 0)

We've seen how to add an multiply matrices, now we define another operation: the transpose. Suppose that you have an $m\times n$ matrix (or vector) $\mathbf{A}$. The transpose is defined by

$$\left(\mathbf{A}^T\right)_{ij} = \left(\mathbf{A}\right)_{ji}.$$

The transpose turns the rows of a matrix into the columns of its transpose. For example

$$\left[\begin{array}{cc}1&2\\3&4\end{array}\right]^T = \left[\begin{array}{cc}1&3\\2&4\end{array}\right].$$

Taking the transpose commutes with addition:

$$\left(\mathbf{A} + \mathbf{B}\right)^T = \mathbf{A}^T + \mathbf{B}^T$$
but it does not commute with multiplication. Instead

$$ \left(\mathbf{AB}\right)^T = \mathbf{B}^T\mathbf{A}^T.$$

We can relate the dot product and outer product to the transpose as follows. For column vectors $\vec{x}$ and $\vec{y}$

$$\vec{x}\cdot\vec{y} = \vec{x}^T\vec{y}\text{ (requires the vectors to be the same length)}$$

and

$$\vec{x}\otimes\vec{y} = \vec{x}\vec{y}^T \text{ (doesn't require the vectors to be the same length)}.$$

A matrix that equals its transpose is called **symmetric**

$$\mathbf{A} = \mathbf{A}^T.$$

A matrix that equals minus the transpose is called **skew-symmetric** or **anti-symmetric**

$$\mathbf{A} = -\mathbf{A}^T.$$

Numpy includes the transpose for matrices as follows `np.transpose(A)`. You can also use `A.T`. Note that the transpose operation has no effect on 1D arrays in Numpy because they are not assumed to be either row or column vectors. If you force an array to have shape (1,n) then taking its transpose will result in an array of shape (n,1).

The **identity** matrix is denoted $\mathbf{I}$. It has entries

$$\left(\mathbf{I}\right)_{ij} := \delta_{ij} = \left\{\begin{array}{ll}1& i=j\\0&i\neq j\end{array}\right.$$

It is always a square matrix ($n\times n$). It has the property that $\mathbf{I}\vec{x} = \vec{x}$ and $\mathbf{IA} = \mathbf{AI} = \mathbf{A}$.

The symbol $\delta_{ij}$ is called the Kronecker delta.

The identity matrix is an example of a **diagonal** matrix. A diagonal matrix has zeros except on the diagonal, i.e. where row index equals column index.

Upper triangular matrices have entries on and above the diagonal:

$$\left[\begin{array}{cccc}*&&&*\\0&\ddots&&\\&&\ddots&\\0&&0&*\end{array}\right]$$

Lower triangular matrices have entries on and below the diagonal. The transpose of an upper triangular matrix is a lower triangular matrix.

# Vector Spaces

Consider the set of all possible real vectors of length $n$. This set is called $\mathbb{R}^n$, and is an example of a **vector space**. The mathematical definition of a vector space is somewhat technical, and we don't need the details. The important properties of a vector space from our perspective are
1. You can add vectors and the result is still a vector in the same set.
2. You can multiply vectors by any number and the result is a vector in the same set.

If point 2 uses complex numbers then it's a complex vector space. $\mathbb{C}^n$ is an example of a complex vector space.

The elements of a vector space can be functions, e.g. polynomials.

# Subspaces

A **subspace** is a subset of a vector space that is closed under addition and scalar multiplication. For example, consider the space $(x,y,z)\in\mathbb{R}^3$. The subset of points of the form $(x,y,0)$ forms a subspace, because if you multiply by a scalar the last element remains zero, and if you add two of these vectors the last element remains zero.

Technically $\mathbb{R}^2$ is not a subspace of $\mathbb{R}^3$, because elements of $\mathbb{R}^3$ are vectors of length 3 while elements of $\mathbb{R}^2$ are vectors of length 2: $\mathbb{R}^2$ is not a *subset* of $\mathbb{R}^3$, so it can't be a subspace.

As another example, if we consider the vector space of polynomials of degree $\le 2$, then the set of all polynomials of degree $\le 1$ is a subspace.

# Span

Consider a set of vectors $\vec{v}_1,\ldots,\vec{v}_k$ in some vector space.
A **linear combination** of these vectors is a sum of the form

$$x_1\vec{v}_1+x_2\vec{v}_2 + \ldots + x_k\vec{v}_k$$

where $x_1,\ldots,x_k$ are real or complex numbers.

The **span** of these vectors is the set of all possible linear combinations of these vectors.

For example, consider $\vec{v}_1 = (1,0,0)^T$ and $\vec{v}_2 = (0,1,0)^T$. The span of these vectors is

$$\vec{v} = (x,y,0)^T,\text{ for any }x\text{ and }y.$$

Clearly this is a subspace, and in general **the span of a set of vectors is a subspace**.

# Linear Independence

A set of vectors $\vec{v}_1,\ldots,\vec{v}_k$ is **linearly independent** when the only linear combination that yields zero has all zero coefficients, i.e.

$$x_1\vec{v}_1+x_2\vec{v}_2 + \ldots + x_k\vec{v}_k = \vec{0} \Leftrightarrow x_1=x_2=\ldots=x_k=0.$$

The symbol $\Leftrightarrow$ means *if and only if*, which is sometimes written 'iff'.

A simpler definition that is more intuitive but harder to use in proofs is: A set of vectors is linearly dependent when one of the vectors can be written as a linear combination of the others, i.e.

$$\vec{v}_1 = x_2\vec{v}_2 + \ldots + x_k\vec{v}_k.$$

It's not too hard to prove that these definitions are equivalent.

## Example
Consider the set of vectors
\begin{equation}
\vec{v}_1 = \left(\begin{array}{c}1\\0\\1\end{array}\right),\;\vec{v}_2 = \left(\begin{array}{c}0\\0\\1\end{array}\right),\;\;\vec{v}_3 = \left(\begin{array}{c}1\\1\\0\end{array}\right).
\end{equation}
Are they linearly independent? A linear combination of these vectors has the form
\begin{equation}
x_1\vec{v}_1 + x_2\vec{v}_2+x_3\vec{v}_3 = \left[\begin{array}{ccc}1&0&1\\0&0&1\\1&1&0\end{array}\right]\left(\begin{array}{c}x_1\\x_2\\x_3\end{array}\right) = \mathbf{A}\vec{x}.
\end{equation}
We will come back to linear systems in Lecture 2.

# Basis & Dimension

If a set of vectors $\vec{v}_1,\ldots,\vec{v}_k$ spans a vector (sub-)space and are linearly independent then
1. They are called a **basis** for the vector (sub-)space and
2. The vector (sub-)space has dimension $k$.

For example, the **standard basis** of $\mathbb{R}^n$ is the columns of the $n\times n$ identity matrix. The vectors in the standard basis are usually denoted $\vec{e}_i$. In $\mathbb{R}^3$ these vectors are

\begin{equation}
\vec{e}_1 = \left(\begin{array}{c}1\\0\\0\end{array}\right),\;\vec{e}_2 = \left(\begin{array}{c}0\\1\\0\end{array}\right),\;\;\vec{e}_3 = \left(\begin{array}{c}0\\0\\1\end{array}\right).
\end{equation}

It's easy to check that they are a basis.

# Coordinates
If $\vec{v}_1,\ldots,\vec{v}_k$ are a basis for some vector space, then any vector $\vec{b}$ in that space can be written as a linear combination of the vectors, and the coefficients in that linear combination are *unique*.

Proof: Reductio. Assume that $\vec{b} = x_1\vec{v}_1+\ldots+x_k\vec{v}_k = y_1\vec{v}_1+\ldots+y_k\vec{v}_k$. Now subtract to get

$$(x_1-y_1)\vec{v}_1+\ldots+(x_k-y_k)\vec{v}_k = \vec{0}.$$

Basis implies linear independence implies $x_i=y_i$ (uniqueness).

The coefficients in the linear sum are called **coordinates** and choosing a basis for the space is sometimes called choosing a coordinate system.