# Sources

- [Gilbert Strang’s Class - MIT Linear Algebra Fall 2011](https://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/resource-index/)
 - Uses Introduction to Linear Algebra, 5th Edition
- [3 blue 1 brown vids - Essence of Linear Algebra](https://www.youtube.com/watch?v=LyGKycYT2v0&list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab&index=10)
- May test myself on [Khan](https://www.khanacademy.org/math/linear-algebra) but not planning to use his videos between the textbook and the above vids
- Going to distribute [Numpy](https://numpy.org/doc/stable/user/absolute_beginners.html) stuff as I go.  Taking notes separately on that.

# Introduction to Vectors (1)
Lectures:
* [The Geometry of Linear Equations](https://ocw.mit.edu/courses/18-06sc-linear-algebra-fall-2011/resources/the-geometry-of-linear-equations-1/) - Note, this also covers (2.1)
* [An Overview of Linear Algebra](https://ocw.mit.edu/courses/18-06sc-linear-algebra-fall-2011/resources/an-overview-of-linear-algebra-1/)

## Linear combinations (1.1)

$cv + dw$ for linear combinations of vectors $v$ and $w$, where $c$ and $d$ are scalars.


## Lengths and Dot Products (1.2)
The dot product of vectors $v = \begin{bmatrix}1 \\ 2\end{bmatrix}$ and $w = \begin{bmatrix}4 \\ 5\end{bmatrix}$ is $v \cdot w = (1)(4) + (2)(5) = 4 + 10 = 14$.

Some algebraic properties of the dot product:
1. Commutative Property: For any two vectors $u$ and $v$, $u \cdot $v = v \cdot u$.
2. Scalar Multiplication Property: For any two vectors $u$ and $v$ and any real number $c$, $(cu) \cdot v = u \cdot (cv) = c(u \cdot v)$
3. Distributive Property: For any 3 vectors $u$, $v$, and $w$, $u \cdot (v+w) = u \cdot v + u \cdot w$.

When you multiply two vectors and the dot product is zero, they are perpindicular.  More generally, the angle $\theta$ between vectors $v$ and $w$ has:
$$
\cos \theta = \frac{v \cdot w}{||v||\;||w||}
$$

The length $||v||$ of a vector is $\sqrt{v \cdot v}$. This follows from the pythagorean theorem.

The **unit vector** is a vector with length 1. Divide any vector by its length to get a unit vector.


### Explanation of Angle Between Two Vectors

The unit vector that makes an angle $\theta$ with the x axis is $\begin{bmatrix}\cos \theta \\ \sin \theta\end{bmatrix}$, we can see this from the unit circle

![image.png](images/unit-circle.png)

Let's get a geometric understanding for the rule
$$
\cos \theta = \frac{v \cdot w}{||v||\;||w||}
$$

Now suppose instead of forming $\theta$ with the x axis, we have two unit vectors, $U$ and $u$, and they are both rotated from the x axis:

![image.png](images/unit-vector-addition.png)

$u \cdot U$ would then be $\cos{\alpha}\cos{\beta} + \sin{\alpha}\sin{\beta}$. From the cosine angle addition rule in trignometry, this is equal to $\cos(\theta)$.

So we have arrived at the preliminary rule that unit vectors $u$ and $U$ at angle $\theta$ have:

$$u \cdot U = \cos{\theta}$$

Combine this with our observation before that you can divide any vector by its length to get its unit vector, and we arrive at our **cosine formula** for any vectors $v$ and $w$ by just dividing their lengths:

$$
\cos \theta = \frac{v \cdot w}{||v||\;||w||}
$$


### Schwarz and Triangle Inequalities

Because all cosines are between -1 and 1, it follows that the absolute value of the dot product, $|v \cdot w|$, cannot exceed the product of the lengths, this is the **Schwarz Inequality**:

$$|v \cdot w| \le ||v||\: ||w||$$

From the Schwarz Inequality [follows](https://math.stackexchange.com/a/91194) the **Triangle Inequality**:

$$||u + v|| \le ||u|| + ||v||$$



## Independence and Dependence

Vectors are **independent** if no combination other than 0 multiples gives $b=0$.  Vectors are **dependent** if multiple combinations give $b=0$.

## Matrices (1.3)

A matrix is **invertible** (aka **non-singular**) if it has independent (see definition above) column vectors, meaning $Ax = 0$ has only one solution between them.

A matrix is **singular** if $Ax=0$ has many solutions, or none at all.


# Solving Linear Equations (2)

* [Elemination with Matrices](https://ocw.mit.edu/courses/18-06sc-linear-algebra-fall-2011/resources/elimination-with-matrices-1/) - Lecture covering 2.2 and 2.3
## Vectors and Linear Equations (2.1)

Geometrically, it's worth noting that the dot product of each row with $x$ gives the equation of a plane.
When the number of unknowns matches the number of equations, there is _usually_ one solution.

### Matrix, Row, and Column Pictures

Lets say we have $n$ equations and $n$ unknowns, and go over:
* Matrix Form
* Row Picture
* Column Picture

Let's look specifically at these two equations with two unknowns:
$$
2x - y = 0 \\
-x + 2y = 3
$$

In **matrix form**, with the **coefficient matrix**, followed by the unknowns matrix, equal to solutions/right hand side would be:
$$
\begin{bmatrix}
2 & -1 \\
-1 & 2
\end{bmatrix}
\begin{bmatrix}
x \\
y
\end{bmatrix} =
\begin{bmatrix}
0 \\
3
\end{bmatrix}
$$

These three matrices are abstractly referred to as $Ax=b$. When we are solving for $x$ (the inverse), we are abstractly solving $x = A^{-1}b$. And note that only with an invertible matrix (see below) can we solve this.

The **row picture** is looking at one equation at a time, it's what we've seen before with systems of equations, or looking for where lines meet when we graph them geometrically.

The **column picture** would have us formulate the equations as combinations of the columns, so:

$$
x 
\begin{bmatrix}
2 \\
-1
\end{bmatrix}
+ 
y
\begin{bmatrix}
-1 \\
2
\end{bmatrix}
= 
\begin{bmatrix}
0 \\
3
\end{bmatrix}
$$

Geometrically, the column picture can solve these linear equations through vector addition, which we know geometrically means combining the column vectors each a certain number of times to produce the right hand side.

### The Identity Matrix

Multiplying $Ix$ where $I$ is the identity matrix, you get back the x you started with, $Ix=x$.  An example 3x3 identity matrix:

$$
\begin{bmatrix}
1 & 0 & 0 \\
0 & 1 & 0 \\
0 & 0 & 1
\end{bmatrix}
$$

## The Idea of Elimination (2.2)

**Elimination** is the systematic way of solving linear equations. Elimination proceeds by producing an **upper triangular system** from top to bottom, then solving with **back substitution** from the bottom up.

In the first part, where you're producing the upper triangular system, you subtract a multiple of the above equation from the equation below.  This **multiplier** ($l$) is determined from the **pivot** above.  For example, if we have

$$
4x - 8y = 4 \\
3x + 2y = 11
$$

Our multiplier of the first equation would be $l=\frac{3}{4}$, and we'd then subtract that multiplied equation from the 2nd.  We'd then be left with the pivot of $8$ at the bottom right.  To solve $n$ equations we want $n$ pivots.  If there were a 3rd equation we'd use the $8$ pivot to determine our next multiplier and subtract, and so on.

### The breakdown of elimination

It's possible for the process of elimination to fail along the way.  Specifically, we might reach a 0 pivot.  In this case, we may be able to rescue this with row exchange, or may not be able to.  It may be that the 0 pivot:

* Implies no solutions (e.g. $0y=8$).  Geometrically this would be non-intersecting lines. OR 
* It may be that it arrives at infinite solutions (e.g. $0y=0$). Geometrically this would be represented by more than one intersection, e.g. two identical lines.
* It may be that a row exchange can rescue things, for example:

$$
0x + 2y = 4 \\
3x - 2y = 5
$$

Here would just want to perform **row substitution** to get a triangular system we could then back-substitute on.

Recall our terminology from earlier on, when we can complete elimination, we are dealing with a non-singular matrix, whereas the no solutions or infinite solutions cases are singular.

### Extending into 3+ equations

The process involves clear out columns below the pivots, using multipliers of that pivot, before moving onto the next pivot.


## Elimination Using Matrices (2.3)

**Elimination matrices** execute our elimination steps.  An elimination matrix $E_{ij}$ eliminates row $i$, column $j$ by multiplying the $j$th equation by $l_{ij}$ and subtracting it from the $i$th equation.  So for example $E_{21}$ would would be the first elimination step, clearing out row 2, column 1.

We need a lot of these $E_{ij}$ matrices to complete elimination, which is why we'll later see they can be combined into one big matrix $E$.  The neatest way to do that is by combining all their inverses $(E_{ij})^{-1}$ into one overall matrix $L = E^{-1}$.  

The special property of $L$ is that all the multipliers $l_{ij}$ fall into place.  Those numbers are mixed up in $E$ (forward elimination from A to U).  Inverting puts the steps and their elimination matrices in the opposite order and prevents the mixup.

### The Matrix Form of One Elimination Step

Suppose we want to subtract two times row 1 from row 2.  The elimination matrix for this step would be:

$$
\begin{bmatrix}
1 & 0 & 0 \\
-2 & 1 & 0 \\
0 & 0 & 1
\end{bmatrix}
$$

The first and third rows come from the identity matrix $I$. The $-2$ comes from the negative of the multiplier $l$ (2).

### Matrix Multiplication

Via [MathIsFun](https://www.mathsisfun.com/algebra/matrix-multiplying.html):

![image.svg](images/matrix-multiply.svg)

It works through the dot product of each row and column.

In order to multiply two matrices, the number of columns of A must equal the number of rows of B. The product
AB will have the same number of rows as the first matrix and the same number of columns as the second.

Algebraic rules for matrix multiplication:
* Associative Law is true: $A(BC) = (AB)C$
* Commutative Law is false: Often $AB \ne BA$

A note on matrix multiplication order.  When we multiply on the left side vs right side, it's the difference between acting on rows vs columns, which switches based on order.  Multiplying from the left, we're doing row operations.  Multiplying from the right, we're doing column operations.

3Blue1Brown emphasized:
- Viewing Matrices as transformation of space
- Matrix multiplication is just one transformation after another [this may belong in subsequent section]

### The Row Exchange Matrix
To exchange aka permute rows we use another matrix $P_{ij}$ called the **permutation matrix**.  For example, the permutation matrix $P_{23}$ exchanges rows 2 and 3:

$$
\begin{bmatrix}
1 & 0 & 0 \\
0 & 0 & 1 \\
0 & 1 & 0
\end{bmatrix}
$$

Permutation matrices can swap multiple rows as well, not just one.  We'll see that soon.

### The Augmented Matrix

We can augment the matrix $A$ in $Ax=b$ to include $b$ as an extra column, and allow it to change through the process of elimination.


## Rules for Matrix Operations (2.4)

- Lecture for 2.4 and 2.5: [Multiplication and Inverse Matrices](https://ocw.mit.edu/courses/18-06sc-linear-algebra-fall-2011/resources/multiplication-and-inverse-matrices/)

A matrix with $n$ columns can multiply a matrix with $n$ rows:
 
$$A_{m \times n}B_{n \times p} = C_{m \times p}$$


### Multiple ways to multiply matrices

1. We went over the typical dot product way of multiplying matrices above, where the entry in row $i$, and column $j$ of $AB$ is (row $i$ of $A$) $\cdot$ (column $j$ of $B$).

Terminology note: A row times a column (a dot product) is also called an **inner product**.  A column times a row is called an **outer product**.

Now let's talk about additional ways to multiply matrices..

2. Matrix $A$ times every column of $B$: $A\begin{bmatrix}b_1 \cdots b_p \end{bmatrix} = A\begin{bmatrix}Ab_1 \cdots Ab_p \end{bmatrix}$.  Recall from the column picture perspective, that we can therefore see each column of $AB$ as a combination of columns of $A$.

3. Every row of matrix $A$ times matrix $B$: 
$\begin{bmatrix} \text{row }i\text{ of }A\end{bmatrix}B = \begin{bmatrix}\text{row }i \text{ of }AB\end{bmatrix}.$

4. Multiply columns $1$ to $n$ of $A$ times rows $1$ to $n$ of $B$. Add those matrices. So for example:
$$
AB = \begin{bmatrix}a \\ c\end{bmatrix}\begin{bmatrix}E & F\end{bmatrix} + \begin{bmatrix}b \\ d\end{bmatrix}\begin{bmatrix}G & H\end{bmatrix}
$$

You'll find that it works out just like the other methods.

### Blocks

Matrices can be added and multiplied by **blocks**, so long as the block sizes correspond to the normal rules-- same size for addition, and rows of 1 = cols of 2 for multiplication. 

Important: Cuts between columns of $A$ must match cuts between rows of $B$.

Matrix block multiplication example:

$A = \begin{bmatrix}A_1 & A_2\end{bmatrix}$ times $B = \begin{bmatrix}B_1 \\ B_2\end{bmatrix}$ is $A_{1}B_1 + A_{2}B_2$.

The blocks must be equal across transposition, so for example you could have:
* Two square matrices split up with each corner a block
* Block columns


## Inverse Matrices (2.5)

If the square matrix $A$ has an inverse, then both $A^{-1}A = I$ and $A^{-1}A = I$.  Note that non-square matrices are not invertible.

Testing for invertibility:

- The _algorithm_ to test invertibility is elimination. $A$ must have $n$ (nonzero) pivots
- The _algebra_ test for invertibility is the determinant of $A$. $\det A$ must not be $0$.
- The _equation_ that test for invertibility is $Ax = 0$.  $x = 0$ must be the only solution.

A matrix cannot have more than one inverse.  If you found the left-inverse, it must be the same as the right-inverse.

### The Inverse of a Product AB

If $A$ and $B$ are invertible, then so is $AB$:

$$(AB)^{-1} = B^{-1}A^{-1}$$

### Gauss-Jordan Elimination
Gauss-Jordan eliminates $\begin{bmatrix}A & I\end{bmatrix}$ to $\begin{bmatrix}I & A^{-1}\end{bmatrix}$.

The Gauss-Jordan method is to begin with that augmented matrix, $\begin{bmatrix}A & I\end{bmatrix}$, and performing elimination until you get the left block upper triangular.  Then, continue doing elimination upwards, so that you have only a diagonal of pivots on the left.  Finally, divide each row to get **reduced echelon form** ($R=I$) on the left hand side.  Then your inverse will be on the right hand side.

This helps explain why the determininant can't be 0 for a matrix with an inverse, you have to divde by the pivots, and you can't divide by 0.

**Diagonally dominant** matrices are invertible.  If the absolute value of the diagonal entries are larger than the sum of the absolute values of the rest of their rows, then the matrix is invertible.  This follows from the fact that the other row entires cannot add up to equal those entries.



## Elimination = Factorization: A = LU
## Transposes and Permutations
# Vector Spaces and Subspaces
## Spaces of Vectors
## The Nullspace of A: Solving Ax = 0 and Rx = 0 
## The Complete Solution to Ax = b
## Independence, Basis and Dimension
## Dimensions of the Four Subspaces
# Orthogonality 
## Orthogonality of the Four Subspaces 
## Projections 
## Least Squares Approximations 
## Orthonormal Bases and Gram-Schmidt 
# Determinants 247
## The Properties of Determinants 
## Permutations and Cofactors 
## Cramer’s Rule, Inverses, and Volumes 
# Eigenvalues and Eigenvectors 
## Introduction to Eigenvalues 
## Diagonalizing a Matrix 
## Systems of Differential Equations 
## Symmetric Matrices 
## Positive Definite Matrices 