---
title: "Matrices I"
format: 
  html:
    toc: true
    code-fold: false
    page-layout: full
    fig-cap-location: bottom
    number-sections: true
    number-depth: 2
jupyter: python3
---

# Motivation
Linear algebra pops up almost everywhere in physics, so the matrix-related techniques developed below will be used repeatedly in later lectures. As a result, we will spend lots of time on matrices. We will take the time to introduce several numerical techniques in detail. 

## Examples from Physics
We discuss some elementary examples from undergraduate physics.

### Rotations in two dimensions
Consider a two-dimensional Cartesian coordinate system. A point $\boldsymbol{r} = (x,y)^T$ can be rotated counter-clockwise through an angle $\theta$ about the origin, producing a new point $\boldsymbol{r}' = (x',y')^T$. The two points' coordinates are related as follows:
$$
\begin{pmatrix}
\cos\theta & -\sin\theta \\
\sin\theta & \cos\theta
\end{pmatrix}
\begin{pmatrix}
x \\
y
\end{pmatrix}
=
\begin{pmatrix}
x' \\
y'
\end{pmatrix}
$$
The $2\times 2$ matrix appearing here is an example of a _rotation matrix_ in Euclidean space. If you know $\boldsymbol{r}'$ and wish to calculate $\boldsymbol{r}$, you need to solve this system of two linear equations. 

### Electrostatic potentials
Assume you have $n$ electric charges $q_j$ (which are unknown) held at the positions $\boldsymbol{R}_j$ (which are known).  Further assume that you have measured the electric potential $\phi(r_i)$ at the $n$ known positions $\boldsymbol{r}_i$. From the definition of the potential (as well as the fact that the potential obeys the principle of superposition), we see that:
$$
\phi(\boldsymbol{r}_i) = \sum_{j=0}^{n-1}\left(\frac{k q_j}{|\boldsymbol{r}_i - \boldsymbol{R}_j|}\right),
$$
where $i = 0,1,\dots,n-1$.
If you assume you have four charges, the above relation turns into the following $4\times 4$ linear systems of equations:
$$
\begin{pmatrix}
k/|\boldsymbol{r}_0 - \boldsymbol{R}_0| &k/|\boldsymbol{r}_0 - \boldsymbol{R}_1| &k/|\boldsymbol{r}_0 - \boldsymbol{R}_2| &k/|\boldsymbol{r}_0 - \boldsymbol{R}_3| \\
k/|\boldsymbol{r}_1 - \boldsymbol{R}_0| &k/|\boldsymbol{r}_1 - \boldsymbol{R}_1| &k/|\boldsymbol{r}_1 - \boldsymbol{R}_2| &k/|\boldsymbol{r}_1 - \boldsymbol{R}_3| \\
k/|\boldsymbol{r}_2 - \boldsymbol{R}_0| &k/|\boldsymbol{r}_2 - \boldsymbol{R}_1| &k/|\boldsymbol{r}_2 - \boldsymbol{R}_2| &k/|\boldsymbol{r}_2 - \boldsymbol{R}_3| \\
k/|\boldsymbol{r}_3 - \boldsymbol{R}_0| &k/|\boldsymbol{r}_3 - \boldsymbol{R}_1| &k/|\boldsymbol{r}_3 - \boldsymbol{R}_2| &k/|\boldsymbol{r}_3 - \boldsymbol{R}_3|
\end{pmatrix}
\begin{pmatrix}
q_0 \\ q_1 \\ q_2 \\ q_3
\end{pmatrix}
=
\begin{pmatrix}
\phi(\boldsymbol{r}_0) \\ \phi(\boldsymbol{r}_1) \\ \phi(\boldsymbol{r}_2) \\ \phi(\boldsymbol{r}_3)
\end{pmatrix}
$$
which needs to be solved for the 4 unknowns $q_0$, $q_1$, $q_2$ and $q_3$.

### Principle moments of inertia
In study of the rotation of a rigid body about an arbitrary axis in three dimensions, you may have encountered the moment of inertia tensor:
$$
I_{\alpha \beta} = \int \rho(\boldsymbol{r}) \left(\delta_{\alpha \beta}r^2 - \boldsymbol{r}_\alpha \boldsymbol{r}_\beta\right)d^3 r,
$$
where $\rho(r)$ is the mass density, $\alpha$ and $\beta$ denote Cartesian components, and $\delta_{\alpha \beta}$ is the Kronecker delta. 

The moment of inertia tensor is represented by a $3\times 3$ matrix: 
$$
\boldsymbol{I} = 
\begin{pmatrix}
I_{xx} & I_{xy} & I_{xz} \\
I_{yx} & I_{yy} & I_{yz} \\
I_{zx} & I_{zy} & I_{zz}.
\end{pmatrix}
$$
This is a symmetric matrix. It is possible to choose a coordinate system such that the off-diagonal elements vanish. 
This axes of this coordinate system are known as the _principal axes_ for the body at the origin. Then the moment of inertian tensor is represented by a diagonal matrix, with diagonal elements $I_0$, $I_1$, and $I_2$, known as the principal moments. This is an instance of the "eigenvalue problem".

## The problems to be solved
First, we look at the problem where we have $n$ unknowns $x_i$, along with $n\times n$ coefficients $A_{ij}$ and $n$ constants $b_i$:
$$
\begin{pmatrix}
A_{00} & A_{01} & \dots & A_{0,n-1} \\
A_{10} & A_{11} & \dots & A_{1,n-1} \\
\vdots & \vdots & \ddots & \vdots \\
A_{n-1,0} & A_{n-1,1} & \dots & A_{n-1,n-1} 
\end{pmatrix}
\begin{pmatrix}
x_0 \\ x_1 \\ \vdots \\ x_{n-1}
\end{pmatrix}
= 
\begin{pmatrix}
b_0 \\ b_1 \\ \vdots \\ b_{n-1}
\end{pmatrix}
$$
where we used a comma to separate two indices when this was necessary to avoid confusion. 
These are $n$ equations linear in $n$ unknowns. 

In compact matrix form, this problem is written as 
$$
\boldsymbol{A}\boldsymbol{x} = \boldsymbol{b},
$$
where $\boldsymbol{A}$ is called the _coefficient matrix_. This is a problem that we will spend considerable time solving in this lecture. 
We will be doing this mainly by using the _augmented coefficient matrix_ which places together the elements of $\boldsymbol{A}$ and $\boldsymbol{b}$, i.e.:
$$
(\boldsymbol{A}|\boldsymbol{b})= \left(
\begin{matrix}
A_{00} & A_{01} & \dots & A_{0,n-1} \\
A_{10} & A_{11} & \dots & A_{1,n-1} \\
\vdots & \vdots & \ddots & \vdots \\
A_{n-1,0} & A_{n-1,1} & \dots & A_{n-1,n-1} 
\end{matrix}\right|
\left.
\begin{matrix}
b_0 \\ b_1 \\ \vdots \\b_{n-1}
\end{matrix}
\right).
$$
For now we assume the determinant of $\boldsymbol{A}$ satisfy $|\boldsymbol{A}| \neq 0$.

In a course on linear algebra you have seen examples of legitimate operations one can carry out while solving the system of linear equations. 
Such operations change the elements of $\boldsymbol{A}$ and $\boldsymbol{b}$, but leave the solution vector $\boldsymbol{x}$ unchanged. 
More generally, we are allowed to carry the following elementary row operations:
- _Scaling_: each row/equation may be multiplied by a constant (multiplies $|\boldsymbol{A}|$ by the same constant).
- _Pivoting_: two rows/equations may be interchanged (changes sign of $|\boldsymbol{A}|$).
- _Elimination_: a row/equation may be replaced by a linear combination of that row/equation with any other row/equation (doesn't change $|\boldsymbol{A}|$).

Keep in mind that these are operations that are carried out on the augmented coefficient matrix $(\boldsymbol{A}|\boldsymbol{b})$.

Second, we wish to tackle the standard form of the matrix eigenvalue problem:
$$
\boldsymbol{A}\boldsymbol{v} = \lambda \boldsymbol{v}.
$${#eq-eigenvalue}
Here, both $\lambda$ and the column vector $\boldsymbol{v}$ are unknown. This $\lambda$ is called an _eigenvalue_ and $\boldsymbol{v}$ is called an _eigenvector_.

Let's sketch one possible approach to solve this problem.  We can move everything to the left-hand side, we have
$$
(\boldsymbol{A} - \lambda \boldsymbol{I})\boldsymbol{v} = \boldsymbol{0},
$$
where $\boldsymbol{I}$ is the $n\times n$ identity matrix and $\boldsymbol{0}$ is an $n\times 1$ column vector made up of $0$s. 
It is easy to see that we are faced with a system of $n$ linear equations: the coefficient matrix here is $A - \lambda \boldsymbol{I}$. 

The trivial solution is $\boldsymbol{v} = 0$. In order for a non-trivial solution to exist, we must have vanishing determinant $|\boldsymbol{A} - \lambda \boldsymbol{I}| = 0$.
In other words, the matrix $\boldsymbol{A} - \lambda \boldsymbol{I}$ is singular. Expanding the determinant gives us a polynomial equation, known as the _characteristic equation_:
$$
(-1)^n\lambda^n + c_{n-1} \lambda^{n-1} + \cdots + c_1 \lambda + c_0 = 0.
$$

Thus, an $n \times n $ matrix has at most $n$ distinct eigenvalues, which are the roots of the characteristic polynomial. When a root occurs twice, we say that root has multiplicity $2$. If a root occurs only once, in other words if it has multiplicity 1, we are dealing with a _simple_ eigenvalue.

Having calculated the eigenvalues, one way to evaluate the eigenvectors is simply by using @eq-eigenvalue again. 
- Specifically, for a given/known eigenvalue, $\lambda_i$, one tries to solve the system of linear equations $(\boldsymbol{A} - \lambda_i\boldsymbol{I})\boldsymbol{v}_i = 0$ for $\boldsymbol{v}_i$. 
- For each value $\lambda_i$, we will not be able to determine unique values of $\boldsymbol{v}_i$, so we will limit ourselves to computing the relative values of the components of $\boldsymbol{v}_i$. 
- We will in the following use the notation $(v_j)_0$, $(v_j)_1$ etc. to denote the $n$ elements of the column vector $\boldsymbol{v}_j$.

# Error Analysis