In [1]:
import sympy
from sympy import Matrix, Rational, sqrt, symbols, zeros, simplify
import numpy as np
import matplotlib.pyplot as plt
import ipywidgets as widgets
from ipywidgets import interact, interactive, fixed, interact_manual
import matplotlib.pyplot as plt
%matplotlib notebook


# Linear algebra

## Session 8: 

## Gerhard Jäger

### December 14, 2023

## Determinants

### Sympy

In [6]:
a,b,c,d, e, f, g, h, i = symbols('a b c d e f g h i')
A = Matrix([
    [a, b],
    [c,d]
])
A

Matrix([
[a, b],
[c, d]])

In [7]:
A.det()

a*d - b*c

In [8]:
A = Matrix([
    [a,b,c],
    [d,e,f],
    [g,h,i]
])
A

Matrix([
[a, b, c],
[d, e, f],
[g, h, i]])

In [9]:
l, u, _ = A.LUdecomposition()

In [11]:

sympy.simplify(u)


Matrix([
[a,         b,                                                           c],
[0, e - b*d/a,                                                   f - c*d/a],
[0,         0, (a*e*i - a*f*h - b*d*i + b*f*g + c*d*h - c*e*g)/(a*e - b*d)]])

In [12]:
A

Matrix([
[a, b, c],
[d, e, f],
[g, h, i]])

In [13]:
from math import prod
sympy.simplify(prod(u.diagonal()))

a*e*i - a*f*h - b*d*i + b*f*g + c*d*h - c*e*g

In [14]:
A.det()

a*e*i - a*f*h - b*d*i + b*f*g + c*d*h - c*e*g

## Geometric interpretation

The absolute value $A$ is the $n$-dimensional volume of the parallelepiped created by the column vectors of $A$.

Why?

It is fairly easy to see that this interpretation holds for the first two axioms:

- the parallelepiped corresponding to the identity matrix is the $n$-dimensional standard cube with length 1 along each edge
- swapping two columns does not change the parallelepiped



The third axiom is more complex. These pictures give an intuitive explanation in 2 dimensions:

<img src=_img/axiom3.svg>

There is no simple geometric interpretation of the sign of the determinant though.

## Determinants and permutations

Consider our standard $2\times 2$ matrix

$$
\begin{bmatrix}
a & b\\
c & d
\end{bmatrix}
$$

According to rule 3:

$$
\begin{aligned}
\begin{vmatrix}
a & b\\
c & d
\end{vmatrix} &=
\begin{vmatrix}
a & b\\
0 & d
\end{vmatrix} +
\begin{vmatrix}
0 & b\\
c & d
\end{vmatrix}\\
&=
\begin{vmatrix}
a & b\\
0 & 0
\end{vmatrix} +
\begin{vmatrix}
a & 0\\
0 & d
\end{vmatrix} +
\begin{vmatrix}
0 & b\\
c & 0
\end{vmatrix}+
\begin{vmatrix}
0 & 0\\
c & d
\end{vmatrix}\\
\end{aligned}
$$


The first and last summand each have a zero row. According to rules 6 and 10, their determinants are 0. So we get

$$
\begin{aligned}
\begin{vmatrix}
a & b\\
c & d
\end{vmatrix} &=
\begin{vmatrix}
a & 0\\
0 & d
\end{vmatrix} +
\begin{vmatrix}
0 & b\\
c & 0
\end{vmatrix}
\end{aligned}
$$


According to rule 3, it follows that

$$
\begin{aligned}
\begin{vmatrix}
a & b\\
c & d
\end{vmatrix} &=
a d\begin{vmatrix}
1 & 0\\
0 & 1
\end{vmatrix} +
b c \begin{vmatrix}
0 & 1\\
1 & 0
\end{vmatrix}
\end{aligned}
$$


Note the pattern:

- each summand is the determinant of a **permutation matrix**, 
- multiplied by the entries of the original matrix corresponding to the non-zero entries of the permutation matrix

(Reminder: a permutation matrix is a square matrix with exactly one 1 in each row and each column, and 0 everywhere else.)

According to rules 2 and 1:

$$
\begin{aligned}
\begin{vmatrix}
a & b\\
c & d
\end{vmatrix} &=
a d\begin{vmatrix}
1 & 0\\
0 & 1
\end{vmatrix} +
b c \begin{vmatrix}
0 & 1\\
1 & 0
\end{vmatrix}\\
&=
a d\begin{vmatrix}
1 & 0\\
0 & 1
\end{vmatrix} -
b c \begin{vmatrix}
1 & 0\\
0 & 1
\end{vmatrix}\\
&= ad - bc
\end{aligned}
$$


Same thing for $3\times 3$ matrix:

$$
\begin{aligned}
\begin{vmatrix}
a & b & c\\
d & e & f\\
g & h & i
\end{vmatrix} 
&= 
\begin{vmatrix}
a &  & \\
 & e & \\
 &  & i
\end{vmatrix} +
\begin{vmatrix}
a &  & \\
 &  & f\\
 & h & 
\end{vmatrix} +
\begin{vmatrix}
 & b & \\
d &  & \\
 &  & i
\end{vmatrix} +
\begin{vmatrix}
 & b & \\
 &  & f\\
g &  & 
\end{vmatrix} +
\begin{vmatrix}
 &  & c\\
d &  & \\
 & h & 
\end{vmatrix} +
\begin{vmatrix}
 &  & c\\
 & e & \\
g &  & 
\end{vmatrix} \\
&=
\begin{vmatrix}
a &  & \\
 & e & \\
 &  & i
\end{vmatrix} -
\begin{vmatrix}
a &  & \\
 & f & \\
 &  & h
\end{vmatrix} -
\begin{vmatrix}
b &  & \\
 & d & \\
 &  & i
\end{vmatrix} -
\begin{vmatrix}
 b& & \\
 &  & f\\
 &g  & 
\end{vmatrix} -
\begin{vmatrix}
 & c & \\
d &  & \\
 &  & h
\end{vmatrix} -
\begin{vmatrix}
 &  c& \\
 & & e\\
g &  & 
\end{vmatrix} \\
&=
\begin{vmatrix}
a &  & \\
 & e & \\
 &  & i
\end{vmatrix} -
\begin{vmatrix}
a &  & \\
 & f & \\
 &  & h
\end{vmatrix} -
\begin{vmatrix}
b &  & \\
 & d & \\
 &  & i
\end{vmatrix} +
\begin{vmatrix}
 b& & \\
 & f & \\
 & & g
\end{vmatrix} +
\begin{vmatrix}
c &  & \\
 & d & \\
 &  & h
\end{vmatrix} +
\begin{vmatrix}
 c&  & \\
 & & e\\
& g & 
\end{vmatrix} \\
&=
\begin{vmatrix}
a &  & \\
 & e & \\
 &  & i
\end{vmatrix} -
\begin{vmatrix}
a &  & \\
 & f & \\
 &  & h
\end{vmatrix} -
\begin{vmatrix}
b &  & \\
 & d & \\
 &  & i
\end{vmatrix} +
\begin{vmatrix}
 b& & \\
 & f & \\
 & & g
\end{vmatrix} +
\begin{vmatrix}
c &  & \\
 & d & \\
 &  & h
\end{vmatrix} -
\begin{vmatrix}
 c&  & \\
 &e & \\
&  & g
\end{vmatrix} \\
&= aei + bfg + cdh - ceg - bdi - afh
\end{aligned}
$$



Let $\pi$ be a *permutation* of $1,\ldots, n$. This means $\pi$ is a **bijection** from $\{1,\ldots,n\}$ onto itself.

Each component of the formula above has the form

$$
\pm \prod_i a_{i,\pi(i)}
$$

for some permutation $\pi$.

We distinguish *even* and *odd* permutations::

**Definition**

*A permutation $\pi$ is ***even*** if and only if
$$
\|\{\langle i,j\rangle|i < j \wedge \pi(i) > \pi(j)\}\|
$$
is even. Otherwise it is ***odd***.*

The sum in the definition is the number of column permutations we have to perform to convert the corresponding permutation matrix to the identity matrix.



**Definition**

*Let $\pi$ be a permutation.*

$$
\mathrm{sign}(\pi) \doteq \left\{\begin{aligned}1 & \mbox{ if }\pi\mbox{ is even}\\-1&\mbox{ else}\end{aligned}\right.
$$



This leads to the **Leibniz formula**:

$$
|A | = \sum_{\pi: \pi \mathrm{~a~permutation~over~}\{1,\ldots,n\}}\mathrm{sign}(\pi)\prod_ia_{i,\pi(i)}
$$

This formula is much too unwieldy for actual computations, but it is useful for proving properties of the determinant.

## Cofactors

If we expand the Leibniz formula for $n=3$, we get

$$
\begin{aligned}
|A| &=& ~~a_{11}a_{22}a_{33} - a_{11}a_{23}a_{32}\\
&& -a_{21}a_{12}a_{33} + a_{21}a_{32}a_{13}\\
&& +a_{31}a_{12}a_{23} - a_{31}a_{22}a_{13}\\
&=& ~~a_{11}(a_{22}a_{33} - a_{23}a_{32})\\
&& -a_{21}(a_{12}a_{33} - a_{32}a_{13})\\
&& +a_{31}(a_{12}a_{23} - a_{22}a_{13})
\end{aligned}
$$

Note that in the second equation, we have three products. Each consists of 
- $(-1)^{1+i}$
- an entry $a_{i1}$ from the first column, and
- the determinant of the matrix that results when we remove from $A$ the first column and the $i$th row, and

This generalizes to matrices of arbitrary size, and to arbitrary columns.

Let $M_{ij}$ be the matrix that results if we remove the $i$th row and the $j$th column from $A$.

$$
|A| = \sum_i (-1)^{i+j} a_{ij} |M_{ij}|
$$

Fore brevity's sake, we define

$$
C_{ij} \doteq (-1)^{i+j} |M_{ij}|
$$

These quantities are called **cofactors**.

Then the above formula simplifies to the **Laplace expansion**

$$
|A| = \sum_i a_{ij}C_{ij}
$$

Note that all matrices $M_{ij}$ have size $(n-1)\times (n-1)$.

This leads to a **recursive definition** of the determinant:

- *If $n=1$, $|A| = a_{11}$.<p>*

- *If $n>1$, $|A| = \sum_i (-1)^{i+1}a_{i1}|M_{i1}|$.*


Applying this definition to actual computations is not advisable, because it amounts to an application of the Leibniz formula, i.e., it is computationally costly.

## Matrix powers and exponentials

So far we studied basic operations such as addition and multiplication of matrices and vectors.

Higher operations are also defined, at least for square matrices:

**power**
- $A^0 = \mathbf I$
- $A^{n+1} = A A^n$

**exponential**
- $e^A = \sum_{k=0}^\infty \frac{1}{k!} A^k$

To compute them efficiently (and for many other applications), 
we need **eigenvectors** and **eigenvalues** of square matrices.

## Example

In [17]:
A = Matrix([
    [0.8, 0.3],
    [0.2, 0.7]
])
A

Matrix([
[0.8, 0.3],
[0.2, 0.7]])

In [18]:
A**2

Matrix([
[0.7, 0.45],
[0.3, 0.55]])

In [19]:
A**3

Matrix([
[0.65, 0.525],
[0.35, 0.475]])

In [20]:
A**10

Matrix([
[0.600390625, 0.5994140625],
[0.399609375, 0.4005859375]])

In [21]:
A**100

Matrix([
[0.600000000000002, 0.600000000000002],
[0.400000000000001, 0.400000000000001]])

In [22]:
A**1000000

Matrix([
[0.6, 0.6],
[0.4, 0.4]])

## Eigenvectors and eigenvalues

### Basic equation

$$
A\mathbf x = \lambda \mathbf x
$$

- $\mathbf x$ is called an **eigenvector** of $A$
- $\lambda$ is called an **eigenvalue** of $A$

In [69]:
A = np.array([
    [1, 1],
    [-1, 1]
])

In [70]:

def g(A, theta):
    a, c = np.cos(theta), np.sin(theta)
    x = np.linspace(0, a, 100)
    y = np.linspace(0, c, 100)
    b, d = A @ np.array([a,c])
    z = np.linspace(0, b, 100)
    w = np.linspace(0, d, 100)
    return x, y, z, w

fig, ax = plt.subplots(figsize=(6,6))
xmin, xmax, ymin, ymax = -2, 2, -2, 2
ax.set(xlim=(xmin-1, xmax+1), ylim=(ymin-1, ymax+1), aspect='equal')
ax.spines['bottom'].set_position('zero')
ax.spines['left'].set_position('zero')
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)

ax.set_xlabel('x', size=14, labelpad=-24, x=1.03)
ax.set_ylabel('y', size=14, labelpad=-21, y=1.02, rotation=0)
arrow_fmt = dict(markersize=4, color='black', clip_on=False)
ax.plot((1), (0), marker='>', transform=ax.get_yaxis_transform(), **arrow_fmt)
ax.plot((0), (1), marker='^', transform=ax.get_xaxis_transform(), **arrow_fmt)

x, y, z, w = g(A, 0)
line1, = ax.plot(x,y, color='red')
line2, = ax.plot(z,w, color='green')

angles = np.linspace(0, 2*np.pi, 100)
crc = np.array([np.cos(angles), np.sin(angles)])
elps = A @ crc
ax.plot(crc[0,:], crc[1,:], color='red')
ax.plot(elps[0,:], elps[1,:], color='green')

def update(theta = 0):
    x, y, z, w = g(A,theta)
    print("x = "+str(x[-1])+","+str(y[-1]))
    print("y = "+str(z[-1])+","+str(w[-1]))
    line1.set_data(x, y)
    line2.set_data(z, w)
    fig.canvas.draw_idle()
    
interact(update, theta = (0, 2*np.pi, 0.01));

<IPython.core.display.Javascript object>

interactive(children=(FloatSlider(value=0.0, description='theta', max=6.283185307179586, step=0.01), Output())…

### Example: How to find eigenvalues and eigenvectors

$$
\begin{aligned}
A &= \left[\begin{matrix}0.8 & 0.3\\0.2 & 0.7\end{matrix}\right]\\
A\mathbf x &= \lambda \mathbf x\\
&= \lambda \mathbf I \mathbf x\\
(A-\lambda \mathbf I)\mathbf x &= \mathbf 0 
\end{aligned}
$$

The important matrix now is

$$
A-\lambda \mathbf I = \begin{bmatrix}
0.8-\lambda & 0.3\\
0.2 & 0.7-\lambda
\end{bmatrix}
$$

We are looking for a value of $\lambda$ such that
$$
(A-\lambda \mathbf I)\mathbf x = \mathbf 0 
$$

with $\mathbf x \neq \mathbf 0$ (otherwise the equation would be trivial).

It follows that $(A-\lambda \mathbf I)$ is not invertible. Hence:

$$
|(A-\lambda \mathbf I)| = 0
$$

Using the formula for the determinant of a $2\times 2$ matrix:

$$
(0.8-\lambda)(0.7-\lambda) - 0.2\times 0.3 = 0
$$


Simplifying:

$$
\begin{aligned}
(0.8-\lambda)(0.7-\lambda) - 0.2\times 0.3 &= 0\\
\lambda ^2 - 1.5\lambda + 0.56 - 0.06 &= 0\\
\lambda^2 - 1.5\lambda + 0.5 &= 0
\end{aligned}
$$

There is a formula for finding the solution of quadratic equations (https://en.wikipedia.org/wiki/Quadratic_equation):

$$
\begin{aligned}
\lambda_{1/2} = \frac{3}{4} \pm \sqrt{(\frac{3}{4})^2-\frac{1}{2}}\\
&= \frac{3}{4} \pm \sqrt{\frac{9-8}{16}}\\
&= \frac{3}{4} \pm \sqrt{\frac{1}{16}}\\
&= \frac{3}{4} \pm \frac{1}{4}\\
\lambda_1 &= 1\\
\lambda_2 &= 0.5
\end{aligned}
$$

$\lambda_1$ and $\lambda_2$ are the eigenvalues of $A$. Now let's find the corresponding eigenvectors.

This amount to finding the nullspace of $A-\lambda\mathbf I$:

- $\lambda_1$

$$
\begin{aligned}
(A - \lambda_1\mathbf I)\mathbf x_1 &= \mathbf 0\\
\begin{bmatrix}
-0.2 & 0.3\\
0.2 & -0.3
\end{bmatrix}\mathbf x_1 &= \mathbf 0\\
\begin{bmatrix}
-0.2 & 0.3\\
0 & 0
\end{bmatrix}\mathbf x_1 &= \mathbf 0\\
\begin{bmatrix}
1 & -1.5\\
0 & 0
\end{bmatrix}\mathbf x_1 &= \mathbf 0\\
\mathbf x_1 &= \begin{bmatrix}1.5\\1\end{bmatrix}
\end{aligned}
$$




- $\lambda_2$

$$
\begin{aligned}
(A - \lambda_2\mathbf I)\mathbf x_1 &= \mathbf 0\\
\begin{bmatrix}
0.3 & 0.3\\
0.2 & 0.2
\end{bmatrix}\mathbf x_2 &= \mathbf 0\\
\begin{bmatrix}
0.3 & 0.3\\
0 & 0
\end{bmatrix}\mathbf x_2 &= \mathbf 0\\
\begin{bmatrix}
1 & 1\\
0 & 0
\end{bmatrix}\mathbf x_2 &= \mathbf 0\\
\mathbf x_2 &= \begin{bmatrix}-1\\1\end{bmatrix}\\
\end{aligned}
$$

$\mathbf x_1$ is the eigenvector *corresponding to* $\lambda_1$.

$\mathbf x_2$ is the eigenvector *corresponding to* $\lambda_2$.

Any non-zero multiples of $\mathbf x_1, \mathbf x_2$ are also eigenvectors. It is common practice to use normalized eigenvectors, i.e.\ eigenvectors with length 1.

$$
\begin{aligned}
\mathbf v_1 &= \frac{\mathbf x_1}{||\mathbf x_1||}\\
&= \frac{1}{\sqrt{13}}\begin{bmatrix}3\\2\end{bmatrix}\\
\mathbf v_2 &= \frac{\mathbf x_1}{||\mathbf x_1||}\\
&= \frac{1}{\sqrt{2}}\begin{bmatrix}-1\\1\end{bmatrix}\\
\end{aligned}
$$

In [52]:
A = Matrix([
    [Rational(4,5), Rational(3,10)],
    [Rational(1,5), Rational(7,10)]
])
A

Matrix([
[4/5, 3/10],
[1/5, 7/10]])

In [56]:
e1, e2 = A.eigenvects()

In [57]:
lambda1, _, v1 = e1

In [58]:
lambda1

1/2

In [59]:
v1[0].normalized()

Matrix([
[-sqrt(2)/2],
[ sqrt(2)/2]])

In [60]:
lambda2, _, v2 = e2

In [61]:
lambda2

1

In [62]:
v2[0].normalized()

Matrix([
[3*sqrt(13)/13],
[2*sqrt(13)/13]])

In [63]:
A = np.array([
    [0.8, 0.3],
    [0.2, 0.7]
])
A

array([[0.8, 0.3],
       [0.2, 0.7]])

In [64]:
np.linalg.eig(A)

(array([1. , 0.5]),
 array([[ 0.83205029, -0.70710678],
        [ 0.5547002 ,  0.70710678]]))



### Procedure to find eigenvalues and eigenvectors:

1. Construct $A-\lambda \mathbf I$ with $\lambda$ as unknown.
2. Set $|A-\lambda \mathbf I| = 0$ and solve for $\lambda$. All solutions are eigenvalues.
3. For each solution for $\lambda$, find the nullspace of $|A-\lambda \mathbf I| = 0$. Each vector in the nullspace is an eigenvector corresponding to this solution.

##### Example 2: Projection matrix

$$
A = \begin{bmatrix}
\frac{1}{2} & \frac{1}{2}\\
\frac{1}{2} & \frac{1}{2}
\end{bmatrix}
$$

- find eigenvalues

$$
\begin{aligned}
\begin{vmatrix}
\frac{1}{2}-\lambda & \frac{1}{2}\\
\frac{1}{2} & \frac{1}{2}-\lambda
\end{vmatrix} &= 0\\
(\frac{1}{2}-\lambda)^2 - \frac{1}{4} &= 0\\
\lambda^2 -\lambda &= 0\\
\lambda(\lambda-1) &= 0\\
\lambda_1 &= 1\\
\lambda_2 &= 0
\end{aligned}
$$

- find eigenvectors

    - $\lambda_1$: 
    $$
    \begin{aligned}
    \begin{bmatrix}
    -\frac{1}{2} & \frac{1}{2}\\
    \frac{1}{2} & -\frac{1}{2}
    \end{bmatrix}\mathbf x_1 &= \mathbf 0\\
    \mathbf x_1 &= \begin{bmatrix}1\\1\end{bmatrix}
    \end{aligned}
    $$
    
    - $\lambda_2$:
    $$
    \begin{aligned}
    \begin{bmatrix}
    \frac{1}{2} & \frac{1}{2}\\
    \frac{1}{2} & \frac{1}{2}
    \end{bmatrix}\mathbf x_2 &= \mathbf 0\\
    \mathbf x_2 &= \begin{bmatrix}-1\\1\end{bmatrix}
    \end{aligned}
    $$

##### Example 3: Reflection matrix

$$
A = \begin{bmatrix}
0 & 1\\
1 & 0
\end{bmatrix}
$$

- find eigenvalues

$$
\begin{aligned}
\begin{vmatrix}
-\lambda & 1\\
1 & -\lambda
\end{vmatrix} &= 0\\
\lambda^2-1 &= 0\\
\lambda^2 &= 1\\
\lambda_1 &= 1\\
\lambda_2 &= -1
\end{aligned}
$$

- find eigenvectors

    - $\lambda_1$: 
    $$
    \begin{aligned}
    \begin{bmatrix}
    -1 & 1\\
    1 & -1
    \end{bmatrix}\mathbf x_1 &= \mathbf 0\\
    \mathbf x_1 &= \begin{bmatrix}1\\1\end{bmatrix}
    \end{aligned}
    $$
    
    - $\lambda_2$:
    $$
    \begin{aligned}
    \begin{bmatrix}
    1 & 1\\
    1 & 1
    \end{bmatrix}\mathbf x_2 &= \mathbf 0\\
    \mathbf x_2 &= \begin{bmatrix}-1\\1\end{bmatrix}
    \end{aligned}
    $$