# Lab 5: Eigenvalues, eigenvectors, and linear multivariate solutions

How do we use Python to find eigenvalues and eigenvectors? 

Let's say we have a simple 2x2 matrix

In [None]:
from sympy import *
var('a,b,c,d')
M = Matrix([[a,b],
            [c,d]])
M

To find both the eigenvalues and eigenvectors we can then use SymPy's built in `eigenvects` function

In [None]:
es = M.eigenvects()
es

This is a list of length 2, where the first item is

In [None]:
es0 = es[0]
es0

which is a "tuple" (essentially a list) of length 3. The first item is an eigenvalue

In [None]:
es0[0]

the second item is the "multiplicity" of that eigenvalue (sometimes there are multiple eigenvalues with the same value, and the multiplicity is how many eigenvalues have this value)

In [None]:
es0[1]

and the third item is a list of right eigenvectors (if the multiplicity is >1 then there can be more than one right eigenvector)

In [None]:
es0[2]

In this case we can extract the only right eigenvector as the first item in that list

In [None]:
es0[2][0]

The second item in the `eigenvects` list is another tuple giving the second eigenvalue, it's multiplicity, and the associated right eigenvector

In [None]:
es[1]

We can also get the matrices of right eigenvectors ($\mathbf{A}$) and eigenvalues ($\mathbf{D}$) directly

In [None]:
A, D = M.diagonalize()
A, D

## Question

Let's put this into practice by analyzing what is known as "Kimura's two-parameter model of mutation". Here we model how the number of adenines ($n_A$), guanines ($n_G$), cytosines ($n_C$), and thymines ($n_T$), in a sequence of DNA changes over generations. We assume that *transitions* (from A to G, G to A, C to T, and T to C) happen with probability $\alpha$ per generation while *transversions* (the remainder of mutations, which are generally less likely because they alter the ring structure of the DNA) happen with probability $\beta$ per generation. We assume no other force but mutation is acting (this is a "neutral" model, where there is no selection). The system of linear recursion equations is, in matrix form,

$$
\begin{pmatrix}
n_A(t+1) \\ n_G(t+1) \\ n_C(t+1) \\ n_T(t+1)
\end{pmatrix}
= 
\begin{pmatrix}
1-\alpha-2\beta & \alpha & \beta & \beta \\
\alpha & 1-\alpha-2\beta & \beta & \beta \\
\beta & \beta & 1-\alpha-2\beta & \alpha \\
\beta & \beta & \alpha & 1-\alpha-2\beta \\
\end{pmatrix}
\begin{pmatrix}
n_A(t) \\ n_G(t) \\ n_C(t) \\ n_T(t)
\end{pmatrix}
$$

Use the general solution, $\vec{n}(t) = \mathbf{A} \mathbf{D}^t \mathbf{A}^{-1} \vec{n}(0)$, to calculate the fraction of each nucleotide after 100 million generations assuming $\alpha=10^{-7}$, $\beta=10^{-8}$, and $\vec{n}(0) = \begin{pmatrix} 0.1 \\ 0.25 \\ 0.25 \\ 0.4 \end{pmatrix}$. (Note that we switched from the *number* of each nucleotide to the *fraction* of each nucleotide, which is fine since the total number of nucleotides will remain constant, meaning we can convert between numbers and fractions without loss of information). Explain the correspondance of your answer with the right eigenvector associated with the leading eigenvalue.