**This Notebook will be used for the purpose of step by step drafting and implementing a SSM in python.**

Resources: https://arxiv.org/abs/2111.00396

It can be seen that an **SSM** is based on three variables that depend on time $t$:

- $x(t) \in \mathbb{C}^n$ represents the $n$ state variables,  
- $u(t) \in \mathbb{C}^m$ represents the $m$ state inputs,  
- $y(t) \in \mathbb{C}^p$ represents the $p$ outputs.

We can also see that it's made up of four learnable matrices: **A**, **B**, **C**, and **D**.

- $\mathbf{A} \in \mathbb{C}^{n \times n}$ is the state matrix (controlling the latent state $\mathbf{x}$),  
- $\mathbf{B} \in \mathbb{C}^{n \times m}$ is the control matrix,  
- $\mathbf{C} \in \mathbb{C}^{p \times n}$ is the output matrix,  
- $\mathbf{D} \in \mathbb{C}^{p \times m}$ is the command matrix.  

The above picture can be reduced to the following system of equations:

$$
\begin{aligned}
x'(t) &= \mathbf{A}x(t) + \mathbf{B}u(t) \\
y(t) &= \mathbf{C}x(t) + \mathbf{D}u(t)
\end{aligned}
$$



D is an instantaneous connection between input $u(t)$ and output $y(t)$, which when we want to discretize the SSM this is unrealistic so D is usually ommitted or $D = 0$.

$$
\begin{aligned}
x'(t) &= \mathbf{A}x(t) + \mathbf{B}u(t) \\
y(t) &= \mathbf{C}x(t) 
\end{aligned}
$$

The main challenge with implementing this model is discretization.

Also, from Albert GU paper, A must be replaced with HiPPO matrix for necessary improved accuracy.

https://arxiv.org/abs/2008.07669