GitHub - ckerce/hippo-s4-mamba-operator-dynamics: Investigating alternative methods for defining operator dynamics in s4/mamba; examining the effect on trainability.

Operator evolution in HiPPO / s4 / Mamaba

There appears to be a discrepancy between the way operators are defined in the papers (Hippo, S4, and Mamba) and the way they are implemented in the code. The code uses elementwise exponentiation, while the papers use formulas for the fundamental matrix solution (see, for example (1)). Most methods for computing the fundamental matrix solution $exp(A)$ require an undesirable complexity and memomory usage to differentiate, however such implementations have been previously demonstrated in the numerical weather modeling community (see (2) and references therein).

A first obvious approach to investigating this possible discrepancy is to use the algebraic form of Zassenhaus formula in conjunction with the existing s4/Mamaba technique, which starts from an initial good approximation for the optimal state propagator, $exp(tA_{opt})$: $$exp(t(A + dA)) = exp(tA) * M. $$ Here $M$ is defined by matrix exponentials of (high-order) commutators with $A$ & $dA$, and is implemented as a trainable parameter intialized @ $M = Id = eye()$.

Other easy-to-implement options include the following, but at progressivly more computational expense:

Use Runge-Kutta integration to get the initial $exp(tA)$, and potentially make a small number of steps part of the training loop.
Use Pade Approximates to get the initial $exp(tA)$; Golub and Van Loan (3).
Use other techniques from Moler and Van Loan's "19 dubious way's paper" (the most recent update, (4)).

(1) On the exponential solution of differential equations for a linear operator; Wilhelm Magnus; Communications on Pure and Applied Mathematics, November 1954; https://doi.org/10.1002/cpa.3160070404

(2) Assimilation of angle of arrival measurements from an antenna of GPS receivers in the WRF model; F Vandenberghe, Clayton Kerce, Robert Bock; Assimilation of Remote Sensing and In Situ Data in Modern Numerical Weather and Environmental Prediction Models

(3) Matrix Computations; Golub and Van Loan

(4) Nineteen Dubious Ways to Compute the Exponential of a Matrix, 25 years later; Moler and Van Loan

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
.github/workflows		.github/workflows
3rdparty		3rdparty
assets		assets
benchmarks		benchmarks
csrc/selective_scan		csrc/selective_scan
evals		evals
mamba_ssm		mamba_ssm
tests/ops		tests/ops
.gitignore		.gitignore
.gitmodules		.gitmodules
AUTHORS		AUTHORS
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Operator evolution in HiPPO / s4 / Mamaba

Other easy-to-implement options include the following, but at progressivly more computational expense:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Operator evolution in HiPPO / s4 / Mamaba

Other easy-to-implement options include the following, but at progressivly more computational expense:

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages