Skip to content
Pi edited this page Dec 27, 2023 · 7 revisions

Some of us have started digging into Mamba (and it's predecessors (and successors))

BeeGass has started on a series of presentations on the Discord.

See https://github.com/sap-ient-ai/ssm

^ With fewer than 300 weights, it precompiles in 7 minutes on an A6000 and produces good output in 10 epochs (3 minutes) when trained on TinyShakespeare.

That's pretty amazing.


Below is summarized by GPT4. I don't like it and will manually redo at some point. (pi)

Mamba: Innovations and Historical Pathway in Sequence Modeling

Mamba represents a significant advancement in sequence modeling by utilizing structured state spaces. It provides remarkable efficiency and performance improvements. The development of Mamba has been through a historical pathway of innovations, each contributing to its current capabilities. Below is an organized structure of Mamba's development, key innovations, and useful resources.

Historical Development

The development of Mamba is built upon several key innovations in sequence modeling:

  • Based: Newer: TODO Research this: https://hazyresearch.stanford.edu/blog/2023-12-11-zoology2-based
  • Mamba (S6): Current version, emphasizing linear-time processing and efficiency.
  • S4 (Structured State Space Sequences): Predecessor to Mamba, focusing on structured state spaces for sequence modeling.
  • HHH (Hungry Hungry Hippos): A significant step towards advanced language modeling with state space models.
  • HiPPO: Fundamental theory for continuous-time sequence modeling.
  • LMU (Legendre Memory Units): Improved efficiency and scaling compared to traditional models.
  • Voelker's Contributions: Early work on improving spiking dynamical networks and introducing higher-order synapses.

Key Innovations and Their Contributions

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

S4 and Its Relevance to Mamba

S4 stands as a direct precursor to Mamba, introducing the concept of structured state spaces.

Additional Resources and Further Reading

Videos and Blogs

Research and Further Exploration

TODO

  • Review and summarize the key points from the "Beyond Attention" and "Mamba AI" videos for a deeper understanding of the current state and future potential of sequence modeling technologies.