Jamba Model in Easy Pytorch

This repo is an unofficial implementation of the Jamba model as introduced in Lieber et al., (2024). One can find the official project webpage here. This repo is developed mainly for didactic purposes to spell out the details of the how to hybridize SSM with Transformers.

Usage

Roadmap

Put all the essential pieces together: Mamba, MoE.
Add functioning training script (Lightning)
Show some results

Citations

@article{lieber2024jamba,
  title={Jamba: A Hybrid Transformer-Mamba Language Model},
  author={Lieber, Opher and Lenz, Barak and Bata, Hofit and Cohen, Gal and Osin, Jhonathan and Dalmedigos, Itay and Safahi, Erez and Meirom, Shaked and Belinkov, Yonatan and Shalev-Shwartz, Shai and others},
  journal={arXiv preprint arXiv:2403.19887},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
src		src
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Jamba Model in Easy Pytorch

Usage

Roadmap

Citations

About

Releases

Packages

myscience/jamba

Folders and files

Latest commit

History

Repository files navigation

Jamba Model in Easy Pytorch

Usage

Roadmap

Citations

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages