Skip to content

Pytorch (Lightning) Implementation of the Jamba Language Model

Notifications You must be signed in to change notification settings

myscience/jamba

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 

Repository files navigation

Jamba Model in Easy Pytorch

This repo is an unofficial implementation of the Jamba model as introduced in Lieber et al., (2024). One can find the official project webpage here. This repo is developed mainly for didactic purposes to spell out the details of the how to hybridize SSM with Transformers.

Usage

Roadmap

  • Put all the essential pieces together: Mamba, MoE.
  • Add functioning training script (Lightning)
  • Show some results

Citations

@article{lieber2024jamba,
  title={Jamba: A Hybrid Transformer-Mamba Language Model},
  author={Lieber, Opher and Lenz, Barak and Bata, Hofit and Cohen, Gal and Osin, Jhonathan and Dalmedigos, Itay and Safahi, Erez and Meirom, Shaked and Belinkov, Yonatan and Shalev-Shwartz, Shai and others},
  journal={arXiv preprint arXiv:2403.19887},
  year={2024}
}

Releases

No releases published

Packages

No packages published