Minimal reference implementation #10

EelcoHoogendoorn · 2023-12-05T20:03:34Z

Thanks so much for providing this code; looks very useful and reproducible.

As I understand, the custom scan kernel can be quite important to performance considerations, so it is great to see it here as well.

However, as a suggestion, I think itd be super neat to also see a minimal Mamba reference implementation, with minimal dependencies, simply for clarity of exposition; something that could be unit tested to behave the same at least on small datasets, as the custom kernel. Would that be a lot of work? Does it already exist somewhere? If a torch version exists id be happy to port it to a JAX version as well.

tridao · 2023-12-05T20:13:00Z

There's a reference implementation of the selective scan in Pytorch here. That's the main primitive that requires CUDA.

EelcoHoogendoorn · 2023-12-05T20:35:18Z

Right; I did see there is a reference implementation; I just wondered how close we should consider it to being 'minimal'.

How close could mamba get to this kind of minimalisms?

tridao · 2023-12-06T08:33:12Z

The core is actually just a for loop, the code will simplify a lot if you only take the path where B/C are input-dependent.

johnma2006 · 2023-12-20T11:29:44Z

Hi all, I wrote a minimal implementation here: https://github.com/johnma2006/mamba-minimal/tree/master. Hope it helps!

EelcoHoogendoorn · 2023-12-21T06:12:58Z

Hi all, I wrote a minimal implementation here: https://github.com/johnma2006/mamba-minimal/tree/master. Hope it helps!

Thanks, that looks really clean, and should be trivial to port to JAX. From what i understand using JAX scan also isnt competitive for LLM-scale models but my intuition is itd be fine for some of the smaller stuff id want to try it on.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minimal reference implementation #10

Minimal reference implementation #10

EelcoHoogendoorn commented Dec 5, 2023

tridao commented Dec 5, 2023

EelcoHoogendoorn commented Dec 5, 2023

tridao commented Dec 6, 2023

johnma2006 commented Dec 20, 2023

EelcoHoogendoorn commented Dec 21, 2023

Minimal reference implementation #10

Minimal reference implementation #10

Comments

EelcoHoogendoorn commented Dec 5, 2023

tridao commented Dec 5, 2023

EelcoHoogendoorn commented Dec 5, 2023

tridao commented Dec 6, 2023

johnma2006 commented Dec 20, 2023

EelcoHoogendoorn commented Dec 21, 2023