[QUESTION]Mamba-2-hybrid Weights #864

Mooler0410 · 2024-06-13T19:24:30Z

Your question
An Empirical Study of Mamba-based Language Models](https://github.com/NVIDIA/Megatron-LM/tree/ssm/examples/mamba)
Hi! I'm impressed by this work and cannot wait to try the new mamba-2-hybrid. This paper mentioned that the weights are released on Huggingface. But I cannot find any. Wondering have they been released? If yes, where can I download them?

Thanks a lot for your folks' contribution to the community!

ruipeterpan · 2024-06-17T21:38:24Z

I think the model weights are released here: https://huggingface.co/collections/nvidia/ssms-666a362c5c3bb7e4a6bcfb9c

Mooler0410 · 2024-06-17T22:28:27Z

I think the model weights are released here: https://huggingface.co/collections/nvidia/ssms-666a362c5c3bb7e4a6bcfb9c

Thanks! I've already found it. While when this question is posted, the weights haven't been set as public.

Now, I'm looking for the tokenizer🤣. To run the example, a tokenizer is required. But I cannot find any. Any idea about this?

ruipeterpan · 2024-06-17T22:43:45Z

I think the tokenizer path should point to the .model file in the huggingface repos. For example, I downloaded the mamba2-hybrid-8b-3t-4k repo from huggingface, and mamba2-hybrid-8b-3t-4k/mt_nlg_plus_multilingual_ja_zh_the_stack_frac_015_256k.model is the tokenizer. I'm running inference using run_text_gen_server_8b.sh and the checkpoint/tokenizer paths are

CHECKPOINT_PATH="/workspace/checkpoints/mamba2-hybrid-8b-3t-4k/"
TOKENIZER_PATH="/workspace/checkpoints/mamba2-hybrid-8b-3t-4k/mt_nlg_plus_multilingual_ja_zh_the_stack_frac_015_256k.model"

respectively.

Mooler0410 · 2024-06-17T22:48:44Z

Wow, thank you so much for your guidance! It took me hours to find something like a tokenizer.

Never used megatron before🙃. You did save my life!!

Mooler0410 closed this as completed Jun 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[QUESTION]Mamba-2-hybrid Weights #864

[QUESTION]Mamba-2-hybrid Weights #864

Mooler0410 commented Jun 13, 2024

ruipeterpan commented Jun 17, 2024

Mooler0410 commented Jun 17, 2024

ruipeterpan commented Jun 17, 2024

Mooler0410 commented Jun 17, 2024

[QUESTION]Mamba-2-hybrid Weights #864

[QUESTION]Mamba-2-hybrid Weights #864

Comments

Mooler0410 commented Jun 13, 2024

ruipeterpan commented Jun 17, 2024

Mooler0410 commented Jun 17, 2024

ruipeterpan commented Jun 17, 2024

Mooler0410 commented Jun 17, 2024