Llama3 from scratch with MLX

This is a nanoGPT-like implementation of llama3 architecture model from scratch using apple's mlx python library. Most of the things are implemented from scratch and no nn module used. It will also work if you just swap in numpy.

Setup

Install packages.

All we need is just mlx for our model.

pip install mlx

However, we will need these packages for converting pytorch weight and loading tokenization

pip install numpy torch llama_models

Download the model. I have only tested the Llama3.2-1B. You can download it from https://www.llama.com/llama-downloads/
Update the weight, param and tokenizer paths in main.py to your download destination.

Running

python main.py

You can directly specify your prompt, temperature, topk, etc. in main.py

playground.ipynb is just me studying and experimenting with the model components. It's kind of documenting the learning journey, and I think will be helpful for somebody who is trying to start doing something similar. So I committed it here.

Special thanks to

Architecture understanding

Implementation references

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
main.py		main.py
playground.ipynb		playground.ipynb
readme.md		readme.md
ref_impl.py		ref_impl.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Llama3 from scratch with MLX

Setup

Running

Special thanks to

About

Uh oh!

Releases

Packages

Languages

Juno-T/llama3-mlx

Folders and files

Latest commit

History

Repository files navigation

Llama3 from scratch with MLX

Setup

Running

Special thanks to

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages