Skip to content

A nanoGPT-like implementation of llama3 architecture model from scratch using apple's mlx library.

Notifications You must be signed in to change notification settings

Juno-T/llama3-mlx

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Llama3 from scratch with MLX

This is a nanoGPT-like implementation of llama3 architecture model from scratch using apple's mlx python library. Most of the things are implemented from scratch and no nn module used. It will also work if you just swap in numpy.

Setup

  1. Install packages.

All we need is just mlx for our model.

pip install mlx

However, we will need these packages for converting pytorch weight and loading tokenization

pip install numpy torch llama_models
  1. Download the model. I have only tested the Llama3.2-1B. You can download it from https://www.llama.com/llama-downloads/

  2. Update the weight, param and tokenizer paths in main.py to your download destination.

Running

python main.py

You can directly specify your prompt, temperature, topk, etc. in main.py

playground.ipynb is just me studying and experimenting with the model components. It's kind of documenting the learning journey, and I think will be helpful for somebody who is trying to start doing something similar. So I committed it here.

Special thanks to

Architecture understanding

Implementation references

About

A nanoGPT-like implementation of llama3 architecture model from scratch using apple's mlx library.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published