`gpt2-mlx`

A re-implementation of GPT-2 in Apple's new machine learning framework, MLX

Run OpenAI's 1.5 billion parameter model or train custom GPT-style models from scratch, all on your Mac GPU!

GPT-2 XL 1.5B real-time text generation on M1 Pro 16GB

Install

Use a device with Apple silicon

$ python -m venv venv
$ source venv/bin/activate
$ pip install -r requirements.txt

See the full GPT-2 neural network architecture, implemented in MLX, in transformer.py

Run

Download the pre-trained GPT-2 model weights from Hugging Face

Convert the PyTorch model weights to the MLX format

$ python convert_weights.py --weights_path="path/to/pytorch_model.bin" --model_name="gpt2-xl"

Generate text

$ python generate.py --model_name="gpt2-xl" --prompt "In a shocking finding, scientists discovered a herd of unicorns"

Train

With gpt2-mlx, it is possible to train a custom GPT-style model on your own data

First, gather your training data and save it as a text file, i.e. train.txt

Run the following script to pre-process and tokenize the text data into a format compatible with the model

$ python prepare_data.py --data_path="path/to/train.txt"

Train the model, natively on your device

$ python train.py --data_path="path/to/train.npy" --checkpoint_dir="path/to/save/checkpoints"

The training script loads data in batches at each training step to avoid loading the entire dataset into memory

It also has an implementation of gradient accumulation, allowing you to train larger models with larger batch sizes than would otherwise fit in memory

To generate text with your custom trained model, use the same generate.py file, but with the checkpoint directory containing your custom model definition and weights

$ python generate.py --checkpoint_dir="path/to/checkpoints" --prompt="Hello world,"

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
convert_weights.py		convert_weights.py
generate.py		generate.py
gpt2-mlx.gif		gpt2-mlx.gif
optimizer.py		optimizer.py
prepare_data.py		prepare_data.py
requirements.txt		requirements.txt
test_grad_accum.py		test_grad_accum.py
train.py		train.py
transformer.py		transformer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

`gpt2-mlx`

Install

Run

Train

About

Releases

Packages

Languages

License

dx-dtran/gpt2-mlx

Folders and files

Latest commit

History

Repository files navigation

gpt2-mlx

Install

Run

Train

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

`gpt2-mlx`

Packages