Skip to content

Simplistic Flax implementation of GPT in one notebook.

Notifications You must be signed in to change notification settings

codescv/flax-gpt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Introduction

FlaxGPT is a simplistic Flax implementation of GPT (decoder-only transformer) model. The code is minimum in a single notebook and therefore good for hacking and educational purposes.

Usage

Open the main flax_gpt colab and start hacking.

Feature Roadmap

  • Implement GPT model using flax
  • Load / Convert LLaMA2-7B checkpoint for prediction
  • Implement K-V cache in prediction
  • Pretraining (example)
  • Finetuning (example)
  • LoRA finetuning
  • Quantization
  • Distributed Training (TPUs)

Tutorials

Here are some tutorials of how I implemented GPT from scratch.

About

Simplistic Flax implementation of GPT in one notebook.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published