Skip to content

milmor/GPT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

56 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GPT

This repository is a simple and clean GPT implementation in TensorFlow.

Dependencies

  • Python 3.9.16
  • TensorFlow 2.12.0
  • TensorFlow Text 2.12.1
  • TensorFlow Datasets 4.9.2
  • KerasNLP 0.5.2
  • Datasets 2.13.0

Usage

Train

The model is trained by default on the OpenWebText dataset. Use --model_dir=<model_dir> to specify the model directory name.

python train.py --model_dir=<model_dir> 

Some other options:

  • The model.py functions are compiled with XLA. To disable XLA, set jit_compile=False.

Generate

Use --model_dir=<model_dir> and --context=<context> to specify the model directory name and context.

python generate.py --model_dir=<model_dir> --context=<context>

Pretrained GPT-Mini

To download and try pretrained GPT-Mini, run demo.ipynb. If you want to fine-tune GPT-Mini using the pretrained weights, you will need to modify the code in the demo.ipynb notebook or create a new notebook specifically for fine-tuning.

Hparams setting

Adjust hyperparameters in the config.py file.

Tensorboard

Run tensorboard --logdir ./.

References

Implementation notes:

  • The model.py functions are compiled with XLA

Licence

MIT

About

Implementation of Generative Pretrained Transformer Model in Tensorflow / Keras

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published