Skip to content

efeecllk/gpt-reimplementation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

65 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GPT-2 Reimplementation

πŸ“Œ Project Overview

This repository contains a reimplementation of OpenAI's GPT-2 model from scratch. The goal is to understand and reproduce the core functionalities of GPT-2, including tokenization, transformer architecture, training, and inference.

🧠 Understanding GPT-2

GPT-2 is an autoregressive Transformer model designed for text generation. It consists of:

  • Multi-layer Transformer blocks
  • Self-attention for contextual word understanding
  • Layer normalization and residual connections
  • Token embeddings and positional encodings

Dataset

We use the FineWeb-Edu dataset from Hugging Face for training. This dataset consists of high-quality web text specifically filtered for educational purposes. You can find it here: FineWeb-Edu

πŸ“š References

πŸ† Contributors

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages