GPT: Transformer-based Language Model

Overview

This repository contains an implementation of a Transformer-based GPT model trained on the OpenWebText dataset. The model follows the architecture principles of the original GPT paper and is built using PyTorch. The goal of this project is to explore language modeling and text generation using modern deep learning techniques.

Features

Implements a Transformer-based architecture
Pretrained on OpenWebText dataset
Supports text generation and fine-tuning
Built with PyTorch for flexibility and scalability

Installation

To use this repository, first clone it and install the required dependencies:

git clone https://github.com/Executer13/gpt.git
cd gpt
pip install -r requirements.txt

Usage

Training the Model

To train the GPT model on your dataset, run:

python train.py --config configs/default.yaml

Modify configs/default.yaml to change model hyperparameters and training settings.

Generating Text

To generate text using a trained model, run:

python generate.py --model-checkpoint path/to/checkpoint.pth --prompt "Your text prompt here"

Fine-Tuning

Fine-tune the model on a custom dataset using:

python fine_tune.py --dataset path/to/dataset

Model Architecture

The model follows the standard Transformer-based GPT architecture, including:

Multi-head self-attention mechanism
Layer normalization
Position-wise feedforward network
Causal masking for autoregressive generation

Dataset

The model is trained on OpenWebText, an open-source dataset designed to replicate OpenAI’s WebText dataset. Users can replace this dataset with their own corpus for custom training.

Contributions

Contributions are welcome! Feel free to open issues or submit pull requests.

License

This project is licensed under the MIT License. See LICENSE for details.

Contact

For questions or discussions, open an issue or reach out via GitHub Discussions.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.ipynb_checkpoints		.ipynb_checkpoints
.gitignore		.gitignore
README.md		README.md
bigram.ipynb		bigram.ipynb
chatbot.py		chatbot.py
data-extract.py		data-extract.py
data-train.py		data-train.py
time-testing.py		time-testing.py
training.py		training.py
vocab_file		vocab_file

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GPT: Transformer-based Language Model

Overview

Features

Installation

Usage

Training the Model

Generating Text

Fine-Tuning

Model Architecture

Dataset

Contributions

License

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

GPT: Transformer-based Language Model

Overview

Features

Installation

Usage

Training the Model

Generating Text

Fine-Tuning

Model Architecture

Dataset

Contributions

License

Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages