TinyLM: A PyTorch Implementation of a Causal Decoder Transformer

A PyTorch implementation of a causal decoder transformer model for training tiny language models. The implementation includes training utilities, data processing, and an interactive UI for model interaction.

Sample Generation

Below is a sample generation after ~10 hours of training on a single Quadro RTX 6000 using the TinyStoriesV2-GPT4 dataset:

Prompt: "Once upon a time there was a little girl named Lucy"

Output:

Once upon a time there was a little girl named Lucy. She was three years old and loved to explore the world around her. One day, Lucy was walking in the park when she saw a big, red ball. She wanted to play with it, so she ran over to it.

"What is this?" Lucy asked.

"It's a ball," said a friendly voice.\nLucy looked around and saw a little boy. He was wearing a blue shirt and had a big smile on his face.

"Can I play with you?" Lucy asked.\nThe little boy nodded and they started to play together. They had so much fun that they didn't notice the time passing by. Suddenly, the ball started to roll away. Lucy and the little boy ran after it, but it was too fast.

"Let's catch it!" Lucy said.

They ran and ran until they finally caught the ball.

"That was so much fun!" said the little boy.

"Yes, it was!" said Lucy.\nThey both smiled and hugged each other. Then they went back to playing with the ball, happy to have made a new friend.

A sample checkpoint can be found in saved_checkpoints and a file showing the progression of text generation capabilities during training can be found in data/.

Project Structure

.
├── config/               # Configuration files (Hydra)
├── data/                # Dataset directory
├── data_utils/          # Data processing utilities
│   ├── logger.py       # Weights & Biases logging
│   ├── tokenizer.py    # Custom tokenizer implementation
│   └── dataloader.py   # Data loading utilities
├── interactive_ui/      # Web UI components
│   ├── app.py         # Flask application
│   └── templates/     # HTML templates
├── saved_checkpoints/   # Model checkpoints
├── transformer/        # Core transformer implementation
│   └── transformer.py # Transformer model architecture
└── train.py           # Main training script

Installation

Clone the repository:

git clone https://github.com/Mr-vedant-gupta/TinyLM.git
cd tiny-lm

Install the package in development mode:

pip install -e .

Usage

Data Preparation

Download the training and validation datasets into the data/ directory:
- Training data
- Validation data
Both datasets should be text files with training examples separated by data.EOTtoken (in config/config.yaml)
Train the tokenizer and tokenize the datasets:

python data_utils/tokenizer.py

Training

Train the model with Weights & Biases logging:

python train.py WandB.name=<training_run_name> train.debug=False

The training script uses Hydra for configuration management. Training parameters can be modified in the config/ directory.

Interactive UI

Launch the web interface to interact with a trained model:

python tiny_ui/app.py train.load_checkpoint=saved_checkpoints/sample_checkpoint.pt WandB.name=dummy

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
config		config
data		data
data_utils		data_utils
saved_checkpoints		saved_checkpoints
tiny_ui		tiny_ui
transformer		transformer
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TinyLM: A PyTorch Implementation of a Causal Decoder Transformer

Sample Generation

Project Structure

Installation

Usage

Data Preparation

Training

Interactive UI

About

Uh oh!

Releases

Packages

Uh oh!

Languages

guptbot/tinyLM

Folders and files

Latest commit

History

Repository files navigation

TinyLM: A PyTorch Implementation of a Causal Decoder Transformer

Sample Generation

Project Structure

Installation

Usage

Data Preparation

Training

Interactive UI

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages