Theta AI

A GPT-2 based conversational AI training framework optimized for NVIDIA RTX 3060 (12GB VRAM). This repository contains everything needed to train, fine-tune, and run inference on Theta AI models.

Features

Optimized Training Pipeline: Gradient checkpointing, mixed precision (FP16), CPU offloading
Advanced Techniques: Curriculum learning, R-Drop regularization, EMA, label smoothing
RTX 3060 Optimized: Configured for 12GB VRAM with memory-efficient settings
Email Notifications: Real-time training alerts with GPU stats and metric monitoring
Multi-domain Training: Cybersecurity, programming, networking, data science, and more

Quick Start

Requirements

GPU: NVIDIA RTX 3060 12GB (or similar)
CPU: AMD Ryzen 5-5500 or equivalent
CUDA: 11.8+
Python: 3.8+

Installation

# Clone repository
git clone https://github.com/yourusername/theta-ai.git
cd theta-ai

# Install dependencies (CUDA)
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip install -r requirements.txt

# Download NLTK data
python -c "import nltk; nltk.download('punkt'); nltk.download('wordnet'); nltk.download('stopwords')"

# Setup environment
cp .env.example .env
# Edit .env with your settings

Download Datasets

# Human-like conversational data (28MB)
download_human_like_dpo.bat

# OpenAssistant dataset (6GB)
download_openassistant.bat

# OpenMath dataset (9GB, optional)
download_openmath_instruct.bat

Training

# Full training pipeline (overnight recommended)
train_overnight_enhanced.bat

Fine-tuning (After Initial Training)

If training stalls or validation loss plateaus, run targeted fine-tuning with reduced regularization:

# Create a config file or use existing one
finetune_theta.bat

See Training Pipeline for details.

Inference

from src.model.theta_model import ThetaModel

model = ThetaModel.load("models/theta_enhanced_YYYYMMDD/theta_final")
response = model.generate("What is machine learning?", max_length=200)
print(response)

Documentation

Full documentation is available in the documentation/ folder:

Guide	Description
Installation	Detailed setup instructions
Quick Start	Get training in 5 minutes
Training Pipeline	Complete training system guide
Datasets	Dataset formats and creation
Hyperparameters	All configuration options
RTX 3060 Optimizations	GPU-specific tuning
Email Notifications	Alert system setup
Architecture	System design overview
API Reference	Code documentation
Data Processing	Data preparation guide
Model Config	Model settings
Troubleshooting	Common issues & fixes

Project Structure

theta-ai/
├── src/
│   ├── model/              # Model architecture
│   ├── training/           # Training pipeline
│   ├── inference/          # Inference utilities
│   ├── data_processing/    # Dataset processing
│   └── utils/              # Email notifier, GPU info
├── Datasets/               # Training data (JSON)
├── models/                 # Saved checkpoints
├── documentation/          # Full documentation
├── train_overnight_enhanced.bat  # Main training script
├── prepare_data_for_training.py  # Data preparation
└── requirements.txt        # Dependencies

Key Files

File	Purpose
`train_overnight_enhanced.bat`	Main training orchestration
`finetune_theta.bat`	Targeted fine-tuning with reduced regularization
`prepare_data_for_training.py`	Data preparation pipeline
`src/training/train_enhanced.py`	Core training logic
`src/model/theta_model.py`	Model architecture
`src/utils/email_notifier.py`	Training notifications

Contributing

See CONTRIBUTING.md for guidelines.

Changelog

See CHANGELOG.md for version history.

License

This project is for educational and research purposes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Theta AI

Features

Quick Start

Requirements

Installation

Download Datasets

Training

Fine-tuning (After Initial Training)

Inference

Documentation

Project Structure

Key Files

Contributing

Changelog

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
Datasets		Datasets
data/knowledge		data/knowledge
documentation		documentation
models		models
src		src
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
README.md		README.md
SECURITY.md		SECURITY.md
download_human_like_dpo.bat		download_human_like_dpo.bat
download_human_like_dpo_dataset.py		download_human_like_dpo_dataset.py
download_openassistant.bat		download_openassistant.bat
download_openassistant_oasst1.py		download_openassistant_oasst1.py
download_openmath_instruct.bat		download_openmath_instruct.bat
download_openmath_instruct.py		download_openmath_instruct.py
finetune_theta.bat		finetune_theta.bat
interface.bat		interface.bat
prepare_data_for_training.py		prepare_data_for_training.py
requirements.txt		requirements.txt
setup_enhanced_db.py		setup_enhanced_db.py
train_overnight_enhanced.bat		train_overnight_enhanced.bat

Folders and files

Latest commit

History

Repository files navigation

Theta AI

Features

Quick Start

Requirements

Installation

Download Datasets

Training

Fine-tuning (After Initial Training)

Inference

Documentation

Project Structure

Key Files

Contributing

Changelog

License

About

Resources

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages