Skip to content

FrostlineTech/Theta-AI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Theta AI

A GPT-2 based conversational AI training framework optimized for NVIDIA RTX 3060 (12GB VRAM). This repository contains everything needed to train, fine-tune, and run inference on Theta AI models.

Features

  • Optimized Training Pipeline: Gradient checkpointing, mixed precision (FP16), CPU offloading
  • Advanced Techniques: Curriculum learning, R-Drop regularization, EMA, label smoothing
  • RTX 3060 Optimized: Configured for 12GB VRAM with memory-efficient settings
  • Email Notifications: Real-time training alerts with GPU stats and metric monitoring
  • Multi-domain Training: Cybersecurity, programming, networking, data science, and more

Quick Start

Requirements

  • GPU: NVIDIA RTX 3060 12GB (or similar)
  • CPU: AMD Ryzen 5-5500 or equivalent
  • CUDA: 11.8+
  • Python: 3.8+

Installation

# Clone repository
git clone https://github.com/yourusername/theta-ai.git
cd theta-ai

# Install dependencies (CUDA)
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip install -r requirements.txt

# Download NLTK data
python -c "import nltk; nltk.download('punkt'); nltk.download('wordnet'); nltk.download('stopwords')"

# Setup environment
cp .env.example .env
# Edit .env with your settings

Download Datasets

# Human-like conversational data (28MB)
download_human_like_dpo.bat

# OpenAssistant dataset (6GB)
download_openassistant.bat

# OpenMath dataset (9GB, optional)
download_openmath_instruct.bat

Training

# Full training pipeline (overnight recommended)
train_overnight_enhanced.bat

Fine-tuning (After Initial Training)

If training stalls or validation loss plateaus, run targeted fine-tuning with reduced regularization:

# Create a config file or use existing one
finetune_theta.bat

See Training Pipeline for details.

Inference

from src.model.theta_model import ThetaModel

model = ThetaModel.load("models/theta_enhanced_YYYYMMDD/theta_final")
response = model.generate("What is machine learning?", max_length=200)
print(response)

Documentation

Full documentation is available in the documentation/ folder:

Guide Description
Installation Detailed setup instructions
Quick Start Get training in 5 minutes
Training Pipeline Complete training system guide
Datasets Dataset formats and creation
Hyperparameters All configuration options
RTX 3060 Optimizations GPU-specific tuning
Email Notifications Alert system setup
Architecture System design overview
API Reference Code documentation
Data Processing Data preparation guide
Model Config Model settings
Troubleshooting Common issues & fixes

Project Structure

theta-ai/
├── src/
│   ├── model/              # Model architecture
│   ├── training/           # Training pipeline
│   ├── inference/          # Inference utilities
│   ├── data_processing/    # Dataset processing
│   └── utils/              # Email notifier, GPU info
├── Datasets/               # Training data (JSON)
├── models/                 # Saved checkpoints
├── documentation/          # Full documentation
├── train_overnight_enhanced.bat  # Main training script
├── prepare_data_for_training.py  # Data preparation
└── requirements.txt        # Dependencies

Key Files

File Purpose
train_overnight_enhanced.bat Main training orchestration
finetune_theta.bat Targeted fine-tuning with reduced regularization
prepare_data_for_training.py Data preparation pipeline
src/training/train_enhanced.py Core training logic
src/model/theta_model.py Model architecture
src/utils/email_notifier.py Training notifications

Contributing

See CONTRIBUTING.md for guidelines.

Changelog

See CHANGELOG.md for version history.

License

This project is for educational and research purposes.

About

No description, website, or topics provided.

Resources

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors