Tango

This project is an active exercise in building an LLM and associated evaluation/red-teaming from scratch, from the autograd and tokenization to GPU kernels and deployment.

I started by creating my own autograd library, using Andrej Karpathy's micrograd library for most of the foundation and adding in support for SwiGLU, tensor operations and operations like batchnorm.

I've added:

Custom CUDA kernels for matmuls, optimizing with shared memory tiling, float4 vectorization and double buffering
Manually written Triton matmul kernels
cuBLAS matmul wrapper
Performance evaluation for all matmul acceleration using NSight
A rough implementation of Meta's recent chain of continuous thought (Coconut) paper (Hao et al., 2024)
An eval framework with adversarial testing through ANLI and MMLU
Inference speed-up with TensorRT-LLM and vLLM implementations

The next steps are:

From-scratch implementation of byte-pair encoding for tokenization
Multi-GPU support
Interpretability features (inspired by Li et al.'s recent work on geometric approaches to interpretability)
Deployment on a home Kubernetes cluster
Combining it all with the from-scratch LLM (currently reasoning and eval/inference frameworks are wrapped onto existing models like Llama 3.3)

Overall, this project is intended as a constant learning experience and features are subject to change.

CUDA Development Setup

Cloud Environment Setup (e.g., Lambda Labs, Vast.ai)

Configure CUDA Environment

# Set Triton's ptxas path
export TRITON_PTXAS_PATH=/usr/local/cuda-11.8/bin/ptxas
echo 'export TRITON_PTXAS_PATH=/usr/local/cuda-11.8/bin/ptxas' >> ~/.bashrc
source ~/.bashrc

Install Miniconda

# Download Miniconda installer
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh

# Make installer executable
chmod +x Miniconda3-latest-Linux-x86_64.sh

# Run installer (accept license & defaults)
./Miniconda3-latest-Linux-x86_64.sh

# Source conda initialization script directly
source ~/miniconda3/etc/profile.d/conda.sh

# Initialize conda in your current shell
conda init

# Activate conda changes
source ~/.bashrc

# Verify installation
conda --version

Create and Activate Conda Environment

conda create -n tango python=3.10
conda activate tango

Install Required Packages

# Install PyTorch with CUDA support
conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia

# Install CuPy for CUDA operations
conda install -c conda-forge cupy cuda-version=12.1

# Install Triton for GPU kernel development
conda install -c conda-forge triton

Set Python Path

# Add project root to PYTHONPATH
export PYTHONPATH="${PYTHONPATH}:/workspace/Tango"

# For persistence, add to ~/.bashrc:
echo 'export PYTHONPATH="${PYTHONPATH}:/workspace/Tango"' >> ~/.bashrc
source ~/.bashrc

Project Structure

Tango/
├── centigrad/
│   ├── __init__.py
│   ├── kernels/
│   │   ├── __init__.py
│   │   ├── benchmark.py
│   │   ├── triton_kernels.py
│   │   ├── cuda_kernels.cu
│   │   └── cudnn_utils.cu

Verify Installation

import torch
import triton
import cupy

print(f"PyTorch: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
print(f"Triton version: {triton.__version__}")
print(f"CuPy version: {cupy.__version__}")

Common Issues

PYTHONPATH not set correctly
- Symptom: ModuleNotFoundError when importing project modules
- Solution: Ensure PYTHONPATH includes the project root directory
CUDA version mismatch
- Symptom: Runtime errors related to CUDA
- Solution: Ensure all CUDA-related packages (PyTorch, CuPy, Triton) use compatible versions
Missing NVIDIA drivers
- Symptom: CUDA not available
- Solution: Install appropriate NVIDIA drivers for your GPU

Development Tools

NVIDIA Nsight Systems: For system-wide performance analysis
NVIDIA Nsight Compute: For detailed kernel analysis
Triton Kernel Debugger: For debugging custom Triton kernels

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
centigrad		centigrad
inference-testing		inference-testing
reasoning-wrapper		reasoning-wrapper
transformer		transformer
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Tango

CUDA Development Setup

Cloud Environment Setup (e.g., Lambda Labs, Vast.ai)

Project Structure

Verify Installation

Common Issues

Development Tools

References

About

Uh oh!

Releases

Packages

Uh oh!

Languages

aka133/Tango

Folders and files

Latest commit

History

Repository files navigation

Tango

CUDA Development Setup

Cloud Environment Setup (e.g., Lambda Labs, Vast.ai)

Project Structure

Verify Installation

Common Issues

Development Tools

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages