# üß™ Transformer Builder - Advanced Testing Lab

Welcome! This notebook provides comprehensive testing and training capabilities for your custom transformer architecture.

**What's included:**
- ‚úÖ **Tier 1:** Critical validation (shape, gradients, numerical stability)
- üî¨ **Tier 2:** Advanced analysis (attention patterns, robustness, profiling)
- üöÄ **Tier 3:** Training utilities (fine-tuning, hyperparameter sweeps, benchmarks)

**Quick Start:**
1. Click "Run all" (Runtime ‚Üí Run all)
2. Review Tier 1 results (should complete in ~1 minute)
3. Explore Tier 2/3 sections as needed

**Source:** Generated from [Transformer Builder](https://transformer-builder.com)

---

## Setup: Install Dependencies

This may take 30-60 seconds on first run.

In [None]:
# Install core dependencies
!pip install -q torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
!pip install -q transformers datasets evaluate accelerate
!pip install -q scipy matplotlib seaborn pandas tqdm
!pip install -q torchinfo  # For model summaries

print("‚úì Core dependencies installed")

# Install Tier 2 dependencies (optional)
!pip install -q captum  # For attribution analysis
print("‚úì Tier 2 dependencies installed")

# Install Tier 3 dependencies (optional)
!pip install -q optuna  # For hyperparameter optimization
print("‚úì Tier 3 dependencies installed")

print("\n‚úÖ All dependencies ready!")

## Load Custom Model from URL

This cell extracts your model code from the URL fragment (passed from Transformer Builder).

In [None]:
import urllib.request
import urllib.parse
import json
from google.colab import output

# Extract gist_id from URL query parameters
js_script = """
const params = new URLSearchParams(window.location.search);
return {
  gist_id: params.get('gist_id'),
  model_name: params.get('name') || 'CustomTransformer'
};
"""

params = output.eval_js(js_script)
gist_id = params.get('gist_id')

if gist_id:
    try:
        print(f"üì• Loading model from GitHub Gist: {gist_id}")
        
        # Fetch gist data from GitHub API
        gist_url = f"https://api.github.com/gists/{gist_id}"
        with urllib.request.urlopen(gist_url) as response:
            gist_data = json.loads(response.read().decode('utf-8'))
        
        # Extract files from gist
        files = gist_data.get('files', {})
        
        if 'model.py' not in files or 'config.json' not in files:
            raise ValueError("Gist missing required files (model.py or config.json)")
        
        model_code = files['model.py']['content']
        config_json = files['config.json']['content']
        
        # Write files to disk
        with open('custom_transformer.py', 'w') as f:
            f.write(model_code)
        
        with open('config.json', 'w') as f:
            f.write(config_json)
        
        model_name = params.get('model_name', 'CustomTransformer')
        
        print(f"‚úÖ Model code loaded successfully!")
        print(f"‚úÖ Model name: {model_name}")
        print(f"‚úÖ Code size: {len(model_code):,} bytes")
        print(f"‚úÖ Config size: {len(config_json):,} bytes")
        print(f"‚úÖ Gist URL: {gist_data.get('html_url', 'N/A')}")
        
    except Exception as e:
        print(f"‚ùå Failed to load model from Gist: {e}")
        print("‚ö†Ô∏è Falling back to example model...")
        gist_id = None

if not gist_id:
    print("‚ö†Ô∏è No gist_id found in URL")
    print("Loading example model for demonstration...\n")
    
    # Fallback: Create example model
    example_code = """import torch
import torch.nn as nn

class ExampleTransformer(nn.Module):
    def __init__(self, vocab_size=50257, d_model=512, nhead=8, num_layers=6):
        super().__init__()
        self.embedding = nn.Embedding(vocab_size, d_model)
        encoder_layer = nn.TransformerEncoderLayer(d_model=d_model, nhead=nhead, batch_first=True)
        self.transformer = nn.TransformerEncoder(encoder_layer, num_layers=num_layers)
        self.output_projection = nn.Linear(d_model, vocab_size)
    
    def forward(self, input_ids):
        x = self.embedding(input_ids)
        x = self.transformer(x)
        return self.output_projection(x)
"""
    
    with open('custom_transformer.py', 'w') as f:
        f.write(example_code)
    
    with open('config.json', 'w') as f:
        json.dump({
            "vocab_size": 50257,
            "d_model": 512,
            "nhead": 8,
            "num_layers": 6
        }, f)
    
    params = {'name': 'ExampleTransformer'}
    print("‚úÖ Example model loaded")

## üìÑ View Loaded Model Code

This cell displays the Python code that was loaded from your Transformer Builder export. You can review the architecture before running tests.

In [None]:
# Display the loaded model code for transparency
print("=" * 80)
print("üìÑ LOADED MODEL CODE (custom_transformer.py)")
print("=" * 80)
print()

with open('custom_transformer.py', 'r') as f:
    model_code_display = f.read()

# Use syntax highlighting
from IPython.display import Code
display(Code(model_code_display, language='python'))

print()
print("=" * 80)
print("üìã MODEL CONFIGURATION (config.json)")
print("=" * 80)
print()

with open('config.json', 'r') as f:
    config_display = json.load(f)

# Pretty print JSON
print(json.dumps(config_display, indent=2))
print()
print("‚úÖ You can now proceed to run the model instantiation and tests below!")

## Dynamic Dependency Detection

Automatically detect and install any custom dependencies your model needs.

In [None]:
import ast
import subprocess
import sys

# Parse imports from generated code
with open('custom_transformer.py', 'r') as f:
    source_code = f.read()
    tree = ast.parse(source_code)

# Extract all imports
imports = set()
for node in ast.walk(tree):
    if isinstance(node, ast.Import):
        for alias in node.names:
            imports.add(alias.name.split('.')[0])
    elif isinstance(node, ast.ImportFrom):
        if node.module:
            imports.add(node.module.split('.')[0])

print(f"Detected imports: {', '.join(sorted(imports))}")

# Standard library modules (don't need pip install)
stdlib_modules = {
    'abc', 'collections', 'dataclasses', 'functools', 'json', 'math',
    'typing', 'warnings', 'os', 'sys', 're', 'time', 'copy'
}

# Already installed
installed_modules = {
    'torch', 'transformers', 'numpy', 'scipy', 'matplotlib',
    'pandas', 'seaborn', 'tqdm', 'torchinfo', 'captum', 'optuna'
}

# Find missing packages
missing = imports - stdlib_modules - installed_modules

if missing:
    print(f"\nInstalling additional dependencies: {', '.join(missing)}")
    for package in missing:
        try:
            subprocess.check_call(
                [sys.executable, '-m', 'pip', 'install', '-q', package],
                stdout=subprocess.DEVNULL,
                stderr=subprocess.DEVNULL
            )
            print(f"  ‚úÖ Installed {package}")
        except subprocess.CalledProcessError:
            print(f"  ‚ö†Ô∏è Failed to install {package} (may not be a pip package)")
else:
    print("\n‚úÖ All dependencies already installed")

## Import and Instantiate Model

Load your custom transformer and prepare for testing.

In [None]:
import torch
import torch.nn as nn
from torchinfo import summary

# Import the custom model
exec(open('custom_transformer.py').read())

# Load config
with open('config.json') as f:
    config_dict = json.load(f)

# Find the model class
model_class = None
for name, obj in list(globals().items()):
    if isinstance(obj, type) and issubclass(obj, nn.Module) and obj is not nn.Module:
        if name == params['name']:
            model_class = obj
            break

if model_class is None:
    # Fallback: find any nn.Module subclass
    for name, obj in list(globals().items()):
        if isinstance(obj, type) and issubclass(obj, nn.Module) and obj is not nn.Module:
            model_class = obj
            print(f"‚ö†Ô∏è Using {name} (expected {params['name']})")
            break

if model_class:
    # Instantiate model
    try:
        model = model_class(**config_dict)
        model.eval()
        
        total_params = sum(p.numel() for p in model.parameters())
        trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
        
        print(f"‚úÖ Model instantiated: {model_class.__name__}")
        print(f"‚úÖ Total parameters: {total_params:,}")
        print(f"‚úÖ Trainable parameters: {trainable_params:,}")
        
        # Move to GPU if available
        device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
        model = model.to(device)
        print(f"‚úÖ Device: {device}")
        
        # Display model summary
        print("\n--- Model Summary ---")
        try:
            # Create dummy input based on config
            vocab_size = config_dict.get('vocab_size', 50257)
            dummy_input = torch.randint(0, vocab_size, (1, 32)).to(device)
            summary(model, input_data=dummy_input, depth=3)
        except Exception as e:
            print(f"‚ö†Ô∏è Could not generate summary: {e}")
        
    except Exception as e:
        print(f"‚ùå Failed to instantiate model: {e}")
        raise
else:
    raise RuntimeError(f"Could not find model class '{params['name']}' in generated code")

# Create config object for test functions
class ModelConfig:
    def __init__(self, **kwargs):
        for key, value in kwargs.items():
            setattr(self, key, value)

config = ModelConfig(**config_dict)
print("\n‚úÖ Ready for testing!")

---

# üîç Tier 1: Critical Validation

These tests verify your model is mathematically sound and ready for training.

**Estimated time:** ~1 minute

**What's tested:**
- ‚úÖ Shape validation across edge cases
- ‚úÖ Gradient flow (detect vanishing/exploding gradients)
- ‚úÖ Numerical stability (NaN/Inf detection)
- ‚úÖ Parameter initialization quality
- ‚úÖ Memory footprint scaling
- ‚úÖ Inference speed benchmarks

In [None]:
# Import test utilities
!wget -q https://raw.githubusercontent.com/matt-hans/transformer-builder-colab-templates/main/utils/test_functions.py
from test_functions import (
    test_shape_robustness,
    test_gradient_flow,
    test_output_stability,
    test_parameter_initialization,
    test_memory_footprint,
    test_inference_speed
)

print("‚úÖ Test functions loaded")

In [None]:
print("=" * 80)
print("TIER 1: CRITICAL VALIDATION")
print("=" * 80)
print()

# Test 1: Shape Robustness
print("Test 1/6: Shape Validation")
print("-" * 80)
shape_results = test_shape_robustness(model, config)
display(shape_results)
print()

# Test 2: Gradient Flow
print("Test 2/6: Gradient Flow Analysis")
print("-" * 80)
grad_results = test_gradient_flow(model, config)
display(grad_results)
print()

# Test 3: Output Stability
print("Test 3/6: Numerical Stability")
print("-" * 80)
stability_stats = test_output_stability(model, config, n_samples=100)
print()

# Test 4: Parameter Initialization
print("Test 4/6: Parameter Initialization")
print("-" * 80)
param_results = test_parameter_initialization(model)
display(param_results)
print()

# Test 5: Memory Footprint
print("Test 5/6: Memory Footprint Analysis")
print("-" * 80)
memory_results = test_memory_footprint(model, config)
display(memory_results)
print()

# Test 6: Inference Speed
print("Test 6/6: Inference Speed Benchmark")
print("-" * 80)
speed_stats = test_inference_speed(model, config, n_trials=50)
print()

print("=" * 80)
print("‚úÖ TIER 1 VALIDATION COMPLETE")
print("=" * 80)
print()
print("All critical tests passed! Your model is ready for advanced analysis.")
print()
print("Next steps:")
print("‚Ä¢ Scroll down for Tier 2 (Advanced Analysis)")
print("‚Ä¢ Or jump to Tier 3 (Training & Fine-Tuning)")