# 🚀 CSCG-Torch

This notebook demonstrates how to use CSCG-Torch for GPU-accelerated sequence generation and CHMM training.

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/your-repo/cscg-torch/blob/main/examples/CSCG_Torch_Colab_Demo.ipynb)

## Features
- 🚀 GPU-accelerated sequence generation
- 🧠 V100/A100 optimized Tensor Cores
- ⚡ Fast CHMM training with mixed precision
- 📊 Easy visualization and analysis

## 📦 Installation

First, let's install CSCG-Torch and check our GPU setup.

In [None]:
# Install CSCG-Torch
!git clone https://github.com/your-repo/cscg-torch.git
%cd cscg-torch
!pip install -e .

# Check installation
import cscg_torch
print(f"✅ CSCG-Torch v{cscg_torch.__version__} installed successfully!")

## 🔍 GPU Detection and Optimization

In [None]:
# Check GPU information
device = cscg_torch.detect_optimal_device()
gpu_info = cscg_torch.get_gpu_info(device)

print(f"🖥️  Device: {gpu_info['name']}")
print(f"💾 Memory: {gpu_info.get('memory_gb', 'N/A')} GB")
print(f"⚡ Optimizations: {', '.join(gpu_info['optimizations'])}")

# Get optimal settings for this GPU
gpu_settings = cscg_torch.optimize_for_gpu(device)
print(f"\n🚀 Optimal chunk size: {gpu_settings['chunk_size']:,}")
print(f"🧠 Tensor Cores: {gpu_settings['tensor_cores']}")
print(f"🔢 Mixed Precision: {gpu_settings['mixed_precision']}")

## 📊 GPU Performance Benchmark

In [None]:
# Benchmark GPU performance
print("🏃‍♂️ Running GPU benchmark...")
benchmark_results = cscg_torch.benchmark_device(device)

print(f"⚡ Performance: {benchmark_results['gflops']:.1f} GFLOPS")
if 'memory_bandwidth_gb_s' in benchmark_results:
    print(f"💾 Memory Bandwidth: {benchmark_results['memory_bandwidth_gb_s']:.1f} GB/s")

# Memory information
mem_info = cscg_torch.get_memory_info(device)
print(f"\n💾 Available Memory: {mem_info['available_gb']:.1f} GB")

## 🏠 Load Room Data and Create Environment

In [None]:
# Load pre-generated room data
print("🏠 Loading room data...")
available_rooms = cscg_torch.get_available_rooms()
print(f"Available rooms: {list(available_rooms.keys())}")

# Load a 20x20 room for demonstration
room_data = cscg_torch.load_room_data("room_20x20")
print(f"Room shape: {room_data.shape}")

# Get room information
room_info = cscg_torch.room_info(room_data)
print(f"Total cells: {room_info['total_cells']}")
print(f"Free cells: {room_info['free_cells']}")
print(f"Unique observations: {room_info['unique_observations']}")

# Create room adapter
adapter = cscg_torch.create_room_adapter(room_data, adapter_type="torch", seed=42)
print("✅ Room adapter created successfully!")

## 🎨 Visualize Room Layout

In [None]:
# Plot the room layout
import matplotlib.pyplot as plt

fig = cscg_torch.plot_room_layout(
    room_data, 
    title="20x20 Room Navigation Environment",
    colormap='tab20'
)
plt.show()

print("🎨 Room layout visualization complete!")

## ⚡ GPU-Accelerated Sequence Generation

In [None]:
import time

# Test different sequence lengths to see GPU acceleration
test_lengths = [10_000, 50_000, 100_000]
generation_times = []
generation_rates = []

print("🚀 Testing GPU sequence generation performance...\n")

for seq_len in test_lengths:
    print(f"Generating {seq_len:,} steps...")
    
    start_time = time.time()
    x_seq, a_seq = adapter.generate_sequence_gpu(seq_len, device=device)
    end_time = time.time()
    
    generation_time = end_time - start_time
    rate = seq_len / generation_time
    
    generation_times.append(generation_time)
    generation_rates.append(rate)
    
    print(f"  ✅ Generated in {generation_time:.2f}s ({rate:,.0f} steps/sec)")
    print(f"  📊 Obs range: [{x_seq.min()}, {x_seq.max()}], Action range: [{a_seq.min()}, {a_seq.max()}]\n")

# Use the largest sequence for training
print(f"Using {len(x_seq):,} step sequence for CHMM training...")

## 📈 GPU Performance Visualization

In [None]:
# Plot GPU performance scaling
fig = cscg_torch.plot_gpu_performance(
    test_lengths, 
    generation_times,
    device_name=gpu_info['name']
)
plt.show()

print(f"📊 Average generation rate: {sum(generation_rates)/len(generation_rates):,.0f} steps/second")

## 🔍 Sequence Analysis

In [None]:
# Analyze the generated sequences
fig = cscg_torch.plot_sequence_statistics(x_seq, a_seq)
plt.show()

print(f"📈 Sequence statistics plotted for {len(x_seq):,} steps")

## 🧠 CHMM Training with GPU Optimization

In [None]:
# Setup model parameters
n_clones_per_obs = 150  # Adjust based on GPU memory
n_clones = cscg_torch.get_room_n_clones(n_clones_per_obs=n_clones_per_obs, device=device)

print(f"🧠 Training CHMM with {n_clones.sum().item()} total states...")
print(f"📊 Sequence length: {len(x_seq):,} steps")
print(f"🔧 Using {gpu_info['name']} with {', '.join(gpu_info['optimizations'])}")

# Train the model with GPU optimization
start_time = time.time()

model, progression = cscg_torch.train_chmm(
    n_clones=n_clones,
    x=x_seq,
    a=a_seq,
    device=device,
    method='em_T',
    n_iter=50,
    enable_mixed_precision=gpu_settings['mixed_precision'],
    learn_E=True,
    early_stopping=True,
    seed=42
)

training_time = time.time() - start_time

print(f"\n✅ Training completed in {training_time:.2f} seconds!")
print(f"📈 Final BPS: {progression[-1]:.4f}")
print(f"📊 Total improvement: {progression[0] - progression[-1]:.4f} BPS")
print(f"🎯 Convergence: {len(progression)} iterations")

## 📊 Training Results Visualization

In [None]:
# Plot training progression
fig = cscg_torch.plot_training_progression(
    progression,
    title=f"CHMM Training on {gpu_info['name']}",
    show_improvement=True
)
plt.show()

print(f"📈 Training progression plotted ({len(progression)} iterations)")

## 🎯 Model Evaluation and Analysis

In [None]:
# Evaluate model performance
import torch

# Convert sequences to tensors for evaluation
x_tensor = torch.tensor(x_seq, device=device, dtype=torch.int64)
a_tensor = torch.tensor(a_seq, device=device, dtype=torch.int64)

# Calculate bits per step
final_bps = model.bps(x_tensor, a_tensor, reduce=True)
print(f"🎯 Final Bits Per Step: {final_bps:.4f}")

# Decode optimal state sequence
print("🔍 Decoding optimal state sequence...")
neg_log_lik, states = model.decode(x_tensor, a_tensor)
print(f"📊 MAP negative log-likelihood: {neg_log_lik:.4f}")
print(f"🧠 Decoded {len(states)} states")

# Model statistics
unique_states = torch.unique(states)
print(f"\n📈 Model Statistics:")
print(f"  • Total possible states: {n_clones.sum().item()}")
print(f"  • Unique states used: {len(unique_states)}")
print(f"  • State utilization: {len(unique_states)/n_clones.sum().item()*100:.1f}%")
print(f"  • Training efficiency: {(progression[0]-progression[-1])/progression[0]*100:.1f}% improvement")

## 💾 Memory Usage Analysis

In [None]:
# Check final memory usage
final_mem_info = cscg_torch.get_memory_info(device)

print("💾 Final Memory Usage:")
print(f"  • Used: {final_mem_info['used_gb']:.1f} GB")
print(f"  • Available: {final_mem_info['available_gb']:.1f} GB")
print(f"  • Cached: {final_mem_info['cached_gb']:.1f} GB")

# Performance summary
print(f"\n🚀 Performance Summary:")
print(f"  • GPU: {gpu_info['name']}")
print(f"  • Sequence generation: {generation_rates[-1]:,.0f} steps/sec")
print(f"  • Training time: {training_time:.1f} seconds")
print(f"  • Model performance: {final_bps:.4f} BPS")
print(f"  • Memory efficiency: {final_mem_info['used_gb']:.1f} GB peak usage")

## 🎉 Conclusion

You've successfully:

✅ **Installed CSCG-Torch** with GPU optimization  
✅ **Detected and optimized** for your specific GPU  
✅ **Generated sequences** at high speed using GPU acceleration  
✅ **Trained a CHMM model** with mixed precision and Tensor Cores  
✅ **Visualized results** and analyzed performance  

### Next Steps

- 🔬 **Experiment** with different room sizes and parameters
- 📊 **Scale up** to larger sequences (1M+ steps) 
- 🧠 **Try different models** and training methods
- 🎯 **Custom environments** for your specific research

### Resources

- 📖 [Documentation](https://github.com/your-repo/cscg-torch)
- 🐛 [Report Issues](https://github.com/your-repo/cscg-torch/issues)
- 💬 [Discussions](https://github.com/your-repo/cscg-torch/discussions)

Happy researching! 🚀