# LayerLens Quick Start on Google Colab

This notebook demonstrates how to use LayerLens on Google Colab with GPU acceleration.

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ErenAta16/LayerLens/blob/main/notebooks/colab_quick_start.ipynb)

## What You'll Learn
- Install LayerLens on Colab
- Verify GPU availability
- Run a simple BERT profiling example
- Generate optimization manifest

In [None]:
# Install LayerLens
!git clone https://github.com/ErenAta16/LayerLens.git
%cd LayerLens
!pip install -e ".[demo]" -q

print("‚úÖ LayerLens installed successfully!")

In [None]:
# Verify installation and GPU
import torch
import layerlens

print(f"LayerLens version: {layerlens.__version__ if hasattr(layerlens, '__version__') else '0.1.0'}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    print(f"GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")
else:
    print("‚ö†Ô∏è No GPU detected. Using CPU (slower performance).")

In [None]:
from pathlib import Path
from layerlens.pipeline import run_pipeline
from layerlens.config import ProfilingConfig, OptimizationConfig, LatencyProfile
from layerlens.models import ModelSpec, LayerSpec

# Define a simple model (BERT-base structure)
model_spec = ModelSpec(
    model_name="bert-base-example",
    total_params=110_000_000,
    layers=[
        LayerSpec(
            name=f"encoder.layer.{i}",
            hidden_size=768,
            layer_type="transformer",
            supports_attention=True
        )
        for i in range(12)
    ]
)

# Configure profiling
profiling_cfg = ProfilingConfig(
    metric_weights={
        "gradient_energy": 0.4,
        "fisher": 0.4,
        "proxy_eval": 0.2
    }
)

# Configure optimization with GPU latency profile
latency_profile = LatencyProfile(
    device_type="gpu",
    model_family="llm",
    batch_size=4,
    sequence_length=512
)

optimization_cfg = OptimizationConfig(
    max_trainable_params=50_000,
    max_flops=1e9,
    max_vram_gb=15.0,  # Colab GPU limit
    latency_target_ms=100.0,
    latency_profile=latency_profile
)

# Create synthetic activation cache (in real use, compute from model)
activation_cache = {
    f"encoder.layer.{i}": {
        "grad_norm": 0.5 + i * 0.1,
        "fisher_trace": 0.3 + i * 0.05,
        "proxy_gain": 0.1 + i * 0.02
    }
    for i in range(12)
}

# Run pipeline
output_dir = Path("./output")
print("Running LayerLens pipeline...")
manifest_path = run_pipeline(
    model_spec=model_spec,
    profiling_cfg=profiling_cfg,
    optimization_cfg=optimization_cfg,
    activation_cache=activation_cache,
    output_dir=output_dir
)

print(f"\n‚úÖ Optimization complete!")
print(f"üìÑ Manifest saved to: {manifest_path}")

In [None]:
import json

# Load and display the manifest
with open(manifest_path, 'r') as f:
    manifest = json.load(f)

print("=" * 60)
print("LAYERLENS OPTIMIZATION RESULTS")
print("=" * 60)

allocations = manifest['allocations']
print(f"\nTotal layers: {len(allocations)}")

# Summary statistics
lora_layers = sum(1 for a in allocations if a['method'] == 'lora')
adapter_layers = sum(1 for a in allocations if a['method'] == 'adapter')
prefix_layers = sum(1 for a in allocations if a['method'] == 'prefix')
none_layers = sum(1 for a in allocations if a['method'] == 'none')

print(f"\nMethod Distribution:")
print(f"  LoRA: {lora_layers} layers")
print(f"  Adapter: {adapter_layers} layers")
print(f"  Prefix: {prefix_layers} layers")
print(f"  None: {none_layers} layers")

# Show top 5 layers by utility
print(f"\nTop 5 Layers by Utility:")
sorted_allocs = sorted(allocations, key=lambda x: x['utility'], reverse=True)
for i, alloc in enumerate(sorted_allocs[:5], 1):
    print(f"  {i}. {alloc['layer']}: {alloc['method']} (rank={alloc['rank']}, utility={alloc['utility']:.4f})")

print("\n" + "=" * 60)

## üéâ Success!

You've successfully run LayerLens on Google Colab!

### Next Steps
1. **Try with a real model**: See `demos/demo_bert.py` for loading actual BERT models
2. **Experiment with configurations**: Adjust `max_trainable_params`, `latency_target_ms`, etc.
3. **Explore YOLO demo**: Check `demos/demo_yolo.py` for vision model optimization

### Resources
- üìñ [Documentation](https://github.com/ErenAta16/LayerLens/tree/main/docs)
- üêõ [Troubleshooting](https://github.com/ErenAta16/LayerLens/blob/main/COLAB_TROUBLESHOOT.md)
- üí¨ [Issues](https://github.com/ErenAta16/LayerLens/issues)