# üé® Glimpse3D Diffusion Enhancement Module

This notebook tests the `ai_modules/diffusion/` module for enhancing 3D rendered views using **SDXL Lightning + ControlNet Depth**.

## üñ•Ô∏è VS Code Colab Extension Setup

**You're using this notebook with the VS Code Colab Extension!** Here's what you need to know:

### How It Works
1. **Kernel runs on Colab servers** (with GPU) - NOT your local machine
2. **Local files are NOT automatically available** - You need to either:
   - Clone the repo in the Colab runtime (recommended)
   - Upload files manually via right-click ‚Üí "Upload to Colab Session"
   - Mount Google Drive for persistent storage

### To Connect:
1. Click `Select Kernel` ‚Üí `Colab` ‚Üí `New Colab Server`
2. Choose runtime type (T4 GPU recommended)
3. Run the setup cells below

---

## Features
- **SDXL Lightning**: 4-step inference using UNet checkpoints (recommended by ByteDance)
- **ControlNet Depth**: Structure preservation using depth from `midas_depth` module
- **T4 GPU Optimized**: Memory optimizations for 15GB VRAM

## Pipeline Role
```
SyncDreamer (16 views) ‚Üí 3DGS ‚Üí Render ‚Üí [This Module] ‚Üí Refined Views ‚Üí Back to 3DGS
```

## 1Ô∏è‚É£ Setup & Installation

**Important**: Since the Colab runtime is remote, we need to:
1. Clone the Glimpse3D repository into the Colab environment
2. Install dependencies on the remote runtime
3. Set up caching to avoid re-downloading models

In [None]:
# Check GPU availability
!nvidia-smi --query-gpu=name,memory.total,memory.free --format=csv

In [None]:
# Install dependencies (run once)
!pip install -q diffusers>=0.25.0 transformers>=4.36.0 accelerate>=0.25.0
!pip install -q xformers  # Memory-efficient attention
!pip install -q huggingface_hub safetensors
!pip install -q timm scipy  # For MiDaS depth

print("‚úÖ Dependencies installed!")

In [None]:
# Clone Glimpse3D repository into Colab runtime
# NOTE: This clones to the COLAB SERVER, not your local machine!
import os
import sys

# Define paths
COLAB_ROOT = "/content"
REPO_PATH = f"{COLAB_ROOT}/Glimpse-3D"

# Clone if not exists
if not os.path.exists(REPO_PATH):
    print("üì• Cloning Glimpse3D repository to Colab runtime...")
    !git clone https://github.com/varunaditya27/Glimpse3D.git {REPO_PATH}
    print("‚úÖ Repository cloned!")
else:
    print("‚úÖ Repository already exists in Colab runtime")
    # Optionally pull latest changes
    # !cd {REPO_PATH} && git pull

# Add to Python path
if REPO_PATH not in sys.path:
    sys.path.insert(0, REPO_PATH)

# Change working directory
os.chdir(REPO_PATH)
print(f"üìÇ Working directory: {os.getcwd()}")
print(f"üìÇ Files in ai_modules/: {os.listdir('ai_modules') if os.path.exists('ai_modules') else 'NOT FOUND'}")

In [None]:
# Configure caching and environment
# This persists models across sessions if using Google Drive
import os

# Option 1: Use Colab's /content directory (lost on disconnect)
os.environ["HF_HOME"] = "/content/hf_cache"
os.environ["TRANSFORMERS_CACHE"] = "/content/hf_cache"
os.makedirs("/content/hf_cache", exist_ok=True)

# Option 2: Mount Google Drive for persistent cache (recommended for large models)
# Uncomment below to use Google Drive:
# from google.colab import drive
# drive.mount('/content/drive')
# os.environ["HF_HOME"] = "/content/drive/MyDrive/hf_cache"
# os.makedirs(os.environ["HF_HOME"], exist_ok=True)

print(f"‚úÖ Cache directory: {os.environ['HF_HOME']}")
print("üí° Tip: Mount Google Drive (uncomment above) to persist models across sessions")

## 2Ô∏è‚É£ Test Module Imports

In [None]:
# Test imports from diffusion module
try:
    from ai_modules.diffusion import (
        EnhanceService,
        EnhanceConfig,
        enhance_view,
        MemoryConfig,
        get_memory_status,
        print_memory_report,
        PromptBuilder,
    )
    print("‚úÖ Diffusion module imported successfully!")
except ImportError as e:
    print(f"‚ùå Import error: {e}")

In [None]:
# Test imports from midas_depth module
try:
    from ai_modules.midas_depth import (
        estimate_depth,
        estimate_depth_confidence,
        save_depth_visualization,
        DepthEstimator,
    )
    print("‚úÖ MiDaS depth module imported successfully!")
except ImportError as e:
    print(f"‚ùå Import error: {e}")

In [None]:
# Check GPU memory before loading models
print_memory_report()

## 3Ô∏è‚É£ Get Test Image

**Option A**: Download a sample image from the web  
**Option B**: Upload your own local image using VS Code Colab extension:
  - Right-click your local file in VS Code Explorer
  - Select "Upload to Colab Session"
  - File will appear in `/content/`

**Option C**: Mount Google Drive and use images from there

In [None]:
import os
import urllib.request
from PIL import Image
import matplotlib.pyplot as plt

# Create test directory in Colab runtime
os.makedirs('/content/test_images', exist_ok=True)
os.makedirs('/content/outputs', exist_ok=True)

# ============================================================
# CHOOSE YOUR IMAGE SOURCE:
# ============================================================

# Option A: Download sample image from web
test_image_url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/input_image_vermeer.png"
test_image_path = "/content/test_images/test_render.png"

if not os.path.exists(test_image_path):
    print("üì• Downloading sample test image...")
    urllib.request.urlretrieve(test_image_url, test_image_path)
    print(f"‚úÖ Downloaded to: {test_image_path}")
else:
    print(f"‚úÖ Using existing: {test_image_path}")

# Option B: Use uploaded file (via VS Code right-click ‚Üí "Upload to Colab Session")
# Uncomment and modify path:
# test_image_path = "/content/your_uploaded_image.png"

# Option C: Use file from mounted Google Drive
# test_image_path = "/content/drive/MyDrive/your_image.png"

# ============================================================

# Display test image
print(f"\nüìÇ Test image path: {test_image_path}")
test_img = Image.open(test_image_path)
plt.figure(figsize=(6, 6))
plt.imshow(test_img)
plt.title(f"Test Input Image\nSize: {test_img.size}")
plt.axis('off')
plt.show()

## 4Ô∏è‚É£ Test MiDaS Depth Estimation

First, let's verify the `midas_depth` module integration works.

In [None]:
import numpy as np

# Estimate depth using midas_depth module
print("üîç Estimating depth...")
depth_map = estimate_depth(test_image_path, model_type="MiDaS_small")

print(f"‚úÖ Depth map shape: {depth_map.shape}")
print(f"   Depth range: [{depth_map.min():.4f}, {depth_map.max():.4f}]")

# Visualize depth
fig, axes = plt.subplots(1, 2, figsize=(12, 5))

axes[0].imshow(test_img)
axes[0].set_title("Input Image")
axes[0].axis('off')

axes[1].imshow(depth_map, cmap='magma')
axes[1].set_title("Depth Map (MiDaS)")
axes[1].axis('off')

plt.tight_layout()
plt.show()

In [None]:
# Test depth confidence estimation
rgb_array = np.array(test_img)
confidence = estimate_depth_confidence(depth_map, rgb_array)

print(f"‚úÖ Confidence map shape: {confidence.shape}")
print(f"   Confidence range: [{confidence.min():.4f}, {confidence.max():.4f}]")

# Visualize confidence
fig, axes = plt.subplots(1, 3, figsize=(15, 4))

axes[0].imshow(test_img)
axes[0].set_title("Input")
axes[0].axis('off')

axes[1].imshow(depth_map, cmap='magma')
axes[1].set_title("Depth")
axes[1].axis('off')

axes[2].imshow(confidence, cmap='viridis')
axes[2].set_title("Confidence (bright=reliable)")
axes[2].axis('off')

plt.tight_layout()
plt.show()

## 5Ô∏è‚É£ Load Enhancement Service

Now let's load the SDXL Lightning + ControlNet pipeline.

**‚ö†Ô∏è This downloads ~10GB of models on first run!**

In [None]:
# Create configuration for T4 GPU
config = EnhanceConfig.for_t4_gpu()

print("Enhancement Configuration:")
print(f"  Device: {config.device}")
print(f"  Lightning steps: {config.lightning_steps}")
print(f"  ControlNet: {config.use_controlnet}")
print(f"  Strength: {config.strength}")
print(f"  Memory optimization: {config.optimize_memory}")

In [None]:
%%time
# Load the enhancement service (downloads models automatically)
# This may take 5-10 minutes on first run

service = EnhanceService(config=config)
service.load()

print("\n" + "="*50)
print("‚úÖ Enhancement service loaded!")
print("="*50)

In [None]:
# Check memory after loading
print_memory_report()

## 6Ô∏è‚É£ Test Enhancement

Let's enhance our test image with different settings.

In [None]:
%%time
# Basic enhancement with auto-depth
enhanced = service.enhance(
    image=test_image_path,
    prompt="high quality 3D render, detailed texture, photorealistic, studio lighting",
    seed=42  # For reproducibility
)

print(f"‚úÖ Enhancement complete!")
print(f"   Output size: {enhanced.size}")

In [None]:
# Compare original vs enhanced
fig, axes = plt.subplots(1, 2, figsize=(14, 7))

axes[0].imshow(test_img)
axes[0].set_title("Original", fontsize=14)
axes[0].axis('off')

axes[1].imshow(enhanced)
axes[1].set_title("Enhanced (SDXL Lightning + ControlNet)", fontsize=14)
axes[1].axis('off')

plt.tight_layout()
plt.show()

## 7Ô∏è‚É£ Test with Pre-computed Depth

Using depth from `midas_depth` module for better control.

In [None]:
%%time
# Enhancement with pre-computed depth (skips auto-depth)
enhanced_with_depth = service.enhance(
    image=test_image_path,
    depth_map=depth_map,  # From midas_depth
    prompt="photorealistic 3D model, detailed surface texture, professional rendering",
    controlnet_scale=0.6,  # Stronger structure preservation
    strength=0.7,  # Less change from original
    seed=42
)

print("‚úÖ Enhancement with pre-computed depth complete!")

In [None]:
# Compare all versions
fig, axes = plt.subplots(1, 3, figsize=(18, 6))

axes[0].imshow(test_img)
axes[0].set_title("Original", fontsize=12)
axes[0].axis('off')

axes[1].imshow(enhanced)
axes[1].set_title("Enhanced (auto-depth)", fontsize=12)
axes[1].axis('off')

axes[2].imshow(enhanced_with_depth)
axes[2].set_title("Enhanced (pre-computed depth, stronger control)", fontsize=12)
axes[2].axis('off')

plt.tight_layout()
plt.show()

## 8Ô∏è‚É£ Test Confidence-Weighted Blending

Blend enhanced and original based on depth confidence.

In [None]:
%%time
# Enhancement with confidence-weighted blending
# Preserves original in low-confidence (uncertain depth) regions
enhanced_blended = service.enhance_with_depth_confidence(
    image=test_image_path,
    prompt="high quality 3D render, detailed texture",
    blend_with_original=True,
    confidence_threshold=0.5,
    seed=42
)

print("‚úÖ Confidence-weighted enhancement complete!")

In [None]:
# Compare blended vs full enhancement
fig, axes = plt.subplots(1, 3, figsize=(18, 6))

axes[0].imshow(test_img)
axes[0].set_title("Original", fontsize=12)
axes[0].axis('off')

axes[1].imshow(enhanced)
axes[1].set_title("Full Enhancement", fontsize=12)
axes[1].axis('off')

axes[2].imshow(enhanced_blended)
axes[2].set_title("Confidence-Weighted Blend", fontsize=12)
axes[2].axis('off')

plt.tight_layout()
plt.show()

## 9Ô∏è‚É£ Test Prompt Templates

In [None]:
# List available templates
builder = PromptBuilder(template="default")
print("Available prompt templates:")
for template in builder.list_templates():
    info = builder.get_template_info(template)
    print(f"  ‚Ä¢ {template}: {info['base_prompt'][:50]}...")

In [None]:
# Build prompt using template
builder = PromptBuilder(template="photorealistic")
prompt, negative = builder.build(
    subject="a detailed 3D model",
    extra_modifiers=["soft shadows", "ambient occlusion"]
)

print("Generated Prompt:")
print(f"  Positive: {prompt}")
print(f"  Negative: {negative}")

## üîü Test Batch Enhancement

Simulate enhancing multiple views (as in the pipeline).

In [None]:
# Create multiple test images (simulating rendered views)
from PIL import ImageEnhance

test_images = []
for i in range(4):
    # Create variations to simulate different views
    img = test_img.copy()
    enhancer = ImageEnhance.Brightness(img)
    img = enhancer.enhance(0.9 + i * 0.1)  # Vary brightness
    test_images.append(img)

print(f"Created {len(test_images)} test images")

In [None]:
%%time
# Batch enhancement with progress callback
def progress_callback(current, total):
    print(f"  Processing view {current}/{total}...")

print("Starting batch enhancement...")
enhanced_batch = service.enhance_batch(
    images=test_images,
    prompt="high quality 3D render, detailed texture",
    progress_callback=progress_callback,
    seed=42
)

print(f"\n‚úÖ Batch enhancement complete! {len(enhanced_batch)} images processed.")

In [None]:
# Display batch results
fig, axes = plt.subplots(2, 4, figsize=(16, 8))

for i in range(4):
    axes[0, i].imshow(test_images[i])
    axes[0, i].set_title(f"Original {i+1}")
    axes[0, i].axis('off')
    
    axes[1, i].imshow(enhanced_batch[i])
    axes[1, i].set_title(f"Enhanced {i+1}")
    axes[1, i].axis('off')

plt.suptitle("Batch Enhancement Results", fontsize=14)
plt.tight_layout()
plt.show()

## 1Ô∏è‚É£1Ô∏è‚É£ Save Results

In [None]:
# Save enhanced images to Colab runtime
output_dir = "/content/outputs"
os.makedirs(output_dir, exist_ok=True)

enhanced.save(f"{output_dir}/enhanced_basic.png")
enhanced_with_depth.save(f"{output_dir}/enhanced_with_depth.png")
enhanced_blended.save(f"{output_dir}/enhanced_blended.png")

for i, img in enumerate(enhanced_batch):
    img.save(f"{output_dir}/enhanced_batch_{i+1}.png")

# Save comparison
from ai_modules.diffusion.image_utils import save_comparison
save_comparison(
    original=test_img,
    enhanced=enhanced,
    output_path=f"{output_dir}/comparison.png",
    depth=depth_map
)

print(f"‚úÖ Results saved to {output_dir}/")
!ls -la {output_dir}

# ============================================================
# DOWNLOAD RESULTS TO LOCAL MACHINE:
# ============================================================
print("\n" + "="*60)
print("üì• TO DOWNLOAD RESULTS TO YOUR LOCAL MACHINE:")
print("="*60)
print("Option 1: In VS Code, use Command Palette ‚Üí 'Colab: Download File'")
print("Option 2: Copy to Google Drive:")
print(f"          !cp -r {output_dir}/* /content/drive/MyDrive/glimpse3d_outputs/")
print("Option 3: Use the code cell below to create a zip file")
print("="*60)

In [None]:
# Optional: Create a zip file for easy download
import shutil

zip_path = "/content/enhanced_outputs"
shutil.make_archive(zip_path, 'zip', output_dir)
print(f"‚úÖ Created: {zip_path}.zip")
print(f"üì• Download via VS Code: Right-click the file in Colab Files panel")

# Optional: Copy to Google Drive (uncomment if Drive is mounted)
# !cp {zip_path}.zip /content/drive/MyDrive/
# print("‚úÖ Copied to Google Drive!")

## 1Ô∏è‚É£2Ô∏è‚É£ Cleanup

In [None]:
# Unload models to free GPU memory
service.unload()

# Check memory after unload
print_memory_report()

## üìã Summary

### VS Code Colab Extension Notes:
- ‚úÖ **Kernel runs on Colab servers** with GPU access
- ‚úÖ **Clone repo to runtime** - Required since local files aren't synced
- ‚úÖ **Upload files manually** - Right-click ‚Üí "Upload to Colab Session"
- ‚úÖ **Mount Google Drive** - For persistent storage across sessions
- ‚úÖ **Download results** - Via Command Palette or copy to Drive

### What We Tested:
1. ‚úÖ Module imports
2. ‚úÖ MiDaS depth estimation integration
3. ‚úÖ Depth confidence estimation
4. ‚úÖ SDXL Lightning + ControlNet enhancement
5. ‚úÖ Enhancement with pre-computed depth
6. ‚úÖ Confidence-weighted blending
7. ‚úÖ Prompt templates
8. ‚úÖ Batch enhancement

### Performance on T4 GPU:
- Model loading: ~3-5 minutes (first time)
- Per-image enhancement: ~8-10 seconds
- VRAM usage: ~12GB

### Pipeline Integration:
```python
from ai_modules.midas_depth import estimate_depth
from ai_modules.diffusion import EnhanceService

# In the refinement loop:
for view in rendered_views:
    depth = estimate_depth(view)
    enhanced = service.enhance(view, depth_map=depth)
    # Back-project enhanced view to 3DGS...
```

### Troubleshooting:
- **"Module not found"**: Make sure you ran the clone cell first
- **"File not found"**: Upload file via right-click or check path
- **OOM errors**: Restart runtime and use `config.optimize_memory = True`