# MegaSAM Pipeline - Google Colab

This notebook runs the MegaSAM pipeline for camera tracking and depth estimation.

**Requirements:** GPU runtime (Runtime > Change runtime type > T4 GPU)

In [None]:
# Check GPU availability
!nvidia-smi

## 1. Clone Repository and Initialize Submodules

In [None]:
# Clone the repository
!git clone https://github.com/JonnyShiUW/cse455-mega-sam-impl.git
%cd cse455-mega-sam-impl

# Initialize submodules
!git submodule update --init --recursive

In [None]:
%cd implementation

## 2. Install Dependencies

In [None]:
# Use Colab's pre-installed PyTorch (compatible with the environment)
# Just install the additional dependencies
!pip install opencv-python-headless tqdm imageio einops scipy matplotlib 
!pip install timm ninja numpy==1.26.3 huggingface-hub kornia
!pip install torch-scatter -f https://data.pyg.org/whl/torch-$(python -c "import torch; print(torch.__version__.split('+')[0])")+cu$(python -c "import torch; print(torch.version.cuda.replace('.',''))").html

In [None]:
# Install xformers (compatible with Colab's PyTorch version)
!pip install xformers

In [None]:
# Install UniDepth
!pip install unidepth

## 3. Compile DROID-SLAM Extensions

In [None]:
%cd DROID-SLAM
!python setup.py install
%cd ..

## 4. Download Model Checkpoints

In [None]:
# Create checkpoint directories
!mkdir -p mega-sam/Depth-Anything/checkpoints

# Download DepthAnything checkpoint (~1.2GB)
!wget -O mega-sam/Depth-Anything/checkpoints/depth_anything_vitl14.pth \
    "https://huggingface.co/spaces/LiheYoung/Depth-Anything/resolve/main/checkpoints/depth_anything_vitl14.pth"

print("DepthAnything checkpoint downloaded!")

In [None]:
# Download RAFT checkpoint (~78MB)
!wget -O mega-sam/cvd_opt/raft-things.pth \
    "https://www.dropbox.com/s/4j4z58wuv8o0mfz/raft-things.pth?dl=1"

print("RAFT checkpoint downloaded!")

In [None]:
# Verify checkpoints
!ls -lh mega-sam/Depth-Anything/checkpoints/
!ls -lh mega-sam/cvd_opt/raft-things.pth
!ls -lh mega-sam/checkpoints/

## 5. Upload Your Input Frames

Upload your `test_video` folder of JPEG frames, or use the sample frames if included in the repo.

In [None]:
# Check if test_video frames exist
!ls test_video/ | head -10
!ls test_video/*.jpg 2>/dev/null | wc -l

In [None]:
# If you need to upload frames from Google Drive:
# from google.colab import drive
# drive.mount('/content/drive')
# !cp -r /content/drive/MyDrive/your_frames_folder ./test_video

## 6. Run MegaSAM Pipeline

In [None]:
# Verify setup before running
import torch
print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
print(f"CUDA device: {torch.cuda.get_device_name(0) if torch.cuda.is_available() else 'N/A'}")

In [None]:
# Run the pipeline!
!python main.py

## 7. View and Download Results

In [None]:
# Check outputs
!ls -lh outputs_cvd/
!ls -lh reconstructions/

In [None]:
# Load and inspect final output
import numpy as np

data = np.load("outputs_cvd/marching_sgd_cvd_hr.npz")
print("Output contents:")
for key in data.files:
    print(f"  {key}: shape={data[key].shape}, dtype={data[key].dtype}")

In [None]:
# Visualize a sample depth map
import matplotlib.pyplot as plt

fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# RGB frame
axes[0].imshow(data['images'][0])
axes[0].set_title('Input Frame 0')
axes[0].axis('off')

# Depth map
axes[1].imshow(data['depths'][0], cmap='turbo')
axes[1].set_title('Estimated Depth 0')
axes[1].axis('off')

plt.tight_layout()
plt.show()

In [None]:
# Download results to your local machine
from google.colab import files

# Download the final output
files.download('outputs_cvd/marching_sgd_cvd_hr.npz')

In [None]:
# Or save to Google Drive
# from google.colab import drive
# drive.mount('/content/drive')
# !cp -r outputs_cvd /content/drive/MyDrive/megasam_results/