# NETHERGAZE: Markerless Augmented Reality Pipeline

**Author:** Milan Savard  
**Course:** CS366 F25 Final Personal Project  
**Date:** November 2025

---

## Project Overview

NETHERGAZE is a markerless augmented reality (AR) system built using computer vision techniques. Unlike traditional AR systems that rely on fiducial markers (like ArUco or AprilTags), NETHERGAZE uses **natural feature tracking** to estimate camera pose and overlay virtual content.

### Key Technologies
- **OpenCV** for image processing and computer vision
- **ORB (Oriented FAST and Rotated BRIEF)** for feature detection and description
- **Lucas-Kanade Optical Flow** for frame-to-frame feature tracking
- **Essential Matrix + RANSAC** for robust pose estimation
- **Temporal filtering** for smooth AR overlays
- **Sparse SLAM** for 3D mapping and loop closure
- **Depth-aware rendering** for AR occlusion handling


## Table of Contents

1. [Architecture Overview](#1-architecture-overview)
2. [Implementation Progress](#2-implementation-progress)
3. [Component Deep Dive](#3-component-deep-dive)
4. [Demo: Running the Pipeline](#4-demo-running-the-pipeline)
5. [Configuration Options](#5-configuration-options)
6. [Project Highlights](#6-project-highlights)
7. [Demo Screenshots](#7-demo-screenshots)
8. [References](#8-references)


---

## 1. Architecture Overview

The NETHERGAZE pipeline consists of seven main stages:

```
┌─────────────┐    ┌──────────────┐    ┌────────────────┐    ┌───────────┐    ┌─────────────┐    ┌─────────┐
│   Capture   │ -> │   Tracking   │ -> │ Pose Estimation│ -> │  Mapping  │ -> │   Overlay   │ -> │ Display │
│  (video.py) │    │ (feature.py) │    │   (pose.py)    │    │(mapping.py│    │ (overlay.py)│    │ (ui.py) │
└─────────────┘    └──────────────┘    └────────────────┘    └───────────┘    └─────────────┘    └─────────┘
                                                                   │                  │
                                                                   v                  v
                                                            ┌─────────────┐    ┌─────────────┐
                                                            │    Loop     │    │  Occlusion  │
                                                            │   Closure   │    │(occlusion.py│
                                                            └─────────────┘    └─────────────┘
```

### Module Responsibilities

| Module | File | Purpose |
|--------|------|---------|  
| **VideoProcessor** | `src/video.py` | Camera capture, preprocessing, backend selection |
| **FeatureTracker** | `src/tracking/feature.py` | ORB detection, optical flow, keyframe management |
| **PoseEstimator** | `src/pose.py` | Essential matrix, pose recovery, temporal filtering |
| **SparseMap** | `src/mapping.py` | 3D point cloud, keyframe management, loop closure |
| **OcclusionHandler** | `src/occlusion.py` | Depth estimation, occlusion masks, depth-aware rendering |
| **OverlayRenderer** | `src/overlay.py` | 2D/3D overlay rendering, blending |
| **UserInterface** | `src/ui.py` | Display window, keyboard controls |
| **NETHERGAZEApp** | `src/main.py` | Pipeline orchestration, CLI interface |


---

## 2. Implementation Progress

### ✅ All Components Complete

| Component | Status | Description |
|-----------|--------|-------------|
| Video Capture | ✅ Complete | Multi-backend support (AVFoundation, Continuity Camera) |
| Feature Tracking | ✅ Complete | ORB + optical flow with keyframe management |
| Pose Estimation | ✅ Complete | Essential matrix decomposition with temporal smoothing |
| Overlay Rendering | ✅ Complete | 2D primitives + 3D wireframes (cube, pyramid, chair, etc.) |
| World Anchoring | ✅ Complete | Objects stay fixed in 3D space as camera moves |
| User Interface | ✅ Complete | OpenCV window with keyboard controls and pose stats |
| Pipeline Orchestration | ✅ Complete | CLI-driven main application |
| Configuration System | ✅ Complete | Default calibration with sensible defaults |
| Integration Tests | ✅ Complete | Synthetic video testing, benchmarks, detector comparison |
| SLAM/Mapping | ✅ Complete | Sparse 3D mapping with keyframes, triangulation, loop closure |
| Occlusion Handling | ✅ Complete | Depth estimation, occlusion masks, depth-aware rendering |


---

## 3. Component Deep Dive

Let's explore each component in detail.


In [None]:
# Setup: Add src to path for imports
import sys
import os

# Navigate to project root
notebook_dir = os.getcwd()
project_root = os.path.dirname(notebook_dir) if 'notebooks' in notebook_dir else notebook_dir
src_path = os.path.join(project_root, 'src')
sys.path.insert(0, src_path)

print(f"Project root: {project_root}")
print(f"Source path: {src_path}")


Project root: /Users/milansavard/Desktop/GitHub/ComputerVision/NETHERGAZE
Source path: /Users/milansavard/Desktop/GitHub/ComputerVision/NETHERGAZE/src


### 3.1 Feature Tracking

The `FeatureTracker` class implements a hybrid approach:
1. **Detection**: ORB features are detected when tracking is lost or features drop below threshold
2. **Tracking**: Lucas-Kanade optical flow tracks features frame-to-frame (faster than re-detection)
3. **Keyframes**: Best frames are stored for re-localization when tracking fails


In [10]:
from tracking.feature import FeatureTracker, TrackingConfiguration

# View default configuration
default_config = TrackingConfiguration()
print("Feature Tracking Configuration:")
print(f"  Method: {default_config.method}")
print(f"  Max Features: {default_config.max_features}")
print(f"  Use Optical Flow: {default_config.use_optical_flow}")
print(f"  Optical Flow Window: {default_config.optical_flow_win_size}")
print(f"  Reacquire Threshold: {default_config.reacquire_threshold}")
print(f"  Keyframe Interval: {default_config.keyframe_interval}")
print(f"  Max Keyframes: {default_config.max_keyframes}")


Feature Tracking Configuration:
  Method: orb
  Max Features: 1000
  Use Optical Flow: True
  Optical Flow Window: 21
  Reacquire Threshold: 300
  Keyframe Interval: 15
  Max Keyframes: 5


### 3.2 Pose Estimation

The `PoseEstimator` class recovers camera motion from 2D-2D correspondences:

1. **Essential Matrix**: Computed from matched points using RANSAC
2. **Pose Recovery**: Decompose E into rotation (R) and translation (t)
3. **Temporal Filtering**: EMA smoothing + outlier rejection for stable overlays


In [11]:
from pose import PoseEstimator, PoseFilterConfig

# View pose filter configuration
filter_config = PoseFilterConfig()
print("Pose Filter Configuration:")
print(f"  Enable Smoothing: {filter_config.enable_smoothing}")
print(f"  Smoothing Alpha (EMA): {filter_config.smoothing_alpha}")
print(f"  Enable Outlier Rejection: {filter_config.enable_outlier_rejection}")
print(f"  Max Translation Jump: {filter_config.max_translation_jump} m")
print(f"  Max Rotation Jump: {filter_config.max_rotation_jump} rad")
print(f"  Min Inliers Threshold: {filter_config.min_inliers_threshold}")


Pose Filter Configuration:
  Enable Smoothing: True
  Smoothing Alpha (EMA): 0.3
  Enable Outlier Rejection: True
  Max Translation Jump: 0.5 m
  Max Rotation Jump: 0.5 rad
  Min Inliers Threshold: 10


### 3.3 Overlay Rendering

The `OverlayRenderer` supports both 2D screen-space and 3D world-space overlays:

**2D Overlays:**
- Text labels
- Rectangles, circles, lines
- Polygons

**3D Wireframe Objects:**
- Cube, pyramid, grid
- Coordinate axes (RGB = XYZ)
- Custom wireframes


In [12]:
from overlay import OverlayRenderer, OverlayConfiguration

# View overlay configuration
overlay_config = OverlayConfiguration()
print("Overlay Configuration:")
print(f"  Enable 2D Overlays: {overlay_config.enable_2d_overlays}")
print(f"  Enable 3D Overlays: {overlay_config.enable_3d_overlays}")
print(f"  Default 3D Color: {overlay_config.default_3d_color} (BGR)")
print(f"  Blend Alpha: {overlay_config.blend_alpha}")
print(f"  Antialiasing: {overlay_config.antialiasing}")


Overlay Configuration:
  Enable 2D Overlays: True
  Enable 3D Overlays: True
  Default 3D Color: (0, 255, 255) (BGR)
  Blend Alpha: 0.7
  Antialiasing: True


### 3.4 Camera Calibration

The system uses a camera intrinsic matrix (K) for projection:

```
K = | fx   0  cx |
    |  0  fy  cy |
    |  0   0   1 |
```

Where:
- `fx, fy` = focal lengths in pixels
- `cx, cy` = principal point (image center)


In [13]:
from utils import get_config
import numpy as np

# Load default calibration
config = get_config()
calib = config['calibration']

K = np.array(calib['camera_matrix'])
dist = np.array(calib['dist_coeffs'])

print("Default Camera Matrix (K):")
print(K)
print(f"\nFocal Length: fx={K[0,0]:.1f}, fy={K[1,1]:.1f} pixels")
print(f"Principal Point: cx={K[0,2]:.1f}, cy={K[1,2]:.1f} pixels")
print(f"\nDistortion Coefficients: {dist}")


Default Camera Matrix (K):
[[800.   0. 320.]
 [  0. 800. 240.]
 [  0.   0.   1.]]

Focal Length: fx=800.0, fy=800.0 pixels
Principal Point: cx=320.0, cy=240.0 pixels

Distortion Coefficients: [0. 0. 0. 0. 0.]


---

## 4. Demo: Running the Pipeline

### Command Line Usage

```bash
cd NETHERGAZE

# Run the World-Anchored AR Demo
python3 src/main.py                    # Default settings
python3 src/main.py --fast             # High performance mode (60fps)
python3 src/main.py --verbose          # Debug logging

# Or run directly from examples
python3 examples/demo_anchored_objects.py
```

### How to Use the AR Demo

1. **Point camera** at a textured surface (book, poster, keyboard)
2. **Wait for green "TRACKING"** indicator in the corner
3. **Press SPACE** to set the anchor point (world origin)
4. **Press 1-5** to place 3D objects in the scene
5. **Move camera around** - objects stay fixed in 3D space!

### Keyboard Controls

| Key | Action |
|-----|--------|
| `SPACE` | Set anchor point (world origin) |
| `1` | Place wireframe cube |
| `2` | Place pyramid |
| `3` | Place RGB coordinate axes |
| `4` | Place solid box |
| `5` | Place 3D chair |
| `C` | Clear all placed objects |
| `G` | Toggle ground grid |
| `M` | Toggle feature markers |
| `R` | Reset anchor |
| `H` | Toggle help overlay |
| `Q` / `ESC` | Quit |


In [14]:
# Display the main.py CLI help
import subprocess
result = subprocess.run(
    ['python3', '../src/main.py', '--help'],
    capture_output=True,
    text=True
)
print(result.stdout)


usage: main.py [-h] [--fast] [--verbose]

NETHERGAZE - Markerless Augmented Reality with World-Anchored Objects

options:
  -h, --help     show this help message and exit
  --fast, -f     High performance mode (60fps, optimized settings)
  --verbose, -V  Enable verbose/debug logging

Examples:
  python main.py                # Run the AR demo
  python main.py --fast         # High performance mode (60fps)
  python main.py --verbose      # Enable debug logging

Controls (in demo):
  SPACE  - Set anchor point (world origin)
  1-5    - Place objects (1=cube, 2=pyramid, 3=axes, 4=box, 5=chair)
  C      - Clear all objects
  G      - Toggle ground grid
  M      - Toggle feature markers
  R      - Reset anchor
  H      - Toggle help overlay
  Q      - Quit
        



---

## 5. Configuration Options

All parameters can be customized via JSON config file or modified in `src/utils.py`.


In [15]:
import json
from utils import get_config

# Get full configuration
config = get_config()

# Pretty print
print("Full Configuration:")
print(json.dumps(config, indent=2, default=str))


Full Configuration:
{
  "camera_id": 0,
  "video_width": 640,
  "video_height": 480,
  "video_fps": 30,
  "camera_backend_priority": null,
  "camera_init_attempts": 10,
  "enable_preprocessing": true,
  "blur_kernel": [
    5,
    5
  ],
  "contrast_alpha": 1.0,
  "brightness_beta": 0,
  "axis_length": 0.05,
  "feature_tracking": {
    "method": "orb",
    "max_features": 1000,
    "quality_level": 0.01,
    "min_distance": 7.0,
    "fast_threshold": 20,
    "orb_scale_factor": 1.2,
    "orb_nlevels": 8,
    "akaze_threshold": 0.001,
    "use_optical_flow": true,
    "optical_flow_win_size": 21,
    "optical_flow_max_level": 3,
    "optical_flow_criteria_eps": 0.03,
    "optical_flow_criteria_count": 30,
    "adaptive_optical_flow": true,
    "reacquire_threshold": 200,
    "keyframe_interval": 15,
    "max_keyframes": 6,
    "min_keyframe_features": 160,
    "keyframe_quality_threshold": 0.5,
    "matcher_type": "bf_hamming",
    "match_ratio_threshold": 0.75,
    "use_grid_detection"

### Key Configuration Sections

| Section | Description |
|---------|-------------|
| `feature_tracking` | ORB detection and optical flow parameters |
| `calibration` | Camera intrinsic matrix and distortion coefficients |
| `pose_filter` | Temporal smoothing and outlier rejection settings |
| `overlay` | 2D/3D rendering options |


---

## 6. Project Highlights

### Key Features Implemented

#### 1. World-Anchored AR Objects
Fully implemented in `examples/demo_anchored_objects.py`:
- Objects stay fixed in 3D world space as camera moves
- Anchor point system for world origin
- Multiple object types: cube, pyramid, axes, box, chair
- Real-time pose statistics display (FPS, tracking rate, pose rate)

```bash
# Run the world-anchored AR demo
python src/main.py                    # Main entry point
python examples/demo_anchored_objects.py  # Direct demo
```

#### 2. Integration Tests
Comprehensive test suite in `tests/test_integration.py`:
- Synthetic video generation (checkerboard, feature-rich sequences)
- Pipeline metrics collection (tracking rate, pose rate, FPS)
- Detector comparison benchmarks (ORB, AKAZE, BRISK)
- Video file playback for reproducible testing

#### 3. SLAM/Mapping Integration
Sparse SLAM in `src/mapping.py`:
- `SparseMap` class with 3D point cloud from triangulated features
- Keyframe management with covisibility graph
- Loop closure detection with geometric verification
- Map persistence (save/load to JSON)

#### 4. Occlusion Handling
Depth-aware rendering in `src/occlusion.py`:
- `DepthEstimator` - sparse depth from features + ground plane assumption
- `OcclusionHandler` - mask generation and compositing
- `DepthAwareOverlayRenderer` - proper AR occlusion


### ✅ World-Anchored AR Demo

The main demo places 3D objects that stay fixed in space. Here's the CLI help:


In [None]:
# Display the main.py CLI help
import subprocess
result = subprocess.run(
    ['python3', '../src/main.py', '--help'],
    capture_output=True,
    text=True
)
print(result.stdout)


usage: main.py [-h] [--fast] [--verbose]

NETHERGAZE - Markerless Augmented Reality with World-Anchored Objects

options:
  -h, --help     show this help message and exit
  --fast, -f     High performance mode (60fps, optimized settings)
  --verbose, -V  Enable verbose/debug logging

Examples:
  python main.py                # Run the AR demo
  python main.py --fast         # High performance mode (60fps)
  python main.py --verbose      # Enable debug logging

Controls (in demo):
  SPACE  - Set anchor point (world origin)
  1-5    - Place objects (1=cube, 2=pyramid, 3=axes, 4=box, 5=chair)
  C      - Clear all objects
  G      - Toggle ground grid
  M      - Toggle feature markers
  R      - Reset anchor
  H      - Toggle help overlay
  Q      - Quit
        



---

## 7. Demo Screenshots

### World-Anchored AR Demo

The following screenshots demonstrate NETHERGAZE in action:

#### Cube Placement

<img src="images/demo_cube.png" alt="AR Cube Demo" width="800"/>

*A wireframe cube placed in the scene. Notice the tracking indicators: green feature points, FPS counter, and pose statistics in the corners.*

#### UI Elements Visible in Demo
- **Top-left**: Help overlay with keyboard controls
- **Top-right**: Tracking status (TRACKING/SEARCHING) and anchor status with object count
- **Bottom-left**: Real-time statistics (FPS: 30.3, Features: 334, Track: 99%, Pose: 93%)
- **Green dots**: ORB features being tracked across the scene (~300+ features for stable pose estimation)



The first attempt did not anchor very well, so I modified the foloowing parameters to increase stability.

**max_features** Increased from 2000 to 3000 in order to detect more features

**fast_threshold** Decreased from 20 to 10 for more sensitive corner detection

**quality_level** Decreased from 0.01 to 0.005 in order to accept weaker features

**min_distance** Decreased from 7.0 to 5.0 to allow features closer together

**reacquire_threshold** Increased from 300 to 500 to re-detect features sooner

**keyframe_interval** Decreased from 10 to 8 for more frequent keyframe updates.

**orb_nlevels** Increased from 8 to 12 for more scale pyramid levels

<img src="images/demo_objects.png" alt="Multiple Objects Demo" width="800"/>

---

## 8. References

### Papers and Books
1. Multiple View Geometry - Hartley and Zisserman (Essential matrix, pose recovery)
2. ORB: An efficient alternative to SIFT or SURF - Rublee et al., 2011
3. Lucas-Kanade Optical Flow - Lucas and Kanade, 1981

### OpenCV Documentation
- Feature Detection: https://docs.opencv.org/4.x/db/d27/tutorial_py_table_of_contents_feature2d.html
- Camera Calibration: https://docs.opencv.org/4.x/dc/dbb/tutorial_py_calibration.html
- Pose Estimation: https://docs.opencv.org/4.x/d7/d53/tutorial_py_pose.html

### Project Files
- PROGRESS.md - Detailed implementation progress
- TESTING.md - Testing procedures
- TROUBLESHOOTING.md - Common issues and solutions
- docs/design_overview.md - Architecture documentation


---

## Summary

NETHERGAZE is a fully functional markerless AR pipeline with:

✅ **Real-time feature tracking** using ORB + optical flow  
✅ **Robust pose estimation** with temporal filtering  
✅ **World-anchored 3D objects** that stay fixed in space  
✅ **Multiple 3D object types** (cube, pyramid, axes, box, chair)  
✅ **Live pose statistics** (FPS, tracking rate, pose rate, feature count)  
✅ **Integration tests** for stability verification  
✅ **Sparse SLAM/mapping** with loop closure detection  
✅ **Occlusion handling** with depth-aware rendering  


---


