Skip to content

seantomany/585Project

Repository files navigation

Robust, Explainable Cross-View 3D Pose Tracking for Exercise Form Feedback

CS 585 — Final Project

Sean Tomany, Jonah Rothman, Jigar Kanakhara, Bhavya Bavishi, Harsha Basavaraj Beth

Overview

Multi-view 3D pose tracking system that reconstructs 3D human joint positions from calibrated cameras using only classical methods — no neural networks as core contributions. We fork the Varun-Tandon14 reimplementation of Chen et al. (2020) and add three key upgrades:

  1. IRLS Robust Triangulation — Iteratively Reweighted Least Squares with Huber/Tukey loss on reprojection residuals, incorporating per-joint confidence scores as prior weights
  2. Uncertainty-Aware Matching Affinity — Mahalanobis-like cost that maps detection confidence to per-joint variances, so high-confidence joints dominate the matching
  3. Kalman Filter Temporal Smoothing — Constant-velocity Kalman filter per joint with measurement covariances from IRLS residuals
  4. Skeleton Visualization + Exercise Feedback — Bone connections, joint angle computation (knee flexion, hip hinge, trunk tilt), and threshold-based form deviation flags

Additionally includes a single-camera exercise feedback demo using MediaPipe for real-time pose estimation on video files or webcam.

Results

All configurations achieve 100% PCP on clean Campus data. The improvements show under stress:

Method Clean Outlier 10% Outlier 20% Occlusion 20% Limb Drop 20%
Baseline 1.000 0.663 0.429 0.254 0.753
IRLS (Huber) 1.000 0.701 0.475 0.758 0.982
IRLS + Uncertainty 1.000 0.701 0.475 0.756 0.984
Full (+ Kalman) 0.999 0.724 0.497 0.827 0.989

Average PCP across all bone groups. Full per-bone breakdowns in results/full_comparison.json.

Key takeaways:

  • IRLS provides massive gains under occlusion (25% → 76% PCP) by down-weighting outlier cameras
  • Kalman smoothing further improves occlusion robustness (76% → 83%) through temporal consistency
  • Limb drop resilience jumps from 75% to 98% with IRLS

Setup

Prerequisites

  • Python 3.10+
  • macOS / Linux (Windows should work but paths untested)

Installation

git clone https://github.com/seantomany/585Project.git
cd 585Project
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Dataset

Download the Campus dataset and extract it:

  1. Download from Google Drive
  2. Extract into dataset/Campus_Seq1/

The directory should contain:

dataset/Campus_Seq1/
├── calibration.json
├── annotation_2d.json
├── annotation_3d.json
├── detection.json
└── frames/
    ├── Camera0/
    ├── Camera1/
    └── Camera2/

Usage

Run the full pipeline

# Baseline (linear triangulation, uniform affinity)
python -m src.run_pipeline --dataset dataset/Campus_Seq1 --method baseline

# IRLS robust triangulation (Huber loss)
python -m src.run_pipeline --dataset dataset/Campus_Seq1 --method irls_huber

# IRLS + uncertainty-aware affinity
python -m src.run_pipeline --dataset dataset/Campus_Seq1 --method irls_uncertainty

# Full pipeline (IRLS + uncertainty + Kalman filter)
python -m src.run_pipeline --dataset dataset/Campus_Seq1 --method full

# Run ALL methods
python -m src.run_pipeline --dataset dataset/Campus_Seq1 --method all

Robustness stress tests

# Run all methods under all perturbations
python -m src.run_pipeline --dataset dataset/Campus_Seq1 --method all --robustness all

# Single perturbation
python -m src.run_pipeline --dataset dataset/Campus_Seq1 --method full --robustness outlier_10pct

Available perturbations: outlier_5pct, outlier_10pct, outlier_20pct, occlusion_10pct, occlusion_20pct, limb_drop_20pct, time_delay

Visualization

# Save a 3D skeleton plot
python -m src.run_pipeline --dataset dataset/Campus_Seq1 --method full --visualize

# Save 3D animation
python -m src.run_pipeline --dataset dataset/Campus_Seq1 --method full --animate

Single-camera exercise feedback demo

Uses MediaPipe Pose to analyze exercise form from a video file or webcam:

# Process a video file (model downloads automatically on first run)
python -m src.demo_single_cam --input my_squat.mp4 --output results/annotated.mp4

# Live webcam
python -m src.demo_single_cam --input webcam --display

# Both save and display
python -m src.demo_single_cam --input my_squat.mp4 --output results/annotated.mp4 --display

The demo overlays skeleton bone connections, joint angle readouts (knee flexion, hip hinge, trunk tilt), and real-time form deviation warnings.

Project Structure

585Project/
├── src/                          # Modular pipeline code
│   ├── config.py                 # All constants and hyperparameters
│   ├── triangulation.py          # Linear baseline + IRLS robust triangulation
│   ├── affinity.py               # Uniform + uncertainty-aware matching affinity
│   ├── tracking.py               # Main pipeline loop (iterative camera processing)
│   ├── kalman.py                 # Constant-velocity Kalman filter per joint
│   ├── helpers.py                # Velocity estimation, utility functions
│   ├── visualization.py          # Skeleton rendering, joint angles, form flags
│   ├── evaluation.py             # PCP evaluation + tracking metrics (MOTA)
│   ├── robustness.py             # Outlier injection, occlusion sim, time delay
│   ├── run_pipeline.py           # CLI entry point for the full pipeline
│   └── demo_single_cam.py        # Single-camera exercise feedback (MediaPipe)
├── dataset/
│   └── crossview_dataset/        # Camera, calibration, and data loading code
│       ├── calib/
│       │   ├── camera.py         # Camera class (projection, backprojection)
│       │   └── calibration.py    # Calibration (triangulation, epipolar geometry)
│       ├── data_utils.py         # FrameLoader, Pose2DLoader, Pose3DLoader
│       └── visualization/        # Original OpenCV/vispy visualization
├── bip_solver.py                 # GLPK graph partitioning for clustering
├── evaluate.py                   # Official PCP evaluation script (longcw)
├── display.py                    # Official visualization script (longcw)
├── cross_view_tracking_for_3d_pose_estimation.ipynb  # Original baseline notebook
├── results/                      # Output: comparison tables, plots, videos
│   ├── full_comparison.json      # PCP results across all methods/perturbations
│   ├── summary.json              # Latest run summary
│   ├── squat_feedback.mp4        # Exercise feedback demo output
│   └── *.png                     # Skeleton visualization frames
└── requirements.txt

Method Details

IRLS Triangulation (src/triangulation.py)

Replaces the baseline SVD-on-DLT triangulation with iteratively reweighted least squares:

  1. Initialize with standard linear triangulation
  2. Compute per-camera reprojection residuals
  3. Assign robust weights via Huber or Tukey bisquare loss function
  4. Incorporate detection confidence scores as prior weights: w̃ ∝ (s + ε) × w_irls
  5. Re-solve the weighted system
  6. Repeat until convergence

Every decision is explainable — for any joint, you can inspect which cameras were trusted vs. down-weighted and the exact residual that triggered it.

Uncertainty-Aware Affinity (src/affinity.py)

Maps detection confidence scores to per-joint variances and uses Mahalanobis-like distances:

  • Variance mapping: σ²_k = 1/(s_k + ε) — low confidence → high variance
  • Matching cost: C(a,b) = Σ_k ||u_ik - u_jk||² / (σ²_ik + σ²_jk)
  • High-confidence joints dominate; uncertain joints (occluded wrists, elbows) contribute less

Kalman Filter (src/kalman.py)

Constant-velocity Kalman filter per 3D joint:

  • State: [x, y, z, vx, vy, vz]
  • Measurement noise scaled by IRLS residual magnitude (noisy triangulations trusted less)
  • Replaces the baseline two-point velocity difference

Robustness Framework (src/robustness.py)

Controlled perturbations applied to 2D detection data:

  • Outlier noise: Gaussian pixel perturbations on random fraction of joints
  • Occlusion simulation: Zero out random joints with score=0
  • Limb drop: Drop entire limb groups (arm/leg) with configurable probability
  • Time delay: Shift per-camera frame data to simulate unsynchronized streams

Dependencies

numpy, scipy, opencv-python, matplotlib, pandas, tqdm, cvxopt, prettytable, ipywidgets, ipympl, jupyter

For the single-camera demo: mediapipe (installed separately: pip install mediapipe)

Acknowledgments

About

Workout Project

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors