Skip to content

CyrusH7/PriMo

Repository files navigation

PriMo: Prior-Guided Mobile Reconstruction Beyond Texture

PriMo is a prior-guided mobile reconstruction project built on top of MP-SfM. It targets a practical failure mode of mobile SfM: camera misregistration in low-texture, weak-texture, and repetitive-texture scenes.

Compared with the original MP-SfM codebase, PriMo mainly adds mobile prior pose integration, prior-pose-driven pair selection, prior pose refinement during registration, and optional PromptDA depth prompting for ARKit-like captures.

PriMo Pipeline

✨ Overview

PriMo focuses on making MP-SfM work better for mobile captures where image evidence alone is not reliable enough. In the current codebase, this is done by injecting prior information into the original pipeline rather than replacing it.

The repository currently provides:

  • ingestion of ARKit / mobile prior poses and intrinsics into the reconstruction workflow,
  • a more flexible pair selection strategy based on prior pose geometry instead of relying only on NetVLAD retrieval,
  • a prior pose refine framework for both initialization and incremental registration,
  • optional integration of Prompt Depth Anything using low-resolution sensor depth as prompt.

In our current experiments, these changes effectively reduce the misregistration issue of MP-SfM in low-texture / weak-texture scenes.

🚀 Highlights

  • ARKit prior pose and intrinsics integration. PriMo reads mobile pose and camera metadata and injects them into the MP-SfM registration pipeline.
  • Prior-pose-based pair selection. PriMo adds a retrieval alternative that selects image pairs directly from prior pose geometry, replacing the original NetVLAD-only strategy when prior poses are available.
  • Prior pose refine framework. PriMo refines prior-guided poses during initialization and next-view registration instead of using the raw priors as fixed inputs.
  • Effective mitigation of weak-texture misregistration. The main practical gain of PriMo is improved stability and robustness in low-texture and weak-texture scenes where the original MP-SfM registration is more brittle.
  • Optional PromptDA depth prompting. PriMo integrates PromptDA so low-resolution ARKit depth can be used as a metric depth prompt.

Setup

PriMo follows the same core environment stack as MP-SfM: build pyceres and pycolmap from source, then install the Python package and its dependencies. The provided Dockerfile automates these lower-level system and compilation steps for containerized use.

1. Clone the repository and submodules

git clone --recursive https://github.com/Cyrus-Hao/PriMo
cd PriMo

If you already cloned the repository without submodules, run:

git submodule update --init --recursive

For public release, all third-party projects, including third_party/PromptDA, are expected to be fetched through git submodules.

2. Install dependencies

After building pyceres and pycolmap from source, install the Python dependencies:

pip install -r requirements.txt
python -m pip install -e .
Optional
  • For faster inference with transformer-based models, install xformers.
  • For faster MASt3R matching, compile the CUDA kernels for RoPE:
DIR=$PWD
cd third_party/mast3r/dust3r/croco/models/curope/
python setup.py build_ext --inplace
cd $DIR
  • For DepthPro, the current repository follows the MP-SfM setup and keeps ml-depth-pro as a separate third-party dependency:
DIR=$PWD
cd third_party/ml-depth-pro/
pip install -e . --no-deps
cd $DIR
  • PromptDA is integrated slightly differently from the other depth models: PriMo imports it directly from third_party/PromptDA via sys.path, so running PriMo does not require an extra pip install -e third_party/PromptDA step. As long as the submodule exists and the checkpoint path is configured, PromptDA works inside PriMo.
  • If you want to run the official PromptDA standalone scripts outside PriMo, you can still install it manually:
DIR=$PWD
cd third_party/PromptDA/
pip install -e . --no-deps
cd $DIR
Docker

This repository already provides a Dockerfile, but it is closer to an MP-SfM-style base environment than a fully pre-baked PriMo runtime image.

Build the image locally:

docker build -t primo:latest .

Run it with the repository mounted:

docker run --gpus all -it --rm \
  --shm-size=8g \
  -v $(pwd):/workspace/PriMo \
  -w /workspace/PriMo primo:latest

Inside the container, finish by installing the package:

pip install -e .

Optional steps inside Docker are the same as above:

# optional MASt3R speed up
DIR=$PWD
cd third_party/mast3r/dust3r/croco/models/curope/
python setup.py build_ext --inplace
cd $DIR

# optional DepthPro install
cd third_party/ml-depth-pro/
pip install -e . --no-deps
cd $DIR

# optional PromptDA standalone install
cd third_party/PromptDA/
pip install -e . --no-deps
cd $DIR

3. Prepare mobile pose and intrinsics files

If you want to use prior poses, the current PriMo workflow expects a camera_poses.yaml file and matching intrinsics.yaml.

PriMo includes camera_processor.py to convert mobile capture metadata into these PriMo-compatible YAML files. In the current project, this script is specifically tailored to Stray Scanner exports, and is meant to process the ARKit-style:

  • camera_matrix.csv
  • odometry.csv

into:

  • camera_poses.yaml
  • intrinsics.yaml
python camera_processor.py \
    --camera-matrix local/example/sofa/camera_matrix.csv \
    --odometry local/example/sofa/odometry.csv \
    --images-dir local/example/images \
    --output-dir local/example \
    --sample-interval 1

This writes:

  • local/example/camera_poses.yaml
  • local/example/intrinsics.yaml

In other words, camera_processor.py is the bridge from Stray Scanner's ARKit prior pose matrix / odometry export to the prior-pose format used by PriMo.

4. Run reconstruction

python reconstruct.py \
    --conf sp-lg_m3dv2 \
    --data_dir local/example \
    --intrinsics_pth local/example/intrinsics.yaml \
    --images_dir local/example/images \
    --cache_dir local/example/cache_dir \
    --pose_config_path local/example/camera_poses.yaml \
    --extract \
    --verbose 0

Outputs are written under local/example/sfm_outputs, while extracted priors and matches are cached under local/example/cache_dir.

What PriMo Changes over MP-SfM

PriMo is not a full rewrite of MP-SfM. Instead, it introduces a small number of focused changes around registration and mobile priors:

  1. Inject prior poses and intrinsics from mobile capture metadata.
  2. Select matching pairs from prior poses through pairs_from_prior_pose(...), instead of relying only on NetVLAD retrieval.
  3. Refine prior-guided initialization using normalized correspondences, an essential matrix induced by the prior relative pose, and Levenberg-Marquardt optimization over a 6-DoF update.
  4. Refine incremental prior poses with pose-only optimization from 2D-3D correspondences collected from already registered views.
  5. Optionally improve monocular depth priors with PromptDA using ARKit-like low-resolution depth prompts.

Prompted Depth with PromptDA

PriMo also supports Prompt Depth Anything for mobile captures with rough sensor depth, such as ARKit or Stray Scanner depth maps.

In the current project, the PromptDA integration is intended for Stray Scanner's exported ARKit depth maps. The expected setup is:

  • RGB images in local/example/images
  • aligned ARKit depth prompt maps in a directory such as .../depth
  • one prompt depth file per RGB frame, typically stored as .png

Configure configs/defaults/promptda.yaml with:

extractors:
  promptda:
    pretrained_path: /absolute/path/to/prompt_depth_anything_vitl.ckpt
    prompt_depth_dir: /absolute/path/to/arkit_depth_png

Then run:

python reconstruct.py \
    --conf defaults/promptda \
    --data_dir local/example \
    --intrinsics_pth local/example/intrinsics.yaml \
    --images_dir local/example/images \
    --cache_dir local/example/cache_dir \
    --pose_config_path local/example/camera_poses.yaml \
    --extract \
    --verbose 0

By default, PromptDA depth prompts are expected as .png files and interpreted as millimeter depth, which is converted internally to meters.

So in practice:

  • camera_processor.py handles Stray Scanner -> ARKit prior pose / intrinsics
  • PromptDA handles Stray Scanner -> ARKit depth prompt maps

These two inputs complement each other in the current PriMo workflow.

Repository Structure

  • reconstruct.py: main entry point for reconstruction experiments.
  • camera_processor.py: converts mobile metadata into camera_poses.yaml and intrinsics.yaml.
  • configs/: high-level reconstruction presets and per-estimator defaults.
  • PriMo/sfm/mapper/registration.py: prior-pose-guided registration and prior pose refinement logic.
  • PriMo/extraction/pairs/prior_pose.py: prior-pose-based pair selection that can replace the original NetVLAD retrieval path.
  • PriMo/extraction/imagewise/geometry/models/depth/promptda.py: PromptDA wrapper used inside PriMo.
  • third_party/: external projects vendored as submodules.

In particular, the PromptDA integration is source-tree based: PriMo loads third_party/PromptDA directly, similar in spirit to the other vendored third-party geometry models, but without requiring a mandatory package installation step for the PriMo pipeline itself.

Current Focus

PriMo is especially aimed at improving reconstruction robustness in cases where camera registration is unreliable because of:

  • weak visual texture,
  • repeated patterns or symmetric structure,
  • low-overlap mobile trajectories,
  • noisy mobile depth and pose priors.

The current codebase should be considered an actively evolving research release built around a specific goal: making MP-SfM more reliable for difficult mobile captures, especially those that suffer from low- or weak-texture misregistration.

🙏 Acknowledgements

PriMo is built on top of the excellent MP-SfM codebase. We especially thank the MP-SfM authors for making their implementation available and for providing the foundation that this project extends.

We also gratefully acknowledge the open-source projects used in this repository or its integrated priors and matchers:

We especially thank MP-SfM, since PriMo directly extends its reconstruction framework rather than starting from scratch.

📸 Running on Your Own Capture

You can use the Stray Scanner App to capture your own data. This requires iPhone 12 Pro or later Pro models, or iPad 2020 Pro or later Pro models.

In the current PriMo workflow, Stray Scanner is especially useful because:

  • its ARKit pose export can be converted by camera_processor.py into PriMo prior poses and intrinsics,
  • its ARKit depth export can be used as PromptDA depth prompts.

About

PriMo: Prior-Guided Mobile Reconstruction Beyond Texture

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages