PriMo is a prior-guided mobile reconstruction project built on top of MP-SfM. It targets a practical failure mode of mobile SfM: camera misregistration in low-texture, weak-texture, and repetitive-texture scenes.
Compared with the original MP-SfM codebase, PriMo mainly adds mobile prior pose integration, prior-pose-driven pair selection, prior pose refinement during registration, and optional PromptDA depth prompting for ARKit-like captures.
PriMo focuses on making MP-SfM work better for mobile captures where image evidence alone is not reliable enough. In the current codebase, this is done by injecting prior information into the original pipeline rather than replacing it.
The repository currently provides:
- ingestion of ARKit / mobile prior poses and intrinsics into the reconstruction workflow,
- a more flexible pair selection strategy based on prior pose geometry instead of relying only on NetVLAD retrieval,
- a prior pose refine framework for both initialization and incremental registration,
- optional integration of Prompt Depth Anything using low-resolution sensor depth as prompt.
In our current experiments, these changes effectively reduce the misregistration issue of MP-SfM in low-texture / weak-texture scenes.
- ARKit prior pose and intrinsics integration. PriMo reads mobile pose and camera metadata and injects them into the MP-SfM registration pipeline.
- Prior-pose-based pair selection. PriMo adds a retrieval alternative that selects image pairs directly from prior pose geometry, replacing the original NetVLAD-only strategy when prior poses are available.
- Prior pose refine framework. PriMo refines prior-guided poses during initialization and next-view registration instead of using the raw priors as fixed inputs.
- Effective mitigation of weak-texture misregistration. The main practical gain of PriMo is improved stability and robustness in low-texture and weak-texture scenes where the original MP-SfM registration is more brittle.
- Optional PromptDA depth prompting. PriMo integrates PromptDA so low-resolution ARKit depth can be used as a metric depth prompt.
PriMo follows the same core environment stack as MP-SfM: build pyceres and pycolmap from source, then install the Python package and its dependencies. The provided Dockerfile automates these lower-level system and compilation steps for containerized use.
git clone --recursive https://github.com/Cyrus-Hao/PriMo
cd PriMoIf you already cloned the repository without submodules, run:
git submodule update --init --recursiveFor public release, all third-party projects, including third_party/PromptDA, are expected to be fetched through git submodules.
After building pyceres and pycolmap from source, install the Python dependencies:
pip install -r requirements.txt
python -m pip install -e .Optional
- For faster inference with transformer-based models, install xformers.
- For faster MASt3R matching, compile the CUDA kernels for RoPE:
DIR=$PWD
cd third_party/mast3r/dust3r/croco/models/curope/
python setup.py build_ext --inplace
cd $DIR- For
DepthPro, the current repository follows the MP-SfM setup and keepsml-depth-proas a separate third-party dependency:
DIR=$PWD
cd third_party/ml-depth-pro/
pip install -e . --no-deps
cd $DIRPromptDAis integrated slightly differently from the other depth models: PriMo imports it directly fromthird_party/PromptDAviasys.path, so running PriMo does not require an extrapip install -e third_party/PromptDAstep. As long as the submodule exists and the checkpoint path is configured, PromptDA works inside PriMo.- If you want to run the official PromptDA standalone scripts outside PriMo, you can still install it manually:
DIR=$PWD
cd third_party/PromptDA/
pip install -e . --no-deps
cd $DIRDocker
This repository already provides a Dockerfile, but it is closer to an MP-SfM-style base environment than a fully pre-baked PriMo runtime image.
Build the image locally:
docker build -t primo:latest .Run it with the repository mounted:
docker run --gpus all -it --rm \
--shm-size=8g \
-v $(pwd):/workspace/PriMo \
-w /workspace/PriMo primo:latestInside the container, finish by installing the package:
pip install -e .Optional steps inside Docker are the same as above:
# optional MASt3R speed up
DIR=$PWD
cd third_party/mast3r/dust3r/croco/models/curope/
python setup.py build_ext --inplace
cd $DIR
# optional DepthPro install
cd third_party/ml-depth-pro/
pip install -e . --no-deps
cd $DIR
# optional PromptDA standalone install
cd third_party/PromptDA/
pip install -e . --no-deps
cd $DIRIf you want to use prior poses, the current PriMo workflow expects a camera_poses.yaml file and matching intrinsics.yaml.
PriMo includes camera_processor.py to convert mobile capture metadata into these PriMo-compatible YAML files. In the current project, this script is specifically tailored to Stray Scanner exports, and is meant to process the ARKit-style:
camera_matrix.csvodometry.csv
into:
camera_poses.yamlintrinsics.yaml
python camera_processor.py \
--camera-matrix local/example/sofa/camera_matrix.csv \
--odometry local/example/sofa/odometry.csv \
--images-dir local/example/images \
--output-dir local/example \
--sample-interval 1This writes:
local/example/camera_poses.yamllocal/example/intrinsics.yaml
In other words, camera_processor.py is the bridge from Stray Scanner's ARKit prior pose matrix / odometry export to the prior-pose format used by PriMo.
python reconstruct.py \
--conf sp-lg_m3dv2 \
--data_dir local/example \
--intrinsics_pth local/example/intrinsics.yaml \
--images_dir local/example/images \
--cache_dir local/example/cache_dir \
--pose_config_path local/example/camera_poses.yaml \
--extract \
--verbose 0Outputs are written under local/example/sfm_outputs, while extracted priors and matches are cached under local/example/cache_dir.
PriMo is not a full rewrite of MP-SfM. Instead, it introduces a small number of focused changes around registration and mobile priors:
- Inject prior poses and intrinsics from mobile capture metadata.
- Select matching pairs from prior poses through
pairs_from_prior_pose(...), instead of relying only on NetVLAD retrieval. - Refine prior-guided initialization using normalized correspondences, an essential matrix induced by the prior relative pose, and Levenberg-Marquardt optimization over a 6-DoF update.
- Refine incremental prior poses with pose-only optimization from 2D-3D correspondences collected from already registered views.
- Optionally improve monocular depth priors with PromptDA using ARKit-like low-resolution depth prompts.
PriMo also supports Prompt Depth Anything for mobile captures with rough sensor depth, such as ARKit or Stray Scanner depth maps.
In the current project, the PromptDA integration is intended for Stray Scanner's exported ARKit depth maps. The expected setup is:
- RGB images in
local/example/images - aligned ARKit depth prompt maps in a directory such as
.../depth - one prompt depth file per RGB frame, typically stored as
.png
Configure configs/defaults/promptda.yaml with:
extractors:
promptda:
pretrained_path: /absolute/path/to/prompt_depth_anything_vitl.ckpt
prompt_depth_dir: /absolute/path/to/arkit_depth_pngThen run:
python reconstruct.py \
--conf defaults/promptda \
--data_dir local/example \
--intrinsics_pth local/example/intrinsics.yaml \
--images_dir local/example/images \
--cache_dir local/example/cache_dir \
--pose_config_path local/example/camera_poses.yaml \
--extract \
--verbose 0By default, PromptDA depth prompts are expected as .png files and interpreted as millimeter depth, which is converted internally to meters.
So in practice:
camera_processor.pyhandles Stray Scanner -> ARKit prior pose / intrinsicsPromptDAhandles Stray Scanner -> ARKit depth prompt maps
These two inputs complement each other in the current PriMo workflow.
reconstruct.py: main entry point for reconstruction experiments.camera_processor.py: converts mobile metadata intocamera_poses.yamlandintrinsics.yaml.configs/: high-level reconstruction presets and per-estimator defaults.PriMo/sfm/mapper/registration.py: prior-pose-guided registration and prior pose refinement logic.PriMo/extraction/pairs/prior_pose.py: prior-pose-based pair selection that can replace the original NetVLAD retrieval path.PriMo/extraction/imagewise/geometry/models/depth/promptda.py: PromptDA wrapper used inside PriMo.third_party/: external projects vendored as submodules.
In particular, the PromptDA integration is source-tree based: PriMo loads third_party/PromptDA directly, similar in spirit to the other vendored third-party geometry models, but without requiring a mandatory package installation step for the PriMo pipeline itself.
PriMo is especially aimed at improving reconstruction robustness in cases where camera registration is unreliable because of:
- weak visual texture,
- repeated patterns or symmetric structure,
- low-overlap mobile trajectories,
- noisy mobile depth and pose priors.
The current codebase should be considered an actively evolving research release built around a specific goal: making MP-SfM more reliable for difficult mobile captures, especially those that suffer from low- or weak-texture misregistration.
PriMo is built on top of the excellent MP-SfM codebase. We especially thank the MP-SfM authors for making their implementation available and for providing the foundation that this project extends.
We also gratefully acknowledge the open-source projects used in this repository or its integrated priors and matchers:
- DUSt3R
- MASt3R
- Metric3D
- Depth Pro
- Depth-Anything-V2
- Prompt Depth Anything
- DSINE
- LightGlue
- SuperPoint
- NetVLAD
We especially thank MP-SfM, since PriMo directly extends its reconstruction framework rather than starting from scratch.
You can use the Stray Scanner App to capture your own data. This requires iPhone 12 Pro or later Pro models, or iPad 2020 Pro or later Pro models.
In the current PriMo workflow, Stray Scanner is especially useful because:
- its ARKit pose export can be converted by
camera_processor.pyinto PriMo prior poses and intrinsics, - its ARKit depth export can be used as PromptDA depth prompts.
