Point a phone at a real object, get a clean, correctly-scaled, CAD-ready .glb mesh — reconstructed on-device, with the depth network running on the Snapdragon Hexagon NPU via ExecuTorch.
Built for the Qualcomm × Meta ExecuTorch Hackathon (Jun 27–28, 2026).
Faithful reconstruction (not generative): TSDF fusion of ARCore-aligned monocular depth maps.
RGB + ARCore pose + raw depth + intrinsics
│
├─► Depth-Anything-3 Small ......... dense RELATIVE depth (affine-invariant)
│ (DA-V2 Small = fallback) NPU-optimized for SM8750 (~43 ms/frame)
├─► affine scale/shift solver ...... fit metric ≈ s·pred + t vs ARCore sparse
│ metric depth → real meters (solve s AND t)
├─► TSDF fusion (Open3D) ........... average many views → one clean surface
├─► marching cubes ................. raw .glb mesh
└─► import_and_clean.py (Blender) .. watertight + normals + m→mm → CAD-ready .glb
Two depth sources by design: the monocular depth net gives a dense, smooth depth in unknown units; ARCore gives sparse-but-metric depth. The affine solver marries them so the mesh is both dense and correctly sized. See roadmap.md for the full execution plan, decision log, and risk register.
Depth net: Depth-Anything-3 Small (Apache-2.0, 24.7 M params, 518×518) is the host default and the on-device target — Qualcomm AI Hub already exports it NPU-optimized for the S25 SoC (SM8750) at ~43 ms/frame float on the Hexagon NPU. DA-V2 Small stays as the documented fallback. Generate disparities with depth_anything_v3.py; build the ExecuTorch .pte with export_da3_executorch.py (pip install -r requirements-da3.txt first). See LIVE_MESH_PLAN.md §5.0a for the research log.
pip install -r requirements-da3.txt
python3 depth_anything_v3.py --frames <dataset>/frames --output <dataset>/disparities
python3 export_da3_executorch.py --backend qnn --soc SM8750 -o da3_small_sm8750.pteGalaxy S25 / S25+ / S25 Ultra → Snapdragon 8 Elite = SM8750 (Adreno 830 + Hexagon NPU).
⚠️ The S25 FE is Exynos, not Snapdragon — the QNN lane (QnnPartitioner/QnnQuantizer/SM8750) does not target it. On an FE unit, use ExecuTorch's Samsung ENN backend, or fall back to a CPU/GPU (XNNPACK).pte.
| File | What |
|---|---|
roadmap.md |
Full execution roadmap — DAG, 6 phases, 5 decisions, risk register, module cards |
depth_anything_v3.py |
DA3-Small host inference → pipeline disparities (drop-in for depth_model.py) |
export_da3_executorch.py |
DA3-Small → ExecuTorch .pte (XNNPACK CPU / QNN SM8750 NPU) |
depth_model.py |
DA-V2 Small host inference (fallback depth net) |
requirements-da3.txt |
Heavy DA3/torch/ExecuTorch extras (kept out of core requirements.txt) |
import_and_clean.py |
Host-side mesh cleanup: raw TSDF mesh → watertight, scaled, CAD-ready .glb |
test_harness.sh |
Offline smoke test (generates a sphere fixture, runs the script, validates) |
orientation_test.sh |
Distinct-dims box test — verifies the pipeline is orientation-preserving |
Takes the reconstruction's output mesh (.glb, also .obj/.ply/.stl) and produces a clean mesh: import → join parts → voxel-remesh to watertight → consistent normals → scale m→mm → export. Writes both a .glb (for viewers/web) and an .stl beside it — CAD tools (Fusion 360, FreeCAD) can't read .glb, so the .stl is the actual CAD handoff. Returns a non-zero exit on failure so the pipeline can gate on it.
Verify CAD-importability with cad_check.py (imports the STL through the OpenCASCADE kernel — FreeCAD's kernel — and reports imports-as-mesh-body and solid-convertible; needs cadquery-ocp).
blender --background --python import_and_clean.py -- input.glb output.glb
# options:
# --scale FLOAT uniform scale (default 1000 = m→mm for CAD)
# --voxel FLOAT voxel remesh size in meters (default 0.005); smaller = finer
# --no-remesh keep raw topology (normals still fixed)
# --rotate-x DEG optional frame correction (default 0 — usually unneeded)
# --rotate-z DEGOrientation note: Blender's glTF importer and exporter apply inverse +Y-up↔+Z-up conversions, so import → clean → export is orientation-preserving with no manual rotation (verified by orientation_test.sh). Use --rotate-x/--rotate-z only if a mesh comes in tipped.
Contract (input from the float pipeline): meters (1.0 = 1 m, ARCore world), glTF +Y-up. The script logs imported dims (pre-transform, m) every run — the fastest check that units are right (a coffee mug ≈ 0.10).
./test_harness.sh # end-to-end smoke test
./orientation_test.sh # axis-swap / orientation checkBoth are offline (they generate their own fixtures) and exit non-zero on failure. Verified on Blender 5.1.
Host-side (~80% of de-risking runs before the phone is unboxed): float pipeline + quantization in progress; mesh cleanup + validation complete and tested. Phone integration is the final phase, not the first.