( 💻 means code available)
- Accurate multiple view 3d reconstruction using patch-based stereo for large-scale scenes, TIP 2013
- Patchmatch based joint view selection and depthmap estimation, CVPR 2014 [video]
- 💻 Massively parallel multiview stereopsis by surface normal diffusion, ICCV 2015 [code] [project]
- 💻 Pixelwise view selection for unstructured multi-view stereo, ECCV 2016 [code] [project]
- 💻 Multi-Scale Geometric Consistency Guided Multi-View Stereo, CVPR 2019 [ACMH] [ACMM]
- Tapa-mvs: Textureless-aware patchmatch multi-view stereo, ICCV 2019
- Plane completion and filtering for multi-view stereo reconstruction, GCPR 2019
- 💻 Planar prior assisted patchmatch multi-view stereo, AAAI 2020 [ACMP]
- 💻 Multi-Scale Geometric Consistency Guided and Planar Prior Assisted Multi-View Stereo, T-PAMI 2022 [ACMMP]
- Academic Softwares
- Commercial Softwares
- DJI-Terra (大疆智图)
- Smart3D
- ContextCapture
- Visualization (visualization of point cloud & triangle mesh)
📃 PatchMatch for MVS (from Section 2 of TAPA-MVS paper)
The PatchMatch seminal paper by Barnes et al. proposed a general method to efficiently compute an approximate nearest neighbor function defining the pixelwise correspondence among patches of two images. The idea is to use a collaborative search which exploits local coherency. PatchMatch initializes each pixel of an image with a random guess about the location of the nearest neighbor in the second image. Then, each pixel propagates its estimate to the neighboring pixels and, among these estimates, the most likely is assigned to the pixel itself. As a result the best estimates spread along the entire image.
Bleyer et al. re-framed this method into the stereo matching realm. Indeed, for each image patch, stereo matching looks in the second image for the corresponding patch, i.e. the nearest neighbor in the sense of photometric consistency. To improve its robustness the matching function is not limited to fixed sized squared windows, but it extends PatchMatch to estimate a pixel-wise plane orientation adopted to define the matching procedure on slanted support windows. Heise et al. integrated the PatchMatch for stereo into a variational formulation to regularize the estimate with quadratic relaxation. This approach produces smoother depth estimates while preserving edges discontinuities.
The previous works successfully applied the PatchMatch idea to the pair-wise stereo matching problem. The natural extension to Multi-View Stereo was proposed by Shen. Here the author selects a subset of camera pairs depending on the number of shared points computed by Structure from Motion and their mutual parallax angle. Then he estimates a depth map for the selected subset of camera pairs through a simplified version of the method of Bleyer et al.. The algorithm refines the depth maps by enforcing consistency among multiple views, and it finally merges the depth maps into a point cloud.
A different multi-view approach by Galliani et al. modifies the PatchMatch propagation scheme in such a way that computation can better exploit the parallelization of GPUs. Differently, from Shen, they aggregate, for each reference camera, a set of matching costs compute from different source images. One of the major drawbacks of these approaches is the decoupled depth estimation and camera pairs selection. Xu and Tao recently proposed an attempt to overcome this issue; they extended with a more efficient propagation pattern and, in particular, their optimization procedure jointly considers all the views and all the depth hypotheses.
Rather than considering the whole set of images to compute the matching costs, Zheng et al. proposed an elegant method to deal with view selection. They designed a robust method framing the joint depth estimation and pixel-wise view selection problem into a variational approximation framework. Following a generalized Expectation Maximization paradigm, they alternate depth update with a PatchMatch propagation scheme, keeping the view selection fixed, and pixel-wise view inference with the forward-backward algorithm, keeping the depth fixed.
Schonberger et al. extended this method to jointly estimate per-pixel depths and normals, such that, differently from Zheng et al., the knowledge of the normals enables slanted support windows to avoid the fronto-parallel assumption. Then they add view-dependent priors to select views that more likely induce robust matching cost computation.
Welcome to contribute to this Repo!