-
Notifications
You must be signed in to change notification settings - Fork 106
Home
We present a new, massively parallel method for high-quality multiview matching.
Source code for the paper:
S. Galliani, K. Lasinger and K. Schindler, Massively Parallel Multiview Stereopsis by Surface Normal Diffusion (supplementary material), ICCV 2015
IMPORTANT: If you use this software please cite the following in any resulting publication:
@InProceedings{Galliani_2015_ICCV,
author = {Galliani, Silvano and Lasinger, Katrin and Schindler, Konrad},
title = {Massively Parallel Multiview Stereopsis by Surface Normal Diffusion},
journal = {The IEEE International Conference on Computer Vision (ICCV)},
month = {June},
year = {2015}
}
- Cuda >= 6.0
- Nvidia video card with compute capability at least 3.0, see https://en.wikipedia.org/wiki/CUDA#GPUs_supported
- Opencv >= 2.4
- cmake
- Ubuntu GNU/Linux 14.04 with nvidia gtx 980
- Ubuntu GNU/Linux 14.04 (use their repository for cuda sdk and nvidia drivers)
- Windows with Visual Studio 2012/2013 (it's working directly with cmake)
- GT720 (PASS, but 30X slower than Titan X)
- GTX980 (PASS)
- Titan X (PASS)
- Quadro K4000 (PASS)
Use cmake for both Windows and Linux. For linux it gets as easy as:
cmake .
make
Gipuma itself is only a matcher. It will compute a depthmap with respect to the specified reference camera.
For each camera gipuma computes the noisy depthmap. The final fusion of depthmap is obtained with fusibile
Use a point cloud visualizer to open the resulting ply file. Meshlab is probably the best visualizer because it allows to render points cloud with normals.
inside gipuma directory first download middlebury data:
./scripts/download-middlebury.sh
Then run the model you prefer (see the list of available scripts inside the script folder). For example for the dino sparse ring:
./scripts/dinoSparseRing.sh
It will fuse the dephmaps without considering point projecting on the image with an intensity lower than 15 (on a scale 0-255). The result should match the middlebury benchmark submission (excluding Poisson reconstruction)
http://roboimagedata.compute.dtu.dk/?page_id=24
In case you did not download it, you can download the sample dataset (warning: 6GB of data)):
./scripts/download-dtu.sh
Then to reconstruct image 1 using fast settings:
./scripts/dtu_fast.sh 1
or for the accurate version
./scripts/dtu_accurate.sh 1
In case you need to reconstruct other images just copy the corresponding folder inside ./data/dtu/SampleSet/MVS\ Data/Rectified/
TODO
The minimum information Gipuma needs is camera information and image list Gipuma relies on known camera information. You can provide this information in 3 different ways:
Parameter Name | Syntax | Comment |
---|---|---|
pmvs_folder | -pmvs_folder <folder> | The easiest way is to point gipuma to the output of VisualSFM for pmvs. Images will be taken from <pmvs>/visualize and cameras from <pmvs>/txt/ Additionally 3d points in <pmvs>/bundle.rd.out |
krt_file | -krt_file <file> | In this way camera information is read from a file as specified by Middlebury benchmark http://vision.middlebury.edu/mview/data/ |
p_folder | -p_folder <folder> | This parameter expects a folder with a list of textfiles containing the P matrix on 3 lines and the same filename as the images but with ".P" appended |
To specify an image list in case a pmvs folder is not specify a list of filename is needed with an image folder. For example:
./gipuma image1.jpg image2.jpg -img_folder images/ -krt_file Temple.txt
gipuma comes with a good-enough parameter sets, but to obtain best results some setting can be optimized
Parameter Name | Syntax and default Value | Comment |
---|---|---|
camera_idx | --camera_idx=00 | This value set the reference camera to be used. The resulting depth map will be computed with respect to this camera |
blocksize | --blocksize=19 | It's an important value that affects the patch size used for cost computation. Its value is highly dependent on the image resolution and the object size. Suggested value range from 9 for Middlebury size image (640x480) to 25 for DTU dataset (1600x1200) |
iterations | --iterations=8 | This parameter controls the amount of normal diffusion employed for the reconstruction. A value bigger than 8 rarely improves the reconstruction. Recuding its value trades-off runtime for quality of reconstruction |
min_angle and max_angle | --min_angle=5 --max_angle=45 | The reference camera will be matched with respect to other cameras that have a intersection angle of the principal ray withing the specified range. For datasets with many images a range 10-30 degree is suggested. For dataset with a sparse set of images (as middlebury SparseRing) a bigger range is needed |
max_views | --max_views=9 | In case more than max_views cameras survive the angle selection, a random set of max_views cameras is considered. |
depth_min and depth_max | --depth_min=-1 --depth_max=-1 | This value set the minimum depth for the reconstruction in world coordinate. In case it is not set it is computed as the minimum and maximum range for all the camera from the 3d points inside the specified pmvs directory. In case only a list of P matrices is given, it is computed from the minimum and maximum range of depth obtained when setting the viewing angle to the minimum and maximum specified |
The following parameters can be tweaked at will but in our experience they do not affect the overall reconstruction TODO
There are no bugs, just features! :P Please use the github ticketing system. In this way other users might benefit from tips ans solutions from other users.
Feel free to open pull request or edit this wiki! :)
I'm quoting here part of an e-mail exchange I had where I describe some tips to get a better reconstruction:
First of all you can try to increase the blocksize, it's usually producing better normals and often also a better reconstruction at the cost of higher runtime and sometimes stronger fattening effects around thin structures.
Then you can try to make the fusion stage more conservative by reducing disp_thresh. Keep fixed normal_tresh and num_consistent since they are already quite big in my experience, you might even increase them.
The final suggestion is an undocumented tip. During the plane refinement stage you can try to decrease the disparity threshold and the disparity steps in the following way: Modify this line https://github.com/kysucix/gipuma/blob/master/gipuma.cu#L959 with: for ( float deltaZ = maxdisp; deltaZ >= 0.1f; deltaZ = deltaZ / 4.0f ) { or even a more aggressive: for ( float deltaZ = maxdisp; deltaZ >= 0.01f; deltaZ = deltaZ / 2.0f ) { I will probably make this configurable at a certain point, without the need of this dirty hack. The drawback will be longer runtime but with a higher change of jumping out of local minima.
Sometimes it worked better to use "good" instead of "best_n" in cost_comb, just give it a try. In this way not the best n cost are aggregated but all the cost with a pairwise error within good_factor(default 1.5) multiplied by the best cost. It's a more data-driven approach which some times was giving better results.