Skip to content

CT-MVSNet CURVATURE-GUIDED FOR MULTI-VIEW STEREO with Transformers

License

Notifications You must be signed in to change notification settings

Sun-Licheng/CT-MVSNet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 

Repository files navigation

CT-MVSNet:Curvature-Guided For Multi-View Stereo with Transformers

Recently, the proliferation of Dynamic Scale Convolution modules has simplified the feature correspondence between multiple views. Concurrently, Transformers have been proven effective in enhancing the reconstruction of multi-view stereo (MVS) by facilitating feature interactions across views. In this paper, we present CM-MVSNet based on an in-depth study of feature extraction and matching in MVS. By exploring inter-view relationships and measuring the receptive field size and feature information on the image surface through the curvature of the law, our method adapts to various candidate scales of curvature. Consequently, this module outperforms existing networks in adaptively extracting more detailed features for precise cost computation. Furthermore, to better identify inter-view similarity relationships, we introduce a Transformer-based feature matching module. Leveraging Transformer principles, we align features from multiple source views with those from a reference view, enhancing the accuracy of feature matching. Additionally, guided by the proposed curvature-guided dynamic scale convolution and Transformer-based feature matching, we introduce a feature-matching similarity measurement module that tightly integrates curvature and inter-view similarity measurement, leading to improved reconstruction accuracy. Our approach demonstrates advanced performance on the DTU dataset and the Tanks and Temples benchmark. Details are described in our paper and our result:

CT-MVSNet:Curvature-Guided For Multi-View Stereo with Transformers

Licheng Sun, Liang Wang

CT-MVSNet is more robust on the challenge regions and can generate more accurate depth maps. The point cloud is more complete and the details are finer.

If there are any errors in our code, please feel free to ask your questions.

⚙ Setup

1. Recommended environment

  • PyTorch 1.9.1
  • Python 3.7

2. DTU Dataset

Training Data. We adopt the full resolution ground-truth depth provided in CasMVSNet or MVSNet. Download DTU training data and Depth raw. Unzip them and put the Depth_raw to dtu_training folder. The structure is just like:

dtu_training                          
       ├── Cameras                
       ├── Depths   
       ├── Depths_raw
       └── Rectified

Testing Data. Download DTU testing data and unzip it. The structure is just like:

dtu_testing                          
       ├── Cameras                
       ├── scan1   
       ├── scan2
       ├── ...

3. Tanks and Temples Dataset

Testing Data. Download Tanks and Temples and unzip it. Here, we adopt the camera parameters of short depth range version (Included in your download), therefore, you should replace the cams folder in intermediate folder with the short depth range version manually. The structure is just like:

tanksandtemples                          
       ├── advanced                 
       │   ├── Auditorium       
       │   ├── ...  
       └── intermediate
           ├── Family       
           ├── ...          

📊 Testing

1. Download models

Put model to <your model path>.

2. DTU testing

Fusibile installation. Since we adopt Gipuma to filter and fuse the point on DTU dataset, you need to install Fusibile first. Download fusible to <your fusibile path> and execute the following commands:

cd <your fusibile path>
cmake .
make

If nothing goes wrong, you will get an executable named fusable. And most of the errors are caused by mismatched GPU computing power.

Point generation. To recreate the results from our paper, you need to specify the datapath to <your dtu_testing path>, outdir to <your output save path>, resume to <your model path>, and fusibile_exe_path to <your fusibile path>/fusibile in shell file ./script/dtu_test.sh first and then run:

bash ./scripts/dtu_test.sh

Note that we use the CT-MVSNet_dtu checkpoint when testing on DTU.

Point testing. You need to move the point clouds generated under each scene into a folder dtu_points. Meanwhile, you need to rename the point cloud in the mvsnet001_l3.ply format (the middle three digits represent the number of scene). Then specify the dataPath, plyPath and resultsPath in ./dtu_eval/BaseEvalMain_web.m and ./dtu_eval/ComputeStat_web.m. Finally, run file ./dtu_eval/BaseEvalMain_web.m through matlab software to evaluate DTU point scene by scene first, then execute file ./dtu_eval/BaseEvalMain_web.m to get the average metrics for the entire dataset.

🖼 Visualization

To visualize the depth map in pfm format, run:

python main.py --vis --depth_path <your depth path> --depth_img_save_dir <your depth image save directory>

The visualized depth map will be saved as <your depth image save directory>/depth.png. For visualization of point clouds, some existing software such as MeshLab can be used.

⏳ Training

DTU training

To train the model from scratch on DTU, specify the datapath and log_dir in ./scripts/dtu_train.sh first and then run:

bash ./scripts/dtu_train.sh

By default, we employ the DistributedDataParallel mode to train our model, you can also train your model in a single GPU.

👩‍ Acknowledgements

Thanks to MVSNet, MVSNet_pytorch and CasMVSNet.

About

CT-MVSNet CURVATURE-GUIDED FOR MULTI-VIEW STEREO with Transformers

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages