DeepSimNets

Official repository for DeepSim-Nets: Deep Similarity Networks for Stereo Image Matching paper 📄 accepted for CVPR 2023 EarthVision Workshop.

The paper code is divided into two parts:

Training and Testing of the classifiers performance: The code in this repo should do it !
Inference: The code is integrated under MicMac and is written in C++ (including Torch C++)

Overall training pipeline

Epipolar Our MS-AFF PSMNet Normalized Cross Correlation

We propose to learn dense similarities by training three multi-scale learning architectures on wider images tiles. To enable robust self-supervised contrastive learning, a sample mining is developed. Our main idea lies in relying on wider suppport regions to leverage pixel-level similarity-aware embeddings. Then, the whole set of pixel embeddings of a reference image are matched to their corresponding ones at once. Our approach alleviates the block matching distinctiveness shotcomings by exploiting the image wider context. We therefore leverage quite distinctive similarity measures that outcome standard hand-crafted correlation (NCC) and deep learning patch based approaches (MC-CNN). Compared to end-to-end methods, our DeepSim-Nets are highly versatile and readily suited for standard mutli resolution and large scale stereo matching pipelines.

Multi-Scale Attentional Feature Fusion (MS-AFF)

We additionally propose a lightweight architecture baptized MS-AFF where inputs are 4 multi-scale or multi-resolution tiles as highlighted below. The generated multi-scale features are iteratively fused based on an adpated attention mechanism from Attentional Feature Fusion. Here is the architecture together with the multi-scale attention module.

Training

DeepSim-Nets are trained on Aerial data from Dublin dataset on 4 GPUs. The following summarizes the training environment:

Ubuntu 18.04.6 LTS/CentOS Linux release 7.9.2009
Python 3.9.12
PyTorch 1.11.0
pytorch_lightning 1.6.3
CUDA 10.2, 11.2 and 11.4
NVIDIA V100 32G/ NVIDIA A100 40G
64G RAM

Dataset structure:

To train DeepSim-Nets in general, datasets should include the following elementary batch compositin:

Left image tile
Right image tile
Ground truth densified diparity map
Occlusion mask
Definition mask: Sometimes, disparity map contain NaN data where no information is provided, this should be considered to define the ROI of interest.

The following is an example of what should the aformentioned image tiles look like:

The project follows the structure described below:

├─ configs                 # Configuration files for training 
├─ datasets                # Datasets classes 
├─ models                  # Models' architectures 
├─ trained_models          # Holds some models' checkpoints 
├─ utils                   # Scripts for logging 
└─ Trainer.py              # Main script for traininig DeepSim-Nets 
└─ Tester.py               # Main script for testing DeepSim-Nets classification accuracy (Joint probabilities, AUC, etc)

To train DeepSim-Nets, this command can be run :

python3 Trainer.py -h
usage: Trainer.py [-h] --config_file CONFIG_FILE --model MODEL --checkpoint CHECKPOINT

optional arguments:
  -h, --help            show this help message and exit
  --config_file CONFIG_FILE, -cfg CONFIG_FILE
                        Path to the yaml config file
  --model MODEL, -mdl MODEL
                        Model name to train, possible names are: 'MS-AFF', 'U-Net32', 'U-Net_Attention'
  --checkpoint CHECKPOINT, -ckpt CHECKPOINT
                        Model checkpoint to load

Evaluation

To evaluate our classifiers performance, we estimate joint distributions of matching and non-similarity random variables on test data. These metrics are obtained by running the testing python script.

python3 Tester.py -h
usage: Tester.py [-h] --model MODEL --checkpoint CHECKPOINT --output_folder OUTPUT_FOLDER

optional arguments:
  -h, --help            show this help message and exit
  --model MODEL, -mdl MODEL
                        Model name to train, possible names are: 'MS-AFF', 'U-Net32', 'U-Net_Attention'
  --checkpoint CHECKPOINT, -ckpt CHECKPOINT
                        Model checkpoint to load
  --output_folder OUTPUT_FOLDER, -o OUTPUT_FOLDER
                        output folder to store results

Inference

Models

After training, models are scripted and arranged so that similarities could be computed by:

normalized dot product between embeddings : This relies on feature extractor ouptput feature maps.
learned similarity function from the MLP decision network (feature extractor+ MLP).

Model name	Dataset	Joint_Probability(JP)	💾	👇
MS-AFF feature	Dublin/Vaihingen/Enschede	--	4 M	link
MS-AFF decision (MLP)	Dublin/Vaihingen/Enschede	89.6	1,4 M	link
Unet32	Dublin/Vaihingen/Enschede	--	31,4 M	link
Unet32 decision (MLP)	Dublin/Vaihingen/Enschede	88.6	1,4 M	link
Unet Attention	Dublin/Vaihingen/Enschede	--	38,1 M	link
Unet Attention decision (MLP)	Dublin/Vaihingen/Enschede	88.9	1,4 M	link

Inference requires an SGM implementation for cost volume regularization. Our similarty models are scripted (*.pt files) and fed to our C++ implementation under the photogrammetry software. The main C++ production code is located at MMVII/src/LearningMatching. Our approach is embedded into the MicMac multi-resolution image matching pipeline and can be parametrized using a MicMac compliant xml file. The figure below illustrates

To reproduce the obtained results, we provide an epipolar pair consisting of high resolution aerial images (GSD=6cm). To run our code, we recommand to run the following script:

Docker Image

Pull the deepsim-Nets docker image :

 docker pull dali1210/micmac_deepsimnets:latest

Path to models Feature extractor + MLP

We provide MICMAC .xml configuration files that should be edited according to models locations. More specifically, the tag FileModeleParams should contain the path to both the feature extractor and MLP scripted models (*.pt).

 
 <EtapeMEC>
    <DeZoom> 4 </DeZoom> <!-- DeepSim-Nets run @ zoom 4-->
    <CorrelAdHoc>
        <SzBlocAH> 40000000 </SzBlocAH>
        <TypeCAH>
            <ScoreLearnedMMVII>
                <FileModeleCost> MVCNNCorrel</FileModeleCost>
                <FileModeleParams>./MODEL_AERIAL_MSNET_DECISION/.*.pt</FileModeleParams>
                <FileModeleArch>UnetMLPMatcher</FileModeleArch>
            </ScoreLearnedMMVII>
        </TypeCAH>
    </CorrelAdHoc>
</EtapeMEC>

Steps to run dense matching with DeepSim-Nets

#1. Download scripted models 
#2. Gather each model feature extractor and decision (MLP) under the same folder 
#3. Update models path explained above <FileModeleParams> bla bla bla </FileModeleParams>
#4. run docker image
docker run --gpus all --network=host --privileged --shm-size 25G -v path_to_images_folder:/process -it dali1210/micmac_deepsimnets:latest
#5. go to images folder 
cd /process
#6. run micmac with appropriate xml file (examples are in 
mm3d MICMAC XML_CONFIGURATION_FILE.xml +Im1=Epip1.tif +Im2=Epip2.tif +DirMEC=TEST_DEEPSIM_NETS +ZReg=0.002 +IncPix=100
#7. The ouput disparity maps follow the MicMac naming conventions

Contact information

please contact us @ mohamed.ali-chebbi@ign.fr or med.chebbi.mac@gmail.com

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

XML_MICMAC_DeepSimNets

XML_MICMAC_DeepSimNets

configs

configs

datasets

datasets

models

models

trained_models/MS_AFF_MLP

trained_models/MS_AFF_MLP

utils

utils

README.md

README.md

Tester.py

Tester.py

Trainer.py

Trainer.py

Repository files navigation

DeepSimNets

Multi-Scale Attentional Feature Fusion (MS-AFF)

Training

Dataset structure:

Evaluation

Inference

Models

Docker Image

Path to models Feature extractor + MLP

Steps to run dense matching with DeepSim-Nets

Contact information

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
XML_MICMAC_DeepSimNets		XML_MICMAC_DeepSimNets
configs		configs
datasets		datasets
models		models
trained_models/MS_AFF_MLP		trained_models/MS_AFF_MLP
utils		utils
README.md		README.md
Tester.py		Tester.py
Trainer.py		Trainer.py

DaliCHEBBI/DeepSimNets

Folders and files

Latest commit

History

Repository files navigation

DeepSimNets

Multi-Scale Attentional Feature Fusion (MS-AFF)

Training

Dataset structure:

Evaluation

Inference

Models

Docker Image

Path to models Feature extractor + MLP

Steps to run dense matching with DeepSim-Nets

Contact information

About

Resources

Stars

Watchers

Forks

Languages