For those who would like a quick start with inference

Requirements

tested with python 2.7.15 and 3.6.8
tested with pytorch 0.4.0, 0.4.1 and 1.0.0
a few packages need to be installed, for eamxple, texttable, scikit-image

Prepare your data

{dataset}
- {pair_1}
  - im0.png
  - im1.png
- {pair_2}
  - im0.png
  - im1.png
- ...

Execute command

CUDA_VISIBLE_DEVICES=0 python submission.py --datapath {dataset} --outdir {output} --loadmodel ./final-768px.pth --testres 1 --clean 0.8 --max_disp 512

Check output

{output}
- {pair_1}
  - img_reference.png
  - img_target.png
  - disp.npy
  - uncertainty.npy
  - disp.jpg
  - disp.cbar.jpg
  - disp_min_max.txt
  - disp.mask.jpg
  - uncertainty.jpg
  - time.txt
- {pair_2}
  - img_reference.png
  - img_target.png
  - disp.npy
  - uncertainty.npy
  - disp.jpg
  - disp.cbar.jpg
  - disp_min_max.txt
  - disp.mask.jpg
  - uncertainty.jpg
  - time.txt
- ...

Hierarchical Deep Stereo Matching on High Resolution Images

Architecture:

Qualitative results on Middlebury (refer to project webpage for more results)

Performance on Middlebury benchmark (y-axis: the lower the better)

Requirements

tested with python 2.7.15 and 3.6.8
tested with pytorch 0.4.0, 0.4.1 and 1.0.0
a few packages need to be installed, for eamxple, texttable

Weights

Download

Data

train/val

test

High-res-real-stereo (HR-RS): comming soon

Train

Download and extract training data in folder /d/. Training data include Middlebury train set, HR-VS, KITTI-12/15 and SceneFlow.
Run

CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --maxdisp 384 --batchsize 24 --database /d/ --logname log1 --savemodel /somewhere/  --epochs 10

Evalute on Middlebury additional images and KITTI validation set. After 10 epochs, average error on Middlebury additional images with half-res should be around 4.6 (excluding Shopvac).

Inference

Example:

CUDA_VISIBLE_DEVICES=3 python submission.py --datapath ./data-mbtest/   --outdir ./mboutput --loadmodel ./weights/final-768px.pth  --testres 1 --clean 0.8 --max_disp -1

Evaluation:

CUDA_VISIBLE_DEVICES=3 python submission.py --datapath ./data-HRRS/   --outdir ./output --loadmodel ./weights/final-768px.pth  --testres 0.5
python eval_disp.py --indir ./output --gtdir ./data-HRRS/

And use cvkit to visualize in 3D.

Example outputs

left image

3D projection

disparity map

uncertainty map (brighter->higher uncertainty)

Parameters

testres: 1 is full resolution, and 0.5 is half resolution, and so on
max_disp: maximum disparity range to search
clean: threshold of cleaning. clean=0 means removing all the pixels.

Citation

@InProceedings{yang2019hsm,
author = {Yang, Gengshan and Manela, Joshua and Happold, Michael and Ramanan, Deva},
title = {Hierarchical Deep Stereo Matching on High-Resolution Images},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2019}
}

Acknowledgement

Part of the code is borrowed from MiddEval-SDK, PSMNet, FlowNetPytorch and pytorch-semseg. Thanks SorcererX for fixing version compatibility issues.

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
data-mbtest/CrusadeP		data-mbtest/CrusadeP
dataloader		dataloader
mboutput		mboutput
models		models
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
architecture.png		architecture.png
eval_disp.py		eval_disp.py
final-768px.pth		final-768px.pth
middlebury-benchmark.png		middlebury-benchmark.png
save_image_only.py		save_image_only.py
submission.py		submission.py
train.py		train.py

License

Kai-46/high-res-stereo

Folders and files

Latest commit

History

Repository files navigation

For those who would like a quick start with inference

Requirements

Prepare your data

Execute command

Check output

Hierarchical Deep Stereo Matching on High Resolution Images

Requirements

Weights

Data

train/val

test

Train

Inference

Example outputs

Parameters

Citation

Acknowledgement

About

Resources

License

Stars

Watchers

Forks

Languages