Omni-AnyNet

Omni-AnyNet [1] is a network for omnidirectional stereo vision. It extends the network AnyNet [2] by incorporating OmniGlasses, a set of carefully designed look up tables. The repository was originally forked from: https://github.com/mileyan/AnyNet

The current version of OmniGlasses allows to process a stereo pair of images underlying the equiangular camera model. Furthermore, it is currently required, that the cameras are arranged in a canonical stereo setup. Hence, the cameras' x-axes (left-right) are collinear, and y-axes (top-down) are parallel.

If you use this work for an own publication, we kindly ask you to cite [1] and [2] (see Section References)

The following submodules are used by default: Omni_lut / OmniGlasses: OmniGlasses of Omni-AnyNet map_processing: Helper scripts for processing omnidirectional images and maps (see .gitmodules for repository locations)

Preparations

First, please download the sceneflow model from the AnyNet webpage to checkpoint/sceneflow/sceneflow.tar. Then use the Bash script prepare_machine.sh that helps you to create an appropriate Conda environment and that compiles necessary modules. It was tested on Ubuntu 22.04.

For a test run (run.sh), you need to download a checkpoint from ~~here~~ (coming soon). We trained and evaluated Omni-AnyNet on a converted dataset of THEOStereo [3]. This converted dataset can be downloaded from ~~here~~ (coming soon). If you use this dataset in your work, please cite [1] and [3]. Thank you.

Training / Inference / Evaluation

Scripts:

Script name	Purpose
`create_dataset.sh`	leftover from original AnyNet
`evaluate.sh`	evaluates a certain checkpoint / epoch
`evaluate_current_version.sh`	helper script to select parameters for evaluate.sh
`prepare_machine.sh`	generates Conda environment and compiles SPN
`run.sh`	inference and time measurements
`start_finetune.sh`	starts training / finetuning

Most important environment variables (see each Bash script for default values):

Variable	Description	Used in
`BASELINE`	baseline / distance between cameras in stereo setup	`start_finetune.sh`
`BETA`	loss function: Beta parameter of Smooth-L1-Loss	`start_finetune.sh, evaluate.sh`
`BSIZE`	batch size (inference)	`run.sh`
`CHKPNT`	path to initial checkpoint for training / finetuning	`start_finetune.sh, evaluate.sh`
`CONDA_ENV_NAME`	name of Conda environment	`run.sh, start_finetune.sh, evaluate.sh`
`CUDA_DEVICE_ORDER`	oder of CUDA devices (usually `PCI_BUS_ID`)	`run.sh, start_finetune.sh, evaluate.sh`
`CUDA_VISIBLE_DEVICES`	indices of CUDA devices to use (e.g. 0,1 for GPU 0 and 1)	`run.sh, start_finetune.sh, evaluate.sh`
`DATASET`	path to the dataset for inference or training	`run.sh, start_finetune.sh, evaluate.sh`
`EPOCHS`	maximum number of epochs for training	`start_finetune.sh, evaluate.sh`
`EVAL_BSIZE`	batch size for evaluation	`evaluate.sh`
`FAILMAIL`	send mail to this address if something failed	`evaluate.sh, start_finetune.sh`
`FOV_DEG`	field of view in deg. (where FOV relates to the full height)	`start_finetune.sh`
`INFERENCE_CHKPNT`	path to checkpoint for inference or training	`run.sh`
`LR`	learning rate	`start_finetune.sh, evaluate.sh`
`MASK_FULL_RES`	path to full resolution mask	`run.sh, start_finetune.sh, evaluate.sh`
`MASK_LUT_DIR`	path to the directory storing masks and OmniGlasses-LUTs	`run.sh, start_finetune.sh, evaluate.sh`
`MAX_DISP`	maximum disparity index neglecting residuals	`run.sh, start_finetune.sh, evaluate.sh`
`NUM_LOADER_WORKER`	number of dataloader workers (processes)	`run.sh, start_finetune.sh, evaluate.sh`
`SAVE_PATH`	directory for output files	`run.sh, start_finetune.sh, evaluate.sh`
`SPN_START`	index of epoch in which SPN module should be started (run.sh: SPN always active)	`run.sh, start_finetune.sh, evaluate.sh`
`TEST_BSIZE`	batch size for testing and validation (training pipeline)	`start_finetune.sh`
`TRAIN_BSIZE`	batch size for training	`start_finetune.sh`

Attention: The baseline is hardcoded in evaluate.sh

Scientific article

The article can be found here.

References

[1] J. B. Seuffert, A. C. Perez Grassi, H. Ahmed, R. Seidel, and G. Hirtz, “OmniGlasses: an optical aid for stereo vision CNNs to enable omnidirectional image processing,” Machine Vision and Applications, vol. 35, no. 3, pp. 58–72, Apr. 2024, doi: 10.1007/s00138-024-01534-2.

[2] Y. Wang et al., “Anytime Stereo Image Depth Estimation on Mobile Devices,” in 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada: IEEE, 2019, pp. 5893–5900. doi: 10.1109/ICRA.2019.8794003.

[3] J. B. Seuffert, A. C. Perez Grassi, T. Scheck, and G. Hirtz, “A Study on the Influence of Omnidirectional Distortion on CNN-based Stereo Vision,” in Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, VISIGRAPP 2021, Volume 5: VISAPP, Online Conference: SciTePress, Feb. 2021, pp. 809–816. doi: 10.5220/0010324808090816.

BibTex

@article{seuffert_omniglasses_2024,
    title = {{OmniGlasses}: an optical aid for stereo vision {CNNs} to enable omnidirectional image processing},
    volume = {35},
    issn = {0932-8092, 1432-1769},
    shorttitle = {{OmniGlasses}},
    url = {https://link.springer.com/10.1007/s00138-024-01534-2},
    doi = {10.1007/s00138-024-01534-2},
    number = {3},
    urldate = {2024-05-07},
    journal = {Machine Vision and Applications},
    author = {Seuffert, Julian B. and Perez Grassi, Ana C. and Ahmed, Hamza and Seidel, Roman and Hirtz, Gangolf},
    month = apr,
    year = {2024},
    pages = {58--72}
}

@inproceedings{wang_anytime_2019,
    author={Wang, Yan and Lai, Zihang and Huang, Gao and Wang, Brian H. and van der Maaten, Laurens and Campbell, Mark and Weinberger, Kilian Q.},
    booktitle={2019 International Conference on Robotics and Automation (ICRA)}, 
    title={Anytime Stereo Image Depth Estimation on Mobile Devices}, 
    year={2019},
    pages={5893-5900},
    doi={10.1109/ICRA.2019.8794003}
}

@inproceedings{seuffert_study_2021,
    title={A {Study} on the {Influence} of {Omnidirectional} {Distortion} on {CNN}-based {Stereo} {Vision}},
    isbn={978-989-758-488-6},
    doi={10.5220/0010324808090816},
    booktitle={Proceedings of the 16th {International} {Joint} {Conference} on {Computer} {Vision}, {Imaging} and {Computer} {Graphics} {Theory} and {Applications}, {VISIGRAPP} 2021, {Volume} 5: {VISAPP}},
    publisher={SciTePress},
    author={Seuffert, Julian Bruno and Perez Grassi, Ana Cecilia and Scheck, Tobias and Hirtz, Gangolf},
    month=feb,
    year={2021},
    pages={809--816}
}

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
Omni_lut @ 7a64d8c		Omni_lut @ 7a64d8c
checkpoint		checkpoint
dataloader		dataloader
docs		docs
figures		figures
map_processing @ a5232c2		map_processing @ a5232c2
models		models
utils		utils
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
Loading_multiple_luts.py		Loading_multiple_luts.py
README.md		README.md
create_dataset.sh		create_dataset.sh
evaluate.sh		evaluate.sh
evaluate_current_version.sh		evaluate_current_version.sh
finetune.py		finetune.py
inference.py		inference.py
look_up_lut.py		look_up_lut.py
main.py		main.py
prepare_machine.sh		prepare_machine.sh
rounding_off_tensors.py		rounding_off_tensors.py
run.sh		run.sh
start_finetune.sh		start_finetune.sh
tiffviewer.py		tiffviewer.py

License

hamza9305/Omni-AnyNet

Folders and files

Latest commit

History

Repository files navigation

Omni-AnyNet

Preparations

Training / Inference / Evaluation

Scientific article

References

BibTex

About

Resources

License

Stars

Watchers

Forks

Languages