This derivable pytorch operator allows to translate projection images to epipolar line images in another view.
For a given image
When applied to entire images, epipolar consistency maps emerge. These maps can be used as a geometry informed prior during model training. Applications include the reduction of false positives in segmentation task and the improvement of segmentation of partially occluded objects using the second view.
For more information on the application of this operator, please refer to our 2023 MICCAI submission. If this work is useful for your application, please cite our work as:
@InProceedings{10.1007/978-3-031-43898-1_6,
author="Rohleder, Maximilian and Pradel, Charlotte and Wagner, Fabian and Thies, Mareike and Maul, Noah and Denzinger, Felix and Maier, Andreas and Kreher, Bjoern",
editor="Greenspan, Hayit and Madabhushi, Anant and Mousavi, Parvin and Salcudean, Septimiu and Duncan, James and Syeda-Mahmood, Tanveer and Taylor, Russell",
title="Enabling Geometry Aware Learning Through Differentiable Epipolar View Translation",
booktitle="Medical Image Computing and Computer Assisted Intervention -- MICCAI 2023",
year="2023",
publisher="Springer Nature Switzerland",
address="Cham",
pages="57--65",
isbn="978-3-031-43898-1"
}
After installation, run the main.py script to get this image:
In Deep Learning models, it is often the case, that images are downsampled. The pre-calculated
fundamental matrices however require a fixed size (eg. (976, 976)
). To enable a dynamic
downsampling without having to instantiate a new layer per resolution, we introduced the
downsampled_factor
parameter.
So, for example, if you downsample an image by a factor of two and now your tensors
spatial dimensions are (488, 488)
, you can adapt this by setting this option.
fume3d = Fume3dLayer()
factor = torch.tensor([downsample_factor], dtype=torch.float64, device='cuda', requires_grad=False)
CM1 = fume3d(view2_bin, F12, F21, downsampled_factor=factor)
CM2 = fume3d(view1_bin, F21, F12, downsampled_factor=factor)
Furthermore, the projection matrices need to be defined to map onto the center of the detector. Eg. if the
detector has shape (976, 976)
, the projection matrices need to be compensated by this:
c = (976 / 2) - 0.5 # center of detector in pixels
to_center = np.array([[1, 0, -c],
[0, 1, -c],
[0, 0, 1]])
P1 = to_center @ P1
This enables the user to get downsampled images like this:
Padding and downsampling (needed in many CNN architectures) are also supported:
First, install a suitable pytorch installation (https://pytorch.org/get-started/locally/). It comes with the necessary libraries build the cuda sources in this implementation. To install the latest version right from this repository, do this:
git clone https://github.com/maxrohleder/FUME.git
cd FUME
pip install -e .
To test the installation, run the main file (e.g. python main.py
). You should get an image similar to the one above. The layers have been tested on Windows and linux using python 3.10 and the CUDA 11.3 installation of pytorch (comes with cudNN 8.0).
This repository is devided in two sections. The python part defines the pytorch
interface implementing a nn.module
and a autograd.function
. This can be found
in fume_layer.py.
The underlying implementation of the image translation layer is found in the
cuda folder. The .cpp
files take care of framework-related functions
whilst the actual image transformations are implemented in
image_translation_kernel.cu. The
header file is only needed to build the test script in
main.cpp using the CMakeLists.txt.
The file main.cpp tests the image translation functionality without the python frontend. This is not needed to build the FUME layers. The only purpose of building this seperate C++ executable is to verify the functionality during development.
To set up your machine for development, follow these steps:
- Download and unzip libtorch (I used libtorch_linux_cuda11.3)
- Copy path to folder (e.g. /usr/include/libtorch) and add it to the included directories in CMakeLists.txt
- Add path to your python include directories. You can find out where that is by running
import sysconfig; print(sysconfig.get_paths()['include'])
in your preferred python env. - Make sure CUDA build tools are installed correctly. Verify by running
nvcc --version
(tested with 11.3). Also install CudNN. (tested with 7.6)