This repository contains an unofficial implementation of DocUNet: Document Image Unwarping via a Stacked U-Net. We extend this work by:
- predicting the inverted vector fields directly, which saves computation time during inference
- adding more networks that can be used: from UNet to Deeplabv3+ with different backbones
- adding a second loss function (MS-SSIM / SSIM) to measure the similarity between unwarped and target image
- achieving real-time inference speed (300ms) on cpu for Deeplabv3+ with MobileNetv2 as backbone
Unfortunately, I am not allowed to make public the dataset. However, I created a very small toy dataset to give you an idea of how the network input should look. You can find this here. The idea is to create a 2D vector field to deform a flat input image. The deformed image is used as network input and the vector field is the network target.
- Check the available parser options.
- Download the toy dataset.
- Set the path to your dataset in the available parser options.
- Create the environment from the conda file:
conda env create -f environment.yml
- Activate the conda environment:
conda activate unwarping_assignment
- Train the networks using the provided scripts: 1, 2. The trained model is saved to the
save_dir
command line argument. - Run the inference script on your set. The command line argument
inference_dir
should be used to provide the relative path to the folder which contains the images to be classified.