Skip to content

Pytorch implementation and extension of "DocUnet: Document Image Unwarping via A Stacked U-Net"

Notifications You must be signed in to change notification settings

mhashas/Document-Image-Unwarping-pytorch

Repository files navigation

Document Image unwarping

This repository contains an unofficial implementation of DocUNet: Document Image Unwarping via a Stacked U-Net. We extend this work by:

  • predicting the inverted vector fields directly, which saves computation time during inference
  • adding more networks that can be used: from UNet to Deeplabv3+ with different backbones
  • adding a second loss function (MS-SSIM / SSIM) to measure the similarity between unwarped and target image
  • achieving real-time inference speed (300ms) on cpu for Deeplabv3+ with MobileNetv2 as backbone

Training dataset

Unfortunately, I am not allowed to make public the dataset. However, I created a very small toy dataset to give you an idea of how the network input should look. You can find this here. The idea is to create a 2D vector field to deform a flat input image. The deformed image is used as network input and the vector field is the network target.

Training on your dataset

  1. Check the available parser options.
  2. Download the toy dataset.
  3. Set the path to your dataset in the available parser options.
  4. Create the environment from the conda file: conda env create -f environment.yml
  5. Activate the conda environment: conda activate unwarping_assignment
  6. Train the networks using the provided scripts: 1, 2. The trained model is saved to the save_dir command line argument.
  7. Run the inference script on your set. The command line argument inference_dir should be used to provide the relative path to the folder which contains the images to be classified.

Sample results

Releases

No releases published

Packages

No packages published