Skip to content

FelixHertlein/inv3d-generator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Inv3D Dataset Generation

Create your own version of the Inv3D dataset!

This repository contains the dataset generation code of our paper which has been accepted at the International Conference on Document Analysis and Recognition (ICDAR) 2023.

For more details see our project page project page.

Installation

Download external ressources

For the dataset generation, your need the following ressources:

  • Warped paper meshes (download samples orj full dataset)
  • Company logos (download here)
  • HDR envrionment maps (download samples or full dataset [currently offline, see issue])
  • Fonts (already in this repository)
  • Document templates (already in this repository)

All ressources must be placed in the corresponding asset folder within the top-level directory "assets". The asset integration can also be done using docker mounts to avoid copying these inside the container.

Build docker image

docker build -t inv3d-generator PATH_TO_REPOSITORY

Getting started

Start dataset generation

docker run \
--cpus=8 -it \
--init \
--mount source=inv3d-volume,target=/usr/inv3d/out \
inv3d-generator \
--num_workers 4 \
--num_samples 10 \
default \
--resolution_rendering 448 \
--seed 42 \
--document_dpi 150

Resume dataset generation

docker run \
--cpus=8 -it \
--init \
--mount source=inv3d-volume,target=/usr/inv3d/out \
--entrypoint python \
inv3d-generator \
-u src/resume.py --num_workers 4

Sample Files

Preview Name Resolution Dtype Value Range Description
flat_document.png 1755x1240x3
(150dpi)
uint8 0-255 Complete document in perfect condition.
flat_information_delta.png 1755x1240x3
(150dpi)
uint8 0-255 Displays all texts which represent invoice data
flat_template.png 1755x1240x3
(150dpi)
uint8 0-255 Empty invoice template
flat_text_mask.png 1755x1240x3
(150dpi)
uint8 0-255 All texts shown in given document.
warped_angle.png 448x448x2 float32 -Pi-Pi Angle rotation of x- and y-axis induced by the warping.
warped_albedo.png 448x448x3 uint8 0-255 Albedo map
warped_BM.npz 448x448x2 float32 0-1 Backward mapping. Defines for each pixel the relative pixel shift from warped to normalized image.
warped_curvature.npz 448x448x1 float32 0-inf Pixel-wise curvature of the warped document.
warped_depth.npz 448x448x3 float32 0-inf Depth per pixel between camera and document
warped_document.png 448x448x3 uin8 0-255 Warped document image
warped_normal.npz 448x448x3 float32 -inf-inf Normals of warped document
warped_recon.png 448x448x3 uin8 0-255 Warped document using a chess texture.
warped_text_mask.npz 448x448x1 bool8 True;False Boolean mask indicating text pixels.
warped_UV.npz 448x448x3 float32 0-1 Warped texture coordinates.
warped_WC.npz 448x448x3 float32 -inf-inf Coordinates in the 3D space.

Citation

If you use the code of our paper for scientific research, please consider citing

@article{hertlein2023inv3d,
	title        = {Inv3D: a high-resolution 3D invoice dataset for template-guided single-image document unwarping},
	author       = {Hertlein, Felix and Naumann, Alexander and Philipp, Patrick},
	year         = 2023,
	journal      = {International Journal on Document Analysis and Recognition (IJDAR)},
	publisher    = {Springer},
	pages        = {1--12}
}

Acknowledgements

This work is based on the dataset generation of Doc3D.

Affiliations

FZI Logo

License

This project is licensed under MIT unless another license is specified in a given subfolder.

About

Code to generate the Inv3D dataset from our paper "Inv3D: a high-resolution 3D invoice dataset for template-guided single-image document unwarping" (ICDAR) 2023.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published