Skip to content

KKroliKK/document-cropper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation


Logo

Document-Cropper

Python document cropper which can be applied to images.

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. Algorithm Description
  5. Acknowledgments

About The Project

There are a several articles and GitHub repos dedicated to document segmentation; however, I didn't find one that worked right out of the box, so I created this one. It can be used for preprocessing document images for further text recognition on them or for saving them in proper format.

The description below explains the whole circumcision process. Maybe you will make up an idea of how to make it work better.

(back to top)

Built With

(back to top)

Installation

This package can be easily installed via pip

pip install document-cropper

(back to top)

Usage

import document_cropper as dc

To crop image you should use:

dc.crop_image("path_to_image.jpg", "name_for_the_result.jpg")

If you want to continue processing the photo and save the cropping result to a variable:

cropped = dc.crop_image("path_to_image.jpg")

Also you can provide input as np.ndarray

from skimage import io

cropped = dc.crop_image(image=image)

If you want to see all stages of processing you can use:

# Save the stages as image
dc.crop_image_pipeline("path_to_image.jpg", "name_for_the_result.jpg")

# Show the stages as matplotlib figure
dc.crop_image_pipeline("path_to_image.jpg")

main.py file contains examples of using these methods

There are also demostration() methods in this repository which can be uncommented in main.py. They show how the code from the files works in order to make it easier to figure out what the methods are doing. This demonstration() methods present only in this repository and do not come with the package when installed via pip

# segmentation.demonstration()
# corner_detection.demonstration()
# image_cropper.demonstration()
# cropper_pipeline.demonstration()

(back to top)

Algorithm Description

This implementation was based on this Inovex article. Code from the article didn't work out of the box, so I have reworked part of the code and implemented my own corner detection algorithm.

The algorithm consists of several steps: 1 preprocessing; 2) corner detection; 3) cropping.
I chose the most optimal methods and their hyperparameters testing each on a dataset of 200 photos.

  1. Convert initial RGB image into monochrome one.
    Usually documents are white and stand out strongly in the photo so we can use contrast filters for our needs. Such filters work well with monochrome images.

  2. Apply Gaus filter
    Gauss filter blurs the image thereby removing some artifacts. Tests showed that we can get better segmentation results using this filter.

  3. Thresholding
    At this step I apply thresholding as first binarization step. Usage of the thresholding method was missed in the article so I tested all the thresholding methods available in skimage and chose the best one. It turned out to be Otsu thersholding with disc size of 8 pixels.

  4. Document selection
    This is the most important step of the algorithm. Mistakes on this step cause the whole cropping process to fail.
    After Otsu thresholding we get picture with different white zones. One of such zones is our document. At this step I am trying leave only document white zone. There is method in skimage which can cluster pixels from disjoit white zones. The biggest cluster is our document so I leave only it and remove all other regions.

    This part of the algorithm should be improved. There are two cases when extracting works incorrect. The first one is when some of background white regions is connected to the documnet's region. The second one is when some of the background regions is bigger than document. There are examples for theese problems below:


    These issues should be handled somehow in the future.

  5. Fill holes
    At this step we remove holes left by text.
    Two different binary holes methods from skimage and scipy were used. I went through different values of the hyperparameters and found the best ones. Also I tested the order of the application of these methods.

    There is another issue. In the original article this step of binary closing was performed before extracting document region (previous step). Changing the order of applying theese methods gave significant increase in quality. If we perform binary closing before previous step we will glue background to the document.

  6. Corner detection
    Now we have segmentation mask of the document and we can find document's corners. For this perpose I extract one pixel edge from obtained mask. There is a problem with edge selection if the document goes out of the image (as in the example below). In such a case method for edges extrcation loses some of the edges so I decided to use padding with False values before edge extraction. It solved the problem.

    Now we have edge pixels. Some of them belong to sides of the mask others belong to corners. Let's consider side pixel. If we look at surrounding of such a pixel we will understand that it has about half white neighbors and half of black neighbors. If we consider corner pixels then we will get that they have much less than half of white pixels. By this way we can decide guess which pixels are more likely to belong to the corners.

    Finally we should decide which 4 pixels will we take for corners. I go through the obtained list of pixels and select the closest one for each of the corners of the image.

    Algorithm described above is totally my idea. In other articles an repos the authors try to use Hough lines tranform. They try to find straight lines in seegmentation mask and then try to find their intrsections. Finally, the choose corners from theese intersections. I tested some varints of this approach and it did not give good results.

  7. Сutting out
    When coordinates of corners are found we can finally cut the document from the image and rescale it to the correct form.

(back to top)

Acknowledgments

(back to top)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages