# How to Build a Kick-Ass Mobile Document <font color='blue'>Scanner</font> in Just 5 Minutes

Building a document scanner with OpenCV can be accomplished in just three simple steps:

**Step 1**: Detect edges.

**Step 2**: Use the edges in the image to find the contour (outline) representing the piece of paper being scanned.

**Step 3**: Apply a perspective transform to obtain the top-down view of the document.

Really. That’s it.

Only three steps and you’re on your way to submitting your own document scanning app to the App Store.

Sound interesting?

Read on. And unlock the secrets to build a mobile scanner app of your own.

OpenCV and Python versions:

This example will run on Python 2.7/3+ and OpenCV 2.4/3+

https://youtu.be/yRer1GC2298

<video src="videocourse/Building a Kick-Ass Document Scanner using Computer Vision, OpenCV, and Python.mp4" width="900px" height="580px" controls="controls"></video>

Last week I gave you a special treat — my very own ```transform.py```  module that I use in all my computer vision and image processing projects. [You can read more about this module here](https://pyimagesearch.com/2014/08/25/4-point-opencv-getperspective-transform-example/).

Whenever you need to perform a 4 point perspective transform, you should be using this module.

And you guessed it, we’ll be using it to build our very own document scanner.

So let’s get down to business.

Open up your favorite Python IDE, (I like Sublime Text 2), create a new file, name it scan.py , and let’s get started.

In [14]:
import sys, os
BASE_DIR = os.path.abspath(os.path.dirname('__file__'))
sys.path.append(BASE_DIR)

# import the necessary packages
from pyimagesearch.transform import four_point_transform
from skimage.filters import threshold_local
import numpy as np
import argparse
import cv2
import imutils

Lines 6-11 handle importing the necessary Python packages that we’ll need.

We’ll start by importing our ```four_point_transform```  function which I discussed last week.

We’ll also be using the ```imutils```  module, which contains convenience functions for resizing, rotating, and cropping images. You can read more about imutils  in my this [post](https://pyimagesearch.com/2015/02/02/just-open-sourced-personal-imutils-package-series-opencv-convenience-functions/). To install imutils , simply:

> $ pip install --upgrade imutils

Next up, let’s import the ```threshold_local```  function from scikit-image. This function will help us obtain the “black and white” feel to our scanned image.

Note (15 January 2018): The ```threshold_adaptive```  function has been deprecated. This post has been updated to make use of ```threshold_local``` .

Lastly, we’ll use ```NumPy``` for numerical processing, ```argparse```  for parsing command line arguments, and ```cv2```  for our OpenCV bindings.




In [15]:
'''
# construct the argument parser and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required = True,
	help = "Path to the image to be scanned")
args = vars(ap.parse_args())
'''

'\n# construct the argument parser and parse the arguments\nap = argparse.ArgumentParser()\nap.add_argument("-i", "--image", required = True,\n\thelp = "Path to the image to be scanned")\nargs = vars(ap.parse_args())\n'

It handles parsing our command line arguments. We’ll need only a single switch image, ```--image``` , which is the path to the image that contains the document we want to scan.

Now that we have the path to our image, we can move on to Step 1: Edge Detection.



In [16]:
args= {"image":"images/receipt.jpg"}

# Step 1: Edge Detection


In [17]:
# load the image and compute the ratio of the old height
# to the new height, clone it, and resize it
image = cv2.imread(args["image"])
ratio = image.shape[0] / 500.0
orig = image.copy()
image = imutils.resize(image, height = 500)
# convert the image to grayscale, blur it, and find edges
# in the image
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.GaussianBlur(gray, (5, 5), 0)
edged = cv2.Canny(gray, 75, 200)
# show the original image and the edge detected image
print("STEP 1: Edge Detection")
cv2.imshow("Image", image)
cv2.imshow("Edged", edged)
cv2.waitKey(0)
cv2.destroyAllWindows()

STEP 1: Edge Detection


First, we load our image off disk on Line 17.

In order to speedup image processing, as well as make our edge detection step more accurate, we resize our scanned image to have a height of 500 pixels on Lines 17-20.

We also take special care to keep track of the ratio  of the original height of the image to the new height (Line 18) — this will allow us to perform the scan on the original image rather than the resized image.

From there, we convert the image from RGB to grayscale on Line 24, perform Gaussian blurring to remove high frequency noise (aiding in contour detection in Step 2), and perform Canny edge detection on Line 26.

The output of Step 1 is then shown on Lines 30 and 31.

![](images/receipt-edge-detected.jpg)

# Step 2: Finding Contours

In [20]:
# find the contours in the edged image, keeping only the
# largest ones, and initialize the screen contour
cnts = cv2.findContours(edged.copy(), cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
cnts = imutils.grab_contours(cnts)
cnts = sorted(cnts, key = cv2.contourArea, reverse = True)[:5]
# loop over the contours
screenCnt = None
for c in cnts:
	# approximate the contour
	peri = cv2.arcLength(c, True)
	approx = cv2.approxPolyDP(c, 0.02 * peri, True)
	# if our approximated contour has four points, then we
	# can assume that we have found our screen
	if len(approx) == 4:
		screenCnt = approx
		break
# show the contour (outline) of the piece of paper
print("STEP 2: Find contours of paper")
#print(screenCnt,type(screenCnt))
if screenCnt.any():
    cv2.drawContours(image, [screenCnt], -1, (0, 255, 0), 2)
cv2.imshow("Outline", image)
cv2.waitKey(0)
cv2.destroyAllWindows()

STEP 2: Find contours of paper


```if screenCnt:```报错

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

# Step 3: Apply a Perspective Transform & Threshold


In [21]:
# apply the four point transform to obtain a top-down
# view of the original image
warped = four_point_transform(orig, screenCnt.reshape(4, 2) * ratio)
# convert the warped image to grayscale, then threshold it
# to give it that 'black and white' paper effect
warped = cv2.cvtColor(warped, cv2.COLOR_BGR2GRAY)
T = threshold_local(warped, 11, offset = 10, method = "gaussian")
warped = (warped > T).astype("uint8") * 255
# show the original and scanned images
print("STEP 3: Apply perspective transform")
cv2.imshow("Original", imutils.resize(orig, height = 650))
cv2.imshow("Scanned", imutils.resize(warped, height = 650))
cv2.waitKey(0)

STEP 3: Apply perspective transform


-1