<h1 align="center">Guide for detecting areas and text in an image</h1> 

Important things for the protocol:

1. Circles in the paper have to be really separate. It will be easy split both circles.
2. Unncessary letters and lines for the processing have to be in light color. It will help in umbralization process. 
3. It is important to use thick marks for relevant lines
4. Stickers can't cut the lines because after it will not be possible separete this section
5. Script works better with uppercase letters while they are aligned (we tryed with numbers and lowercase letter)

<img src= "sources/processing_detect_text.png" />

In [47]:
import numpy as np
import cv2 
import matplotlib.pyplot as plt
%matplotlib qt5

In [151]:
# Open image 
path = r'scanner/'
pathSave = r'./areas/'
pathNumbers = r'./numbers/'
pathTexts = r'./texts/'
imageName = '33.jpg'
image = cv2.imread(path + imageName) 
plt.imshow(image)

<matplotlib.image.AxesImage at 0x2a294acc278>

In [152]:
# Split images in the same page

middle = int(len(image)/2) 

image1 = image[:middle]
plt.imshow(image1)

image2 = image[middle:]

plt.imshow(image1)



<matplotlib.image.AxesImage at 0x2a294ace588>

In [154]:
# Select image (image1 or image 2) for processing and binarization 

setImage = image1
threshold = 170
# 170
grayImage = cv2.cvtColor(setImage, cv2.COLOR_BGR2GRAY)
_, binaryImage = cv2.threshold(grayImage, threshold, 255, cv2.THRESH_BINARY)
plt.imshow(binaryImage)



<matplotlib.image.AxesImage at 0x2a294ace4e0>

In [158]:
# Binarization for getting numbers

threshold = 55
# 45
_, number = cv2.threshold(grayImage, threshold, 255, cv2.THRESH_BINARY)
kernel = np.ones((3, 3), np.uint8)
number = cv2.dilate(~number, kernel, iterations=4)

plt.imshow(number)

<matplotlib.image.AxesImage at 0x2a294a88710>

In [159]:
# Delete number in the image 

result = ~binaryImage - (number)
plt.imshow(result)


<matplotlib.image.AxesImage at 0x2a294ab7d30>

In [160]:
# Get exteral contour and separate main circle 

r,c = np.shape(grayImage)
ext = np.zeros((r,c), np.dtype('uint8'))
    
contour,_ = cv2.findContours(result, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
ext = cv2.drawContours(ext, contour, -1, 255, -1)
plt.imshow(ext)

<matplotlib.image.AxesImage at 0x2a294acc7b8>

In [161]:
# delete small points in the image
kernel = np.ones((2, 2), np.uint8)
extErode= cv2.erode(ext, kernel, iterations=5)
extDilate = cv2.dilate(extErode, kernel, iterations=5)
plt.imshow(extDilate)

<matplotlib.image.AxesImage at 0x2a294ae2400>

In [162]:
# find where the circle is and make a cropped region
points = np.argwhere(extDilate==255) # find where the black pixels are
points = np.fliplr(points) # store them in x,y coordinates instead of row,col indices
x, y, w, h = cv2.boundingRect(points) # create a rectangle around those points
x, y, w, h = x-10, y-10, w+20, h+20 # make the box a little bigger
crop = ext[y:y+h, x:x+w] # create a cropped region of the gray image
plt.imshow(crop)

<matplotlib.image.AxesImage at 0x2a294aed0b8>

In [163]:
# Mask with external contour 
resultCropped = result[y:y+h, x:x+w]
imaIn = ~result[y:y+h, x:x+w]*ext[y:y+h, x:x+w]
kernel = np.ones((3, 3), np.uint8)
splitArea = cv2.dilate(~imaIn, kernel, iterations=3)
plt.imshow(splitArea)

<matplotlib.image.AxesImage at 0x2a294acefd0>

In [164]:
# Identify secction and put specific lables for each one

sections, labels = cv2.connectedComponents(imaIn)
print('Number of sections: ' + str(sections-1))
plt.imshow(labels)

Number of sections: 2


<matplotlib.image.AxesImage at 0x2a294ab7cf8>

In [165]:
# Get each section, make a mask with original image and save it
for i in range(sections):
    area = np.sum(labels==i)
    if (area > 200):
        print ('Area ', str(area))
        section =  labels.copy()
        section[section != i] = 0
        section[section == i] = 255
        newSection = section.astype(np.uint8)
        newImage = cv2.bitwise_and(setImage[y:y+h, x:x+w], setImage[y:y+h, x:x+w], mask=newSection)
        cv2.imwrite( pathSave + str(area) +'.png', newImage)

print('Images saved in folder')



Area  365636
Area  3109991
Images saved in folder


<h3 align="center">Secction for text processing - One image</h3> 

In this part is posible to load an image and get the text

In [166]:
# Load letter image  
imageName = '3109991.png'
image = cv2.imread(pathSave + imageName) 
plt.imshow(image)

<matplotlib.image.AxesImage at 0x2a2a629c6a0>

In [134]:
# Separate sections

threshold = 200
grayImage = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
_, binaryImage = cv2.threshold(grayImage, threshold, 255, cv2.THRESH_BINARY)
kernel = np.ones((2, 2), np.uint8)
splitArea = cv2.dilate(~binaryImage, kernel, iterations=2)
plt.imshow(splitArea)

sections, labels = cv2.connectedComponents(splitArea)
print('Number of sections: ' + str(sections-1))
plt.imshow(labels)



Number of sections: 194


<matplotlib.image.AxesImage at 0x2a294989eb8>

In [135]:
# Get only the number

for i in range(sections):
    area = np.sum(labels==i)
    
    if (area > 200 and area < 10000):
        print ('Area ', str(area))
        section =  labels.copy()
        section[section != i] = 0
        section[section == i] = 255
        number = section.astype(np.uint8)
        cv2.imwrite( pathNumbers + imageName[:-4] + '.png', ~number)

plt.imshow(binaryImage)

Area  271
Area  499
Area  453
Area  564
Area  428
Area  919
Area  390
Area  233
Area  307
Area  412
Area  373
Area  741
Area  5307
Area  4306
Area  5197
Area  4252
Area  4453
Area  4526
Area  5280
Area  5332
Area  5249
Area  2709
Area  297
Area  9425
Area  7331
Area  9442
Area  9460
Area  7168
Area  7048
Area  9418
Area  7306
Area  9416
Area  4508
Area  8222
Area  7403
Area  8197
Area  7331
Area  8197
Area  7122
Area  8193
Area  7409
Area  8113
Area  4441


<matplotlib.image.AxesImage at 0x2a2949ae160>

In [168]:
# Use tesseract library for getting text 

import os

pathNumbers = './areas/'
commandLine = 'tesseract ' + pathNumbers + imageName[:-4] + '.png' + ' ' + pathTexts + imageName[:-4] + ' -l eng --psm 6'
os.system(commandLine)
f = open(pathTexts + imageName[:-4] + '.txt', "r")
text = f.read()
print('Area ' + text + 'tiene ' + imageName[:-4])

Area A1 A2 A3 A4 AS |
F1 F2 F3 F4 F5
C1 C2 C3 C4 Cs
tiene 3109991


<h3 align="center">Secction for text processing - One folder with images</h3> 

In this part is posible to detect the text in a group of image 

In [101]:
import os

files = os.listdir(pathSave)
print (files)


['118411.png', '13672478.png', '2272593.png', '239763.png', '248540.png', '301128.png', '541542.png']


In [102]:
for imageName in files:
    print ('Processing: ' + imageName)
    if int(imageName[:-4]) < 13000000:
        image = cv2.imread(pathSave + imageName) 
        threshold = 200
        grayImage = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
        _, binaryImage = cv2.threshold(grayImage, threshold, 255, cv2.THRESH_BINARY)
        kernel = np.ones((2, 2), np.uint8)
        splitArea = cv2.dilate(~binaryImage, kernel, iterations=2)
        sections, labels = cv2.connectedComponents(splitArea)
        for i in range(sections):
            area = np.sum(labels==i)
            if (area > 400 and area < 10000):
                section =  labels.copy()
                section[section != i] = 0
                section[section == i] = 255
                number = section.astype(np.uint8)
                kernel = np.ones((2, 2), np.uint8)
                numberErode= cv2.erode(number, kernel, iterations=1)
                cv2.imwrite( pathNumbers + imageName[:-4] + '.png', ~numberErode)
                #cv2.imwrite( pathNumbers + imageName[:-4] + '.png', ~number)
                commandLine = 'tesseract ' + pathNumbers + imageName[:-4] + '.png' + ' ' + pathTexts + imageName[:-4] + ' -l eng --psm 6'
                os.system(commandLine)
                f = open(pathTexts + imageName[:-4] + '.txt', "r")
                text = f.read()
                print('Area: ' + text + 'Pixeles: ' + imageName[:-4] +'\n')


Processing: 118411.png
Area: 4
Pixeles: 118411

Processing: 13672478.png
Processing: 2272593.png
Processing: 239763.png
Area: 1
Pixeles: 239763

Processing: 248540.png
Area: 2
Pixeles: 248540

Processing: 301128.png
Area: 3
Pixeles: 301128

Processing: 541542.png
Area: o
Pixeles: 541542

Area: â€”â€”
Pixeles: 541542



Useful information for tesserat commands:

tesseract --help-psm
Page segmentation modes:
  0    Orientation and script detection (OSD) only.
  1    Automatic page segmentation with OSD.
  2    Automatic page segmentation, but no OSD, or OCR.
  3    Fully automatic page segmentation, but no OSD. (Default)
  4    Assume a single column of text of variable sizes.
  5    Assume a single uniform block of vertically aligned text.
  6    Assume a single uniform block of text.
  7    Treat the image as a single text line.
  8    Treat the image as a single word.
  9    Treat the image as a single word in a circle.
 10    Treat the image as a single character.
 11    Sparse text. Find as much text as possible in no particular order.
 12    Sparse text with OSD.
 13    Raw line. Treat the image as a single text line,
       bypassing hacks that are Tesseract-specific.
 
 tesseract numero2.png outputbase -l eng --psm 6