<h1 align="center">Guide for detecting areas and text in an image</h1> 

Important things for the protocol:

1. Circles in the paper have to be really separate. It will be easy split both circles.
2. Unncessary letters and lines for the processing have to be in light color. It will help in umbralization process. 
3. It is important to use thick marks for relevant lines
4. Stickers can't cut the lines because after it will not be possible separete this section
5. Script works better with uppercase letters while they are aligned (we tryed with numbers and lowercase letter)

<img src= "sources/processing_detect_text.png" />

In [1]:
import numpy as np
import cv2 
import matplotlib.pyplot as plt
%matplotlib qt5

In [14]:
# Open image 
participant = 'S1'
path = './' + participant + r'/Scanner/'
pathSave = './' + participant + r'/Areas/'
pathLetters = './' + participant + r'/Letters/'
pathTexts = './' + participant + r'/Texts/'
imageName = 'A1.jpg'
names = ['Left', 'Right', 'Up', 'Down', 'Straight']
print(path)
image = cv2.imread(path + imageName) 
plt.imshow(image)

./S1/Scanner/


<matplotlib.image.AxesImage at 0x25d16931cc0>

In [19]:
# Split images in the same page

middle = int(len(image)/2) 

image1 = image[:middle]
image2 = image[middle:]

# Select image (image1 or image 2) for processing and binarization 

setImage = image2
threshold = 170
# 170
grayImage = cv2.cvtColor(setImage, cv2.COLOR_BGR2GRAY)
_, binaryImage = cv2.threshold(grayImage, threshold, 255, cv2.THRESH_BINARY)

# Binarization for getting numbers

threshold = 40
# 45
_, number = cv2.threshold(grayImage, threshold, 255, cv2.THRESH_BINARY)
kernel = np.ones((3, 3), np.uint8)
number = cv2.dilate(~number, kernel, iterations=4)

# Delete number in the image 

result = ~binaryImage - (number)
plt.imshow(result)

# Get exteral contour and separate main circle 

r,c = np.shape(grayImage)
ext = np.zeros((r,c), np.dtype('uint8'))
    
contour,_ = cv2.findContours(result, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
ext = cv2.drawContours(ext, contour, -1, 255, -1)

# delete small points in the image
kernel = np.ones((2, 2), np.uint8)
extErode= cv2.erode(ext, kernel, iterations=5)
extDilate = cv2.dilate(extErode, kernel, iterations=5)

# find where the circle is and make a cropped region
points = np.argwhere(extDilate==255) # find where the black pixels are
points = np.fliplr(points) # store them in x,y coordinates instead of row,col indices
x, y, w, h = cv2.boundingRect(points) # create a rectangle around those points
x, y, w, h = x-10, y-10, w+20, h+20 # make the box a little bigger

# Cropp image
imaIn = ~result[y:y+h, x:x+w]*ext[y:y+h, x:x+w]
kernel = np.ones((3, 3), np.uint8)
splitArea = cv2.dilate(~imaIn, kernel, iterations=3)
plt.imshow(splitArea)

# Identify secction and put specific lables for each one

sections, labels = cv2.connectedComponents(imaIn)
print('Number of sections: ' + str(sections-1))

# Get each section, make a mask with original image and save it
for i in range(sections):
    area = np.sum(labels==i)
    if (area > 200):
        print ('Area ', str(area))
        section =  labels.copy()
        section[section != i] = 0
        section[section == i] = 255
        newSection = section.astype(np.uint8)
        newImage = cv2.bitwise_and(setImage[y:y+h, x:x+w], setImage[y:y+h, x:x+w], mask=newSection)
        cv2.imwrite( pathSave + names[1] + '_' + str(area) +'.png', newImage)

print('Images saved in folder')


Number of sections: 24
Area  1424288
Area  3134509
Area  560764
Area  200351
Area  174386
Images saved in folder


<h3 align="center">Secction for text processing - One folder with images</h3> 

In this part is posible to detect the text in a group of image 

In [20]:
import os

files = os.listdir(pathSave)
print (files)


['Left_1401482.png', 'Left_3335282.png', 'Left_354293.png', 'Left_431340.png', 'Right_1424288.png', 'Right_174386.png', 'Right_200351.png', 'Right_3134509.png', 'Right_560764.png']


In [42]:
for imageName in files:
    print ('Processing: ' + imageName)
    
    image = cv2.imread(pathSave + imageName) 
    threshold = 150
    grayImage = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    _, binaryImage = cv2.threshold(grayImage, threshold, 255, cv2.THRESH_BINARY)
    kernel = np.ones((2, 2), np.uint8)
    splitArea = cv2.dilate(~binaryImage, kernel, iterations=1)
    
    r,c = np.shape(grayImage)
    ext = np.zeros((r,c), np.dtype('uint8'))

    contour,_ = cv2.findContours(~splitArea, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    ext = cv2.drawContours(ext, contour, -1, 255, -1)

    imaIn = (splitArea*ext)

    cv2.imwrite( pathLetters + imageName[:-4] + '.png', ~(imaIn*255))




Processing: Left_1401482.png
Processing: Left_3335282.png
Processing: Left_354293.png
Processing: Left_431340.png
Processing: Right_1424288.png
Processing: Right_174386.png
Processing: Right_200351.png
Processing: Right_3134509.png
Processing: Right_560764.png


In [53]:
import os

files = os.listdir(pathLetters)
print (files)
imageName = 'Left_354293.png'

['Left_1401482.png', 'Left_3335282.png', 'Left_354293.png', 'Left_431340.png', 'Right_1424288.png', 'Right_174386.png', 'Right_200351.png', 'Right_3134509.png', 'Right_560764.png']


In [54]:
for file in files: 
    commandLine = 'tesseract ' + pathLetters + file[:-4] + '.png' + ' ' + pathTexts + file[:-4] + ' -l eng --psm 6'
    os.system(commandLine)
    f = open(pathTexts + file[:-4] + '.txt', "r")
    text = f.read()
    print('Direction: ' + file[:-4].split('_')[0] + '\n' + 'Area: ' + text + 'Pixeles: ' + file[:-4].split('_')[1] +'\n')

Direction: Left
Area: Pixeles: 1401482

Direction: Left
Area: 1"
â€˜
. *
:
Pixeles: 3335282

Direction: Left
Area: FA
Pixeles: 354293

Direction: Left
Area: F3
Pixeles: 431340

Direction: Right
Area: Pixeles: 1424288

Direction: Right
Area: FS
Pixeles: 174386

Direction: Right
Area: F2
Pixeles: 200351

Direction: Right
Area: Pixeles: 3134509

Direction: Right
Area: F3
Pixeles: 560764



Useful information for tesserat commands:

tesseract --help-psm
Page segmentation modes:
  0    Orientation and script detection (OSD) only.
  1    Automatic page segmentation with OSD.
  2    Automatic page segmentation, but no OSD, or OCR.
  3    Fully automatic page segmentation, but no OSD. (Default)
  4    Assume a single column of text of variable sizes.
  5    Assume a single uniform block of vertically aligned text.
  6    Assume a single uniform block of text.
  7    Treat the image as a single text line.
  8    Treat the image as a single word.
  9    Treat the image as a single word in a circle.
 10    Treat the image as a single character.
 11    Sparse text. Find as much text as possible in no particular order.
 12    Sparse text with OSD.
 13    Raw line. Treat the image as a single text line,
       bypassing hacks that are Tesseract-specific.
 
 tesseract numero2.png outputbase -l eng --psm 6

In [None]:
    points = np.argwhere(ext==255) # find where the black pixels are
    points = np.fliplr(points) # store them in x,y coordinates instead of row,col indices
    x, y, w, h = cv2.boundingRect(points) # create a rectangle around those points
    x, y, w, h = x-10, y-10, w+20, h+20 # make the box a little bigger
    crop = ext[y:y+h, x:x+w] # create a cropped region of the gray image
    imaIn = (splitArea[y:y+h, x:x+w]*ext[y:y+h, x:x+w])