<h1> Exacting text from a card 

Something extremetly important to take into consideration is the fact that as we will use regions of interest for image extractions, the picture of the cards need to be taken in **exactly** the same angle and the same coordinates.

In [1]:
# cv2 is the module import name for opencv-python needed for the cv algorithm
import cv2
# pillow is needed to editing images, printing them, rotating them...
from PIL import Image
#exact text from images using pytesseract
import pytesseract 
#basic path works for all the files
import sys
#array handling
import numpy as np

In [2]:
# this is something to look into when someone else uses the code!!!! they need to install pytesseract
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'


1. Loading an image with opencv. They become a bunch of numbers in a array that refer to [r,g,b] which means "per pixel" how much color of each is used. in order to display an array, we use pillow.

In [3]:
# original image, we could even apply a '0' as a parameter to make it black & white
image = cv2.imread("1.jpg")
image

array([[[233, 217, 200],
        [233, 217, 200],
        [233, 217, 200],
        ...,
        [167, 170, 168],
        [170, 173, 171],
        [173, 176, 174]],

       [[236, 220, 203],
        [236, 220, 203],
        [237, 221, 204],
        ...,
        [167, 170, 168],
        [171, 174, 172],
        [174, 177, 175]],

       [[230, 214, 197],
        [230, 214, 197],
        [230, 214, 197],
        ...,
        [168, 171, 169],
        [172, 175, 173],
        [176, 179, 177]],

       ...,

       [[167, 191, 209],
        [167, 191, 209],
        [167, 191, 209],
        ...,
        [170, 200, 225],
        [170, 200, 225],
        [170, 200, 225]],

       [[166, 190, 208],
        [167, 191, 209],
        [168, 192, 210],
        ...,
        [170, 200, 225],
        [170, 200, 225],
        [170, 200, 225]],

       [[165, 189, 207],
        [168, 192, 210],
        [169, 193, 211],
        ...,
        [170, 200, 225],
        [170, 200, 225],
        [170, 200, 225]]

In [4]:
#Image from Pillow makes the pic printable as a image and not only an array
Image.fromarray(image).show()

2. Binarization. Images will have different shadings and we would like all of them to be as similar to each other as possible, therefore threshold is needed. If we didn't use the contract, probably the text extraction would not be even half of efficient as it can be after applying the threshold. [check out doc!]
(https://docs.opencv.org/4.x/d7/d4d/tutorial_py_thresholding.html)

The best way is to turn the image black & white, and then apply threshold.

In [5]:
def grayscale (image):
    #cv2.COLOR_BGR2GRAY converts an image to grey
    gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    return gray_image

In [6]:
gray_image = grayscale(image)
# check the new black & white image 
Image.fromarray(gray_image).show()

# imwrite saves an image which we dont need to do, it just allows testing
imag_path = sys.path[0]+"/temp/card-02-grey.jpg"
cv2.imwrite(imag_path,1)

False

In [7]:
# now the image, as it is black & white it only has one element per pixel. pixel is inside the range of 0 (completely black) to 255 (completely white)
gray_image

array([[214, 214, 214, ..., 169, 172, 175],
       [217, 217, 218, ..., 169, 173, 176],
       [211, 211, 211, ..., 170, 174, 178],
       ...,
       [194, 194, 194, ..., 204, 204, 204],
       [193, 194, 195, ..., 204, 204, 204],
       [192, 195, 196, ..., 204, 204, 204]], dtype=uint8)

In [8]:
# applying the threshold: chose thresh_binary because we want each pixel to be either white or black. 
# Also, we need them to be baclk & white as it is easier in pytesseract to work with them
thresh, gray_thresh_image = cv2.threshold(gray_image,130, 255,cv2.THRESH_BINARY)
Image.fromarray(gray_thresh_image).show() # i can also use Image.fromarray(sys.path[0]+"/temp/card-01-grey-thresh.jpg").show()


# imwrite saves an image which we dont need to do, it just allows testing
imag_path = sys.path[0]+"/temp/card-02-grey-thresh.jpg"
cv2.imwrite(imag_path,1)

False

In [9]:
gray_thresh_image = np.asarray(gray_thresh_image)

gray_thresh_image

array([[255, 255, 255, ..., 255, 255, 255],
       [255, 255, 255, ..., 255, 255, 255],
       [255, 255, 255, ..., 255, 255, 255],
       ...,
       [255, 255, 255, ..., 255, 255, 255],
       [255, 255, 255, ..., 255, 255, 255],
       [255, 255, 255, ..., 255, 255, 255]], dtype=uint8)

3. ROI. Region of interest. We need to separate the [card name], [card image] and [card type] from the picture before extracting the text. The previous steps work for the text extraction, as the image extraction needs other kind of preprocessing before "cutting" the image in the region of interest.
Something really important to take into consideration is that all the cards must be in the same position, and the same angle everytime the picture is taken to avoid more pre-processing and make the ROI more accurate

In [10]:
num_rows, num_cols = gray_thresh_image.shape
print("number of rows ",num_rows, " and columns ",num_cols)
# the number of rows and columns are really important for the following part-extrations of the images. they will be changing as the proper
#values from the arduino come to the game

number of rows  1536  and columns  2048


In [11]:
# [row, columns]
rows_cardName = [350, 1250]
columns_cardName = [310, 400]
image_ROI_cardName = gray_thresh_image[rows_cardName[0]:rows_cardName[1], columns_cardName[0]:columns_cardName[1]]
# now image_ROI_carName contains the cardName from each card
Image.fromarray(image_ROI_cardName).show()

In [12]:
# [row, columns]
rows_cardType = [350, 1250]
columns_cardType = [1050, 1150]

image_ROI_cardType = gray_thresh_image[rows_cardType[0]:rows_cardType[1], columns_cardType[0]:columns_cardType[1]]
# now image_ROI_carName contains the cardName from each card
# Image.fromarray(image_ROI_cardType).show()
Image.fromarray(cv2.rotate(image_ROI_cardType,cv2.ROTATE_90_CLOCKWISE)).show()

4. Extracting the text from [card type] & [card name]. The regions of interest have been identified and we will try to extract text from them. The best way is to make a function that is called every time we need extraction. # https://pypi.org/project/pytesseract/ has important documentation about pytesseract, including the modes of data extraction, and in this case OEM 3 PSM 3 is pretty good


In [13]:
def extract_text (rows, columns) -> str:
   # print(rows[0],rows[1])
   # print(columns[0], columns[1])
    custom_oem_psm_config = r'--oem 3 --psm 3'
    # first we implement the region cuttting 
    ROI_image =  gray_thresh_image[rows[0]:rows[1], columns[0]:columns[1]]
    # then we rotate the image
    ROI_image = cv2.rotate(ROI_image,cv2.ROTATE_90_CLOCKWISE)
    # showing it just for testing 
    Image.fromarray(ROI_image).show()
    # and we extract the text from it
    text = pytesseract.image_to_string(ROI_image, lang="eng",config=custom_oem_psm_config)
    return text

In [14]:
# we need string library to include all ASCII letters
import string 

def clean_text(text:str) -> str:
    # ascii_letters include all the letters from english alphabet in lower and upper case
    included = string.ascii_letters
    # first we treat the text as an array because strings are inmutable in python
    new_str = []
    for char in text: #h, #o, #l, #a, #!
        if char in included or char == ' ' or char == '—':
            # appeding to the array if the string is included
            new_str.append(char)
    # removing extra spaces
    new_text = ''.join(new_str)
    return new_text.strip()

In [15]:
text_cardName = clean_text(extract_text(rows_cardName, columns_cardName))
text_cardType = clean_text(extract_text(rows_cardType, columns_cardType))
print("the text from the top part (card name) is", text_cardName , " and from the middle part (card type) is", text_cardType)

the text from the top part (card name) is Lightning Strike  and from the middle part (card type) is Instant
