<a href="https://colab.research.google.com/github/rahiakela/computer-vision-research-and-practice/blob/main/opencv-projects-and-guide/ocr-with-opencv-and-tesseract/07_improving_results_with_tesseract_options.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

##Improving OCR Results with Tesseract Options

In [None]:
%%shell

sudo apt install tesseract-ocr
pip install tesseract
pip install pytesseract
pip install Pillow==9.0.0

Just restart the colab environment.

In [1]:
import cv2
import pytesseract
import csv
import numpy as np


from matplotlib import pyplot as plt
from google.colab.patches import cv2_imshow

%matplotlib inline

In [2]:
pytesseract.pytesseract.tesseract_cmd = (r'/usr/bin/tesseract')

Let's download images.

In [None]:
%%shell

wget https://github.com/rahiakela/computer-vision-research-and-practice/raw/main/opencv-projects-and-guide/ocr-with-opencv-and-tesseract/images/text-orient-1.png
wget https://github.com/rahiakela/computer-vision-research-and-practice/raw/main/opencv-projects-and-guide/ocr-with-opencv-and-tesseract/images/text-orient-1.png

In [3]:
!tesseract --help-psm

Page segmentation modes:
  0    Orientation and script detection (OSD) only.
  1    Automatic page segmentation with OSD.
  2    Automatic page segmentation, but no OSD, or OCR.
  3    Fully automatic page segmentation, but no OSD. (Default)
  4    Assume a single column of text of variable sizes.
  5    Assume a single uniform block of vertically aligned text.
  6    Assume a single uniform block of text.
  7    Treat the image as a single text line.
  8    Treat the image as a single word.
  9    Treat the image as a single word in a circle.
 10    Treat the image as a single character.
 11    Sparse text. Find as much text as possible in no particular order.
 12    Sparse text with OSD.
 13    Raw line. Treat the image as a single text line,
       bypassing hacks that are Tesseract-specific.


##PSM 0

Orientation and script detection (OSD) examines the input image, but instead of returning the
actual OCR’d text, OSD returns two values:

* How the page is oriented, in degrees
* The confidence of the script

In [4]:
def text_orientation(img_path, options):
  image = cv2.imread(img_path)

  image_bgr = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

  # determine the text orientation
  results = pytesseract.image_to_osd(image_bgr, output_type=pytesseract.Output.DICT, config=options)

  print(f"Page number: {results['page_num']}")
  print(f"Orientation: {results['orientation']}")
  print(f"Rotate: {results['rotate']}")
  print(f"Orientation confidence: {results['orientation_conf']}")
  print(f"Script: {results['script']}")
  print(f"Script confidence: {results['script_conf']}")

In [5]:
text_orientation("text-orient-1.png", options="--psm 0")

Page number: 0
Orientation: 0
Rotate: 0
Orientation confidence: 4.51
Script: Latin
Script confidence: 4.58


In [None]:
text_orientation("text-orient-2.png", options="--psm 0")

Page number: 0
Orientation: 90
Rotate: 270
Orientation confidence: 3.7
Script: Latin
Script confidence: 8.15


##PSM 1

In [6]:
def psm_options(img_path, options=None):
  image = cv2.imread(img_path)

  image_bgr = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

  # determine the text orientation
  results = pytesseract.image_to_string(image_bgr, config=options)
  return results

In [None]:
results = psm_options("text-orient-1.png", options="--psm 1")
results

"In the first part of this tutorial, we'll discuss how\nautoencoders can be used used for image retrieval\nand building image search engines.\n\nFrom there, we'll implement a convolutional autoencoder\nthat we'll then train on our image dataset.\n\x0c"

In [None]:
results = psm_options("text-orient-2.png", options="--psm 1")
results

" \n\nIn the first part of this tutorial, we'll discuss how\nautoencoders can be used used for image retrieval\nand building image search engines.\n\nFrom there, we'll implement a convolutional autoencoder\nthat we'll then train on our image dataset.\n\x0c"

##PSM 3

In [None]:
results = psm_options("text-orient-1.png", options="--psm 3")
results

"In the first part of this tutorial, we'll discuss how\nautoencoders can be used used for image retrieval\nand building image search engines.\n\nFrom there, we'll implement a convolutional autoencoder\nthat we'll then train on our image dataset.\n\x0c"

In [None]:
results = psm_options("text-orient-2.png", options="--psm 3")
results

" \n\nIn the first part of this tutorial, we'll discuss how\nautoencoders can be used used for image retrieval\nand building image search engines.\n\nFrom there, we'll implement a convolutional autoencoder\nthat we'll then train on our image dataset.\n\x0c"

PSM 3 is the default behavior of Tesseract.

In [None]:
results = psm_options("text-orient-1.png")
results

"In the first part of this tutorial, we'll discuss how\nautoencoders can be used used for image retrieval\nand building image search engines.\n\nFrom there, we'll implement a convolutional autoencoder\nthat we'll then train on our image dataset.\n\x0c"

In [None]:
results = psm_options("text-orient-2.png")
results

" \n\nIn the first part of this tutorial, we'll discuss how\nautoencoders can be used used for image retrieval\nand building image search engines.\n\nFrom there, we'll implement a convolutional autoencoder\nthat we'll then train on our image dataset.\n\x0c"

##PSM 4

In [9]:
results = psm_options("whole_foods-2.png")
results

'WHOLE\nFOODS\n[Mm AR K E T)\n\nWHOLE FOODS MARKET - WESTPORT,CT 06880\n399 POST RD WEST ~ (203) 227-6858\n\n365\n365\n366\n365\n\nee\n\nwee TAX\n\nBACON LS\nBACON LS\nBACON LS\nBACON LS\nBROTH CHIC\n\nFLOUR ALMOND\n\nCHKN BRST BNLSS SK\n\nHEAVY CREAM\n\nBALSMC REDUCT\n\nBEEF GRND 85/1§\n\nJUICE COF CASHEW C\n\nDOCS PINT ORGANIC\n\nHNY ALMOND BUTTER\n-00 BAL\n\nNP 4.99\nNP 4.99\nNP 4.99\nNP 4,99\nNP 2.19\nNP wi .99\nNP 18.80\nNP 3.39\nNP 6.49\nNP ,B.04\nnp £8.99\nNP "14.49\nNP 9.99\n\n101.33\n\nTAAAAHNAAAAAATAT\n\n»\n\x0c'

In [10]:
results = psm_options("whole_foods-2.png", options="--psm 4")
results

'WHOLE\nFOODS\n[Mm AR K E T)\n\nWHOLE FOODS MARKET - WESTPORT,CT 06880\n399 POST RD WEST ~ (203) 227-6858\n\n365 BACON LS NP 4.99\n\n365 BACON LS NP 4.99\n\n366 BACON LS NP 4.99\n\n365 BACON LS NP 4.99\nBROTH CHIC NP 2.19\n\nFLOUR ALMOND NP wil .99\n\nCHKN BRST BNLSS SK NP 18.80\nHEAVY CREAM NP 3.39\n\nBALSMC REDUCT NP 6.49\n\nBEEF GRND 85/1§ NP 5.04\nJUICE COF CASHEW C NP 8.99\nDOCS PINT ORGANIC NP °14.49\nHNY ALMOND BUTTER NP 9.99\nween TAX -00 BAL - 101.33\n\nF\nF\n\x0c'

##PSM 5

In [11]:
results = psm_options("whole_foods-3.png")
results

'Se\n\nWHOLE\nFOODS\ncee eam\n\nWHOLE FOODS MARKET - WESTPORT, CT\n399 POST RD WEST - (203) 227-6858\n\n365 BACON LS NP\n366 BACON LS NP\n365 BACON LS NP\n365 BACON LS NP\n\nBROTH CHIC NP\n\n06880\n\n4.99\n4.99\n4.99\n4.99\n2.19\n\nFLOUR ALMOND NP oi1.99\n\nCHKN BRST BNLSS SK NP\nHEAVY CREAM NP\n\nBALSMC REDUCT NP\n\nBEEF GRND 85/1§ NP\n\nJUICE COF CASHEW C NP\n\n’ DOCS PINT ORGANIC NP\nHNY ALMOND BUTTER NP\n\nune TAX 00 BAL ~ 101.33\n\n18.80\n\n6.49\n5.04\n8.99\n14.49\n9.99\n\ni\n\x0c'

In [13]:
results = psm_options("whole_foods-3.png", options="--psm 5")
results

'Cex ae\n\nWHOLE FOODS MARKET - WESTPORT,CT 06880\n\n399 POST RD WEST - (203) 227-6858\n* 365 BACON LS NP 4.99 F\n* 365 BACON LS NP 4.99 F\n* 365 BACON LS NP 4.99 F*\n* 365 BACON LS NP 4.99 F\n* BROTH CHIC NP 2.19 F\n* FLOUR ALMOND NP i1.99 F\n. CHKN BRST BNLSS SK NP 18.80 F\n* HEAVY CREAM NP 3.39 F\n* BALSMC REDUCT NP 6.49 F\n* BEEF GRND 85/1§ NP 6.04 F\n. JUICE COF CASHEW C NP fs 99 F\na DOCS PINT ORGANIC NP °14.49 F\n* HNY ALMOND BUTTER NP 9.99 F\n\nune TAX 00 BAL ~ 101.33\n\x0c'

##PSM 6

In [15]:
results = psm_options("sherlock_holmes.png")
print(results)

CHAPTER ONE

wee
Mr. Sherlock Holmes

 

M r, Sherlock Holmes, who was usually very late in the

sions when he

 

ings, save upon those not infrequent oc
was up all night, was seated at the breakfast
the hearth-rug and picked up the stick which ou
n the night before. It was

~aded, of the sort which is known as a “Penang lawyer.

 

ble. I stood upon
visitor had left

    

  

behind fine, thick piece of wood,

 

 

bulbous-h

 

Just under the head was a broad silver band nearly an inch across.
“To James Mortimer, M.R.C.S., from his friends of the C.C.H.,”

 

 

      

    

 

    

was engraved upon it, with the date “1884.” It was just such
stick as the old-fashioned family practitioner used to c: dig-
nified, solid, and reassuring,

“Well, Watson, what do you ike of i?”

Holmes was sitting with his back to me, and I had given him no

 

 

sign of my occupation.
“How did you know what I was doing? I believe you have eyes in
the back of your head.”

“T have, at least, a well-p

In [17]:
results = psm_options("sherlock_holmes.png", options="--psm 6")
print(results)

CHAPTER ONE
wee
Mr. Sherlock Holmes
M r, Sherlock Holmes, who was usually very late in the morn-
ings, save upon those not infrequent occasions when he

was up all night, was seated at the breakfast table. I stood upon
the hearth-rug and picked up the stick which our visitor had left
behind him the night before. It was a fine, thick piece of wood,
bulbous-headed, of the sort which is known as a “Penang lawyer.”
Just under the head was a broad silver band nearly an inch across.
“To James Mortimer, M.R.C.S., from his friends of the C.C.H.,”
was engraved upon it, with the date “1884.” It was just such a
stick as the old-fashioned family practitioner used to carry—dig-
nified, solid, and reassuring,

“Well, Watson, what do you make of it?”

Holmes was sitting with his back to me, and I had given him no
sign of my occupation.

“How did you know what I was doing? I believe you have eyes in
the back of your head.”

“T have, at least, a well-polished, silver-plated coffee-pot in front
of me,” 

##PSM 7

In [18]:
results = psm_options("license_plate-1.png")
print(results)

MHO4DW8351



In [20]:
results = psm_options("license_plate-1.png", options="--psm 7")
print(results)

MHO4DW8351



##PSM 8

In [23]:
results = psm_options("designer.png")
print(results)




In [26]:
results = psm_options("designer.png", options="--psm 8")
print(results)

a



##PSM 11

In [27]:
results = psm_options("website_menu.png")
print(results)

How Do | Get Started?

Deep Learning

Face Applications

Optical Character Recognition (OCR)
Object Detection

Object Tracking

Instance Segmentation and Semantic

Segmentation

Embedded and loT Computer Vision
Computer Vision on the Raspberry Pi
Medical Computer Vision

Working with Video

Image Search Engines

Interviews, Case Studies, and Success Stories

My Books and Courses



In [30]:
results = psm_options("website_menu.png", options="--psm 11")
print(results)

How Do | Get Started?

Deep Learning

Face Applications

Optical Character Recognition (OCR)

bject Det n

Object Tracking

Instance Segmentation and Semantic

Segmentation

Embedded and loT Computer Vision

Computer Vision on the Raspberry Pi

Medical Computer Vision

Working with Video

Image Search Engines

Interviews, Case Studies, and Success Stories

My Books and Courses



##PSM 12

In [31]:
results = psm_options("website_menu.png", options="--osd 0 --psm 12")
print(results)

How Do | Get Started?

Deep Learning

Face Applications

Optical Character Recognition (OCR)
Object Detection

Object Tracking

Instance Segmentation and Semantic

Segmentation

Embedded and loT Computer Vision
Computer Vision on the Raspberry Pi
Medical Computer Vision

Working with Video

Image Search Engines

Interviews, Case Studies, and Success Stories

My Books and Courses

