Easy Yolo OCR

Proceed with text detection only in the selected area

This repository is a project using yolov8 & yolov5 and EasyOCR.

Introduction

The existing OCR (Optical character recognition) process involves detecting the text regions using a Text Detection model and then recognizing the text using a Text Recognition model. This OCR model is effective in recognizing the entire text within desired documents or images.

However, if you want to detect only the characters in a specific area within an image or document, it detects unnecessary areas too, so the detection speed takes a long time and it is inconvenient to process the result value.

To address this issue and cater to those who want to detect only specific patterns or regions of text in various images, we propose Easy Yolo OCR.

Easy Yolo OCR replaces the Text Detection model used for text region detection with an Object Detection model commonly used in object detection tasks. Train your own custom Detection model and detect only the desired regions in the desired format.

The Object Detection model utilizes yolov8 & yolov5, which is widely employed in real-time object detection. The OCR process is benchmarked against EasyOCR and the Text Recognition model is trained using the deep-text-recognition-benchmark by Clova AI Research.

Existing OCR process

Easy Yolo OCR process

Installation

$ git clone https://github.com/aqntks/Easy-Yolo-OCR
$ cd Easy-Yolo-OCR
$ pip install -r requirements.txt

OCR

$ python main.py --gpu 0 --lang en/ko
$ python main.py --gpu 0 --lang en
$ python main.py --gpu -1 --lang ko         # --gpu -1 : cpu mode

Prepare Training Data

$ cd yolov5

1. Data location

Place the image file(jpg, png ... etc) and labeling file(txt) in the "yolov5/dataset/custom_data" folder.

---yolov5/dataset
          ㄴ---custom_data
                     ㄴ---image1.jpg
                         ---image1.txt
                         ---image2.jpg
                         ---image2.txt
                         ---image3.jpg
                         ---image3.txt

2. Configure labeling text(image.txt)

(ClassIndex)  (BoxCenterX[value 0-1])  (BoxCenterY[value 0-1])  (BoxWidth[value 0-1])  (BoxHeight[value 0-1])
(ClassIndex)  (BoxCenterX[value 0-1])  (BoxCenterY[value 0-1])  (BoxWidth[value 0-1])  (BoxHeight[value 0-1])
(ClassIndex)  (BoxCenterX[value 0-1])  (BoxCenterY[value 0-1])  (BoxWidth[value 0-1])  (BoxHeight[value 0-1])\

# ex) image1.txt 

0 0.6659722222222223 0.11302083333333333 0.4013888888888889 0.06770833333333333
9 0.48333333333333334 0.12552083333333333 0.025 0.036458333333333336
3 0.5145833333333334 0.1265625 0.02638888888888889 0.036458333333333336
1 0.5479166666666667 0.125 0.0375 0.0375
4 0.5798611111111112 0.125 0.029166666666666667 0.03333333333333333
8 0.6145833333333334 0.12447916666666667 0.03194444444444445 0.03854166666666667
4 0.6479166666666667 0.12395833333333334 0.03194444444444445 0.041666666666666664
2 0.68125 0.12447916666666667 0.03194444444444445 0.03229166666666666
3 0.7145833333333333 0.12395833333333334 0.03194444444444445 0.03125
7 0.7465277777777778 0.12552083333333333 0.029166666666666667 0.034375
0 0.78125 0.12239583333333333 0.03194444444444445 0.03229166666666666
8 0.8104166666666667 0.125 0.029166666666666667 0.0375
0 0.8423611111111111 0.12343749999999999 0.034722222222222224 0.036458333333333336

3. Create train, valid, test file

yolov5/dataset/custom_train.txt
yolov5/dataset/custom_valid.txt
yolov5/dataset/custom_train_test.txt (optional)

# ex) custom_train.txt

dataset/custom_data/image001.jpg
dataset/custom_data/image002.jpg
dataset/custom_data/image003.jpg
dataset/custom_data/image004.jpg
              .
              .
              .

# ex) custom_valid.txt

dataset/custom_data/image101.jpg
dataset/custom_data/image102.jpg
dataset/custom_data/image103.jpg
dataset/custom_data/image104.jpg
              .
              .
              .

# ex) custom_test.txt (optional)

dataset/custom_data/image151.jpg
dataset/custom_data/image152.jpg
dataset/custom_data/image153.jpg
dataset/custom_data/image154.jpg
              .
              .
              .

4. Write custom.yaml

Create the data/custom.yaml file and write the following

# custom.yaml

path: ./dataset
train: custom_train.txt
val:  custom_valid.txt
test:  custom_train_test.txt  # (optional)

nc: 10  # number of classes
names: ['title', 'name', 'personal_id', 'text_box_1', 'text_box_2', 'price', 'address', 'age', 'date', 'count']  # class names

Train Detection Model

$ python train.py --data data/custom.yaml --weights yolov5s.pt --img 640 --batch-size 64 --epochs 300
                                                    yolov5m.pt       960              40          100
                                                    yolov5l.pt       480              24           50 
                                                    yolov5x.pt       320              16           30

Setting Config

# config.yaml

images: image                                # detection image folder

detection: weights/example.pt                # trained detecting model
detection-size: 640                          # Detection image size
detection-confidence: 0.25                   # detecting confidence
detection-iou: 0.45                          # detecting iou

Name		Name	Last commit message	Last commit date
Latest commit History 81 Commits
core		core
easyocr		easyocr
image		image
models		models
res		res
scripts		scripts
trainer		trainer
utils		utils
weights		weights
yolov5		yolov5
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
README_KOR.md		README_KOR.md
config.yaml		config.yaml
main.py		main.py
main_restapi.py		main_restapi.py
requirements.txt		requirements.txt
send.py		send.py
setup.cfg		setup.cfg
setup.py		setup.py

License

aqntks/Easy-Yolo-OCR

Folders and files

Latest commit

History

Repository files navigation

Easy Yolo OCR

Introduction

Installation

OCR

Prepare Training Data

1. Data location

2. Configure labeling text(image.txt)

3. Create train, valid, test file

4. Write custom.yaml

Train Detection Model

Setting Config

About

Topics

Resources

License

Stars

Watchers

Forks

Languages