crosswords_opencv

This repository is an experiment of reading a crosswords grid to a computer understandable structure, with the help of OpenCV and Tesseract's powerful OCR capabilities.

Current process

Getting an input image

An input image is one similar to the one in the samples/ directory.

Manual pass -- Waifu2X

If the input image is of a too bad quality, it's usually a good idea to run it through Waifu2X to upscale it and clean the noise.

Setting the grid size

Currently the grid size is still set manually in the code. The algorithm has (for now) to know how does the grid looks like.

For the sample image, the grid is 14 by 18.

Calling the script

./main.py ./samples/waifu2xed.png

The algorithm creates an internal representation of the grid by storing in each cell whether it's a black cell or whether it's a letter.
If it's a letter, it crops the letter to its bounding box and stores it in the grid.
From the reconstructed grid, the algorithm now creates a list of all the horizontal and vertical words
From these words (sequence of cells), the algorithm creates temporary "word images", such as the following:
This word image is sent for OCR to Tesseract.
The received text is split into each character and they are stored in the grid.
Thus, since the multiple words use the same characters, we can cross check our OCR
The grid is then displayed

-A-S-U-S-A-D-P
ASSESSEURS-EVE
-SECOURS-SAPER
PICOLER-PINARD
-DON-LAPIDER-R
LUNDI-TIQUETEE
-IDENTIQUE-EN-
ETE-TOQUE-EMOI
-ESTOQUE-AVENT
OS-ELUE-AMINCI
--OTEE-BRITTEN
GARER-PRETE-SE
-CG-ABRUTIES-R
CHAMBOULEE-POA
-EN-LUNES-PIP?
FRIPERIE-EIDER
-OSA-DESESPERE
ENERVER-STERES

There are still some bugs (the ?s) that come from uncertainties about the OCR.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
samples		samples
.gitignore		.gitignore
README.md		README.md
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

crosswords_opencv

Current process

Getting an input image

Manual pass -- Waifu2X

Setting the grid size

Calling the script

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

crosswords_opencv

Current process

Getting an input image

Manual pass -- Waifu2X

Setting the grid size

Calling the script

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages