A simple implementation using C++ and python.
-
Your file of interests are detect.cpp, classify.py and test.py. The working is a little weird and the control jumps between detect.cpp to the test.py.
-
The entire process works on detect.cpp except the classification part which is in the classify.py and test.py files.
-
I have already trained an SVC classifier on training images picked up from ICDAR dataset which is stored in svc.pkl which can classify HOG (Histogram of Oriented Gradients) descriptors of regions to check whether they contain texts.
-
The opencv HOG is used with the parameters below
HOGDescriptor hog( Size(dim, dim), Size(4, 4), Size(2, 2), Size(2, 2), 2 );
Or you can write your own classifer and change the text.py so that it can properly classify the text regions we pass onto it.
- OpenCV 2.4
- Scikit-learn
- cmake
- Detects potential text regions using Maximally Stable Extremal Regions (MSER)
- filters the regions using classifier.
- Redundant boxes (one inside the other are removed)
- the nearby boxes are combined
You can send the detected text boxes to any text recognition library like Tesseract and can easily get the text in the boxes.
-
You need your own text dataset. Go to ICDAR website and download their latest training dataset. It comes with another text file which contains regions in the form of end points of the rectangle of where the text is present in the picture. I have included a sweet function getPoints which accepts a line of the text file containing points and returns a Point variable. The function is in the detect.cpp file
-
Now that you have the text regions, train the classifier on these regions using any feature detector. I used HOG.
-
These are the positive regions. For negative regions, you can take the same dataset and pick up random rectangles from the images and take them as negative samples. Make sure your data is now skewed.
-
Now that you have features (positive and negative), train any classifier on it and save it a pickle.
-
To extract positive and negative features, I have included two functions readFilesPostive() and readFilesNegative(). They accept location of images and trains a HOG classifier on them and saves all the features in two files positive.txt and negative.txt. You need to change them in accordance to your data.