A synthetic data generator for text recognition
What is it for?
Generating text image samples to train an OCR software. Now supporting non-latin text! For a more thorough tutorial see the official documentation.
What do I need to make it work?
I use Archlinux so I cannot tell if it works on Windows yet.
Python 3.X OpenCV 4 (Works with 3.2, probably works with 2.4) Pillow Numpy Requests BeautifulSoup tqdm
You can simply use
pip install -r requirements.txt too.
- Change the text orientation using the
- Change the space width using the
- Specify text color range using
-tc '#000000,#FFFFFF', please note that the quotes are necessary
- Explicit alignement when using
-alwith fixed width (0: Left, 1: Center, 2: Right)
- Fixed width using
- Add support for Simplified and Traditional Chinese
How does it work?
python run.py -w 5 -f 64
You get 1000 randomly generated images with random text on them like:
What if you want random skewing? Add
python run.py -w 5 -f 64 -k 5 -rk)
But scanned document usually aren't that clear are they? Add
-rbl to get gaussian blur on the generated image with user-defined radius (here 0, 1, 2, 4):
Maybe you want another background? Add
-b to define one of the three available backgrounds: gaussian noise (0), plain white (1), quasicrystal (2) or picture (3).
When using picture background (3). A picture from the pictures/ folder will be randomly selected and the text will be written on it.
Or maybe you are working on an OCR for handwritten text? Add
It uses a Tensorflow model trained using this excellent project by Grzego.
The project does not require TensorFlow to run if you aren't using this feature
You can also add distorsion to the generated text with
The text is chosen at random in a dictionary file (that can be found in the dicts folder) and drawn on a white background made with Gaussian noise. The resulting image is saved as [text]_[index].jpg
There are a lot of parameters that you can tune to get the results you want, therefore I recommand checking out
python run.py -h for more informations.
How to create images with Chinese (both simplified and traditional) text
It is simple! Just do
python run.py -l cn -c 1000 -w 5!
Unfortunately I do not speak Chinese so you may have to edit
texts/cn.txt to include some meaningful words instead of random glyphs.
Here are examples of what I could make with it:
Can I add my own font?
Yes, the script picks a font at random from the fonts directory.
|fonts/latin||English, French, Spanish, German|
Simply add / remove fonts until you get the desired output.
If you want to add a new non-latin language, the amount of work is minimal.
- Create a new folder with your language two-letters code
- Add a .ttf font in it
run.pyto add an if statement in
- Add a text file in
dictswith the same two-letters code
- Run the tool as you normally would but add
-lwith your two-letters code
It only supports .ttf for now.
- Intel Core i7-4710HQ @ 2.50Ghz + SSD (-c 1000 -w 1)
-t 1: 363 img/s
-t 2: 694 img/s
-t 4: 1300 img/s
-t 8: 1500 img/s
- AMD Ryzen 7 1700 @ 4.0Ghz + SSD (-c 1000 -w 1)
-t 1: 558 img/s
-t 2: 1045 img/s
-t 4: 2107 img/s
-t 8: 3297 img/s
- Create an issue describing the feature you'll be working on
- Code said feature
- Create a pull request
Feature request & issues
If anything is missing, unclear, or simply not working, open an issue on the repository.
What is left to do?
- Better background generation
- Better handwritten text generation
- More customization parameters (mostly regarding background)