Skip to content

Open-DataScience/Shotor

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 

Repository files navigation

Shotor

Word Level OCR Dataset for Persian Language

Shotor (means camel in Persian) is a free synthetic dataset for Word Level OCR.
The current version contains 120000 images and corresponding words.
Note: To train a robust model, apply augmentations like scaling, translation, additive noise and ... on the images.
To see an example of using the Shotor dataset see this notebook:
A simple word level OCR for Persian Language using Pytorch and OpenCV

I used these resourses to create word lists:

The images have been generated using multiple fonts:

Created by: Amirabbas Asadi (amir137825@gmail.com)

About

Free Persian Word Level OCR Dataset

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published