Skip to content
a CNN using caffe to identify sinhala characters
JavaScript Python Shell
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
imgs
.gitignore
README.md
commands.sh
final_report.pdf
install_commands.sh
lenet.prototxt
lenet_solver.prototxt
lenet_train_test.prototxt
map.txt
package.json
test.py
textgen.js

README.md

sinhala-ocr

this is no way near an OCR. this project contains some perfect test images of sinhala characters (generated) and a test CNN (convolution neural network) based on lenet to train the data set in order to identify images with sinhala characters.

here we have used caffe for implementing and training the network. do anything you want with the provided dataset.

directory structure

.
├── commands.sh                 # contains useful caffe commands
├── imgs                        # dataset
│   ├── adjusted                # traing images with label files (TRAIN & TEST phase)
│   ├── labels_unicode.txt
│   └── unicode                 # a new dataset to evaluvate trained model
├── lenet.prototxt              # TEST model
├── lenet_solver.prototxt       # caffe solver
├── lenet_train_test.prototxt   # TRAIN model
├── map.txt                     # mapping of labels / unicode char / other sinhala fonts
├── model                       # output models / solverstate directory
├── node_modules
├── README.md
├── test.py                     # pythin script to evaluvate trained model
└── textgen.js                  # script to generate sinhala chars

generating database files

training

ones you have defined lenet_train_test.prototxt and configured solver lenet_solver.prototxt start training using following commands.

view commands.sh file.

evaluating

use the test.py to test the trained model on your new images.

generate sinhala chars

installation

sudo apt-get install libcairo2-dev libjpeg8-dev libpango1.0-dev libgif-dev build-essential g++

running the script

you need to have all the fonts mentioned in the code installed in your system. make sure font names don't have any spaces or escape characters.

npm install
node textgen.js

created image will be saved in imgs/ directory.

You can’t perform that action at this time.