Handwritten Grapheme Classification In Bengali Language Using MobileNet

This is implementation of this paper

The Bengali language comprises numerous graphemes, which are the smallest functional units in a writing system. Detecting these graphemes is crucial for developing an OCR application.

Idea

OCR application is mostly used embeded devices. So we utilized a class of efficient models for mobile and embedded vision applications called MobileNet. Specifically, we used MobileNetV2. Since each grapheme contains three components, it is multilabel classification problem. As a results, we modified the softmax layer to facilitate our multilabel classification problem.

Dataset

We used this dataset which is also available in Kaggle. After downloading change $PATH$ to the dataset directory. Then, run the following command sequentially to pre-proccess the data by getting inside the data directory.

python create_image_pickles.py
python create_folds.py
python create_chunk.py

Training

Training the model requires to specify the TRAINING FOLDS, VALIDATION FOLDS. In addition,BATCH_SIZE, IMAGE_WIDTH, IMAGE_LENGTH, EPOCHS can be also specified. Command for training:

python main.py --mode train  --training_folds ($Num1$, $Num2$, $Num3$, $Num4$) --validation_folds ($Num4$,)

Testing

command for testing:

python main.py --mode test

Citetation

If you find this codebase useful, please cite our paper:

@article{taif2024Grap,
  title={Handwritten Grapheme Classification in Bengali Language Using MobileNet},
  author={Taif Al Musabe},
  journal={techRxiv preprint techrxiv.170422019.94163857},
  year={2024}
}

Acknowledgement

We refer to tutorial from Abhishek Thakur Youtube Channel.

License

Our code is BSD-3 licensed. See LICENSE.txt for details.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
data		data
notebook		notebook
results		results
state		state
LICENSE		LICENSE
README.md		README.md
dataset.py		dataset.py
main.py		main.py
models.py		models.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Handwritten Grapheme Classification In Bengali Language Using MobileNet

Idea

Dataset

Training

Testing

Citetation

Acknowledgement

License

About

Releases

Packages

Languages

License

tmusabe/handwritten-grapheme-classification-in-bengali-language-using-mobileNet

Folders and files

Latest commit

History

Repository files navigation

Handwritten Grapheme Classification In Bengali Language Using MobileNet

Idea

Dataset

Training

Testing

Citetation

Acknowledgement

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages