kanji-classifier

OCR application that classifies almost 3000 Japanese kanji. Full list of characters can be accessed by running print_all_characters.py.
Deployed on kanji.al3xbro.me

Dependencies:

Tensorflow, Keras: 2.12
NumPy, OpenCV

Note: Follow the guide at https://www.tensorflow.org/guide/gpu to use your GPU for training

Performance:

92% accuracy for both validation and training sets.
0.18 training loss and 0.21 validation loss.

To Train:

Download an image dataset of your choice.
Modify the config.py file to contain the correct paths.
Run the image_preprocessing.py script to process images.
Run the delete_hiragana.py script to remove hiragana from the dataset.
Run the model_training.py script to train your model. Uses data augmentation to help the model generalize.

Testing:

Run the predicting.py script to test your model.
Try writing kanji in your own handwriting and testing your model on that. Have fun!

Resources:

Datasets ETL8G and ETL9G from etlcdb were used for training and validation.
Used etlcdb-image-extractor to extract images from these datasets. Thank you!

Name		Name	Last commit message	Last commit date
Latest commit History 92 Commits
data		data
model		model
next		next
server		server
.gitignore		.gitignore
README.md		README.md
config.ini		config.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

kanji-classifier

Dependencies:

Performance:

To Train:

Testing:

Resources:

About

Releases

Packages

Contributors 2

Languages

al3xbro/kanji-classifier

Folders and files

Latest commit

History

Repository files navigation

kanji-classifier

Dependencies:

Performance:

To Train:

Testing:

Resources:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages