What does this application do?

language_script_identification_from_images

What does this application do?

Given an image of a text witten in a language, can computer tools of computer vision be used to identify the script of the language? We have explored this question by using feature extraction and creating classification using bag of visual-words model followed by its classification.

What is the language used?

This project is implemented in MATLAB.

Where to start?

The entry point of the project is manager.m file.

Dataset

The dataset is stored in the finalDataset folder. The file createDataset > generateMatFile.m generates .mat files in dataMatlabFormat folder. This file generates the entire dataset. The filterDataset.m file trims the dataset. You can select the subset of languages in the manager.m (change number of languages and language names). In this way you can modify the languages the model will be trained on.

Feature extraction:

This project extracts SIFT features from the files. The code for this is in sift folder.

Making Bag of words model:

Bag of words model is created in the clusterFeatures > getClusterFeatures.m file.

Classification:

For this project, I use two classifiers. Linear Classifier and Random forest classifier.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
clusterFeatures		clusterFeatures
createDataset		createDataset
finalDataset		finalDataset
libsvm-3.21		libsvm-3.21
linearClassifier		linearClassifier
mex		mex
sift		sift
sift_v2		sift_v2
README.md		README.md
manager.m		manager.m
startBOW.m		startBOW.m

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

clusterFeatures

clusterFeatures

createDataset

createDataset

finalDataset

finalDataset

libsvm-3.21

libsvm-3.21

linearClassifier

linearClassifier

mex

mex

sift

sift

sift_v2

sift_v2

README.md

README.md

manager.m

manager.m

startBOW.m

startBOW.m

Repository files navigation

language_script_identification_from_images

What does this application do?

What is the language used?

Where to start?

Dataset

Feature extraction:

Making Bag of words model:

Classification:

About

Releases

Packages

Languages

ayush-sharma-umass/language_script_identification_from_images

Folders and files

Latest commit

History

Repository files navigation

language_script_identification_from_images

What does this application do?

What is the language used?

Where to start?

Dataset

Feature extraction:

Making Bag of words model:

Classification:

About

Resources

Stars

Watchers

Forks

Languages