Skip to content
No description, website, or topics provided.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
Language_Classification
README.md

README.md

Hinglish_English_Classification

Language Identification for Code-Mixed Language. It classifies hinglish and english words using RNN Created a Deep Learning Model on Keras for identifying language. The pretrained model works for Hindi and English Code-mixed language. But the model can be modified for other languages as well.

Accuracy of the model is about 90%

To Run the code please visit my Azure Notebooks.

This project is the part of a Bigger project of Roman to devnagri Conversion.

Code-Mixed

Code-mixing is the mixing of two or more languages or language varieties in speech. Some people use the terms "code-mixing" and "code-switching" interchangeably, especially in studies of syntax, morphology, and other formal aspects of language. For Example: "ye repository bhasha phechanne ke liye banai gai hai." Which actually translates to: "this repository is made for language identification."

Pretrained Model

Pretrained Weight file(.h5) and Model File(.json) can be found in the folder. To use it, just download the .h5 file and .json file and Give the appropriate path in the program.

Dataset

There are two files, one is '.csv' (Created) and another is '.tsv' (Orignal). The '.csv' file is created upon '.tsv' file by us.

Libraries

  • Python v3.6
  • Keras v2.1.5
You can’t perform that action at this time.