Skip to content

kranjan94/language-identification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 

Repository files navigation

Language Identification

This iPython notebook builds and trains a deep model in Keras to predict languages based on lines of text. For now, the scope is restricted to programming languages since the required syntax is much more strict.

Data

For programming languages, we use large codebases for each language. Namely,

To add/modify a language or its configuration for this model, you need only modify the Config cell in the notebook. You may use any suitably (~10,000s of lines, depending on language complexity) large codebases for each language. Set the CODE_DIRS dictionary to point at the root directories of these repos. The notebook will retrieve all of the desired code from the repos and use it as data.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors