Classify source code using a Naive Bayes text classifier
PHP
Switch branches/tags
Nothing to show
Clone or download
angeloskath Initial commit
It also contains the datasets so it is kinda huge
Latest commit 34cef70 Nov 13, 2013
Permalink
Failed to load latest commit information.
bin Initial commit Nov 13, 2013
data Initial commit Nov 13, 2013
src Initial commit Nov 13, 2013
.gitignore Initial commit Nov 13, 2013
README.md Initial commit Nov 13, 2013
composer.json Initial commit Nov 13, 2013
composer.lock Initial commit Nov 13, 2013
model Initial commit Nov 13, 2013

README.md

LanguageDetector

LanguageDetector is an implementation of sourceclassifier in PHP using NlpTools.

LanguageDetector detects the programming language of a source code using a Naive Bayes model. The pre trained provided model recognizes C, C#, C++, Clojure, Go, Haskell, Java, Javascript, MATLAB, Pascal, Perl, PHP, Python, Ruby, Scala, Visual Basic.

You can read a blog post about it.

Usage

include ("vendor/autoload.php");

$detector = LanguageDetector::loadFromFile("model");

$lang = $detector->classify(<<<CODE
#include <stdio.h>

int main() {
	printf("Hello world");
}
CODE
);

echo $lang; // C

$lang = $detector->classify(<<<CODE
def hello():
	print "Hello world"
hello()
CODE
);

echo $lang; // Python