sklearn-porter

Transpile trained scikit-learn estimators to C, Java, JavaScript and others.
It's recommended for limited embedded systems and critical applications where performance matters most.

Machine learning algorithms

Algorithm	Programming language
Classification	C	Java*	JavaScript	Go	PHP	Ruby
sklearn.svm.SVC	✓	✓	✓		✓	✓
sklearn.svm.NuSVC	✓	✓	✓		✓	✓
sklearn.svm.LinearSVC	✓	✓	✓	✓	✓	✓
sklearn.tree.DecisionTreeClassifier	✓	✓	✓	✓	✓	✓
sklearn.ensemble.RandomForestClassifier	✓	✓	✓		✓	✓
sklearn.ensemble.ExtraTreesClassifier	✓	✓	✓		✓	✓
sklearn.ensemble.AdaBoostClassifier	✓	✓	✓
sklearn.neighbors.KNeighborsClassifier		✓	✓
sklearn.neural_network.MLPClassifier		✓	✓
sklearn.naive_bayes.GaussianNB		✓	✓
sklearn.naive_bayes.BernoulliNB		✓	✓
Regression
sklearn.neural_network.MLPRegressor			✓

✓ = is full-featured, ○ = has minor exceptions, * = default language

Installation

pip install sklearn-porter

If you want the latest bleeding edge changes, you can install the module from the master (development) branch:

pip uninstall -y sklearn-porter
pip install --no-cache-dir https://github.com/nok/sklearn-porter/zipball/master

Minimum requirements

- python>=2.7.3
- scikit-learn>=0.14.1

If you want to transpile a multilayer perceptron, you have to upgrade the scikit-learn package:

- python>=2.7.3
- scikit-learn>=0.18.0

Usage

Export

The following example shows how you can port a decision tree model to Java:

from sklearn.datasets import load_iris
from sklearn.tree import tree
from sklearn_porter import Porter

# Load data and train the classifier:
samples = load_iris()
X, y = samples.data, samples.target
clf = tree.DecisionTreeClassifier()
clf.fit(X, y)

# Export:
porter = Porter(clf, language='java')
output = porter.export()
print(output)

The exported result matches the official human-readable version of the decision tree.

Prediction

Run the prediction(s) in the target programming language directly:

# ...
porter = Porter(clf, language='java')

# Prediction(s):
Y_preds = porter.predict(X)
y_pred = porter.predict(X[0])
y_pred = porter.predict([1., 2., 3., 4.])

Accuracy

Always compute the accuracy between the original and the ported estimator:

# ...
porter = Porter(clf, language='java')

# Accuracy:
accuracy = porter.predict_test(X)
print(accuracy) # 1.0

Command-line interface

This example shows how you can port a model from the command line. First of all you have to store the model to the pickle format:

# ...

# Extract estimator:
joblib.dump(clf, 'estimator.pkl')

After that the model can be transpiled by using the following command:

python -m sklearn_porter --input <PICKLE_FILE> [--output <DEST_DIR>] [--pipe] [--c] [--java] [--js] [--go] [--php] [--ruby]
python -m sklearn_porter -i <PICKLE_FILE> [-o <DEST_DIR>] [-p] [--c] [--java] [--js] [--go] [--php] [--ruby]

For instance the following command transpiles the estimator to the target programming language JavaScript:

python -m sklearn_porter -i estimator.pkl --js

The target programming language is changeable on the fly:

python -m sklearn_porter -i estimator.pkl --c
python -m sklearn_porter -i estimator.pkl --go
python -m sklearn_porter -i estimator.pkl --php
python -m sklearn_porter -i estimator.pkl --java
python -m sklearn_porter -i estimator.pkl --ruby

The transpiled estimator is useable for further processing by using the --pipe parameter:

python -m sklearn_porter -i estimator.pkl --js --pipe > estimator.js

For instance the generated JavaScript code can be minified by using UglifyJS:

python -m sklearn_porter -i estimator.pkl --js --pipe | uglifyjs --compress -o estimator.min.js

Further information will be shown by using the --help parameter:

python -m sklearn_porter --help
python -m sklearn_porter -h

Development

Environment

Install the required environment modules by executing the script environment.sh:

bash ./recipes/environment.sh

conda env create -c conda-forge -n sklearn-porter python=2 -f environment.yml
source activate sklearn-porter

The following compilers or intepreters are required to cover all tests:

GCC (>=4.2)
Java (>=1.6)
PHP (>=7)
Ruby (>=2.4.1)
Go (>=1.7.4)
Node.js (>=6)

Testing

The tests cover module functions as well as matching predictions of transpiled models. Run all tests by executing the script test.sh:

bash ./recipes/test.sh

python -m unittest discover -vp '*Test.py'

The test files have a specific pattern: '[Algorithm][Language]Test.py':

python -m unittest discover -vp 'RandomForest*Test.py'
python -m unittest discover -vp '*JavaTest.py'

While you are developing new features or fixes, you can reduce the test duration by setting the number of model tests:

N_RANDOM_FEATURE_SETS=15 N_EXISTING_FEATURE_SETS=30 python -m unittest discover -vp '*Test.py'

Quality

It's highly recommended to ensure the code quality. For that I use Pylint, which you can run by executing the script lint.sh:

bash ./recipes/lint.sh

find ./sklearn_porter -name '*.py' -exec pylint {} \;

License

The module is Open Source Software released under the MIT license.

Questions?

Don't be shy and feel free to contact me on Twitter or Gitter.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

readme.md

readme.md

sklearn-porter

Machine learning algorithms

Installation

Minimum requirements

Usage

Export

Prediction

Accuracy

Command-line interface

Development

Environment

Testing

Quality

License

Questions?

Files

readme.md

Latest commit

History

readme.md

File metadata and controls

sklearn-porter

Machine learning algorithms

Installation

Minimum requirements

Usage

Export

Prediction

Accuracy

Command-line interface

Development

Environment

Testing

Quality

License

Questions?