Python machine learning package providing simple interoperability between ML.NET and scikit-learn components.
Branch: master
Clone or download
shmoradims Removed ISchema from DotNetBridge (#90)
* Removed ISchema

* Fixed the tests

* Addressed PR comments

* Addressed Wei-Sheng's comments about documenting the purpose of Column.DetachedColumn.
Latest commit e5f2b65 Jan 23, 2019
Type Name Latest commit message Commit time
Failed to load latest commit information.
build Update to ML.NET 0.10.preview (#77) Jan 16, 2019
docs Add instructions for editing docstrings. (#51) Dec 6, 2018
.vsts-ci.yml Create Nov 1, 2018
LICENSE Update LICENSE Oct 19, 2018 Create Nov 1, 2018 Update scikit-learn links to use https instead of http Nov 19, 2018
THIRD-PARTY-NOTICES.txt Add THIRD-PARTY-NOTICES.txt and move to root. (#40) Oct 31, 2018
build.cmd for windows use the latest pytest-cov Jan 7, 2019 update pytest-cov version in build command for linux Jan 7, 2019
nimbusml.sln fix build Nov 23, 2018
version.txt Initial checkin for ML.NET 0.7 upgrade Nov 9, 2018


nimbusml is a Python module that provides experimental Python bindings for ML.NET.

ML.NET was originally developed in Microsoft Research and is used across many product groups in Microsoft like Windows, Bing, PowerPoint, Excel and others. nimbusml was built to enable data science teams that are more familiar with Python to take advantage of ML.NET's functionality and performance.

This package enables training ML.NET pipelines or integrating ML.NET components directly into scikit-learn pipelines (it supports numpy.ndarray, scipy.sparse_cst, and pandas.DataFrame as inputs).

Documentation can be found here and additional notebook samples can be found here.


nimbusml runs on Windows, Linux, and macOS.

nimbusml requires Python 2.7, 3.5, or 3.6, 64 bit version only. Python 3.7 is not yet supported.

Install nimbusml using pip with:

pip install nimbusml

nimbusml has been reported to work on Windows 10, MacOS 10.13, Ubuntu 14.04, Ubuntu 16.04, Ubuntu 18.04, CentOS 7, and RHEL 7.


Here is an example of how to train a model to predict sentiment from text samples (based on this ML.NET example). The full code for this example is here.

from nimbusml import Pipeline, FileDataStream
from nimbusml.datasets import get_dataset
from nimbusml.ensemble import FastTreesBinaryClassifier
from nimbusml.feature_extraction.text import NGramFeaturizer

train_file = get_dataset('gen_twittertrain').as_filepath()
test_file = get_dataset('gen_twittertest').as_filepath()

train_data = FileDataStream.read_csv(train_file, sep='\t')
test_data = FileDataStream.read_csv(test_file, sep='\t')

pipeline = Pipeline([ # nimbusml pipeline
    NGramFeaturizer(columns={'Features': ['Text']}),
    FastTreesBinaryClassifier(feature=['Features'], label='Label')

# fit and predict
results = pipeline.predict(test_data)

Instead of creating an nimbusml pipeline, you can also integrate components into scikit-learn pipelines:

from sklearn.pipeline import Pipeline
from nimbusml.datasets import get_dataset
from nimbusml.ensemble import FastTreesBinaryClassifier
from sklearn.feature_extraction.text import TfidfVectorizer
import pandas as pd

train_file = get_dataset('gen_twittertrain').as_filepath()
test_file = get_dataset('gen_twittertest').as_filepath()

train_data = pd.read_csv(train_file, sep='\t')
test_data = pd.read_csv(test_file, sep='\t')

pipeline = Pipeline([ # sklearn pipeline
    ('tfidf', TfidfVectorizer()), # sklearn transform
    ('clf', FastTreesBinaryClassifier()) # nimbusml learner

# fit and predict["Text"], train_data["Label"])
results = pipeline.predict(test_data["Text"])

Many additional examples and tutorials can be found in the documentation.


To build nimbusml from source please visit our developer guide.


The contributions guide can be found here. Given the experimental nature of this project, support will be provided on a best-effort basis. We suggest opening an issue for discussion before starting a PR with big changes.


If you have an idea for a new feature or encounter a problem, please open an issue in this repository or ask your question on Stack Overflow.


NimbusML is licensed under the MIT license.