Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* initial upload * dev branch workflow * Update README.md * starting to setup coverage * flake err cleanup * deleted more unused code * can't find a good githubactions coverage * can't find a good githubactions coverage * bug fixes * consolidating tests * XGB Regressor is failing * commiting lgbm regressor tests * using params * fixing lgbm max_depth bug * better test output. TODO: fix the max_depth for lgbm and xgb to not fall through to None, need to compute * adding failure case test. TODO: why does RF not have extra_config in regressor * pinning to xgboost .90 for now * refactoring tree's extra_config for xgb and lgbm * fixing reversed param * adding gbdt test file * refactoring beam params functions * making all beam params as numpy * increasing coverege by shifting label starts and by deleting unused model.infer_initial_types() * extra config for rf reg * flake8 * more error testing * using onnxconverter types instead of copypaste * more consolidation * more test coverage * first step in refactor * cleaning up batch params * adding beam++ to node size 1 test * there is a bug, documenting * renaming trees to match paper * test * adding precommit hooks * README.md * readme update * commit hooks * Fixing badge link to be relative * notebook for demo * notebook for demo * notebook params change * reveriting 2c95f48 and reopening issue #9; this solution is too clunky * bumping pyt req * Fix pytorch requirements * Fix to brackets for alpha in xgboost * Few minor fixes to comments in tests * Removed unecessary regression tests * Add binary classification tests for gemm, tree_trav and perf_tree_trav * Fixes to whitespaces * updating readme * filling out contrib section * expanding readme example so that (1) it actually runs (2) it actually does a thing * cleaning notebook example * Fix to typo and update to the requirements * Fix to flake8 errors * readme changes from this morning * changes based on feedback * Few edits to contributing * Few edits in the README file * fixing mailto: syntax * Remove initial_types from the converter API * Rename Skl2PyTorch container into HBPyTorch * Add convert_xgboost and convert_lightgbm API * Fix to spacing * remove pandas check (for the moment) * fix import * Fix readme to use the new API * removed common directory * add some documentation * renamed few things * code refactoring for trees * refactor lightgbm and xgboost by moving stuff into gbdt_commons * done with a pass on gbdt after moving everything to _gbdt_common * final refactoring of gbdt classes * rename random forest stuff into decision tree * major refactoring for tree implementations * some renaming here and there * minor fix * Add test to validate that issue #7 is closed. * import container stuff from onnx-common * fix the parser to use the topology in onnx-common * remove unnecessary files * address first chunk of Karla's comments * fix typo in calibration * Another round of comments addressed * fix typo * these two lines seem unnecessary * moving notebooks from broken branch * adding notebooks with new API changes * removing comment * removed few unnecessary code and edited some documentation * Update CONTRIBUTING.md * remove . from git clone * Final pass over non-converters files documentation / API * add constants for converters * simplify a bit the API by using extra_config for optional parameters * Update CONTRIBUTING.md * done with documentation over public classes , methods * add contants and extra config management * addressing Karla's comments * pip install pdoc; pdoc --html hummingbird * pdoc3, using overrides to get extra doc if we want it * add few tests to check that we actually pick the correct implementation * Update README.md * Reformat doc * add HB logo to readme file * Add HB logo in doc * add assertion on model being not None Co-authored-by: Karla Saur <karla.saur@microsoft.com> Co-authored-by: Matteo Interlandi <mainterl@microsoft.com>
- Loading branch information
1 parent
fb4e437
commit 290a4bd
Showing
55 changed files
with
10,267 additions
and
14 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
[run] | ||
branch = True | ||
source = hummingbird | ||
omit = | ||
*tests* |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
[flake8] | ||
ignore = E203, E266, E501, W503, F403, F401, C901 | ||
max-line-length = 127 | ||
max-complexity = 10 | ||
select = B,C,E,F,W,T4,B9 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,49 @@ | ||
# This workflow will install Python dependencies, run tests and lint with a single version of Python | ||
# For more information see: https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions | ||
|
||
name: Python application | ||
|
||
on: | ||
push: | ||
branches: | ||
- master | ||
- develop | ||
|
||
pull_request: | ||
branches: | ||
- master | ||
- develop | ||
|
||
jobs: | ||
build: | ||
|
||
runs-on: ubuntu-latest | ||
|
||
steps: | ||
- uses: actions/checkout@v2 | ||
- name: Set up Python 3.7 | ||
uses: actions/setup-python@v1 | ||
with: | ||
python-version: 3.7 | ||
- name: Install dependencies | ||
run: | | ||
python -m pip install --upgrade pip | ||
pip install -r requirements.txt | ||
- name: Lint with flake8 | ||
run: | | ||
pip install flake8 | ||
# stop the build if there are Python syntax errors or undefined names | ||
flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics | ||
# The GitHub editor is 127 chars wide | ||
flake8 . --count --max-complexity=10 --max-line-length=127 --statistics | ||
- name: Test with pytest | ||
run: | | ||
pip install -r requirements.txt && pip install -e . && pip install pytest | ||
pytest | ||
- name: Coverage | ||
run: | | ||
pip install -r requirements.txt && pip install -e . && pip install coverage | ||
coverage run -m pytest tests | ||
MINIMUM=70 | ||
SCORE=$(coverage report -m | tail -n 1 | awk '{print $NF}' | rev | cut -c2- | rev) | ||
if [ $SCORE -ge $MINIMUM ]; then echo "COVERAGE ($SCORE) OK"; else echo "WARNING: Coverage is $SCORE but should be at least $MINIMUM"; fi |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,80 @@ | ||
# Byte-compiled / optimized / DLL files | ||
__pycache__/ | ||
*.py[cod] | ||
*$py.class | ||
|
||
# C extensions | ||
*.so | ||
|
||
# Distribution / packaging | ||
.Python | ||
build/ | ||
develop-eggs/ | ||
dist/ | ||
downloads/ | ||
eggs/ | ||
.eggs/ | ||
lib/ | ||
lib64/ | ||
parts/ | ||
sdist/ | ||
var/ | ||
wheels/ | ||
pip-wheel-metadata/ | ||
share/python-wheels/ | ||
*.egg-info/ | ||
.installed.cfg | ||
*.egg | ||
MANIFEST | ||
|
||
# PyInstaller | ||
# Usually these files are written by a python script from a template | ||
# before PyInstaller builds the exe, so as to inject date/other infos into it. | ||
*.manifest | ||
*.spec | ||
|
||
# Installer logs | ||
pip-log.txt | ||
pip-delete-this-directory.txt | ||
|
||
# Unit test / coverage reports | ||
htmlcov/ | ||
.tox/ | ||
.nox/ | ||
.coverage | ||
.coverage.* | ||
.cache | ||
nosetests.xml | ||
coverage.xml | ||
*.cover | ||
.hypothesis/ | ||
.pytest_cache/ | ||
|
||
# Jupyter Notebook | ||
.ipynb_checkpoints | ||
|
||
# Environments | ||
.env | ||
.venv | ||
env | ||
env/ | ||
venv | ||
venv/ | ||
ENV/ | ||
env.bak/ | ||
venv.bak/ | ||
|
||
# mkdocs documentation | ||
/site | ||
|
||
# mypy | ||
.mypy_cache/ | ||
.dmypy.json | ||
dmypy.json | ||
|
||
|
||
# project specific | ||
.vscode/* | ||
configs/db/*.config | ||
configs/github/*.token | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
repos: | ||
- repo: https://github.com/psf/black | ||
rev: stable | ||
hooks: | ||
- id: black | ||
language_version: python3.6 | ||
- repo: https://github.com/pre-commit/pre-commit-hooks | ||
rev: v1.2.3 | ||
hooks: | ||
- id: flake8 | ||
- id: check-added-large-files | ||
- id: check-ast | ||
- id: check-byte-order-marker | ||
- id: check-merge-conflict | ||
- id: detect-private-key | ||
- id: trailing-whitespace | ||
- id: no-commit-to-branch |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,80 @@ | ||
# Contributing | ||
|
||
## Welcome | ||
|
||
If you are here, it means you are interested in helping us out. A hearty welcome and thank you! There are many ways you can contribute to Hummingbird: | ||
|
||
* Offer PR's to fix bugs or implement new features; | ||
* Give us feedback and bug reports regarding the software or the documentation; | ||
* Improve our examples, and documentation. | ||
This project welcomes contributions and suggestions. | ||
|
||
## Getting Started | ||
|
||
Please join the community on Gitter *gitter badge*. Also please make sure to take a look at the project [roadmap](wiki/Roadmap-for-Upcoming-Features-and-Support). | ||
|
||
|
||
### Pull requests | ||
If you are new to GitHub [here](https://help.github.com/categories/collaborating-with-issues-and-pull-requests/) is a detailed help source on getting involved with development on GitHub. | ||
|
||
As a first time contributor, you will be invited to sign the Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us | ||
the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com. You will only need to do this once across all repos using our CLA. | ||
|
||
Your pull request needs to reference a filed issue. Please fill in the template that is populated for the pull request. Only pull requests addressing small typos can have no issues associated with them. | ||
|
||
All commits in a pull request will be [squashed](https://github.blog/2016-04-01-squash-your-commits/) to a single commit with the original creator as author. | ||
|
||
### Code of Conduct | ||
This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/). | ||
For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or | ||
contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments. | ||
|
||
## Developing | ||
The simplest setup is: | ||
``` | ||
mkdir hummingbird | ||
cd hummingbird | ||
git clone https://github.com/microsoft/hummingbird.git . | ||
pip install -e . | ||
``` | ||
|
||
#### Pre-commit | ||
This project uses [pre-commit](https://pre-commit.com/) hooks. Run `pip install pre-commit` if you don't already have this in your machine. Afterward, run `pre-commit install` to install pre-commit into your git hooks. | ||
|
||
And before you commit, you can run it like this `pre-commit run --all-files` and should see output such as: | ||
|
||
``` | ||
black............................Passed | ||
Flake8...........................Passed | ||
... | ||
Don't commit to branch...........Passed | ||
``` | ||
|
||
If you have installed your pre-commit hooks successfully, you should see something like this if you try to commit something non-conformant: | ||
``` | ||
$ git commit -m "testing" | ||
black............................Failed | ||
- hook id: black | ||
- files were modified by this hook | ||
reformatted hummingbird/convert.py | ||
All done! | ||
1 file reformatted. | ||
``` | ||
|
||
#### Formatting | ||
We generally use all pep8 checks, with the exception of line length 127. | ||
|
||
To do a quick check-up before commit, try: | ||
``` | ||
flake8 . --count --max-complexity=10 --max-line-length=127 --statistics | ||
``` | ||
|
||
#### Coverage | ||
|
||
For coverage, we use [coverage.py](https://coverage.readthedocs.io/en/coverage-5.0.4/) in our Github Actions. Run `pip install coverage` if you don't already have this, and any code you commit should generally not significantly impact coverage. | ||
|
||
We strive to keep our test coverage about 70%. To run all unit tests: | ||
``` | ||
coverage run -m pytest tests | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,14 +1,75 @@ | ||
|
||
# Contributing | ||
|
||
This project welcomes contributions and suggestions. Most contributions require you to agree to a | ||
Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us | ||
the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com. | ||
|
||
When you submit a pull request, a CLA bot will automatically determine whether you need to provide | ||
a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions | ||
provided by the bot. You will only need to do this once across all repos using our CLA. | ||
|
||
This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/). | ||
For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or | ||
contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments. | ||
[![](https://i.imgur.com/0pp9lMS.png?1)](https://github.com/microsoft/hummingbird/) | ||
|
||
# Hummingbird | ||
|
||
![](https://github.com/microsoft/hummingbird/workflows/Python%20application/badge.svg?branch=develop) | ||
|
||
## Introduction | ||
*Hummingbird* converts trained traditional Machine Learning models into [PyTorch](https://pytorch.org/). Once in the PyTorch format, <!--you can further convert to [ONNX](https://github.com/onnx/onnx) or [TorchScript](https://pytorch.org/docs/stable/jit.html), and --> you can run the models on GPU for high performance native scoring. For full details, see [our paper](https://scnakandala.github.io/papers/TR_2020_Hummingbird.pdf). | ||
|
||
Currently we support [these](https://github.com/microsoft/hummingbird/blob/develop/hummingbird/_supported_operators.py#L26) tree-based classifiers and regressors. These models include | ||
[scikit-learn](https://scikit-learn.org/stable/) models such as Decision Trees and Random Forest, and also [LightGBM](https://github.com/Microsoft/LightGBM) and [XGBoost](https://github.com/dmlc/xgboost) Classifiers/Regressors. | ||
|
||
## Installation | ||
|
||
This was tested on Python 3.7 on a Linux machine. | ||
``` | ||
mkdir hummingbird | ||
cd hummingbird | ||
git clone https://github.com/microsoft/hummingbird.git . | ||
python setup.py install | ||
``` | ||
|
||
## Examples | ||
|
||
See the [notebooks](notebooks) section for examples that demonstrate use and speedups. | ||
|
||
In general, the syntax is very similar to [skl2onnx](https://github.com/onnx/sklearn-onnx), as hummingbird started as a fork of that project. | ||
|
||
```python | ||
import torch | ||
import numpy as np | ||
import lightgbm as lgb | ||
from hummingbird import convert_lightgbm | ||
|
||
# Create some random data for binary classification | ||
num_classes = 2 | ||
X = np.array(np.random.rand(100000, 28), dtype=np.float32) | ||
y = np.random.randint(num_classes, size=100000) | ||
|
||
# Create and train a model (LightGBM in this case) | ||
model = lgb.LGBMClassifier() | ||
model.fit(X, y) | ||
|
||
# Use Hummingbird to convert the model to pytorch | ||
pytorch_model = convert_lightgbm(model) | ||
|
||
# Run Hummingbird on CPU | ||
pytorch_model.to('cpu') | ||
hb_cpu = pytorch_model(torch.from_numpy(X)) | ||
|
||
# Run Hummingbird on GPU | ||
pytorch_model.to('cuda') | ||
hb_gpu = pytorch_model(torch.from_numpy(X).to('cuda')) | ||
``` | ||
|
||
# Contributing | ||
|
||
We welcome contributions! Please see the guide on [Contributing](CONTRIBUTING.md). | ||
|
||
Also, see our [roadmap](wiki/Roadmap-for-Upcoming-Features-and-Support) of planned features. | ||
|
||
# Community | ||
|
||
Join our community! *gitter badge here* | ||
|
||
For more formal enquiries, you can [contact us](mailto:hummingbird-dev@microsoft.com). | ||
|
||
# Authors | ||
|
||
* Supun Nakandala | ||
* Matteo Interlandi | ||
* Karla Saur | ||
|
||
# License | ||
[MIT License](LICENSE) |
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Oops, something went wrong.