Dev (#31)

* Ebnt 142 warnings (#16) * EBNT-142 eliminate Tensorflow deprecation warnings * EBNT-142 replace Tensorflow with CatBoost in requirements_tests * EBNT-142 eliminate fake warnings from check_pypi_module tests * EBNT-142 eliminate version incompatibility warnings from client tests * EBNT-142 tests and fixes for (multi-)model providers/loaders * fix for coveralls badge * EBNT-142 minor review improvement * EBNT-142 move TF warnings workarounds from ebonite to ebonite.ext.tensorflow * changelog v0.3.2 * Bump version: 0.3.1 → 0.3.2 * EBNT-142 move TF warnings workarounds from main code to tests (#18) * Ebnt 146 docs (#20) * EBNT-146 docs * EBNT-146 fix isort * EBNT-146 fix import * EBNT-146 fix import * Ebnt 177 which docker (#22) * update docs (#21) * Ebnt 142 warnings (#16) * EBNT-142 eliminate Tensorflow deprecation warnings * EBNT-142 replace Tensorflow with CatBoost in requirements_tests * EBNT-142 eliminate fake warnings from check_pypi_module tests * EBNT-142 eliminate version incompatibility warnings from client tests * EBNT-142 tests and fixes for (multi-)model providers/loaders * fix for coveralls badge * EBNT-142 minor review improvement * EBNT-142 move TF warnings workarounds from ebonite to ebonite.ext.tensorflow * changelog v0.3.2 * Bump version: 0.3.1 → 0.3.2 * EBNT-142 move TF warnings workarounds from main code to tests (#18) * Ebnt 146 docs (#20) * EBNT-146 docs * EBNT-146 fix isort * EBNT-146 fix import * EBNT-146 fix import * EBNT-177 Eliminate "which docker" check failing on Windows hosts * EBNT-177 fix for coverage 5.0.0 * EBNT-177 freeze coverage version as new major release is incompatible with coveralls * Ebnt 178 flask redirect (#23) * update docs (#21) * Ebnt 142 warnings (#16) * EBNT-142 eliminate Tensorflow deprecation warnings * EBNT-142 replace Tensorflow with CatBoost in requirements_tests * EBNT-142 eliminate fake warnings from check_pypi_module tests * EBNT-142 eliminate version incompatibility warnings from client tests * EBNT-142 tests and fixes for (multi-)model providers/loaders * fix for coveralls badge * EBNT-142 minor review improvement * EBNT-142 move TF warnings workarounds from ebonite to ebonite.ext.tensorflow * changelog v0.3.2 * Bump version: 0.3.1 → 0.3.2 * EBNT-142 move TF warnings workarounds from main code to tests (#18) * Ebnt 146 docs (#20) * EBNT-146 docs * EBNT-146 fix isort * EBNT-146 fix import * EBNT-146 fix import * EBNT-177 Eliminate "which docker" check failing on Windows hosts * EBNT-177 fix for coverage 5.0.0 * EBNT-177 freeze coverage version as new major release is incompatible with coveralls * EBNT-178 redirect from / to Swagger API documentation in Flask * EBNT-178 upgrade pyjackson * Ebnt 157 predict proba (#25) * EBNT-157 support for predict_proba in runtime * EBNT-157 support for predict_proba in catboost * EBNT-157 support for predict_proba in sklearn * EBNT-157 predict_proba in examples * EBNT-157 freeze scipy to 1.3.3 as 1.4.0 crashes pytest * EBNT-157 Support for custom methods in model wrappers with predict_proba PoC * EBNT-157 review fixes * Ebnt 180 numpy shapes (#26) * EBNT-180 First dimension of numpy arrays / torch tensors (batch size) should not be fixed in schemas * EBNT-181 HTTP client/server error handling via dataset type-driven validation (#27) * Ebnt 187 torch jit (#28) * EBNT-181 HTTP client/server error handling via dataset type-driven validation * EBNT-187 Support for Pytorch JIT (TorchScript) models * Ebnt 179 188 better documentation (#29) * EBNT-181 HTTP client/server error handling via dataset type-driven validation * EBNT-187 Support for Pytorch JIT (TorchScript) models * EBNT-179 EBNT-188 improve readme * update scikit-learn to 0.22 to prevent logreg solver warnings * prevent coverage 5.x installation in coveralls tox block * Bump tensorflow from 1.14.0 to 1.15.0 (#24) * update docs (#21) * Ebnt 142 warnings (#16) * EBNT-142 eliminate Tensorflow deprecation warnings * EBNT-142 replace Tensorflow with CatBoost in requirements_tests * EBNT-142 eliminate fake warnings from check_pypi_module tests * EBNT-142 eliminate version incompatibility warnings from client tests * EBNT-142 tests and fixes for (multi-)model providers/loaders * fix for coveralls badge * EBNT-142 minor review improvement * EBNT-142 move TF warnings workarounds from ebonite to ebonite.ext.tensorflow * changelog v0.3.2 * Bump version: 0.3.1 → 0.3.2 * EBNT-142 move TF warnings workarounds from main code to tests (#18) * Ebnt 146 docs (#20) * EBNT-146 docs * EBNT-146 fix isort * EBNT-146 fix import * EBNT-146 fix import * Bump tensorflow from 1.14.0 to 1.15.0 Bumps [tensorflow](https://github.com/tensorflow/tensorflow) from 1.14.0 to 1.15.0. - [Release notes](https://github.com/tensorflow/tensorflow/releases) - [Changelog](https://github.com/tensorflow/tensorflow/blob/master/RELEASE.md) - [Commits](tensorflow/tensorflow@v1.14.0...v1.15.0) Signed-off-by: dependabot[bot] <support@github.com> * fix scipy==1.3.3 Co-authored-by: Mikhail Sveshnikov <mike0sv@gmail.com> * Fix/dataset type tests (#30) * add some DatasetType tests * remove redundant * fix isort * add changelog Co-authored-by: Ivan Andrianov <53702184+i-a-andrianov@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
zyfra · Jan 10, 2020 · afb34bc · afb34bc
1 parent ceb511c
commit afb34bc
Show file tree

Hide file tree

Showing 58 changed files with 907 additions and 360 deletions.
diff --git a/CHANGELOG.rst b/CHANGELOG.rst
@@ -4,6 +4,19 @@ Changelog
 Current release candidate
 -------------------------
 
+0.3.3 (2020-01-10)
+------------------
+
+* Eliminated tensorflow warnings. Added more tests for providers/loaders. Fixed bugs in multi-model provider/builder.
+* Improved documentation
+* Eliminate useless "which docker" check which fails on Windows hosts
+* Perform redirect from / to Swagger API docs in Flask server
+* Support for predict_proba method in ML model
+* Do not fix first dimension size for numpy arrays and torch tensors
+* Support for Pytorch JIT (TorchScript) models
+* Bump tensorflow from 1.14.0 to 1.15.0
+* Added more tests
+
 0.3.2 (2019-12-04)
 ------------------
 

diff --git a/README.rst b/README.rst
@@ -81,25 +81,49 @@ Installation
 Quickstart
 =============
 
-First, create a Ebonite client.
+Before starting with Ebonite prepare your model.
+This could be a model from your favorite library (list of supported libraries is presented above) or
+a custom Python function working with typical machine learning data structures.
+
+.. code-block:: python
+
+  import numpy as np
+  def clf(data):
+    return (np.sum(a, axis=-1) > 1).astype(np.int32)
+
+Moreover, your custom function could wrap a model from some library.
+This gives you flexibility to use not only pure ML models but rule-based ones (e.g., as a service stub at project start)
+and hybrid (ML with pre/postprocessing) ones which are often applied to solve real world problems.
+
+When a model is prepared you should create a Ebonite client.
 
 .. code-block:: python
 
   from ebonite import Ebonite
   ebnt = Ebonite.local()
 
-Second, create a task and push your model object with some sample data.
+Then create a task and push your model object with some sample data.
+Sample data is required for Ebonite to determine structure of inputs and outputs for your model.
 
 .. code-block:: python
 
    task = ebnt.get_or_create_task('my_project', 'my_task')
-   model = task.create_and_push_model(clf, test_x, 'my_sklearn_clf')
+   model = task.create_and_push_model(clf, test_x, 'my_clf')
 
-You are awesome! Now you can load you model from this repo and do other wonderful stuff with it, for
-example create a docker image.
+You are awesome! Now your model is safely persisted in a repository.
 
-Check out examples and documentation to learn more.
+Later on in other Python process you can load your model from this repo and do some wonderful stuff with it,
+e.g., create a Docker image named `my_service` with an HTTP service wrapping your model.
 
+.. code-block:: python
+
+  from ebonite import Ebonite
+  ebnt = Ebonite.local()
+  task = ebnt.get_or_create_task('my_project', 'my_task')
+  model = client.get_model('my_clf', task)
+  client.build_service('my_service', model)
+
+Check out examples (in `examples` directory) and documentation to learn more.
 
 Documentation
 =============
@@ -108,7 +132,9 @@ Documentation
 Supported libraries and repositories
 ====================================
 
-* Machine Learning
+* Models
+
+  * your arbitrary Python function
 
   * scikit-learn
 
@@ -122,15 +148,19 @@ Supported libraries and repositories
 
   * CatBoost
 
-* Data
+* Model input / output data
 
   * NumPy
 
   * pandas
 
   * images
 
-* Repositories
+* Model repositories
+
+  * in-memory
+
+  * local filesystem
 
   * SQLAlchemy
 

diff --git a/examples/custom_code/client.py b/examples/custom_code/client.py
@@ -12,7 +12,7 @@ def main():
     payload = {'vector': {'values': [{'value': value}]}}
     r = requests.post('http://localhost:9000/predict', json=payload)
     r.raise_for_status()
-    print(r.json()['data'][0])
+    print(r.json()['data']['values'][0]['value'])
 
 
 if __name__ == '__main__':

diff --git a/examples/custom_code/model_run.py b/examples/custom_code/model_run.py
@@ -1,6 +1,6 @@
 """This module loads created ebonite model and then creates and runs docker container made from this model"""
 
-import ebonite.runtime.server.http
+import ebonite
 from ebonite.build import build_model_docker, run_docker_img
 from ebonite.build.builder.base import use_local_installation
 from ebonite.core.objects.core import Model, Task

diff --git a/examples/extensions/myext/extension_source.py b/examples/extensions/myext/extension_source.py
@@ -1,4 +1,5 @@
 import contextlib
+import typing
 
 from ebonite.core.analyzer.model import ModelHook
 from ebonite.core.objects.artifacts import Blobs
@@ -11,14 +12,18 @@ class MyModelWrapper(ModelWrapper):
 
     @contextlib.contextmanager
     @ModelWrapper.with_model
-    def dump(self) -> FilesContextManager:
+    def _dump(self) -> FilesContextManager:
         yield Blobs({})
 
-    def load(self, path):
+    def _load(self, path):
         return self.bind_model('ahaha')
 
-    @ModelWrapper.with_model
-    def predict(self, data):
+    def _exposed_methods_mapping(self) -> typing.Dict[str, typing.Optional[str]]:
+        return {
+            'predict': '_predict'
+        }
+
+    def _predict(self, data):
         return data + 1
 
 
@@ -29,5 +34,5 @@ def can_process(self, obj) -> bool:
     def must_process(self, obj) -> bool:
         return True
 
-    def process(self, obj) -> ModelWrapper:
-        return MyModelWrapper().bind_model(obj)
+    def process(self, obj, **kwargs) -> ModelWrapper:
+        return MyModelWrapper().bind_model(obj, **kwargs)
diff --git a/examples/remote_example/model_run.py b/examples/remote_example/model_run.py
@@ -1,5 +1,5 @@
 from ebonite import Ebonite
-from ebonite.runtime.debug import run_test_model_server
+from ebonite.runtime.helpers import run_model_server
 
 
 def main():
@@ -8,7 +8,7 @@ def main():
     model = ebnt.get_model('add_one_model', 'my_task', 'my_project')
 
     # run flask service with this model
-    run_test_model_server(model)
+    run_model_server(model)
     # now you can use client.py to call this service or go to http://localhost:9000/apidocs to view swagger ui
 
 

diff --git a/examples/sklearn_model/client.py b/examples/sklearn_model/client.py
@@ -23,6 +23,10 @@ def main():
     r.raise_for_status()
     print(r.json()['data'][0])
 
+    r = requests.post('http://localhost:9000/predict_proba', json=payload)
+    r.raise_for_status()
+    print(r.json()['data'][0])
+
 
 if __name__ == '__main__':
     main()
diff --git a/examples/sklearn_model/model_train.py b/examples/sklearn_model/model_train.py
@@ -2,16 +2,16 @@
 from typing import Tuple
 
 import pandas as pd
-from sklearn.linear_model import LinearRegression
+from sklearn.linear_model import LogisticRegression
 
 import ebonite
 
 
-def train_model() -> Tuple[LinearRegression, pd.DataFrame]:
+def train_model() -> Tuple[LogisticRegression, pd.DataFrame]:
     """This function simulates general data scientist's work.
     It produces trained model and data sample for this model."""
-    reg = LinearRegression()
-    data = pd.DataFrame([[1, 1], [2, 1]], columns=['a', 'b'])
+    reg = LogisticRegression()
+    data = pd.DataFrame([[1, 0], [0, 1]], columns=['a', 'b'])
     reg.fit(data, [1, 0])
     return reg, data
 

diff --git a/requirements.txt b/requirements.txt
@@ -4,7 +4,7 @@ requests==2.22.0
 dill==0.3.1.1
 GitPython==3.0.3
 isort==4.3.21
-pyjackson==0.0.21
+pyjackson==0.0.23
 everett==1.0.2
 Jinja2==2.10.1
 # click

diff --git a/src/ebonite/build/builder/docker_builder.py b/src/ebonite/build/builder/docker_builder.py
@@ -1,5 +1,4 @@
 import os
-import subprocess
 import tempfile
 from contextlib import contextmanager
 from threading import Lock
@@ -27,11 +26,10 @@ def is_docker_running():
     :return: true or false
     """
     try:
-        subprocess.check_output('which docker', shell=True)
         with create_docker_client() as client:
             client.images.list()
         return True
-    except (subprocess.CalledProcessError, ImportError, requests.exceptions.ConnectionError):
+    except (ImportError, requests.exceptions.ConnectionError, docker.errors.DockerException):
         return False
 
 

diff --git a/src/ebonite/core/analyzer/base.py b/src/ebonite/core/analyzer/base.py
@@ -37,11 +37,12 @@ def must_process(self, obj) -> bool:
         pass
 
     @abstractmethod
-    def process(self, obj):
+    def process(self, obj, **kwargs):
         """
         Analyzes obj and returns result. Result type is determined by specific Hook class sub-hierarchy
 
         :param obj: object to analyze
+        :param kwargs: additional information to be used for analysis
         :return: analysis result
         """
         pass
@@ -148,18 +149,19 @@ class Analyzer:
         hooks: List[hook_type] = []
 
         @classmethod
-        def analyze(cls, obj) -> return_type:
+        def analyze(cls, obj, **kwargs) -> return_type:
             f"""
             Run {hook_type.__name__} hooks to analyze obj
 
             :param obj: objects to analyze
+            :param kwargs: additional information to be used for analysis
             :return: Instance of {return_type.__name__}
             """
             hooks = []
             for hook in cls.hooks:
                 if hook.must_process(obj):
                     logger.debug('processing class %s with %s', type(obj).__name__, hook.__class__.__name__)
-                    return hook.process(obj)
+                    return hook.process(obj, **kwargs)
                 elif hook.can_process(obj):
                     hooks.append(hook)
 
@@ -170,7 +172,7 @@ def analyze(cls, obj) -> return_type:
             elif len(hooks) > 1:
                 raise ValueError(f'Multiple suitable hooks for object {obj} ({hooks})')
 
-            return hooks[0].process(obj)
+            return hooks[0].process(obj, **kwargs)
 
     Analyzer.__name__ = '{}Analyzer'.format(hook_type.__name__)
     setattr(hook_type, ANALYZER_FIELD, Analyzer)

diff --git a/src/ebonite/core/analyzer/dataset.py b/src/ebonite/core/analyzer/dataset.py
@@ -11,7 +11,14 @@ class DatasetHook(Hook):
     Analysis result is an instance of :class:`~ebonite.core.objects.DatasetType`
     """
     @abstractmethod
-    def process(self, obj) -> DatasetType:
+    def process(self, obj, **kwargs) -> DatasetType:
+        """
+        Analyzes obj and returns result. Result type is determined by specific Hook class sub-hierarchy
+
+        :param obj: object to analyze
+        :param kwargs: additional information to be used for analysis
+        :return: analysis result
+        """
         pass
 
 
@@ -29,7 +36,7 @@ def can_process(self, obj):
     def must_process(self, obj):
         return False
 
-    def process(self, obj) -> DatasetType:
+    def process(self, obj, **kwargs) -> DatasetType:
         return PrimitiveDatasetType(type(obj).__name__)
 
 
@@ -43,7 +50,7 @@ def can_process(self, obj) -> bool:
     def must_process(self, obj) -> bool:
         return False
 
-    def process(self, obj) -> DatasetType:
+    def process(self, obj, **kwargs) -> DatasetType:
         return ListDatasetType([DatasetAnalyzer.analyze(o) for o in obj])
 
 
@@ -57,7 +64,7 @@ def can_process(self, obj) -> bool:
     def must_process(self, obj) -> bool:
         return False
 
-    def process(self, obj) -> DatasetType:
+    def process(self, obj, **kwargs) -> DatasetType:
         try:
             items = {k: DatasetAnalyzer.analyze(o) for k, o in obj.items()}
         except ValueError:
@@ -69,7 +76,7 @@ class FilelikeDatasetHook(DatasetHook):
     """
     Hook for file-like objects
     """
-    def process(self, obj) -> DatasetType:
+    def process(self, obj, **kwargs) -> DatasetType:
         return FilelikeDatasetType()
 
     def can_process(self, obj) -> bool:

diff --git a/src/ebonite/core/analyzer/model.py b/src/ebonite/core/analyzer/model.py
@@ -13,7 +13,7 @@ class ModelHook(Hook):
     """
 
     @abstractmethod
-    def process(self, obj) -> ModelWrapper:
+    def process(self, obj, **kwargs) -> ModelWrapper:
         pass
 
     def get_requirements(self, obj) -> Requirements:
@@ -27,8 +27,8 @@ class CallableMethodModelHook(ModelHook):
     """
     Hook for processing functions
     """
-    def process(self, obj) -> ModelWrapper:
-        return CallableMethodModelWrapper().bind_model(obj)
+    def process(self, obj, **kwargs) -> ModelWrapper:
+        return CallableMethodModelWrapper().bind_model(obj, **kwargs)
 
     def can_process(self, obj) -> bool:
         return callable(obj)