Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Develop #127

Merged
merged 23 commits into from May 7, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
9f5ef37
Update example on how to use Dask to do grid search
dnouri Aug 29, 2019
a2b5760
Fix Dask hyperparam search example to work with latest sklearn&dask
dnouri Sep 13, 2019
a69e18a
Merge pull request #117 from ottogroup/bugfix/docs-parallelism
alattner Sep 15, 2019
fd9d57c
Merge pull request #116 from ottogroup/features/docs-dask-hyperparameter
alattner Sep 15, 2019
e3e931c
Back merge master
alattner Sep 30, 2019
336466b
Bumped version
alattner Sep 30, 2019
2a03acb
Remove julia and rpy2 from docs extra requirements
dnouri Nov 5, 2019
28fa35d
Merge pull request #120 from ottogroup/bugfix/travis-docs-require
dnouri Nov 5, 2019
60a7ded
Proposal implementation for handling model attachments
dnouri Oct 21, 2019
9207271
Examples on how to use Keras and XGBoost with Palladium
dnouri Nov 6, 2019
22939c5
In configuration, use exclamation mark '!' instead of '__factory__'
dnouri Nov 6, 2019
6a363c8
Merge pull request #122 from ottogroup/feature/config-exclamation-mar…
alattner Nov 20, 2019
1be16ce
Merge pull request #121 from ottogroup/feature/keras-and-xgboost-exam…
alattner Nov 20, 2019
a4efa8a
Merge pull request #119 from ottogroup/feature/model-attachments
alattner May 6, 2020
ed6842d
avoid loading stale metadata in S3 persister
yversley-ottogroup May 6, 2020
312d341
Merge pull request #125 from yv/issue/fix-invalid-json
alattner May 6, 2020
41fd2a2
Added path to error message when no active model can be found
alattner May 6, 2020
f29a3ce
Bump psutil version to 5.7.0 to fix security vulnerability
alattner May 6, 2020
0dfcb12
Fixed no active model exists test
alattner May 6, 2020
e235059
Updated dependencies
alattner May 6, 2020
304dad3
Fixed flaky tests
alattner May 6, 2020
8ddf698
Bumped version
alattner May 7, 2020
99bf061
Merge pull request #126 from ottogroup/feature/update-dependencies
alattner May 7, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 1 addition & 1 deletion .travis.yml
Expand Up @@ -8,7 +8,7 @@ matrix:
dist: xenial
sudo: true
env:
- TRAVIS=yes
- TRAVIS=yes AWS_ACCESS_KEY_ID=test AWS_SECRET_ACCESS_KEY=test
before_install:
- pip install -U pip && pip --version
- wget https://repo.anaconda.com/miniconda/Miniconda3-4.6.14-Linux-x86_64.sh -O miniconda.sh
Expand Down
17 changes: 17 additions & 0 deletions CHANGES.txt
@@ -1,3 +1,20 @@
v1.2.3 - 2020-05-07
===================

- Updated requirements in order to use newer versions of dependencies
(also fixing potential security vulnerabilities in dependencies)

- Added support for handling model attachments

- Exclamation mark `!` can now be used instead of `__factory__` in
configuration files

v1.2.2.1 - 2019-09-30
=====================

- Added AWS S3 persister


v1.2.2 - 2019-08-15
===================

Expand Down
2 changes: 1 addition & 1 deletion VERSION
@@ -1 +1 @@
1.2.2
1.2.3
14 changes: 7 additions & 7 deletions docs/user/R.rst
Expand Up @@ -42,7 +42,7 @@ above:
.. code-block:: python

'dataset_loader_train': {
'__factory__': 'palladium.R.DatasetLoader',
'!': 'palladium.R.DatasetLoader',
'scriptname': 'iris.R',
'funcname': 'dataset',
},
Expand All @@ -54,7 +54,7 @@ R classification models are configured very similarly, using
.. code-block:: python

'model': {
'__factory__': 'palladium.R.ClassificationModel',
'!': 'palladium.R.ClassificationModel',
'scriptname': 'iris.R',
'funcname': 'train.randomForest',
'encode_labels': True,
Expand Down Expand Up @@ -99,7 +99,7 @@ Here's how this would look like:
.. code-block:: python

'model': {
'__factory__': 'palladium.R.RegressionModel',
'!': 'palladium.R.RegressionModel',
'scriptname': 'tooth.R',
'funcname': 'train.randomForest',
},
Expand All @@ -110,7 +110,7 @@ classification above:
.. code-block:: python

'dataset_loader_train': {
'__factory__': 'palladium.R.DatasetLoader',
'!': 'palladium.R.DatasetLoader',
'scriptname': 'tooth.R',
'funcname': 'dataset',
},
Expand All @@ -129,13 +129,13 @@ including :class:`~palladium.R.Rpy2Transform` in a
.. code-block:: python

'model': {
'__factory__': 'sklearn.pipeline.Pipeline',
'!': 'sklearn.pipeline.Pipeline',
'steps': [
['rpy2', {
'__factory__': 'palladium.R.Rpy2Transform',
'!': 'palladium.R.Rpy2Transform',
}],
['regressor', {
'__factory__': 'palladium.R.RegressionModel',
'!': 'palladium.R.RegressionModel',
'scriptname': 'tooth.R',
'funcname': 'train.randomForest',
}],
Expand Down
10 changes: 5 additions & 5 deletions docs/user/configuration.rst
Expand Up @@ -31,7 +31,7 @@ you to pass in things like database credentials from the environment:
.. code-block:: python

'dataset_loader_train': {
'__factory__': 'palladium.dataset.SQL',
'!': 'palladium.dataset.SQL',
'url': 'mysql://{}:{}@localhost/test?encoding=utf8'.format(
environ['DB_USER'], environ['DB_PASS'],
),
Expand All @@ -46,7 +46,7 @@ folder as the configuration:
.. code-block:: python

'dataset_loader_train': {
'__factory__': 'palladium.dataset.CSV',
'!': 'palladium.dataset.CSV',
'path': '{}/data.csv'.format(here),
}

Expand Down Expand Up @@ -80,15 +80,15 @@ file:
.. code-block:: python

'dataset_loader_train': {
'__factory__': 'palladium.dataset.CSV',
'!': 'palladium.dataset.CSV',
'path': '{}/train.csv'.format(here),
'many': '...',
'more': {'...'},
'entries': ['...'],
}

'dataset_loader_test': {
'__factory__': 'palladium.dataset.CSV',
'!': 'palladium.dataset.CSV',
'path': '{}/test.csv'.format(here),
'many': '...',
'more': {'...'},
Expand All @@ -100,7 +100,7 @@ With ``__copy__``, you can reduce this down to:
.. code-block:: python

'dataset_loader_train': {
'__factory__': 'palladium.dataset.CSV',
'!': 'palladium.dataset.CSV',
'path': '{}/train.csv'.format(here),
'many': '...',
'more': {'...'},
Expand Down
2 changes: 1 addition & 1 deletion docs/user/deployment.rst
Expand Up @@ -421,7 +421,7 @@ startup:
.. code-block:: python

'oauth_init_app': {
'__factory__': 'myoauth.oauth.init_app',
'!': 'myoauth.oauth.init_app',
'app': 'palladium.server.app',
},

Expand Down
37 changes: 18 additions & 19 deletions docs/user/faq.rst
Expand Up @@ -156,7 +156,7 @@ passed at runtime.
'C': [0.1, 0.3, 1.0],
},
'cv': {
'__factory__': 'palladium.util.Partial',
'!': 'palladium.util.Partial',
'func': 'sklearn.cross_validation.StratifiedKFold',
'random_state': 0,
},
Expand All @@ -177,16 +177,16 @@ classifier:
.. code-block:: python

'grid_search': {
'__factory__': 'skopt.BayesSearchCV',
'!': 'skopt.BayesSearchCV',
'estimator': {'__copy__': 'model'},
'n_iter': 16,
'search_spaces': {
'C': {
'__factory__': 'skopt.space.Real',
'!': 'skopt.space.Real',
'low': 1e-6, 'high': 1e+1, 'prior': 'log-uniform',
},
'degree': {
'__factory__': 'skopt.space.Integer',
'!': 'skopt.space.Integer',
'low': 1, 'high': 20,
},
},
Expand All @@ -208,35 +208,34 @@ grid search:

.. code-block:: python

{
'grid_search': {
'__factory__': 'palladium.fit.with_parallel_backend',
'!': 'palladium.fit.with_parallel_backend',
'estimator': {
'__factory__': 'sklearn.model_selection.GridSearchCV',
'!': 'sklearn.model_selection.GridSearchCV',
'estimator': {'__copy__': 'model'},
'param_grid': {
'C': [0.1, 0.3, 1.0],
},
'n_jobs': -1,
'param_grid': {'__copy__': 'grid_search.param_grid'},
'scoring': {'__copy__': 'scoring'},
},
'backend': 'dask.distributed',
'scheduler_host': '127.0.0.1:8786',
'backend': 'dask',
},

'_init_distributed': {
'__factory__': 'palladium.util.resolve_dotted_name',
'dotted_name': 'distributed.joblib.joblib',
'_init_client': {
'!': 'dask.distributed.Client',
'address': '127.0.0.1:8786',
},
}

To start up the Dask scheduler and workers you can follow the
dask.distributed documentation. Here's an example that runs three
workers locally:
For details on how to set up Dask workers and a scheduler, please
consult the `Dask docs <https://docs.dask.org>`_. But here's how you
would start up a scheduler and three workers locally:

.. code-block:: bash

$ dask-scheduler
Scheduler started at 127.0.0.1:8786

$ dask-worker 127.0.0.1:8786
$ dask-worker 127.0.0.1:8786 # start each in a new terminal
$ dask-worker 127.0.0.1:8786
$ dask-worker 127.0.0.1:8786

Expand Down
2 changes: 1 addition & 1 deletion docs/user/julia.rst
Expand Up @@ -44,7 +44,7 @@ configuration in that example defines the model to be of type
.. code-block:: python

'model': {
'__factory__': 'palladium.julia.ClassificationModel',
'!': 'palladium.julia.ClassificationModel',
'fit_func': 'SVM.svm',
'predict_func': 'SVM.predict',
}
Expand Down
32 changes: 16 additions & 16 deletions docs/user/tutorial.rst
Expand Up @@ -149,7 +149,7 @@ defines the type of dataset loader we want to use. That is
.. code-block:: python

'dataset_loader_train': {
'__factory__': 'palladium.dataset.CSV',
'!': 'palladium.dataset.CSV',

The rest of what is inside the ``dataset_loader_train`` are the
keyword arguments that are used to initialize the
Expand All @@ -159,7 +159,7 @@ keyword arguments that are used to initialize the
.. code-block:: python

'dataset_loader_train': {
'__factory__': 'palladium.dataset.CSV',
'!': 'palladium.dataset.CSV',
'path': 'iris.data',
'names': [
'sepal length',
Expand Down Expand Up @@ -232,7 +232,7 @@ scikit-learn:
.. code-block:: python

'model': {
'__factory__': 'sklearn.linear_model.LogisticRegression',
'!': 'sklearn.linear_model.LogisticRegression',
'C': 0.3,
},

Expand Down Expand Up @@ -367,10 +367,10 @@ part of the configuration:
.. code-block:: python

'model_persister': {
'__factory__': 'palladium.persistence.CachedUpdatePersister',
'!': 'palladium.persistence.CachedUpdatePersister',
'update_cache_rrule': {'freq': 'HOURLY'},
'impl': {
'__factory__': 'palladium.persistence.Database',
'!': 'palladium.persistence.Database',
'url': 'sqlite:///iris-model.db',
},
},
Expand Down Expand Up @@ -407,9 +407,9 @@ model's version:
.. code-block:: python

'model_persister': {
'__factory__': 'palladium.persistence.CachedUpdatePersister',
'!': 'palladium.persistence.CachedUpdatePersister',
'impl': {
'__factory__': 'palladium.persistence.File',
'!': 'palladium.persistence.File',
'path': 'model-{version}.pickle',
},
},
Expand All @@ -420,9 +420,9 @@ models, you can use the RestPersister:
.. code-block:: python

'model_persister': {
'__factory__': 'palladium.persistence.CachedUpdatePersister',
'!': 'palladium.persistence.CachedUpdatePersister',
'impl': {
'__factory__': 'palladium.persistence.Rest',
'!': 'palladium.persistence.Rest',
'url': 'http://localhost:8081/artifactory/modelz/{version}',
'auth': ('username', 'passw0rd'),
},
Expand All @@ -440,7 +440,7 @@ endpoint. Let us take a look at how it is configured:
.. code-block:: python

'predict_service': {
'__factory__': 'palladium.server.PredictService',
'!': 'palladium.server.PredictService',
'mapping': [
('sepal length', 'float'),
('sepal width', 'float'),
Expand All @@ -450,7 +450,7 @@ endpoint. Let us take a look at how it is configured:
}

Again, the specific implementation of the ``predict_service`` that we
use is specified through the ``__factory__`` setting.
use is specified through the ``!`` setting.

The ``mapping`` defines which request parameters are to be expected.
In this example, we expect a ``float`` number for each of ``sepal
Expand Down Expand Up @@ -522,7 +522,7 @@ different entry points:
.. code-block:: python

'predict_service1': {
'__factory__': 'mypackage.server.PredictService',
'!': 'mypackage.server.PredictService',
'mapping': [
('sepal length', 'float'),
('sepal width', 'float'),
Expand All @@ -533,7 +533,7 @@ different entry points:
'decorator_list_name': 'predict_decorators',
}
'predict_service2': {
'__factory__': 'mypackage.server.PredictServiceID',
'!': 'mypackage.server.PredictServiceID',
'mapping': [
('id', 'int'),
],
Expand Down Expand Up @@ -590,7 +590,7 @@ entry in ``config.py`` to look like this:
.. code-block:: python

'model': {
'__factory__': 'iris.model',
'!': 'iris.model',
'clf__C': 0.3,
},

Expand All @@ -609,7 +609,7 @@ configuration file, e.g.:
.. code-block:: python

'model': {
'__factory__': 'sklearn.pipeline.Pipeline',
'steps': [['clf', {'__factory__': 'sklearn.linear_model.LinearRegression'}],
'!': 'sklearn.pipeline.Pipeline',
'steps': [['clf', {'!': 'sklearn.linear_model.LinearRegression'}],
],
},
8 changes: 4 additions & 4 deletions docs/user/web-service.rst
Expand Up @@ -31,7 +31,7 @@ configuration from the :ref:`tutorial`:
.. code-block:: python

'predict_service': {
'__factory__': 'palladium.server.PredictService',
'!': 'palladium.server.PredictService',
'mapping': [
('sepal length', 'float'),
('sepal width', 'float'),
Expand Down Expand Up @@ -106,7 +106,7 @@ there's a list of predictions that's returned:

Should a different output format be desired than the one implemented
by :class:`~palladium.interfaces.PredictService`, it is possible to use a
different class altogether by setting an appropriate ``__factory__``
different class altogether by setting an appropriate ``!``
(though that class will likely derive from
:class:`~palladium.interfaces.PredictService` for reasons of convenience).

Expand Down Expand Up @@ -271,13 +271,13 @@ endpoints is this:

'flask_add_url_rules': [
{
'__factory__': 'palladium.server.add_url_rule',
'!': 'palladium.server.add_url_rule',
'rule': '/fit',
'view_func': 'palladium.server.fit',
'methods': ['POST'],
},
{
'__factory__': 'palladium.server.add_url_rule',
'!': 'palladium.server.add_url_rule',
'rule': '/update-model-cache',
'view_func': 'palladium.server.update_model_cache',
'methods': ['POST'],
Expand Down
2 changes: 1 addition & 1 deletion examples/R/config-iris-dataset-from-python.py
Expand Up @@ -4,7 +4,7 @@

{
'dataset_loader_train': {
'__factory__': 'palladium.dataset.CSV',
'!': 'palladium.dataset.CSV',
'path': 'iris.data',
'names': [
'sepal length',
Expand Down