Skip to content

Commit

Permalink
[Runtimes] Support code archives, enhance run cli and add CI/CD suppo…
Browse files Browse the repository at this point in the history
…rt and docs (#888)
  • Loading branch information
yaronha committed May 19, 2021
1 parent 17fd21e commit a38b6af
Show file tree
Hide file tree
Showing 20 changed files with 1,387 additions and 251 deletions.
Binary file added docs/_static/images/git-pipeline.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
90 changes: 90 additions & 0 deletions docs/ci-pipeline.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
# Integrating with CI Pipelines

Users may want to run their ML Pipelines using CI frameworks like Github Actions, GitLab CI/CD, etc.
MLRun support simple and native integration with the CI systems, see the following example in which we combine
local code (from the repository) with MLRun marketplace functions to build an automated ML pipeline which:

* runs data preparation
* train a model
* test the trained model
* deploy the model into a cluster
* test the deployed model

The pipeline uses the `RunNotifications` class for reporting the tracking information into the Git dashboard (as PR comments) and/or to Slack
, note that the same pipeline script can be executed locally (just comment out the `notifier.git_comment()` line or place it under `if` condition)

```python
# MLRun CI Example
# ================
# this code can run in the IDE or inside a CI/CD script (Github Actions or Gitlab CI/CD)
# and require setting the following env vars (can be done in the CI system):
#
# MLRUN_DBPATH - url of the mlrun cluster
# V3IO_USERNAME - username in the remote iguazio cluster
# V3IO_ACCESS_KEY - access key to the remote iguazio cluster
# GIT_TOKEN or GITHUB_TOKEN - Github/Gitlab API Token (will be set automatically in Github Actions)
# SLACK_WEBHOOK - optional, Slack API key when using slack notifications
#

import json
from mlrun.utils import RunNotifications
import mlrun
from mlrun.platforms import auto_mount

project = "ci"
mlrun.set_environment(project=project)

# create notification object (console, Git, Slack as outputs) and push start message
notifier = RunNotifications(with_slack=True).print()
# use the following line only when running inside Github actions or Gitlab CI
notifier.git_comment()

notifier.push_start_message(project)

# define and run a local data prep function
data_prep_func = mlrun.code_to_function("prep-data", filename="../scratch/prep_data.py", kind="job",
image="mlrun/mlrun", handler="prep_data").apply(auto_mount())

# Set the source-data URL
source_url = 'https://s3.wasabisys.com/iguazio/data/iris/iris.data.raw.csv'
prep_data_run = data_prep_func.run(name='prep_data', inputs={'source_url': source_url})

# train the model using a library (hub://) function and the generated data
train = mlrun.import_function('hub://sklearn_classifier').apply(auto_mount())
train_run = train.run(name='train',
inputs={'dataset': prep_data_run.outputs['cleaned_data']},
params={'model_pkg_class': 'sklearn.linear_model.LogisticRegression',
'label_column': 'label'})

# test the model using a library (hub://) function and the generated model
test = mlrun.import_function('hub://test_classifier').apply(auto_mount())
test_run = test.run(name="test",
params={"label_column": "label"},
inputs={"models_path": train_run.outputs['model'],
"test_set": train_run.outputs['test_set']})

# push results via notification to Git, Slack, ..
notifier.push_run_results([prep_data_run, train_run, test_run])

# Create model serving function using the new model
serve = mlrun.import_function('hub://v2_model_server').apply(auto_mount())
model_name = 'iris'
serve.add_model(model_name, model_path=train_run.outputs['model'])
addr = serve.deploy()

notifier.push(f"model {model_name} is deployed at {addr}")

# test the model serving function
inputs = [[5.1, 3.5, 1.4, 0.2],
[7.7, 3.8, 6.7, 2.2]]
my_data = json.dumps({'inputs': inputs})
serve.invoke(f'v2/models/{model_name}/infer', my_data)

notifier.push(f"model {model_name} test passed Ok")
```

**The results will appear in the CI system in the following way:**

<img src="./_static/images/git-pipeline.png" alt="mlrun-architecture" width="800"/><br>


2 changes: 1 addition & 1 deletion docs/hyper-params.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
"\n",
"MLRun iterations can be viewed as child runs under the main task/run, each child run will get a set of parameters which will be computed/selected from the input hyper parameters based on the chosen strategy (Grid, List, Random or Custom).\n",
"\n",
"The hyper parameters and options are specified in the `task` or the `function.run()` command through the `hyperparams` (for hyper param values) and `hyper_param_options` (for {py:class}`~mlrun.model.HyperParamOptions`) properties, see examples below. hyper parameters can also be loaded directly from a CSV or Json file (by setting the `param_file` hyper option).\n",
"The hyper parameters and options are specified in the `task` or the {py:meth}`~mlrun.runtimes.BaseRuntime.run` command through the `hyperparams` (for hyper param values) and `hyper_param_options` (for {py:class}`~mlrun.model.HyperParamOptions`) properties, see examples below. hyper parameters can also be loaded directly from a CSV or Json file (by setting the `param_file` hyper option).\n",
"\n",
"The hyper params are specified as a struct of `key: list` values for example: `{\"p1\": [1,2,3], \"p2\": [10,20]}`, the values can be of any type (int, string, float, ..), the list are used to compute the parameter combinations using one of the following strategies: \n",
"1. Grid Search (`grid`) - running all the parameter combinations\n",
Expand Down
22 changes: 9 additions & 13 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -59,11 +59,18 @@ Table Of Content

.. toctree::
:maxdepth: 1
:caption: ML Pipelines:
:caption: Functions and ML Pipelines:

job-submission-and-tracking
runtimes/functions
hyper-params
projects
ci-pipeline
load-from-marketplace

.. toctree::
:maxdepth: 1
:caption: Online Pipelines & Serving:

serving/index
model_monitoring/model-monitoring-deployment

Expand All @@ -77,17 +84,6 @@ Table Of Content
feature-store/basic-demo
feature-store/end-to-end-demo/index

.. toctree::
:maxdepth: 1
:caption: Serverless Runtimes:

runtimes/functions
runtimes/mlrun_jobs
runtimes/dask-overview
runtimes/horovod
runtimes/spark-operator
load-from-marketplace

.. toctree::
:maxdepth: 1
:caption: Artifact Management:
Expand Down

0 comments on commit a38b6af

Please sign in to comment.