Skip to content

Commit 73602bd

Browse files
authored
Configuration for environment customization (#206)
1 parent 1dbb72c commit 73602bd

7 files changed

+146
-12
lines changed

.pipelines/azdo-abtest-pipeline.yml

+7-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,10 @@
11
# Pipeline for the canary deployment workflow.
2+
3+
resources:
4+
containers:
5+
- container: mlops
6+
image: mcr.microsoft.com/mlops/python:latest
7+
28
pr: none
39
trigger:
410
branches:
@@ -31,7 +37,7 @@ stages:
3137
timeoutInMinutes: 0
3238
pool:
3339
vmImage: 'ubuntu-latest'
34-
container: mcr.microsoft.com/mlops/python:latest
40+
container: mlops
3541
steps:
3642
- task: AzureCLI@1
3743
inputs:

.pipelines/azdo-pr-build-train.yml

+8-2
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,10 @@
11
# Pipeline to run basic code quality tests as part of pull requests to the master branch.
2+
3+
resources:
4+
containers:
5+
- container: mlops
6+
image: mcr.microsoft.com/mlops/python:latest
7+
28
trigger: none
39
pr:
410
branches:
@@ -8,11 +14,11 @@ pr:
814
pool:
915
vmImage: 'ubuntu-latest'
1016

11-
container: mcr.microsoft.com/mlops/python:latest
17+
container: mlops
1218

1319
variables:
1420
- template: diabetes_regression-variables.yml
1521
- group: devopsforai-aml-vg
1622

1723
steps:
18-
- template: azdo-base-pipeline.yml
24+
- template: azdo-base-pipeline.yml

.pipelines/diabetes_regression-ci-build-train.yml

+12-6
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,10 @@
11
# Continuous Integration (CI) pipeline that orchestrates the training, evaluation, registration, deployment, and testing of the diabetes_regression model.
2+
3+
resources:
4+
containers:
5+
- container: mlops
6+
image: mcr.microsoft.com/mlops/python:latest
7+
28
pr: none
39
trigger:
410
branches:
@@ -25,7 +31,7 @@ stages:
2531
jobs:
2632
- job: "Model_CI_Pipeline"
2733
displayName: "Model CI Pipeline"
28-
container: mcr.microsoft.com/mlops/python:latest
34+
container: mlops
2935
timeoutInMinutes: 0
3036
steps:
3137
- template: azdo-base-pipeline.yml
@@ -48,7 +54,7 @@ stages:
4854
- job: "Get_Pipeline_ID"
4955
condition: and(succeeded(), eq(coalesce(variables['auto-trigger-training'], 'true'), 'true'))
5056
displayName: "Get Pipeline ID for execution"
51-
container: mcr.microsoft.com/mlops/python:latest
57+
container: mlops
5258
timeoutInMinutes: 0
5359
steps:
5460
- task: AzureCLI@1
@@ -84,7 +90,7 @@ stages:
8490
dependsOn: "Run_ML_Pipeline"
8591
condition: always()
8692
displayName: "Determine if evaluation succeeded and new model is registered"
87-
container: mcr.microsoft.com/mlops/python:latest
93+
container: mlops
8894
timeoutInMinutes: 0
8995
steps:
9096
- template: diabetes_regression-template-get-model-version.yml
@@ -96,7 +102,7 @@ stages:
96102
jobs:
97103
- job: "Deploy_ACI"
98104
displayName: "Deploy to ACI"
99-
container: mcr.microsoft.com/mlops/python:latest
105+
container: mlops
100106
timeoutInMinutes: 0
101107
steps:
102108
- template: diabetes_regression-template-get-model-version.yml
@@ -129,7 +135,7 @@ stages:
129135
jobs:
130136
- job: "Deploy_AKS"
131137
displayName: "Deploy to AKS"
132-
container: mcr.microsoft.com/mlops/python:latest
138+
container: mlops
133139
timeoutInMinutes: 0
134140
steps:
135141
- template: diabetes_regression-template-get-model-version.yml
@@ -163,7 +169,7 @@ stages:
163169
jobs:
164170
- job: "Deploy_Webapp"
165171
displayName: "Deploy to Webapp"
166-
container: mcr.microsoft.com/mlops/python:latest
172+
container: mlops
167173
timeoutInMinutes: 0
168174
steps:
169175
- template: diabetes_regression-template-get-model-version.yml

.pipelines/diabetes_regression-ci-image.yml

+7-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,10 @@
11
# Builds the container image that is used by other pipelines for scoring.
2+
3+
resources:
4+
containers:
5+
- container: mlops
6+
image: mcr.microsoft.com/mlops/python:latest
7+
28
pr: none
39
trigger:
410
branches:
@@ -16,7 +22,7 @@ trigger:
1622
pool:
1723
vmImage: 'ubuntu-latest'
1824

19-
container: mcr.microsoft.com/mlops/python:latest
25+
container: mlops
2026

2127
variables:
2228
- group: devopsforai-aml-vg

bootstrap/README.md

+10
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@
22

33
To use this existing project structure and scripts for your new ML project, you can quickly get started from the existing repository, bootstrap and create a template that works for your ML project. Bootstrapping will prepare a similar directory structure for your project which includes renaming files and folders, deleting and cleaning up some directories and fixing imports and absolute path based on your project name. This will enable reusing various resources like pre-built pipelines and scripts for your new project.
44

5+
## Generating the project structure
6+
57
To bootstrap from the existing MLOpsPython repository clone this repository, ensure Python is installed locally, and run bootstrap.py script as below
68

79
`python bootstrap.py --d [dirpath] --n [projectname]`
@@ -11,3 +13,11 @@ Where `[dirpath]` is the absolute path to the root of your directory where MLOps
1113
The script renames folders, files and files' content from the base project name `diabetes` to your project name. However, you might need to manually rename variables defined in a variable group and their values.
1214

1315
[This article](https://docs.microsoft.com/azure/machine-learning/tutorial-convert-ml-experiment-to-production#use-your-own-model-with-mlopspython-code-template) will also assist to use this code template for your own ML project.
16+
17+
## Customizing the CI and AML environments
18+
19+
In your project you will want to customize your own Docker image and Conda environment to use only the dependencies and tools required for your use case. This requires you to edit the following environment definition files:
20+
- The Azure ML training and scoring Conda environment defined in [conda_dependencies.yml](diabetes_regression/conda_dependencies.yml).
21+
- The CI Docker image and Conda environment used by the Azure DevOps build agent. See [instructions for customizing the Azure DevOps job container](../docs/custom_container.md).
22+
23+
You will want to synchronize dependency versions as appropriate between both environment definitions (for example, ML libraries used both in training and in unit tests).

docs/custom_container.md

+99
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
# Customizing the Azure DevOps job container
2+
3+
The Model training and deployment pipeline uses a Docker container
4+
on the Azure Pipelines agents to provide a reproducible environment
5+
to run test and deployment code.
6+
The image of the container
7+
`mcr.microsoft.com/mlops/python:latest` is built with this
8+
[Dockerfile](../environment_setup/Dockerfile).
9+
10+
In your project you will want to build your own
11+
Docker image that only contains the dependencies and tools required for your
12+
use case. This image will be more likely smaller and therefore faster, and it
13+
will be totally maintained by your team.
14+
15+
## Provision an Azure Container Registry
16+
17+
An Azure Container Registry is deployed along your Azure ML Workspace to manage models.
18+
You can use that registry instance to store your MLOps container image as well, or
19+
provision a separate instance.
20+
21+
## Create a Registry Service Connection
22+
23+
[Create a service connection](https://docs.microsoft.com/en-us/azure/devops/pipelines/library/service-endpoints?view=azure-devops&tabs=yaml#sep-docreg) to your Azure Container Registry:
24+
- As *Connection type*, select *Docker Registry*
25+
- As *Registry type*, select *Azure Container Registry*
26+
- As *Azure container registry*, select your Container registry instance
27+
- As *Service connection name*, enter `acrconnection`
28+
29+
## Update the environment definition
30+
31+
Modify the [Dockerfile](../environment_setup/Dockerfile) and/or the
32+
[ci_dependencies.yml](../diabetes_regression/ci_dependencies.yml) CI Conda
33+
environment definition to tailor your environment.
34+
Conda provides a [reusable environment for training and deployment with Azure Machine Learning](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-use-environments).
35+
The Conda environment used for CI should use the same package versions as the Conda environment
36+
used for the Azure ML training and scoring environments (defined in [conda_dependencies.yml](../diabetes_regression/conda_dependencies.yml)).
37+
This enables you to run unit and integration tests using the exact same dependencies as used in the ML pipeline.
38+
39+
If a package is available in a Conda package repository, then we recommend that
40+
you use the Conda installation rather than the pip installation. Conda packages
41+
typically come with prebuilt binaries that make installation more reliable.
42+
43+
## Create a container build pipeline
44+
45+
In your [Azure DevOps](https://dev.azure.com) project create a new build
46+
pipeline referring to the
47+
[environment_setup/docker-image-pipeline.yml](../environment_setup/docker-image-pipeline.yml)
48+
pipeline definition in your forked repository.
49+
50+
Edit the [environment_setup/docker-image-pipeline.yml](../environment_setup/docker-image-pipeline.yml) file
51+
and modify the string `'public/mlops/python'` with an name suitable to describe your environment,
52+
e.g. `'mlops/diabetes_regression'`.
53+
54+
Save and run the pipeline. This will build and push a container image to your Azure Container Registry with
55+
the name you have just edited. The next step is to modify the build pipeline to run the CI job on a container
56+
run from that image.
57+
58+
## Modify the model pipeline
59+
60+
Modify the model pipeline file [diabetes_regression-ci-build-train.yml](../.pipelines/diabetes_regression-ci-build-train.yml) by replacing this section:
61+
62+
```
63+
resources:
64+
containers:
65+
- container: mlops
66+
image: mcr.microsoft.com/mlops/python:latest
67+
```
68+
69+
with (using the image name previously defined):
70+
71+
```
72+
resources:
73+
containers:
74+
- container: mlops
75+
image: mlops/diabetes_regression
76+
endpoint: acrconnection
77+
```
78+
79+
Run the pipeline and ensure your container has been used.
80+
81+
## Addressing conflicting dependencies
82+
83+
Especially when working in a team, it's possible for environment changes across branches to interfere with one another.
84+
85+
For example, if the master branch is using scikit-learn and you create a branch to use Tensorflow instead, and you
86+
decide to remove scikit-learn from the
87+
[ci_dependencies.yml](../diabetes_regression/ci_dependencies.yml) Conda environment definition
88+
and run the [docker-image-pipeline.yml](../environment_setup/docker-image-pipeline.yml) Docker image,
89+
then the master branch will stop building.
90+
91+
You could leave scikit-learn in addition to Tensorflow in the environment, but that is not ideal, as you would have to take an extra step to remove scikit-learn after merging your branch to master.
92+
93+
A better approach would be to use a distinct name for your modified environment, such as `mlops/diabetes_regression/tensorflow`.
94+
By changing the name of the image in your branch in both the container build pipeline
95+
[environment_setup/docker-image-pipeline.yml](../environment_setup/docker-image-pipeline.yml)
96+
and the model pipeline file
97+
[diabetes_regression-ci-build-train.yml](../.pipelines/diabetes_regression-ci-build-train.yml),
98+
and running both pipelines in sequence on your branch,
99+
you avoid any branch conflicts, and the name does not have to be changed after merging to master.

docs/getting_started.md

+3-2
Original file line numberDiff line numberDiff line change
@@ -157,7 +157,7 @@ performs linting, unit testing and publishes a training pipeline.
157157
### Set up the Pipeline
158158

159159
In your [Azure DevOps](https://dev.azure.com) project create and run a new build
160-
pipeline referring to the [diabetes_regression-ci-build-train.yml](./.pipelines/azdo-ci-build-train.yml)
160+
pipeline referring to the [diabetes_regression-ci-build-train.yml](../.pipelines/diabetes_regression-ci-build-train.yml)
161161
pipeline definition in your forked repository:
162162

163163
![configure ci build pipeline](./images/ci-build-pipeline-configure.png)
@@ -193,7 +193,7 @@ specified). Example ML pipelines using R have a single step to train a model. Th
193193

194194
* The third stage of the pipeline, **Deploy to ACI**, deploys the model to the QA environment in [Azure Container Instances](https://azure.microsoft.com/en-us/services/container-instances/). It then runs a *smoke test* to validate the deployment, i.e. sends a sample query to the scoring web service and verifies that it returns a response in the expected format.
195195

196-
The pipeline uses a Docker container on the Azure Pipelines agents to accomplish the pipeline steps. The image of the container ***mcr.microsoft.com/mlops/python:latest*** is built with this [Dockerfile](./environment_setup/Dockerfile) and it has all necessary dependencies installed for the purposes of this repository. This image serves as an example of using a custom Docker image that provides a pre-baked environment. This environment is guaranteed to be the same on any building agent, VM or local machine. In your project you will want to build your own Docker image that only contains the dependencies and tools required for your use case. This image will be more likely smaller and therefore faster, and it will be totally maintained by your team.
196+
The pipeline uses a Docker container on the Azure Pipelines agents to accomplish the pipeline steps. The image of the container ***mcr.microsoft.com/mlops/python:latest*** is built with this [Dockerfile](../environment_setup/Dockerfile) and it has all necessary dependencies installed for the purposes of this repository. This image serves as an example of using a custom Docker image that provides a pre-baked environment. This environment is guaranteed to be the same on any building agent, VM or local machine. In your project you will want to build your own Docker image that only contains the dependencies and tools required for your use case. This image will be more likely smaller and therefore faster, and it will be totally maintained by your team.
197197

198198
Wait until the pipeline finishes and verify that there is a new model in the **ML Workspace**:
199199

@@ -261,6 +261,7 @@ Make sure your webapp has the credentials to pull the image from the Azure Conta
261261
* The provided pipeline definition YAML file is a sample starting point, which you should tailor to your processes and environment.
262262
* You should edit the pipeline definition to remove unused stages. For example, if you are deploying to ACI and AKS, you should delete the unused `Deploy_Webapp` stage.
263263
* You may wish to enable [manual approvals](https://docs.microsoft.com/en-us/azure/devops/pipelines/process/approvals) before the deployment stages.
264+
* You may want to use [Azure DevOps self-hosted agents](https://docs.microsoft.com/en-us/azure/devops/pipelines/agents/agents?view=azure-devops&tabs=browser#install) to speed up your ML pipeline execution. The Docker container image for the ML pipeline is sizable, and having it cached on the agent between runs can trim several minutes from your runs.
264265
* You can install additional Conda or pip packages by modifying the YAML environment configurations under the `diabetes_regression` directory. Make sure to use fixed version numbers for all packages to ensure reproducibility, and use the same versions across environments.
265266
* You can explore aspects of model observability in the solution, such as:
266267
* **Logging**: navigate to the Application Insights instance linked to the Azure ML Portal,

0 commit comments

Comments
 (0)