Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
128 commits
Select commit Hold shift + click to select a range
df678e5
Pipeline to use container
dtzar Jun 26, 2019
9cbcf82
Add Dockerfile
dtzar Jun 26, 2019
c93bea3
add latest tag
dtzar Jun 26, 2019
d14c4b1
Add docker image build pipeline
dtzar Jun 26, 2019
79de271
Merge pull request #36 from dtzar/dockerize
eedorenko Jul 29, 2019
634cdfa
new dockerfile
Jul 29, 2019
70c444d
added .env.example. Started replacing parameters
tarockey Jul 29, 2019
e229097
Pipeline building a docker image
Jul 30, 2019
176319a
Update docker-image-pipeline.yml
eedorenko Jul 30, 2019
6d87040
Pipeline building an image
Jul 30, 2019
feceb38
Resolving conflicts
Jul 30, 2019
53c6127
resolving conflicts
Jul 30, 2019
a871665
Additional env updates - no deployment tested
tarockey Jul 30, 2019
23f17d8
iac pipelines
Jul 30, 2019
ce8b8d7
Add IaC Pipelines #40
dtzar Jul 31, 2019
791e1b5
URL to the image on Dockerhub
Jul 31, 2019
e22f243
Merge pull request #38 from microsoft/eedorenko/mlops-docker-image
eedorenko Jul 31, 2019
b327ac4
intial repo
sudivate Aug 1, 2019
f025b11
merged from upgrade
sudivate Aug 1, 2019
34e6af7
first yry
Aug 1, 2019
792611f
Update build-train.yml for Azure Pipelines
eedorenko Aug 1, 2019
18bbb53
Update build-train.yml for Azure Pipelines
eedorenko Aug 1, 2019
d99d873
fix
Aug 1, 2019
b1d0fa2
Merge branch 'eedorenko/code-structure-from-pepsi' of https://github.…
Aug 1, 2019
370ea66
fix
Aug 1, 2019
4b60faa
looks like working
Aug 2, 2019
21dfdb0
added static path to workspace from config
tarockey Aug 5, 2019
42c724e
fix
Aug 5, 2019
cd874fe
Update build-train.yml for Azure Pipelines
eedorenko Aug 5, 2019
851d653
Update build-train.yml for Azure Pipelines
eedorenko Aug 5, 2019
ab40567
Update build-train.yml for Azure Pipelines
eedorenko Aug 5, 2019
5db8b1d
Update build-train.yml for Azure Pipelines
eedorenko Aug 5, 2019
8b26198
Update build-train.yml for Azure Pipelines
eedorenko Aug 5, 2019
c9c10d2
Update build-train.yml for Azure Pipelines
eedorenko Aug 5, 2019
61854b1
Update build-train.yml for Azure Pipelines
eedorenko Aug 5, 2019
aff1b46
Update build-train.yml for Azure Pipelines
eedorenko Aug 5, 2019
d2334a5
Update build-train.yml for Azure Pipelines
eedorenko Aug 5, 2019
5c8ca99
Update build-train.yml for Azure Pipelines
eedorenko Aug 5, 2019
1740c25
Update build-train.yml for Azure Pipelines
eedorenko Aug 5, 2019
0365a75
src dirs to variables
Aug 5, 2019
551879e
nbase pipeline template
Aug 5, 2019
8c90b8a
working folder for linting
Aug 5, 2019
f25addf
correct image name
Aug 6, 2019
141257e
added flake to requirements
Aug 6, 2019
d29242e
flake troubleshooting
Aug 6, 2019
bd31cb7
flake troubleshooting
Aug 6, 2019
d1c494a
flake troubleshooting
Aug 6, 2019
54dcf26
flake troubleshooting
Aug 6, 2019
7e6a2a0
cleaning up
Aug 6, 2019
5333249
cleaning
Aug 6, 2019
4bf1bc6
old pipeline doesn't use container
Aug 6, 2019
e12735e
Dummy code in old pipeline
Aug 6, 2019
9b201ac
Merge pull request #42 from microsoft/eedorenko/code-structure-from-p…
eedorenko Aug 6, 2019
7d26ead
removed config.
tarockey Aug 6, 2019
83ace0f
merged conflicts - fixed reqs. removed cust name
tarockey Aug 6, 2019
f2c72e1
Merge pull request #43 from dtzar/tarockey/dotenv
eedorenko Aug 6, 2019
c75b8fe
Enable and fix linting (#44)
eedorenko Aug 7, 2019
80e063e
Add & enable unit tests (#45)
eedorenko Aug 7, 2019
b0d53b7
Use base pipeline template (#46)
eedorenko Aug 8, 2019
01bd306
publishoing artifacts
Aug 8, 2019
8973c17
Merge branch 'upgrade' into eedorenko/retrain-deploy-model
Aug 8, 2019
fea510a
linting
Aug 8, 2019
c91f2c6
fix
Aug 8, 2019
db72b47
fix
Aug 8, 2019
9d2ecc5
json to artifacts
Aug 9, 2019
de3f1a4
Update iac-create-environment.yml for Azure Pipelines
eedorenko Aug 9, 2019
d42acaa
score files to the artifact
Aug 9, 2019
ea22125
scoring files to the artifacts
Aug 9, 2019
ba54996
fixing artifacts path
Aug 9, 2019
ba32d11
artifacts path
Aug 9, 2019
1b9be89
artifacts folders
Aug 9, 2019
bbf2a2f
fixing scoring issues
Aug 9, 2019
0b68535
linting score
Aug 9, 2019
2bb53b5
linting score
Aug 9, 2019
a51e5a1
scoring dependencies
Aug 10, 2019
808104a
folders refactoring
Aug 12, 2019
d49770e
folders refactoring
Aug 12, 2019
91981ff
connection name
Aug 12, 2019
46389e6
arm template fix
Aug 12, 2019
03beb1f
base pipeline got renamed
Aug 12, 2019
9df780c
unit test fix
Aug 12, 2019
82b7eb5
fix
Aug 12, 2019
d13f470
fix
Aug 12, 2019
0abdb5d
getting started
Aug 12, 2019
a460ab0
removed garbage
Aug 12, 2019
c298a61
removed extra line from gitignore
Aug 12, 2019
6c4bb9d
Merge pull request #47 from microsoft/eedorenko/retrain-deploy-model
eedorenko Aug 12, 2019
033dbbf
getting started update
Aug 13, 2019
2afe237
update IaC pipelines
Aug 13, 2019
5a1e994
document update progress
Aug 13, 2019
3459d7f
document update progress
Aug 13, 2019
67fd322
update document progress
Aug 13, 2019
d75895c
adjust image size
Aug 13, 2019
03b94ca
document update progress
Aug 13, 2019
9c43038
azure-cli to requirements
Aug 13, 2019
5f300f4
document update progress
Aug 13, 2019
e9fb924
document update progress
Aug 13, 2019
4eca7b1
document update progress
Aug 14, 2019
3c129b8
Update getting_started.md
eedorenko Aug 14, 2019
5c7ebe4
Update getting_started.md
eedorenko Aug 14, 2019
8fc0ba1
Update getting_started.md
eedorenko Aug 14, 2019
5a05f00
Update getting_started.md
eedorenko Aug 14, 2019
8b5b76a
Update getting_started.md
eedorenko Aug 14, 2019
a50b0d4
Update getting_started.md
eedorenko Aug 14, 2019
bd2a43b
Update getting_started.md
eedorenko Aug 14, 2019
f930580
Update getting_started.md
eedorenko Aug 14, 2019
e433ca8
Update getting_started.md
eedorenko Aug 14, 2019
a4f44bb
Update getting_started.md
eedorenko Aug 14, 2019
d4b343e
readme update
Aug 14, 2019
62377c5
Merge branch 'eedorenko/documentation-update' of https://github.com/m…
Aug 14, 2019
d421847
Update README.md
eedorenko Aug 14, 2019
5897a31
readme update
Aug 14, 2019
569b4ed
Merge branch 'eedorenko/documentation-update' of https://github.com/m…
Aug 14, 2019
3f6f056
update document progress
Aug 14, 2019
2b122c5
update documentation progress
Aug 15, 2019
4d34ace
azure-cli library update
Aug 15, 2019
71208d6
docker image update
Aug 15, 2019
9736e0a
docker image update
Aug 15, 2019
7b7e382
liniting
Aug 15, 2019
5294441
Update getting_started.md
eedorenko Aug 16, 2019
73db451
image update
Aug 16, 2019
0c3688d
Merge branch 'eedorenko/documentation-update' of https://github.com/m…
Aug 16, 2019
c8c2824
minor add and typo fix
dtzar Aug 16, 2019
48718f3
duplicate file paths and `code` snippet for file paths
eedorenko Aug 16, 2019
2d4bdee
Model Deploy tasks parameters in tables
eedorenko Aug 16, 2019
c0860e5
tables with task parameters uopdate
eedorenko Aug 16, 2019
32a6a91
Merge pull request #48 from microsoft/eedorenko/documentation-update
eedorenko Aug 16, 2019
f6f4427
Eedorenko/yaml header (#49)
eedorenko Aug 16, 2019
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 36 additions & 0 deletions .env.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# Azure Subscription Variables
WORKSPACE_NAME = ''
RESOURCE_GROUP = ''
SUBSCRIPTION_ID = ''
LOCATION = ''
TENANT_ID = ''

# Azure ML Workspace Variables
EXPERIMENT_NAME = ''
SCRIPT_FOLDER = './'
BLOB_STORE_NAME = ''
# Remote VM Config
REMOTE_VM_NAME = ''
REMOTE_VM_USERNAME = ''
REMOTE_VM_PASSWORD = ''
REMOTE_VM_IP = ''
# AML Compute Cluster Config
AML_CLUSTER_NAME = ''
AML_CLUSTER_VM_SIZE = ''
AML_CLUSTER_MAX_NODES = ''
AML_CLUSTER_MIN_NODES = ''
AML_CLUSTER_PRIORITY = 'lowpriority'
# Training Config
MODEL_NAME = ''
# AML Pipeline Config
TRAINING_PIPELINE_NAME = ''
PIPELINE_CONDA_PATH = 'aml_config/conda_dependencies.yml'
MODEL_PATH = ''
# Image config
IMAGE_NAME = ''
IMAGE_DESCRIPTION = ''
IMAGE_VERSION = ''
# ACI Config
ACI_CPU_CORES = ''
ACI_MEM_GB = ''
ACI_DESCRIPTION = ''
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -103,3 +103,5 @@ venv.bak/

# mypy
.mypy_cache/

.DS_Store
26 changes: 26 additions & 0 deletions .pipelines/azdo-base-pipeline.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# this pipeline should be ignored for now
parameters:
pipelineType: 'training'

steps:
- script: |
flake8 --output-file=$(Build.BinariesDirectory)/lint-testresults.xml --format junit-xml
workingDirectory: '$(Build.SourcesDirectory)'
displayName: 'Run code quality tests'
enabled: 'true'

- script: |
pytest --junitxml=$(Build.BinariesDirectory)/unit-testresults.xml $(Build.SourcesDirectory)/tests/unit
displayName: 'Run unit tests'
enabled: 'true'
env:
SP_APP_SECRET: '$(SP_APP_SECRET)'

- task: PublishTestResults@2
condition: succeededOrFailed()
inputs:
testResultsFiles: '$(Build.BinariesDirectory)/*-testresults.xml'
testRunTitle: 'Linting & Unit tests'
failTaskOnFailedTests: true
displayName: 'Publish linting and unit test results'
enabled: 'true'
45 changes: 45 additions & 0 deletions .pipelines/azdo-ci-build-train.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
pr: none
trigger:
branches:
include:
- master

pool:
vmImage: 'ubuntu-latest'

container: mcr.microsoft.com/mlops/python:latest


variables:
- group: devopsforai-aml-vg


steps:
- template: azdo-base-pipeline.yml

- bash: |
# Invoke the Python building and publishing a training pipeline
python3 $(Build.SourcesDirectory)/ml_service/pipelines/build_train_pipeline.py
failOnStderr: 'false'
env:
SP_APP_SECRET: '$(SP_APP_SECRET)'
displayName: 'Train model using AML with Remote Compute'
enabled: 'true'

- task: CopyFiles@2
displayName: 'Copy Files to: $(Build.ArtifactStagingDirectory)'
inputs:
SourceFolder: '$(Build.SourcesDirectory)'
TargetFolder: '$(Build.ArtifactStagingDirectory)'
Contents: |
ml_service/pipelines/?(run_train_pipeline.py|*.json)
code/scoring/**


- task: PublishBuildArtifacts@1
displayName: 'Publish Artifact'
inputs:
ArtifactName: 'mlops-pipelines'
publishLocation: 'container'
pathtoPublish: '$(Build.ArtifactStagingDirectory)'
TargetPath: '$(Build.ArtifactStagingDirectory)'
18 changes: 18 additions & 0 deletions .pipelines/azdo-pr-build-train.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
trigger: none
pr:
branches:
include:
- master

pool:
vmImage: 'ubuntu-latest'

container: mcr.microsoft.com/mlops/python:latest


variables:
- group: devopsforai-aml-vg


steps:
- template: azdo-base-pipeline.yml
32 changes: 14 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,18 @@
---
page_type: sample
languages:
- python
products:
- azure
- azure-machine-learning-service
- azure-devops
---

# MLOps with Azure ML


[![Build Status](https://dev.azure.com/customai/DevopsForAI-AML/_apis/build/status/Microsoft.MLOpsPython?branchName=master)](https://dev.azure.com/customai/DevopsForAI-AML/_build/latest?definitionId=25&branchName=master)

### Author: Praneet Solanki | Richin Jain

MLOps will help you to understand how to build the Continuous Integration and Continuous Delivery pipeline for a ML/AI project. We will be using the Azure DevOps Project for build and release/deployment pipelines along with Azure ML services for model retraining pipeline, model management and operationalization.

Expand All @@ -25,20 +34,15 @@ To deploy this solution in your subscription, follow the manual instructions in

This reference architecture shows how to implement continuous integration (CI), continuous delivery (CD), and retraining pipeline for an AI application using Azure DevOps and Azure Machine Learning. The solution is built on the scikit-learn diabetes dataset but can be easily adapted for any AI scenario and other popular build systems such as Jenkins and Travis.

![Architecture](/docs/images/Architecture_DevOps_AI.png)
![Architecture](/docs/images/main-flow.png)


## Architecture Flow

### Train Model
1. Data Scientist writes/updates the code and push it to git repo. This triggers the Azure DevOps build pipeline (continuous integration).
2. Once the Azure DevOps build pipeline is triggered, it runs following types of tasks:
- Run for new code: Every time new code is committed to the repo, the build pipeline performs data sanity tests and unit tests on the new code.
- One-time run: These tasks runs only for the first time the build pipeline runs. It will programatically create an [Azure ML Service Workspace](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-azure-machine-learning-architecture#workspace), provision [Azure ML Compute](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-set-up-training-targets#amlcompute) (used for model training compute), and publish an [Azure ML Pipeline](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-ml-pipelines). This published Azure ML pipeline is the model training/retraining pipeline.

> Note: The Publish Azure ML pipeline task currently runs for every code change

3. The Azure ML Retraining pipeline is triggered once the Azure DevOps build pipeline completes. All the tasks in this pipeline runs on Azure ML Compute created earlier. Following are the tasks in this pipeline:
2. Once the Azure DevOps build pipeline is triggered, it performs code quality checks, data sanity tests, unit tests, builds an [Azure ML Pipeline](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-ml-pipelines) and publishes it in an [Azure ML Service Workspace](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-azure-machine-learning-architecture#workspace).
3. The [Azure ML Pipeline](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-ml-pipelines) is triggered once the Azure DevOps build pipeline completes. All the tasks in this pipeline runs on Azure ML Compute. Following are the tasks in this pipeline:

- **Train Model** task executes model training script on Azure ML Compute. It outputs a [model](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-azure-machine-learning-architecture#model) file which is stored in the [run history](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-azure-machine-learning-architecture#run).

Expand All @@ -50,16 +54,8 @@ This reference architecture shows how to implement continuous integration (CI),

Once you have registered your ML model, you can use Azure ML + Azure DevOps to deploy it.

The **Package Model** task packages the new model along with the scoring file and its python dependencies into a [docker image](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-azure-machine-learning-architecture#image) and pushes it to [Azure Container Registry](https://docs.microsoft.com/en-us/azure/container-registry/container-registry-intro). This image is used to deploy the model as [web service](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-azure-machine-learning-architecture#web-service).

The **Deploy Model** task handles deploying your Azure ML model to the cloud (ACI or AKS).
This pipeline deploys the model scoring image into Staging/QA and PROD environments.

In the Staging/QA environment, one task creates an [Azure Container Instance](https://docs.microsoft.com/en-us/azure/container-instances/container-instances-overview) and deploys the scoring image as a [web service](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-azure-machine-learning-architecture#web-service) on it.

The second task invokes the web service by calling its REST endpoint with dummy data.
[Azure DevOps release pipeline](https://docs.microsoft.com/en-us/azure/devops/pipelines/release/?view=azure-devops) packages the new model along with the scoring file and its python dependencies into a [docker image](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-azure-machine-learning-architecture#image) and pushes it to [Azure Container Registry](https://docs.microsoft.com/en-us/azure/container-registry/container-registry-intro). This image is used to deploy the model as [web service](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-azure-machine-learning-architecture#web-service) across QA and Prod environments. The QA environment is running on top of [Azure Container Instances (ACI)](https://azure.microsoft.com/en-us/services/container-instances/) and the Prod environemt is built with [Azure Kubernetes Service (AKS)](https://docs.microsoft.com/en-us/azure/aks/intro-kubernetes).

5. The deployment in production is a [gated release](https://docs.microsoft.com/en-us/azure/devops/pipelines/release/approvals/gates?view=azure-devops). This means that once the model web service deployment in the Staging/QA environment is successful, a notification is sent to approvers to manually review and approve the release. Once the release is approved, the model scoring web service is deployed to [Azure Kubernetes Service(AKS)](https://docs.microsoft.com/en-us/azure/aks/intro-kubernetes) and the deployment is tested.

### Repo Details

Expand Down
50 changes: 0 additions & 50 deletions aml_config/conda_dependencies.yml

This file was deleted.

6 changes: 0 additions & 6 deletions aml_config/config.json

This file was deleted.

15 changes: 0 additions & 15 deletions aml_config/security_config.json

This file was deleted.

64 changes: 0 additions & 64 deletions aml_service/00-WorkSpace.py

This file was deleted.

44 changes: 0 additions & 44 deletions aml_service/01-Experiment.py

This file was deleted.

Loading