This guide was inspired by the Azure MLOPs (v2) solution accelerator, and the goal is to help you understand all the steps involved in building the foundation of an ML environment with MLOps.
Check the MLOPs Solution Accelerator (v2) repository for more information:
Azure MLOps (v2) solution accelerator
In the top-right corner of the page, click Fork
Select an owner for the forked repository, optionally, add a description of your fork, and click Create fork.
If you need help setting this up, check the link below:
Open a New Terminal
Use the root folder for this demo
az login
IMPORTANT! Update the $resource_sufix parameter and $subscriptionId before setting the environment variables (executing the env.ps1)
. .\scripts\env.ps1
Check at least one of the variables to make sure the environment variables are set
Write-Output $resource_sufix
Set the default subscription id
az account set --subscription $subscriptionId
az group create -l $resource_region -n $resource_group_ml
01 - Create Dev Workspace
az ml workspace create --resource-group $resource_group_ml --name $workspace01 --location $resource_region --display-name "Dev Workspace"
02 - Create Test Workspace
az ml workspace create --resource-group $resource_group_ml --name $workspace02 --location $resource_region --display-name "Test Workspace"
03 - Create Prod Workspace
az ml workspace create --resource-group $resource_group_ml --name $workspace03 --location $resource_region --display-name "Prod Workspace"
You should see this in your RG after this step
Create the storage account group
az group create -l $resource_region -n $resource_group_stg
Create a storage account
az storage account create --name $storage_name --resource-group $resource_group_stg --location $resource_region --sku Standard_ZRS --kind StorageV2 --enable-hierarchical-namespace true
Important: Storage account names are unique. Make sure to use a different sufix in a new demo
Execute the cmd below. It will store the ID if the managed identity in the $managed_identity_id
$managed_identity_id=$(az identity create -n $managed_identity_mlgroup --query id -o tsv -g $resource_group_ml)
Dev:
az ml compute create -f ./compute/computedev.yml --workspace-name $workspace01 --resource-group $resource_group_ml --identity-type user_assigned --user-assigned-identities $managed_identity_id
Test:
az ml compute create -f ./compute/computetest.yml --workspace-name $workspace02 --resource-group $resource_group_ml --identity-type user_assigned --user-assigned-identities $managed_identity_id
Prod:
az ml compute create -f ./compute/computeprod.yml --workspace-name $workspace03 --resource-group $resource_group_ml --identity-type user_assigned --user-assigned-identities $managed_identity_id
Grant access on the Storage Account you will use for the demo:
$storage_acc_id=$(az storage account show --name $storage_name --resource-group $resource_group_stg --query id -o tsv)
$managed_identity_principal_id=$(az identity show --name $managed_identity_mlgroup --resource-group $resource_group_ml --query principalId -o tsv)
az role assignment create --role "Storage Blob Data Owner" --assignee-object-id $managed_identity_principal_id --scope $storage_acc_id
Grant access to the AML Workspaces managed identities:
$workspace01spID=$(az resource list -n $workspace01 --resource-group $resource_group_ml --query [*].identity.principalId --out tsv)
$workspace02spID=$(az resource list -n $workspace02 --resource-group $resource_group_ml --query [*].identity.principalId --out tsv)
$workspace03spID=$(az resource list -n $workspace03 --resource-group $resource_group_ml --query [*].identity.principalId --out tsv)
az role assignment create --role "Storage Blob Data Owner" --assignee-object-id $workspace01spID --scope $storage_acc_id
az role assignment create --role "Storage Blob Data Owner" --assignee-object-id $workspace02spID --scope $storage_acc_id
az role assignment create --role "Storage Blob Data Owner" --assignee-object-id $workspace03spID --scope $storage_acc_id
Also give access to your own id
$selfid=$(az ad signed-in-user show --query id -o tsv)
az role assignment create --role "Storage Blob Data Owner" --assignee-object-id $selfid --scope $storage_acc_id
Storage Access Control screenshot
az storage container create --name mlopsdemodev --account-name $storage_name --resource-group $resource_group_stg
az storage container create --name mlopsdemotest --account-name $storage_name --resource-group $resource_group_stg
az storage container create --name mlopsdemoprod --account-name $storage_name --resource-group $resource_group_stg
Upload the csv file that will be used in batch deployment to the proper directory
az storage azcopy blob upload -c mlopsdemotest --account-name $storage_name -s "data/taxi-batch.csv" -d "taxibatch/taxi-batch.csv"
az storage azcopy blob upload -c mlopsdemotest --account-name $storage_name -s "data/taxi-request.json" -d "taxioutput/taxi-request.json"
Repeat for prod container
az storage azcopy blob upload -c mlopsdemoprod --account-name $storage_name -s "data/taxi-batch.csv" -d "taxibatch/taxi-batch.csv"
az storage azcopy blob upload -c mlopsdemoprod --account-name $storage_name -s "data/taxi-request.json" -d "taxioutput/taxi-request.json"
In this step you will run a job in the Dev workspace and register a model. This model will be later transfered to Test and Prod workspaces in the following steps.
az ml environment create --file ./dev/train-env.yml --workspace-name $workspace01 --resource-group $resource_group_ml
az ml job create --file ./dev/pipeline.yml --resource-group $resource_group_ml --workspace-name $workspace01
After this command, a pipeline will be triggered in the Dev workspace. The result of this execution is a model being registered in the Dev workspace.
az ml environment create --file ./test/test-env.yml --workspace-name $workspace02 --resource-group $resource_group_ml
Datastore
az ml datastore create --file ./test/data-store.yml --workspace-name $workspace02 --resource-group $resource_group_ml --set account_name=$storage_name
Data Asset
az ml data create -f ./test/file-data-asset.yml --workspace-name $workspace02 --resource-group $resource_group_ml
az ml model download --name taxi-model-mlops-demo --version 1 --resource-group $resource_group_ml --workspace-name $workspace01 --download-path ./model
az ml model create --name taxi-test-model-mlops-demo --version 1 --path ./model/taxi-model-mlops-demo --resource-group $resource_group_ml --workspace-name $workspace02
$endpoint_name_test = "taxifare-b-mldemo-t-$resource_sufix"
az ml batch-endpoint create --file ./test/batch-endpoint-test.yml --resource-group $resource_group_ml --workspace-name $workspace02 --set name=$endpoint_name_test
az ml batch-deployment create --file ./test/batch-deployment-test.yml --resource-group $resource_group_ml --workspace-name $workspace02 --set endpoint_name=$endpoint_name_test
az ml batch-endpoint invoke --name $endpoint_name_test --deployment-name batch-dp-mlopsdemo-test --input-type uri_file --input azureml://datastores/mlopsdemotestcointainer/paths/taxibatch/taxi-batch.csv --resource-group $resource_group_ml --workspace-name $workspace02 --output-path azureml://datastores/mlopsdemotestcointainer/paths/taxioutput
This command will invoke a job, that will use the deployed model in the test workspace, and generate the results from the data in the taxi-batch.csv in the taxioutput folder in the test container
Now you can verify the results and analyze the performance of the model using shadow production data.
az ml environment create --file ./prod/prod-env.yml --workspace-name $workspace03 --resource-group $resource_group_ml
Datastore
az ml datastore create --file ./prod/data-store.yml --workspace-name $workspace03 --resource-group $resource_group_ml --set account_name=$storage_name
Data Asset
az ml data create -f ./prod/file-data-asset.yml --workspace-name $workspace03 --resource-group $resource_group_ml
Already done in Test step.
az ml model create --name taxi-prod-model-mlops-demo --version 1 --path ./model/taxi-model-mlops-demo --resource-group $resource_group_ml --workspace-name $workspace03
$endpoint_name_prod = "taxifare-b-mldemo-p-$resource_sufix"
az ml batch-endpoint create --file ./prod/batch-endpoint-prod.yml --resource-group $resource_group_ml --workspace-name $workspace03 --set name=$endpoint_name_prod
az ml batch-deployment create --file ./prod/batch-deployment-prod.yml --resource-group $resource_group_ml --workspace-name $workspace03 --set endpoint_name=$endpoint_name_prod
az ml batch-endpoint invoke --name $endpoint_name_prod --deployment-name batch-dp-mlopsdemo-prod --input-type uri_file --input azureml://datastores/mlopsdemoprodcointainer/paths/taxibatch/taxi-batch.csv --resource-group $resource_group_ml --workspace-name $workspace03 --output-path azureml://datastores/mlopsdemoprodcointainer/paths/taxioutput
We expect to get the same results in the Test Workspace and Production Workspace in this demo, but the idea is that the file in the prod container is the actual production data, as the file in the test container is shadow production data, which means some actual data that was selected to test the model
The Development, Test and Production environment in a real use case will be used with different datasets
You'll need to create an Azure Active Directory application and service principal and then assign a role on your subscription to your application so that your workflow has access to your subscription
You will create one Service Principal per environment
$githubapp_dev="gitAppdev$resource_sufix"
$githubapp_test="gitApptest$resource_sufix"
$githubapp_prod="gitAppprod$resource_sufix"
az ad app create --display-name $githubapp_dev
az ad app create --display-name $githubapp_test
az ad app create --display-name $githubapp_prod
$githubapp_dev_cid=$(az ad app list --display-name $githubapp_dev --query [*].appId -o tsv)
$githubapp_dev_oid=$(az ad app list --display-name $githubapp_dev --query [*].id -o tsv)
az ad sp create --id $githubapp_dev_cid
$githubapp_dev_assigneeid=$(az ad sp show --id $githubapp_dev_cid --query id -o tsv)
az role assignment create --role contributor --subscription $subscriptionId --assignee-object-id $githubapp_dev_assigneeid --assignee-principal-type ServicePrincipal --scope /subscriptions/$subscriptionId/resourceGroups/$resource_group_ml
$githubapp_test_cid=$(az ad app list --display-name $githubapp_test --query [*].appId -o tsv)
$githubapp_test_oid=$(az ad app list --display-name $githubapp_test --query [*].id -o tsv)
az ad sp create --id $githubapp_test_cid
$githubapp_test_assigneeid=$(az ad sp show --id $githubapp_test_cid --query id -o tsv)
az role assignment create --role contributor --subscription $subscriptionId --assignee-object-id $githubapp_test_assigneeid --assignee-principal-type ServicePrincipal --scope /subscriptions/$subscriptionId/resourceGroups/$resource_group_ml
$githubapp_prod_cid=$(az ad app list --display-name $githubapp_prod --query [*].appId -o tsv)
$githubapp_prod_oid=$(az ad app list --display-name $githubapp_prod --query [*].id -o tsv)
az ad sp create --id $githubapp_prod_cid
$githubapp_prod_assigneeid=$(az ad sp show --id $githubapp_prod_cid --query id -o tsv)
az role assignment create --role contributor --subscription $subscriptionId --assignee-object-id $githubapp_prod_assigneeid --assignee-principal-type ServicePrincipal --scope /subscriptions/$subscriptionId/resourceGroups/$resource_group_ml
Set your GitHub name as an environment variable, and also the repository name
Replace with yout GitHub account
$github_org="jlobrant"
$github_repo="mlopsdemov2"
Configure the GitHub connection
$devgraphuri="https://graph.microsoft.com/beta/applications/$githubapp_dev_oid/federatedIdentityCredentials"
$devgraphbody="{'name':'GitHubDevDeploy','issuer':'https://token.actions.githubusercontent.com','subject':'repo:$github_org/${github_repo}:environment:Dev','description':'Development Environment','audiences':['api://AzureADTokenExchange']}"
az rest --method POST --uri $devgraphuri --body $devgraphbody
After this step, you will see the credential configured in the Azure portal under Application Registrations. Select the service principal you just created and select certificates and secrets from the menu on the left as shown in the screenshot below:
Repeat this step for the Test and Prod Apps
$testgraphuri="https://graph.microsoft.com/beta/applications/$githubapp_test_oid/federatedIdentityCredentials"
$testgraphbody="{'name':'GitHubTestDeploy','issuer':'https://token.actions.githubusercontent.com','subject':'repo:$github_org/${github_repo}:environment:Test','description':'Test Environment','audiences':['api://AzureADTokenExchange']}"
az rest --method POST --uri $testgraphuri --body $testgraphbody
$prodgraphuri="https://graph.microsoft.com/beta/applications/$githubapp_prod_oid/federatedIdentityCredentials"
$prodgraphbody="{'name':'GitHubProdDeploy','issuer':'https://token.actions.githubusercontent.com','subject':'repo:$github_org/${github_repo}:environment:Prod','description':'Prod Environment','audiences':['api://AzureADTokenExchange']}"
az rest --method POST --uri $prodgraphuri --body $prodgraphbody
This step will be necessary to allow you build an end2end Actions workflow
Under Environment secrets, create secrets for AZURE_CLIENT_ID, AZURE_TENANT_ID, and AZURE_SUBSCRIPTION_ID
Get the values in App Resgistrations on Azure Portal. Also get your Subscription ID value
Also, create a resource group secret and a workspace secret with the RG name and the workspace of the environment (example: Dev, Test and Prod according to the workspaces name)
In Dev, use the value of the variable $workspace01, in Test $workspace02 and Prod $workspace03
IMPORTANT: Also create a WORKSPACE_NAME_DEV secret in Test and Prod, as you will need this to donwload the model from Dev to Register in the proper environment
Use the value from parameter $workspace01
The following workflow sample is configured in this repository. This example will run a pipeline in Dev workspace, download the model and register the model in Test and Prod environment. You can improve this by adding other actions like invoking the endpoints and evaluating the results of the batch invoke
on:
push:
paths:
- 'data-science/**'
permissions:
id-token: write
contents: read
jobs:
Pipeline-Dev:
runs-on: ubuntu-latest
environment: Dev
steps:
- name: check out repo
uses: actions/checkout@v2
- name: login
uses: azure/login@v1
with:
client-id: ${{ secrets.AZURE_CLIENT_ID }}
tenant-id: ${{ secrets.AZURE_TENANT_ID }}
subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
- name: Setup Azure ML Cli
run: bash setupml.sh
working-directory: scripts
- name: Execute ML Pipeline
run: az ml job create --file ./dev/pipeline.yml --resource-group ${{ secrets.RESOURCE_GROUP }} --workspace-name ${{ secrets.WORKSPACE_NAME }}
Promote-to-Test:
runs-on: ubuntu-latest
environment: Test
needs: [Pipeline-Dev]
steps:
- name: check out repo
uses: actions/checkout@v2
- name: login
uses: azure/login@v1
with:
client-id: ${{ secrets.AZURE_CLIENT_ID }}
tenant-id: ${{ secrets.AZURE_TENANT_ID }}
subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
- name: Setup Azure ML Cli
run: bash setupml.sh
working-directory: scripts
- name: Download Model
run: az ml model download --name taxi-model-mlops-demo --version 1 --resource-group ${{ secrets.RESOURCE_GROUP }} --workspace-name ${{ secrets.WORKSPACE_NAME_DEV }} --download-path ./model
- name: Register Model Test
run: az ml model create --name taxi-test-model-mlops-demo --version 1 --path ./model/taxi-model-mlops-demo --resource-group ${{ secrets.RESOURCE_GROUP }} --workspace-name ${{ secrets.WORKSPACE_NAME }}
Promote-to-Prod:
runs-on: ubuntu-latest
environment: Prod
needs: [Pipeline-Dev,Promote-to-Test]
steps:
- name: check out repo
uses: actions/checkout@v2
- name: login
uses: azure/login@v1
with:
client-id: ${{ secrets.AZURE_CLIENT_ID }}
tenant-id: ${{ secrets.AZURE_TENANT_ID }}
subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
- name: Setup Azure ML Cli
run: bash setupml.sh
working-directory: scripts
- name: Download Model
run: az ml model download --name taxi-model-mlops-demo --version 1 --resource-group ${{ secrets.RESOURCE_GROUP }} --workspace-name ${{ secrets.WORKSPACE_NAME_DEV }} --download-path ./model
- name: Register Model Prod
run: az ml model create --name taxi-test-model-mlops-demo --version 1 --path ./model/taxi-model-mlops-demo --resource-group ${{ secrets.RESOURCE_GROUP }} --workspace-name ${{ secrets.WORKSPACE_NAME }}
This workflow is configure to trigger the action when you make some changes in the python code inside the data-science folder
For the Test and Prod environment, configure the Environment protection rules. Add at least one login in the Required reviewers
This way, the MLOps process will require a review before moving the model to Test and later to Prod
As you submit an Action and a manual approval is required, you will receive an email requesting approval
Click review deployments to approve and release the task
Do the same for Production environment
If you've followed all the steps correctly up to this point, you now have your MLOps working and now it's time to improve your repository based on your needs
With the advance of AI and machine learning, companies start to use complex machine learning pipelines in various applications, such as recommendation systems, fraud detection, and more. These complex systems usually require hundreds to thousands of features to support time-sensitive business applications, and the feature pipelines are maintained by different team members across various business groups.
In these machine learning systems, we see many problems that consume lots of energy of machine learning engineers and data scientists, in particular duplicated feature engineering, online-offline skew, and feature serving with low latency.
Reference: Feathr: LinkedIn’s feature store is now available on Azure