MLOps - Step by step guide


This guide was inspired by the Azure MLOPs (v2) solution accelerator, and the goal is to help you understand all the steps involved in building the foundation of an ML environment with MLOps.

Check the MLOPs Solution Accelerator (v2) repository for more information:

Azure MLOps (v2) solution accelerator

Step by Step guide - Manual Execution (Learning purpose)

Prerequisite - Setting up new MLOPS

Fork this repository

In the top-right corner of the page, click Fork


Select an owner for the forked repository, optionally, add a description of your fork, and click Create fork.


Use Visual Studio Code to clone the forked repository:



If you need help setting this up, check the link below:

VSCode - Source Control

IMPORTANT!!! Execute the demo in the root folder of your project

Open a New Terminal


Use the root folder for this demo


Authenticate using az login

az login



Edit the env.ps1 file in the scripts folder

IMPORTANT! Update the $resource_sufix parameter and $subscriptionId before setting the environment variables (executing the env.ps1)


Execute the PS script to set the environment variables

. .\scripts\env.ps1

Check at least one of the variables to make sure the environment variables are set

Write-Output $resource_sufix


Set the default subscription id

az account set --subscription $subscriptionId

Create the ML resource group you will use in this demo

az group create -l $resource_region -n $resource_group_ml

Create the 3 AML Workspaces that you need for this demo (Dev, Test and Prod)

01 - Create Dev Workspace

az ml workspace create --resource-group $resource_group_ml --name $workspace01 --location $resource_region --display-name "Dev Workspace"

02 - Create Test Workspace

az ml workspace create --resource-group $resource_group_ml --name $workspace02 --location $resource_region --display-name "Test Workspace"

03 - Create Prod Workspace

az ml workspace create --resource-group $resource_group_ml --name $workspace03 --location $resource_region --display-name "Prod Workspace"

You should see this in your RG after this step


Create a Storage Account

Create the storage account group

az group create -l $resource_region -n $resource_group_stg

Create a storage account

az storage account create --name $storage_name --resource-group $resource_group_stg --location $resource_region --sku Standard_ZRS --kind StorageV2 --enable-hierarchical-namespace true

Important: Storage account names are unique. Make sure to use a different sufix in a new demo

Create a User Managed Identity

Execute the cmd below. It will store the ID if the managed identity in the $managed_identity_id

$managed_identity_id=$(az identity create  -n $managed_identity_mlgroup --query id -o tsv -g $resource_group_ml)

Create Compute in all AML workspaces


az ml compute create -f ./compute/computedev.yml --workspace-name $workspace01 --resource-group $resource_group_ml --identity-type user_assigned --user-assigned-identities $managed_identity_id


az ml compute create -f ./compute/computetest.yml --workspace-name $workspace02 --resource-group $resource_group_ml --identity-type user_assigned --user-assigned-identities $managed_identity_id


az ml compute create -f ./compute/computeprod.yml --workspace-name $workspace03 --resource-group $resource_group_ml --identity-type user_assigned --user-assigned-identities $managed_identity_id

Grant access on the Storage Account you will use for the demo:

$storage_acc_id=$(az storage account show --name $storage_name --resource-group $resource_group_stg --query id -o tsv)

$managed_identity_principal_id=$(az identity show --name $managed_identity_mlgroup --resource-group $resource_group_ml --query principalId -o tsv)

az role assignment create --role "Storage Blob Data Owner" --assignee-object-id $managed_identity_principal_id --scope $storage_acc_id


Grant access to the AML Workspaces managed identities:

$workspace01spID=$(az resource list -n $workspace01 --resource-group $resource_group_ml --query [*].identity.principalId --out tsv)
$workspace02spID=$(az resource list -n $workspace02 --resource-group $resource_group_ml --query [*].identity.principalId --out tsv)
$workspace03spID=$(az resource list -n $workspace03 --resource-group $resource_group_ml --query [*].identity.principalId --out tsv)

az role assignment create --role "Storage Blob Data Owner" --assignee-object-id $workspace01spID --scope $storage_acc_id
az role assignment create --role "Storage Blob Data Owner" --assignee-object-id $workspace02spID --scope $storage_acc_id
az role assignment create --role "Storage Blob Data Owner" --assignee-object-id $workspace03spID --scope $storage_acc_id

Also give access to your own id

$selfid=$(az ad signed-in-user show --query id -o tsv)
az role assignment create --role "Storage Blob Data Owner" --assignee-object-id $selfid --scope $storage_acc_id

Storage Access Control screenshot image

Create the containers in the Storage Account

az storage container create --name mlopsdemodev --account-name $storage_name --resource-group $resource_group_stg
az storage container create --name mlopsdemotest --account-name $storage_name --resource-group $resource_group_stg
az storage container create --name mlopsdemoprod --account-name $storage_name --resource-group $resource_group_stg


Upload the csv file that will be used in batch deployment to the proper directory

az storage azcopy blob upload -c mlopsdemotest --account-name $storage_name -s "data/taxi-batch.csv" -d "taxibatch/taxi-batch.csv"

az storage azcopy blob upload -c mlopsdemotest --account-name $storage_name -s "data/taxi-request.json" -d "taxioutput/taxi-request.json"


Repeat for prod container

az storage azcopy blob upload -c mlopsdemoprod --account-name $storage_name -s "data/taxi-batch.csv" -d "taxibatch/taxi-batch.csv"

az storage azcopy blob upload -c mlopsdemoprod --account-name $storage_name -s "data/taxi-request.json" -d "taxioutput/taxi-request.json"

1) Dev Workspace Steps

In this step you will run a job in the Dev workspace and register a model. This model will be later transfered to Test and Prod workspaces in the following steps.

Create AML Environment

az ml environment create --file ./dev/train-env.yml --workspace-name $workspace01 --resource-group $resource_group_ml

Pipeline run

az ml job create --file ./dev/pipeline.yml --resource-group $resource_group_ml --workspace-name $workspace01

After this command, a pipeline will be triggered in the Dev workspace. The result of this execution is a model being registered in the Dev workspace.



2) Test Workspace Steps

Create AML Enviroment

az ml environment create --file ./test/test-env.yml --workspace-name $workspace02 --resource-group $resource_group_ml

Create datastore and data asset


az ml datastore create --file ./test/data-store.yml --workspace-name $workspace02 --resource-group $resource_group_ml --set account_name=$storage_name

Data Asset

az ml data create -f ./test/file-data-asset.yml --workspace-name $workspace02 --resource-group $resource_group_ml

Download model from Dev Workspace

az ml model download --name taxi-model-mlops-demo --version 1 --resource-group $resource_group_ml --workspace-name $workspace01 --download-path ./model

Register model on Test Workspace

az ml model create --name taxi-test-model-mlops-demo --version 1 --path ./model/taxi-model-mlops-demo --resource-group $resource_group_ml --workspace-name $workspace02

Register Batch Endpoint

$endpoint_name_test = "taxifare-b-mldemo-t-$resource_sufix"

az ml batch-endpoint create --file ./test/batch-endpoint-test.yml --resource-group $resource_group_ml --workspace-name $workspace02 --set name=$endpoint_name_test

Register Batch Deployment

az ml batch-deployment create --file ./test/batch-deployment-test.yml --resource-group $resource_group_ml --workspace-name $workspace02 --set endpoint_name=$endpoint_name_test

Execute Batch Job

az ml batch-endpoint invoke --name $endpoint_name_test --deployment-name batch-dp-mlopsdemo-test  --input-type uri_file --input azureml://datastores/mlopsdemotestcointainer/paths/taxibatch/taxi-batch.csv --resource-group $resource_group_ml --workspace-name $workspace02 --output-path azureml://datastores/mlopsdemotestcointainer/paths/taxioutput

This command will invoke a job, that will use the deployed model in the test workspace, and generate the results from the data in the taxi-batch.csv in the taxioutput folder in the test container





Now you can verify the results and analyze the performance of the model using shadow production data.

3) Prod Steps - Workspace 03 (Prod)

Create Environment

az ml environment create --file ./prod/prod-env.yml --workspace-name $workspace03 --resource-group $resource_group_ml

Create datastore and data asset


az ml datastore create --file ./prod/data-store.yml --workspace-name $workspace03 --resource-group $resource_group_ml --set account_name=$storage_name

Data Asset

az ml data create -f ./prod/file-data-asset.yml --workspace-name $workspace03 --resource-group $resource_group_ml

Download model from Dev Workspace

Already done in Test step.

Register Model

az ml model create --name taxi-prod-model-mlops-demo --version 1 --path ./model/taxi-model-mlops-demo --resource-group $resource_group_ml --workspace-name $workspace03

Register Batch Endpoint

$endpoint_name_prod = "taxifare-b-mldemo-p-$resource_sufix"

az ml batch-endpoint create --file ./prod/batch-endpoint-prod.yml --resource-group $resource_group_ml --workspace-name $workspace03 --set name=$endpoint_name_prod

Register Batch Deployment

az ml batch-deployment create --file ./prod/batch-deployment-prod.yml --resource-group $resource_group_ml --workspace-name $workspace03 --set endpoint_name=$endpoint_name_prod

Execute Batch Job

az ml batch-endpoint invoke --name $endpoint_name_prod --deployment-name batch-dp-mlopsdemo-prod --input-type uri_file --input azureml://datastores/mlopsdemoprodcointainer/paths/taxibatch/taxi-batch.csv --resource-group $resource_group_ml --workspace-name $workspace03 --output-path azureml://datastores/mlopsdemoprodcointainer/paths/taxioutput

We expect to get the same results in the Test Workspace and Production Workspace in this demo, but the idea is that the file in the prod container is the actual production data, as the file in the test container is shadow production data, which means some actual data that was selected to test the model

The Development, Test and Production environment in a real use case will be used with different datasets

GitHub Actions

Setup GitHub Authentication

Create application and service principal

You'll need to create an Azure Active Directory application and service principal and then assign a role on your subscription to your application so that your workflow has access to your subscription

You will create one Service Principal per environment


az ad app create --display-name $githubapp_dev
az ad app create --display-name $githubapp_test
az ad app create --display-name $githubapp_prod

$githubapp_dev_cid=$(az ad app list --display-name $githubapp_dev --query [*].appId -o tsv)
$githubapp_dev_oid=$(az ad app list --display-name $githubapp_dev --query [*].id -o tsv)
az ad sp create --id $githubapp_dev_cid

$githubapp_dev_assigneeid=$(az ad sp show --id $githubapp_dev_cid --query id -o tsv)
az role assignment create --role contributor --subscription $subscriptionId --assignee-object-id  $githubapp_dev_assigneeid --assignee-principal-type ServicePrincipal --scope /subscriptions/$subscriptionId/resourceGroups/$resource_group_ml

$githubapp_test_cid=$(az ad app list --display-name $githubapp_test --query [*].appId -o tsv)
$githubapp_test_oid=$(az ad app list --display-name $githubapp_test --query [*].id -o tsv)
az ad sp create --id $githubapp_test_cid

$githubapp_test_assigneeid=$(az ad sp show --id $githubapp_test_cid --query id -o tsv)
az role assignment create --role contributor --subscription $subscriptionId --assignee-object-id  $githubapp_test_assigneeid --assignee-principal-type ServicePrincipal --scope /subscriptions/$subscriptionId/resourceGroups/$resource_group_ml

$githubapp_prod_cid=$(az ad app list --display-name $githubapp_prod --query [*].appId -o tsv)
$githubapp_prod_oid=$(az ad app list --display-name $githubapp_prod --query [*].id -o tsv)
az ad sp create --id $githubapp_prod_cid

$githubapp_prod_assigneeid=$(az ad sp show --id $githubapp_prod_cid --query id -o tsv)
az role assignment create --role contributor --subscription $subscriptionId --assignee-object-id  $githubapp_prod_assigneeid --assignee-principal-type ServicePrincipal --scope /subscriptions/$subscriptionId/resourceGroups/$resource_group_ml

Set your GitHub name as an environment variable, and also the repository name

Replace with yout GitHub account


Configure the GitHub connection

$devgraphbody="{'name':'GitHubDevDeploy','issuer':'','subject':'repo:$github_org/${github_repo}:environment:Dev','description':'Development Environment','audiences':['api://AzureADTokenExchange']}"

az rest --method POST --uri $devgraphuri --body $devgraphbody

After this step, you will see the credential configured in the Azure portal under Application Registrations. Select the service principal you just created and select certificates and secrets from the menu on the left as shown in the screenshot below:


Repeat this step for the Test and Prod Apps

$testgraphbody="{'name':'GitHubTestDeploy','issuer':'','subject':'repo:$github_org/${github_repo}:environment:Test','description':'Test Environment','audiences':['api://AzureADTokenExchange']}"

az rest --method POST --uri $testgraphuri --body $testgraphbody
$prodgraphbody="{'name':'GitHubProdDeploy','issuer':'','subject':'repo:$github_org/${github_repo}:environment:Prod','description':'Prod Environment','audiences':['api://AzureADTokenExchange']}"

az rest --method POST --uri $prodgraphuri --body $prodgraphbody

Configure your GitHub

Create the Environments in your GitHub repository

This step will be necessary to allow you build an end2end Actions workflow


Under Environment secrets, create secrets for AZURE_CLIENT_ID, AZURE_TENANT_ID, and AZURE_SUBSCRIPTION_ID


Get the values in App Resgistrations on Azure Portal. Also get your Subscription ID value


Also, create a resource group secret and a workspace secret with the RG name and the workspace of the environment (example: Dev, Test and Prod according to the workspaces name)


In Dev, use the value of the variable $workspace01, in Test $workspace02 and Prod $workspace03

IMPORTANT: Also create a WORKSPACE_NAME_DEV secret in Test and Prod, as you will need this to donwload the model from Dev to Register in the proper environment


Use the value from parameter $workspace01

Workflow Sample

The following workflow sample is configured in this repository. This example will run a pipeline in Dev workspace, download the model and register the model in Test and Prod environment. You can improve this by adding other actions like invoking the endpoints and evaluating the results of the batch invoke

                  - 'data-science/**'

      id-token: write
      contents: read

    runs-on: ubuntu-latest
    environment: Dev
    - name: check out repo
      uses: actions/checkout@v2
    - name: login
      uses: azure/login@v1
        client-id: ${{ secrets.AZURE_CLIENT_ID }}
        tenant-id: ${{ secrets.AZURE_TENANT_ID }}
        subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}

    - name: Setup Azure ML Cli
      run: bash
      working-directory: scripts
    - name: Execute ML Pipeline
      run: az ml job create --file ./dev/pipeline.yml --resource-group ${{ secrets.RESOURCE_GROUP }} --workspace-name ${{ secrets.WORKSPACE_NAME }}

    runs-on: ubuntu-latest
    environment: Test
    needs: [Pipeline-Dev]
    - name: check out repo
      uses: actions/checkout@v2
    - name: login
      uses: azure/login@v1
        client-id: ${{ secrets.AZURE_CLIENT_ID }}
        tenant-id: ${{ secrets.AZURE_TENANT_ID }}
        subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}

    - name: Setup Azure ML Cli
      run: bash
      working-directory: scripts

    - name: Download Model
      run: az ml model download --name taxi-model-mlops-demo --version 1 --resource-group ${{ secrets.RESOURCE_GROUP }} --workspace-name ${{ secrets.WORKSPACE_NAME_DEV }} --download-path ./model
    - name: Register Model Test
      run: az ml model create --name taxi-test-model-mlops-demo --version 1 --path ./model/taxi-model-mlops-demo --resource-group ${{ secrets.RESOURCE_GROUP }} --workspace-name ${{ secrets.WORKSPACE_NAME }}

    runs-on: ubuntu-latest
    environment: Prod
    needs: [Pipeline-Dev,Promote-to-Test]
    - name: check out repo
      uses: actions/checkout@v2
    - name: login
      uses: azure/login@v1
        client-id: ${{ secrets.AZURE_CLIENT_ID }}
        tenant-id: ${{ secrets.AZURE_TENANT_ID }}
        subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}

    - name: Setup Azure ML Cli
      run: bash
      working-directory: scripts
    - name: Download Model
      run: az ml model download --name taxi-model-mlops-demo --version 1 --resource-group ${{ secrets.RESOURCE_GROUP }} --workspace-name ${{ secrets.WORKSPACE_NAME_DEV }} --download-path ./model
    - name: Register Model Prod
      run: az ml model create --name taxi-test-model-mlops-demo --version 1 --path ./model/taxi-model-mlops-demo --resource-group ${{ secrets.RESOURCE_GROUP }} --workspace-name ${{ secrets.WORKSPACE_NAME }}

This workflow is configure to trigger the action when you make some changes in the python code inside the data-science folder

Configuring Manual Approvals

For the Test and Prod environment, configure the Environment protection rules. Add at least one login in the Required reviewers



This way, the MLOps process will require a review before moving the model to Test and later to Prod

As you submit an Action and a manual approval is required, you will receive an email requesting approval


Click review deployments to approve and release the task image


Do the same for Production environment



WE DID IT!!! image

If you've followed all the steps correctly up to this point, you now have your MLOps working and now it's time to improve your repository based on your needs

Coming Soon

Feathr Feature Store


Feature store motivation

With the advance of AI and machine learning, companies start to use complex machine learning pipelines in various applications, such as recommendation systems, fraud detection, and more. These complex systems usually require hundreds to thousands of features to support time-sensitive business applications, and the feature pipelines are maintained by different team members across various business groups.

In these machine learning systems, we see many problems that consume lots of energy of machine learning engineers and data scientists, in particular duplicated feature engineering, online-offline skew, and feature serving with low latency.


Reference: Feathr: LinkedIn’s feature store is now available on Azure


