MLOps V2 Computer Vision Quickstart

This repository is a prebuilt project demonstrating a computer vision MLOps scenario using Azure Machine Learning and GitHub workflows.

This project was generated using the MLOps v2 Solution Accelerator using the following project generation parameters:

Parameter	Value
Project Type:	cv
Azure ML Interface:	aml-cli-v2
CI/CD Platform	github-actions
IAC Provider	terraform

The project is organized according to the table:

Location	Contents
`.github/workflows/`	GitHub workflows for infrastructure, training pipeline, and model deployment
`data/`	Sample request data to test the deployed endpoint
`data-science/`	Source code for model training
`images/`	Images for this README.md
`infrastructure/`	Terraform templates for Azure ML infrastructure
`mlops/azureml/`	Azure ML training and deployment pipelines

Below is a quickstart to deploying this prebuilt project. Refer to the MLOps v2 Solution Accelerator project for more comprehensive documentation and deployment guides for bootstrapping your own MLOps projects.

Steps to Deploy

Clone this repository to your own GitHub organization and follow the steps below to deploy the demo.

Create an Azure Service Principal and configure GitHub actions secrets.
Configure dev and/or prod environments and create a dev branch.
Use a GitHub workflow to create Azure ML infrastructure for dev and/or prod environments.
Use a GitHub workflow to create and run a pytorch vision model training pipeline in Azure ML.
Use a GitHub workflow to deploy the vision model as a real-time endpoint in Azure ML.

Configure GitHub Actions Secrets

This step creates a service principal and GitHub secrets to allow the GitHub action workflows to create and interact with Azure Machine Learning Workspace resources.

From the command line, execute the following Azure CLI command with your choice of a service principal name:

# az ad sp create-for-rbac --name <service_principal_name> --role contributor --scopes /subscriptions/<subscription_id> --sdk-auth

You will get output similar to below:

{
"clientId": "<service principal client id>",
"clientSecret": "<service principal client secret>",
"subscriptionId": "<Azure subscription id>",
"tenantId": "<Azure tenant id>",
"activeDirectoryEndpointUrl": "https://login.microsoftonline.com",
"resourceManagerEndpointUrl": "https://management.azure.com/",
"activeDirectoryGraphResourceId": "https://graph.windows.net/",
"sqlManagementEndpointUrl": "https://management.core.windows.net:8443/",
"galleryEndpointUrl": "https://gallery.azure.com/",
"managementEndpointUrl": "https://management.core.windows.net/"
}

Copy all of this output, braces included. From your GitHub project, select Settings:

Then select Secrets, then Actions:

Select New repository secret. Name this secret AZURE_CREDENTIALS and paste the service principal output as the content of the secret. Select Add secret.

Note:
As this infrastructure is deployed using terraform, add the following additional GitHub secrets using the corresponding values from the service principal output as the content of the secret:

ARM_CLIENT_ID
ARM_CLIENT_SECRET
ARM_SUBSCRIPTION_ID
ARM_TENANT_ID

The GitHub configuration is complete.

Configure Azure ML Environment Parameters

In your Github project repository, there are two configuration files in the root, config-infra-dev.yml and config-infra-prod.yml. These files are used to define and deploy Dev and Prod Azure Machine Learning environments. With the default deployment, config-infra-prod.yml will be used when working with the main branch or your project and config-infra-dev.yml will be used when working with any non-main branch.

It is recommended to first create a dev branch from main and deploy this environment first.

Edit each file to configure a namespace, postfix string, Azure location, and environment for deploying your Dev and Prod Azure ML environments. Default values and settings in the files are show below:

namespace: mlopsv2 #maximum of 6 characters.  
postfix: 0001  
location: eastus  
environment: dev  
enable_aml_computecluster: true  
enable_monitoring: false

The first four values are used to create globally unique names for your Azure environment and contained resources. Edit these values to your liking then save, commit, push, or pr to update these files in the project repository. Leave enable_monitoring set to false for this demo.

As this is a deep learning workload, ensure your subscription and Azure location has available GPU compute.

Deploy Azure Machine Learning Infrastructure

In your GitHub project repository, select Actions

This will display the pre-defined GitHub workflows associated with your project. For a classical machine learning project, the available workflows will look similar to this:

Depending on the the use case, available workflows may vary. Select the workflow to 'deploy-infra'. In this scenario, the workflow to select would be tf-gha-deploy-infra.yml. This would deploy the Azure ML infrastructure using GitHub Actions and Terraform.

On the right side of the page, select Run workflow and select the branch to run the workflow on. This will deploy Dev Infrastructure if run on a dev branch or Prod infrastructure if running on main. Monitor the pipeline for successful completion.

When the pipeline has complete successfully, you can find your Azure ML Workspace and associated resources by logging in to the Azure Portal.

Next, a sample model training pipeline will be deployed into the new Azure Machine Learning environment.

Train a Pytorch Classifier in Azure Machine Learning

The solution accelerator includes code and data for a sample machine learning pipeline which trains a dog breed classifier on the Stanford Dogs Dataset.

The https://github.com/sdonohoo/mlops-cv-demo/blob/main/.github/workflows/deploy-model-training-pipeline.ymlGitHub workflow creates both cpu and gpu-based compute clusters in the Azure ML workspace. The cpu cluster is used for the data registration job that downloads and registers the training image dataset in Azure ML. The gpu cluster is used by an Azure ML pipeline with a single component that trains and registers a pytorch classifier model on this dataset.

To deploy the model training pipeline in the previously created Azure ML workspace, select Actions in your GitHub project repository.

Then select the deploy-cv-model-training-pipeline.

As before, select Run workflow on the right and select the branch to run from. This will run the workflow, create compute clusters, register the dataset, and deploy the training pipeline in Azure ML.

Once the run-model-training-pipeline job begins running, you can follow the execution of this job in the Azure ML workspace.

When the Azure ML pipeline completes, the trained model should be registered in the workspace.

Next, the registered model will be deployed to a real-time endpoint for classifying new images.

Deploy the Registered Model to an Online Endpoint

This step uses a GitHub workflow to deploy the registered model to an Azure ML Managed Online Endpoint for predicting the class of new images.

This workflow will register the interference environment with prerequisite python packages, create an Azure ML endpoint, create a deployment of the registered model to that endpoint, then allocate traffic to the endpoint.

To run the workflow to deploy the registered model as a managed online endpoint, select Actions in your GitHub project repository.

Then select the deploy-online-endpoint-pipeline.

Run the workflow.

As the create-endpoint job begins, you can monitor the ednpoint creation, deployment creation, and traffic allocation in the Azure ML workspace.

Next Steps

Explore the construction of the GitHub workflows to deploy infastructure and deploy pipelines and register artifacts in Azure ML.
Explore the python code in data-science/ and Azure ML pipeline definitions in mlops/azureml/ to understand how to adapt your own computer vision code to this pattern.
Configure the GitHub repository and workflows to fit your MLOps workflows and policies utilizing prod/dev branches, branch protection, and pull requests.

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.github/workflows		.github/workflows
data-science		data-science
data		data
images		images
infrastructure		infrastructure
mlops/azureml		mlops/azureml
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
SUPPORT.md		SUPPORT.md
config-infra-dev.yml		config-infra-dev.yml
config-infra-prod.yml		config-infra-prod.yml
environment.yml		environment.yml

License

Azure/mlops-v2-cv-demo

Folders and files

Latest commit

History

Repository files navigation

MLOps V2 Computer Vision Quickstart

Steps to Deploy

Configure GitHub Actions Secrets

Configure Azure ML Environment Parameters

Deploy Azure Machine Learning Infrastructure

Train a Pytorch Classifier in Azure Machine Learning

Deploy the Registered Model to an Online Endpoint

Next Steps

Contributing

Trademarks

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Languages