Skip to content

Commit

Permalink
Merge pull request #11866 from MicrosoftDocs/learn-build-service-prod…
Browse files Browse the repository at this point in the history
…bot/docutune-autopr-20240415-050706-1280128-ignore-build

[DocuTune-Remediation] - DocuTune scheduled execution in AAC (part 4)
  • Loading branch information
prmerger-automator[bot] committed Apr 15, 2024
2 parents acaee11 + 2ccd706 commit 171d0a8
Show file tree
Hide file tree
Showing 2 changed files with 13 additions and 13 deletions.
10 changes: 5 additions & 5 deletions docs/ai-ml/guide/mlops-maturity-model-content.md
Expand Up @@ -4,11 +4,11 @@ The purpose of this maturity model is to help clarify the Machine Learning Opera

The MLOps maturity model helps clarify the Development Operations (DevOps) principles and practices necessary to run a successful MLOps environment. It's intended to identify gaps in an existing organization's attempt to implement such an environment. It's also a way to show you how to grow your MLOps capability in increments rather than overwhelm you with the requirements of a fully mature environment. Use it as a guide to:

* Estimate the scope of the work for new engagements.
- Estimate the scope of the work for new engagements.

* Establish realistic success criteria.
- Establish realistic success criteria.

* Identify deliverables you'll hand over at the conclusion of the engagement.
- Identify deliverables you'll hand over at the conclusion of the engagement.

As with most maturity models, the MLOps maturity model qualitatively assesses people/culture, processes/structures, and objects/technology. As the maturity level increases, the probability increases that incidents or errors will lead to improvements in the quality of the development and production processes.

Expand All @@ -28,13 +28,13 @@ The tables that follow identify the detailed characteristics for that level of p

| People | Model Creation | Model Release | Application Integration |
| ------ | -------------- | ------------- | ----------------------- |
| <ul><li>Data scientists: siloed, not in regular communications with the larger team<li>Data engineers (_if exists_): siloed, not in regular communications with the larger team<li>Software engineers: siloed, receive model remotely from the other team members</ul> | <ul><li>Data gathered manually<li>Compute is likely not managed<li>Experiments aren't predictably tracked<li>End result may be a single model file manually handed off with inputs/outputs</ul> | <ul><li>Manual process<li>Scoring script may be manually created well after experiments, not version controlled<li>Release handled by data scientist or data engineer alone</ul> | <ul><li>Heavily reliant on data scientist expertise to implement<li>Manual releases each time</ul> |
| <ul><li>Data scientists: siloed, not in regular communications with the larger team<li>Data engineers (*if exists*): siloed, not in regular communications with the larger team<li>Software engineers: siloed, receive model remotely from the other team members</ul> | <ul><li>Data gathered manually<li>Compute is likely not managed<li>Experiments aren't predictably tracked<li>End result might be a single model file manually handed off with inputs/outputs</ul> | <ul><li>Manual process<li>Scoring script might be manually created well after experiments, not version controlled<li>Release handled by data scientist or data engineer alone</ul> | <ul><li>Heavily reliant on data scientist expertise to implement<li>Manual releases each time</ul> |

## Level 1: DevOps no MLOps

| People | Model Creation | Model Release | Application Integration |
| ------ | -------------- | ------------- | ----------------------- |
| <ul><li>Data scientists: siloed, not in regular communications with the larger team<li>Data engineers (if exists): siloed, not in regular communication with the larger team<li>Software engineers: siloed, receive model remotely from the other team members</ul> | <ul><li>Data pipeline gathers data automatically<li>Compute is or isn't managed<li>Experiments aren't predictably tracked<li>End result may be a single model file manually handed off with inputs/outputs</ul> | <ul><li>Manual process<li>Scoring script may be manually created well after experiments, likely version controlled<li>Is handed off to software engineers</ul> | <ul><li>Basic integration tests exist for the model<li>Heavily reliant on data scientist expertise to implement model<li>Releases automated<li>Application code has unit tests</ul> |
| <ul><li>Data scientists: siloed, not in regular communications with the larger team<li>Data engineers (if exists): siloed, not in regular communication with the larger team<li>Software engineers: siloed, receive model remotely from the other team members</ul> | <ul><li>Data pipeline gathers data automatically<li>Compute is or isn't managed<li>Experiments aren't predictably tracked<li>End result might be a single model file manually handed off with inputs/outputs</ul> | <ul><li>Manual process<li>Scoring script might be manually created well after experiments, likely version controlled<li>Is handed off to software engineers</ul> | <ul><li>Basic integration tests exist for the model<li>Heavily reliant on data scientist expertise to implement model<li>Releases automated<li>Application code has unit tests</ul> |

## Level 2: Automated Training

Expand Down
16 changes: 8 additions & 8 deletions docs/ai-ml/guide/mlops-python-content.md
Expand Up @@ -32,13 +32,13 @@ This architecture consists of the following services:

## MLOps Pipeline

This solution demonstrates end-to-end automation of various stages of an AI project using tools that are already familiar to software engineers. The machine learning problem is simple to keep the focus on the DevOps pipeline. The solution uses the [scikit-learn diabetes dataset](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_diabetes.html) and builds a ridge linear regression model to predict the likelihood of diabetes.
This solution demonstrates end-to-end automation of various stages of an AI project using tools that are already familiar to software engineers. The machine learning problem is simple to keep the focus on the DevOps pipeline. The solution uses the [scikit-learn diabetes dataset](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_diabetes.html) and builds a ridge linear regression model to predict the likelihood of diabetes.

This solution is based on the following three pipelines:

- **Build pipeline**. Builds the code and runs a suite of tests.
- **Retraining pipeline**. Retrains the model on a schedule or when new data becomes available.
- **Release pipeline**. Operationalizes the scoring image and promotes it safely across different environments.
- **Build pipeline.** Builds the code and runs a suite of tests.
- **Retraining pipeline.** Retrains the model on a schedule or when new data becomes available.
- **Release pipeline.** Operationalizes the scoring image and promotes it safely across different environments.

The next sections describe each of these pipelines.

Expand Down Expand Up @@ -72,7 +72,7 @@ This pipeline covers the following steps:

- **Evaluate model.** A simple evaluation test compares the new model with the existing model. Only when the new model is better does it get promoted. Otherwise, the model is not registered and the pipeline is canceled.

- **Register model.** The retrained model is registered with the [Azure ML Model registry](/azure/machine-learning/service/concept-azure-machine-learning-architecture). This service provides version control for the models along with metadata tags so they can be easily reproduced.
- **Register model.** The retrained model is registered with the [Azure Machine Learning Model registry](/azure/machine-learning/service/concept-azure-machine-learning-architecture). This service provides version control for the models along with metadata tags so they can be easily reproduced.

### Release pipeline

Expand Down Expand Up @@ -114,7 +114,7 @@ Ideally, have your build pipeline finish quickly and execute only unit tests and

The release pipeline publishes a real-time scoring web service. A release to the QA environment is done using Container Instances for convenience, but you can use another Kubernetes cluster running in the QA/staging environment.

Scale the production environment according to the size of your Azure Kubernetes Service cluster. The size of the cluster depends on the load you expect for the deployed scoring web service. For real-time scoring architectures, throughput is a key optimization metric. For non-deep learning scenarios, the CPU should be sufficient to handle the load; however, for deep learning workloads, when speed is a bottleneck, GPUs generally provide better performance compared to CPUs. Azure Kubernetes Service supports both CPU and GPU node types, which is the reason this solution uses it for image deployment. For more information, see [GPUs vs CPUs for deployment of deep learning models.](https://azure.microsoft.com/blog/gpus-vs-cpus-for-deployment-of-deep-learning-models/)
Scale the production environment according to the size of your Azure Kubernetes Service cluster. The size of the cluster depends on the load you expect for the deployed scoring web service. For real-time scoring architectures, throughput is a key optimization metric. For non-deep learning scenarios, the CPU should be sufficient to handle the load; however, for deep learning workloads, when speed is a bottleneck, GPUs generally provide better performance compared to CPUs. Azure Kubernetes Service supports both CPU and GPU node types, which is the reason this solution uses it for image deployment. For more information, see [GPUs vs CPUs for deployment of deep learning models](https://azure.microsoft.com/blog/gpus-vs-cpus-for-deployment-of-deep-learning-models/).

Scale the retraining pipeline up and down depending on the number of nodes in your Azure Machine Learning Compute resource, and use the [autoscaling](/azure/machine-learning/service/how-to-set-up-training-targets#persistent) option to manage the cluster. This architecture uses CPUs. For deep learning workloads, GPUs are a better choice and are supported by Azure Machine Learning Compute.

Expand Down Expand Up @@ -142,11 +142,11 @@ To deploy this reference architecture, follow the steps described in the [Gettin

## Contributors

*This article is maintained by Microsoft. It was originally written by the following contributors.*
*This article is maintained by Microsoft. It was originally written by the following contributors.*

Principal author:

- Praneet Singh Solanki | Senior Software Engineer
- Praneet Singh Solanki | Senior Software Engineer

## Next steps

Expand Down

0 comments on commit 171d0a8

Please sign in to comment.