Training, deployment, and inference of a high volume of machine learning models is a common use case. One example of this scenario is forecasting, where an organization creates hundres of thousands or millions of independent models to predict volumes or revenues at a hyper-specific level.
This case comes with challenges in scalability. The ability to execute a machine learning development lifecyle for hundreds of thousands of models requires a parallel and repeatable framework.
The goal of of this repository is to provide a thorough example of a scaleable and repeatable framework for training, deployment, and inference of a 'Many Model' architecture using Azure Machine Learning.
Follow the sdk notebooks in order to step through the process of developing and deploying repeatable, parallel, training and inferencing using Azure Machine Learning
There are three ways to define and run jobs in Azure Machine Learning: The Python SDK, the Azure ML CLI and the Azure Machine Learning Studio UI. The CLI is a convenient way to manage pipeline definitions as code and run them using a simple command line interface. This makes it the ideal choice to use within your CI/CD orchestration.
The cli_pipelines folder contains the YAML definition files for the training, inference and evaluation pipelines implemented with the Python SDK in the notebooks.
You can run any of these using the az ml job create command from a terminal. Example below
az ml job create -f 3_feedback_pipeline.yml
The primary dataset is a subset of the a dataset from the Dominick's / University of Chicago Booth repository. The dataset includes extra simulated data to simultaneously train many of models on Azure Machine Learning. In this specific subset we will deal with about 33 stores and 3 brands of orange juice. (99 total models)
Access to the full 3,991 store dataset can be found here: AzureML Open Datasets
Managing Pipeline and Component Inputs and Outputs
Query and Compare Experiment Runs with Azure MLFlow
Azure ML Client Docs
MLFlow Client Docs