|
| 1 | +# Customizing the Azure DevOps job container |
| 2 | + |
| 3 | +The Model training and deployment pipeline uses a Docker container |
| 4 | +on the Azure Pipelines agents to provide a reproducible environment |
| 5 | +to run test and deployment code. |
| 6 | + The image of the container |
| 7 | +`mcr.microsoft.com/mlops/python:latest` is built with this |
| 8 | +[Dockerfile](../environment_setup/Dockerfile). |
| 9 | + |
| 10 | +In your project you will want to build your own |
| 11 | +Docker image that only contains the dependencies and tools required for your |
| 12 | +use case. This image will be more likely smaller and therefore faster, and it |
| 13 | +will be totally maintained by your team. |
| 14 | + |
| 15 | +## Provision an Azure Container Registry |
| 16 | + |
| 17 | +An Azure Container Registry is deployed along your Azure ML Workspace to manage models. |
| 18 | +You can use that registry instance to store your MLOps container image as well, or |
| 19 | +provision a separate instance. |
| 20 | + |
| 21 | +## Create a Registry Service Connection |
| 22 | + |
| 23 | +[Create a service connection](https://docs.microsoft.com/en-us/azure/devops/pipelines/library/service-endpoints?view=azure-devops&tabs=yaml#sep-docreg) to your Azure Container Registry: |
| 24 | +- As *Connection type*, select *Docker Registry* |
| 25 | +- As *Registry type*, select *Azure Container Registry* |
| 26 | +- As *Azure container registry*, select your Container registry instance |
| 27 | +- As *Service connection name*, enter `acrconnection` |
| 28 | + |
| 29 | +## Update the environment definition |
| 30 | + |
| 31 | +Modify the [Dockerfile](../environment_setup/Dockerfile) and/or the |
| 32 | +[ci_dependencies.yml](../diabetes_regression/ci_dependencies.yml) CI Conda |
| 33 | +environment definition to tailor your environment. |
| 34 | +Conda provides a [reusable environment for training and deployment with Azure Machine Learning](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-use-environments). |
| 35 | +The Conda environment used for CI should use the same package versions as the Conda environment |
| 36 | +used for the Azure ML training and scoring environments (defined in [conda_dependencies.yml](../diabetes_regression/conda_dependencies.yml)). |
| 37 | +This enables you to run unit and integration tests using the exact same dependencies as used in the ML pipeline. |
| 38 | + |
| 39 | +If a package is available in a Conda package repository, then we recommend that |
| 40 | +you use the Conda installation rather than the pip installation. Conda packages |
| 41 | +typically come with prebuilt binaries that make installation more reliable. |
| 42 | + |
| 43 | +## Create a container build pipeline |
| 44 | + |
| 45 | +In your [Azure DevOps](https://dev.azure.com) project create a new build |
| 46 | +pipeline referring to the |
| 47 | +[environment_setup/docker-image-pipeline.yml](../environment_setup/docker-image-pipeline.yml) |
| 48 | +pipeline definition in your forked repository. |
| 49 | + |
| 50 | +Edit the [environment_setup/docker-image-pipeline.yml](../environment_setup/docker-image-pipeline.yml) file |
| 51 | +and modify the string `'public/mlops/python'` with an name suitable to describe your environment, |
| 52 | +e.g. `'mlops/diabetes_regression'`. |
| 53 | + |
| 54 | +Save and run the pipeline. This will build and push a container image to your Azure Container Registry with |
| 55 | +the name you have just edited. The next step is to modify the build pipeline to run the CI job on a container |
| 56 | +run from that image. |
| 57 | + |
| 58 | +## Modify the model pipeline |
| 59 | + |
| 60 | +Modify the model pipeline file [diabetes_regression-ci-build-train.yml](../.pipelines/diabetes_regression-ci-build-train.yml) by replacing this section: |
| 61 | + |
| 62 | +``` |
| 63 | +resources: |
| 64 | + containers: |
| 65 | + - container: mlops |
| 66 | + image: mcr.microsoft.com/mlops/python:latest |
| 67 | +``` |
| 68 | + |
| 69 | +with (using the image name previously defined): |
| 70 | + |
| 71 | +``` |
| 72 | +resources: |
| 73 | + containers: |
| 74 | + - container: mlops |
| 75 | + image: mlops/diabetes_regression |
| 76 | + endpoint: acrconnection |
| 77 | +``` |
| 78 | + |
| 79 | +Run the pipeline and ensure your container has been used. |
| 80 | + |
| 81 | +## Addressing conflicting dependencies |
| 82 | + |
| 83 | +Especially when working in a team, it's possible for environment changes across branches to interfere with one another. |
| 84 | + |
| 85 | +For example, if the master branch is using scikit-learn and you create a branch to use Tensorflow instead, and you |
| 86 | +decide to remove scikit-learn from the |
| 87 | +[ci_dependencies.yml](../diabetes_regression/ci_dependencies.yml) Conda environment definition |
| 88 | +and run the [docker-image-pipeline.yml](../environment_setup/docker-image-pipeline.yml) Docker image, |
| 89 | +then the master branch will stop building. |
| 90 | + |
| 91 | +You could leave scikit-learn in addition to Tensorflow in the environment, but that is not ideal, as you would have to take an extra step to remove scikit-learn after merging your branch to master. |
| 92 | + |
| 93 | +A better approach would be to use a distinct name for your modified environment, such as `mlops/diabetes_regression/tensorflow`. |
| 94 | +By changing the name of the image in your branch in both the container build pipeline |
| 95 | +[environment_setup/docker-image-pipeline.yml](../environment_setup/docker-image-pipeline.yml) |
| 96 | +and the model pipeline file |
| 97 | +[diabetes_regression-ci-build-train.yml](../.pipelines/diabetes_regression-ci-build-train.yml), |
| 98 | +and running both pipelines in sequence on your branch, |
| 99 | +you avoid any branch conflicts, and the name does not have to be changed after merging to master. |
0 commit comments