# Azure Machine Learning Custom Environment for PyTorch Elastic Training

Azure Machine Learning environments are an encapsulation of the environment where your machine learning training happens. The environments are managed and versioned entities within your Machine Learning workspace that enable reproducible, auditable, and portable machine learning workflows across a variety of computes. By default your workspace has several curated environments already available. 

For running our PyTorch Elastic Training examples, we will build on top of the [ACPT Curated Environment](https://learn.microsoft.com/en-us/azure/machine-learning/resource-azure-container-for-pytorch?view=azureml-api-2) - acpt-pytorch-2.0-cuda11.7. 
We will add the necessary packages to run our examples and then register the environment in our workspace.


## 1. Connect to Azure Machine Learning Workspace

The [workspace](https://docs.microsoft.com/en-us/azure/machine-learning/concept-workspace) is the top-level resource for Azure Machine Learning, providing a centralized place to work with all the artifacts you create when you use Azure Machine Learning. In this section we will connect to the workspace in which the environment will be registered.

In [None]:
# Some necessary imports
from azure.ai.ml import MLClient
from azure.ai.ml.entities import Environment, BuildContext
from azure.identity import DefaultAzureCredential

To connect to a workspace, we need identifier parameters - a subscription, resource group and workspace name. We will use these details in the MLClient from azure.ai.ml to get a handle to the required Azure Machine Learning workspace. We use the default [default azure authentication](https://docs.microsoft.com/en-us/python/api/azure-identity/azure.identity.defaultazurecredential?view=azure-python) for this tutorial. Check the [configuration notebook](https://github.com/Azure/azureml-examples/blob/8b9eade1f20aff30831221e97f1931c5345ee92f/sdk/python/jobs/configuration.ipynb) for more details on how to configure credentials and connect to a workspace.

In [None]:
credential = DefaultAzureCredential()
ml_client = None
try:
    ml_client = MLClient.from_config(credential)
except Exception as ex:
    print(ex)
    # Enter details of your AML workspace
    subscription_id = "<SUBSCRIPTION_ID>"
    resource_group = "<RESOURCE_GROUP>"
    workspace = "<AML_WORKSPACE_NAME>"
    ml_client = MLClient(credential, subscription_id, resource_group, workspace)

## 2. Create a Custom Environment

The `Environment` class will be used to create a custom environment. It accepts the following key parameters:
- `name` - Name of the environment.		
- `version`	- Version of the environment. If omitted, Azure ML will autogenerate a version.		
- `image` - The Docker image to use for the environment. Either `image` or `build` is required to create environment.
- `conda_file` - The standard conda YAML [configuration file](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#creating-an-environment-file-manually) of the dependencies for a conda environment. It can be used with a `image`. If specified, Azure ML will build the conda environment on top of the Docker image provided.
- `BuildContext`- The Docker build context configuration to use for the environment. Either `image` or `build` is required to create environment.
  - `path`- Local path to the directory to use as the build context.		
  - `dockerfile_path` - Relative path to the Dockerfile within the build context.
- `description`	- Description of the environment.		

In the example below, we will create a custom environment from Docker build context configuration.

The directory structure for the build context (under [context](./context)) is as follows:
``` 
.
├── Dockerfile
└── requirements.txt
```

The [Dockerfile](./context/Dockerfile) contains the instructions to build the ACPT image. The [requirements.txt](./context/requirements.txt) file contains the additional packages required for our examples. If you want to add more packages required for your training, you can add them to the `requirements.txt` file.

In [None]:
env_docker_context = Environment(
    build=BuildContext(path="./context"),
    name="TorchElasticAML",
    description="Custom Environment for running Torch Elastic jobs on AzureML.",
)
ml_client.environments.create_or_update(env_docker_context)

The above code will create a custom environment with the name `TorchElasticAML` and register in on the Workspace. Note that this process may take some time, as the environment is being built from the Dockerfile.

## Next Steps

We are ready to submit PyTorch Elastic Training jobs using the custom environment we created.