# Работа с Azure ML Environments

__Цель лабораторной работы:__

- научиться создавать и получать Среды (Environments) Azure ML
- научиться переиспользовать созданные Среды.

## Подготовка среды

Импорт необходимых модулей и проверка версии Azure ML SDK:

In [1]:
import os

import azureml.core
from azureml.core import Workspace, Environment, Experiment
from azureml.core.conda_dependencies import CondaDependencies
from azureml.train.estimator import Estimator
from azureml.train.sklearn import SKLearn
from azureml.widgets import RunDetails

# Check core SDK version number
print(f'SDK version: {azureml.core.VERSION}')

SDK version: 1.19.0


Получим конфигурацию эксперимента: 

In [2]:
%run core.py

config = get_experiment_config('lab_5A')
init_experiment(config)
experiment_dir = get_experiment_dir(config)

config

Experiment environments-experiment was initialized successfully.


{'experiment_name': 'environments-experiment',
 'working_subdir': 'environments-experiment-lab',
 'core': {'expriments_root_dir': 'experiments/',
  'datastore_name': 'winter_school_2020',
  'dataset_name': 'diabetes-data',
  'ml_cluster_name': 'aml-ws-cluster',
  'ml_model_name': 'diabetes-predict-model'}}

## Соединение со Azure ML Workspace

Устанавливаем соединение с Рабочей областью в Azure ML:

In [3]:
ws = Workspace.from_config()
print(f'Successfully connected to Workspace: {ws.name}.')

Successfully connected to Workspace: ai-in-cloud-workspace.


## Доступные Среды

Получим список Сред (Environments), доступных в Azure ML:

In [4]:
envs = Environment.list(workspace=ws)

print('Environments:')
for env in envs:
    if env.startswith('AzureML'):
        print(f'\t{env}')

Environments:
	AzureML-AutoML
	AzureML-PyTorch-1.0-GPU
	AzureML-Scikit-learn-0.20.3
	AzureML-TensorFlow-1.12-CPU
	AzureML-PyTorch-1.2-GPU
	AzureML-TensorFlow-2.0-GPU
	AzureML-TensorFlow-2.0-CPU
	AzureML-Chainer-5.1.0-GPU
	AzureML-TensorFlow-1.13-CPU
	AzureML-Minimal
	AzureML-Chainer-5.1.0-CPU
	AzureML-PyTorch-1.4-GPU
	AzureML-PySpark-MmlSpark-0.15
	AzureML-PyTorch-1.3-CPU
	AzureML-PyTorch-1.1-GPU
	AzureML-TensorFlow-1.10-GPU
	AzureML-PyTorch-1.2-CPU
	AzureML-TensorFlow-1.13-GPU
	AzureML-TensorFlow-1.10-CPU
	AzureML-PyTorch-1.3-GPU
	AzureML-PyTorch-1.4-CPU
	AzureML-Tutorial
	AzureML-PyTorch-1.0-CPU
	AzureML-PyTorch-1.1-CPU
	AzureML-TensorFlow-1.12-GPU
	AzureML-Designer-VowpalWabbit
	AzureML-TensorFlow-2.2-GPU
	AzureML-TensorFlow-2.2-CPU
	AzureML-PyTorch-1.6-CPU
	AzureML-PyTorch-1.6-GPU
	AzureML-Triton
	AzureML-TensorFlow-2.3-CPU
	AzureML-TensorFlow-2.3-GPU
	AzureML-DeepSpeed-0.3-GPU
	AzureML-Sidecar
	AzureML-Dask-CPU
	AzureML-Dask-GPU
	AzureML-TensorFlow-2.1-GPU
	AzureML-PyTorch-1.5-GPU

## Создание собственного Environment

Зададим имя новому Environment:

In [5]:
new_env_name = config['experiment_name']

Укажем используемые пакеты и ML-фреймворки (зависимости), используя пакетные менеджеры `pip` и `conda`:

In [6]:
env_packages = CondaDependencies.create(conda_packages=['scikit-learn','ipykernel','matplotlib', 'pandas'],
                                        pip_packages=['azureml-sdk','pyarrow'])

Создадим собственную Среду в Azure ML с необходимыми зависимостями:

In [7]:
# Create a Python environment for the experiment
new_env = Environment(new_env_name)
new_env.python.user_managed_dependencies = False # Let Azure ML manage dependencies
new_env.docker.enabled = True # Use a docker container

# Add the dependencies to the environment
new_env.python.conda_dependencies = env_packages

print(f'Environment {new_env.name} was defined successfully.')

Environment environments-experiment was defined successfully.


Зарегистрируем Azure ML Environment:

In [8]:
new_env.register(workspace=ws)

{
    "databricks": {
        "eggLibraries": [],
        "jarLibraries": [],
        "mavenLibraries": [],
        "pypiLibraries": [],
        "rcranLibraries": []
    },
    "docker": {
        "arguments": [],
        "baseDockerfile": null,
        "baseImage": "mcr.microsoft.com/azureml/intelmpi2018.3-ubuntu16.04:20200821.v1",
        "baseImageRegistry": {
            "address": null,
            "password": null,
            "registryIdentity": null,
            "username": null
        },
        "enabled": true,
        "platform": {
            "architecture": "amd64",
            "os": "Linux"
        },
        "sharedVolumes": true,
        "shmSize": null
    },
    "environmentVariables": {
        "EXAMPLE_ENV_VAR": "EXAMPLE_VALUE"
    },
    "inferencingStackVersion": null,
    "name": "environments-experiment",
    "python": {
        "baseCondaEnvironment": null,
        "condaDependencies": {
            "channels": [
                "anaconda",
                "co

Просмотрим зарегистрированную Среду:

In [9]:
envs = Environment.list(ws)

print(envs[new_env_name])
print(envs[new_env_name].python.conda_dependencies.serialize_to_string())

Environment(Name: environments-experiment,
Version: 1)
channels:
- anaconda
- conda-forge
dependencies:
- python=3.6.2
- pip:
  - azureml-sdk~=1.19.0
  - pyarrow
- scikit-learn
- ipykernel
- matplotlib
- pandas
name: azureml_96967d47636fcfd2cc2e184f757f972e



## Использование созданного Environment

Получим созданный ранее Environment:

In [10]:
registered_env = Environment.get(ws, new_env_name)
print(f'Environment {registered_env.name} will be reused.')

Environment environments-experiment will be reused.


Подготовим входные данные:

In [11]:
data_ds = ws.datasets.get(config['core']['dataset_name'])
print(f'Used dataset {data_ds.name}: {data_ds.description}')

Used dataset diabetes-data: Diabetes Disease Database (Winter School 2020)


Получим скрипт обучения модели:

In [21]:
!cp scripts/train_model.py $experiment_dir
!ls $experiment_dir

train_model.py


### Создадим и запустим Эксперимент

In [None]:
# Create the experiment
experiment = Experiment(workspace = ws, name = config['experiment_name'])

# Create an SKLearn estimator
estimator = SKLearn(source_directory=experiment_dir,
                    inputs=[data_ds.as_named_input('data')], 
                    entry_script='train_model.py',
                    script_params={'--reg_rate': 0.1},
                    compute_target='local',
                    environment_definition=registered_env # set enviroment here
                    )

# Run the experiment
experiment = Experiment(workspace = ws, name = config['experiment_name'])
run = experiment.submit(config=estimator)

# Get run details
run.wait_for_completion(show_output=True)
RunDetails(run).show()

Посмотрите на результаты обучения модели на портале Azure ML, перейдя по ссылке выше.

## Результаты

Просмотрим результаты обучения модели с использованием нового Environment:

In [15]:
metrics = run.get_metrics()

for key in metrics.keys():
        print(key, metrics.get(key))
print('\n')

for file in run.get_file_names():
    print(file)

Regularization Rate 0.1
Accuracy 0.7788888888888889
AUC 0.846851712258014
ROC aml://artifactId/ExperimentRun/dcid.new_env_demo_exp_1599494769_5a39b844/ROC_1599494783.png


ROC_1599494783.png
azureml-logs/60_control_log.txt
azureml-logs/70_driver_log.txt
logs/azureml/8_azureml.log
logs/azureml/dataprep/backgroundProcess.log
logs/azureml/dataprep/backgroundProcess_Telemetry.log
logs/azureml/dataprep/engine_spans_759936b0-accf-4206-8242-6b2c73d35a33.jsonl
logs/azureml/dataprep/python_span_759936b0-accf-4206-8242-6b2c73d35a33.jsonl
outputs/diabetes_model.pkl
