# Azure ML Compute Targets 

Тема лабораторной работы: работа с __Целевыми объектами вычислений__ (Compute Targets) в Azure ML.

## Соединение со Azure ML Workspace

Импорт необходимых модулей и проверка версии AzureML SDK:

In [22]:
import azureml.core
from azureml.core import Workspace, Environment, Experiment, Model
from azureml.core.conda_dependencies import CondaDependencies
from azureml.train.estimator import Estimator
from azureml.train.sklearn import SKLearn
from azureml.widgets import RunDetails

# Check core SDK version number
print(f'SDK version: {azureml.core.VERSION}')

SDK version: 1.12.0


Устанавливаем соединение с Рабочей областью в Azure ML:

In [2]:
ws = Workspace.from_config()
print(f'Successfully connected to Workspace: {ws.name}.')

Successfully connected to Workspace: ai-in-cloud-workspace.


## Создаем Целевой объект вычислений

Azure ML поддерживает ряд Целевых объектов вычислений, которые можно определить в своем Рабочем пространстве и использовать для выполнения Экспериментов. При этом оплата ресурсов производится только при их использовании.

In [8]:
from azureml.core.compute import ComputeTarget, AmlCompute
from azureml.core.compute_target import ComputeTargetException

cluster_name = 'ml-cluster'

try:
    # First let's check that the cluster doesn't exist
    cluster = ComputeTarget(workspace=ws, name=cluster_name)
    print(f'The cluster {cluster.name} already exists.')
except ComputeTargetException:
    # If the cluster doesn't exist, then create it
    try:
        cluster_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_D2_V2', min_nodes=1, max_nodes=3)
        cluster = ComputeTarget.create(ws, cluster_name, cluster_config)
        cluster.wait_for_completion(show_output=True)
    except Exception as ex:
        print(ex)

The cluster ml-cluster already exists.


Проверим статус созданного Вычислительного кластера:

In [11]:
cluster_state = cluster.get_status()

print(cluster_state.allocation_state, cluster_state.current_node_count, cluster_state.vm_size, cluster_state.creation_time)

Steady 1 STANDARD_D2_V2 2020-09-09 15:10:35.443260+00:00


## Запуск обучения ML модели на ML кластере

Запустим процесс обучения модели машинного обучения на только что созданном кластере.

### Подготовка

Используем [ранее загруженный](03-datastores-and-datasets.ipynb) датасет `diabets-db` в качестве данных для обучения: 

In [13]:
data_ds = ws.datasets.get('diabetes_db')
print(f'Used dataset {data_ds.name}: {data_ds.description}')

Used dataset diabetes_db: Diabetes Disease Database


Используем [ранее зарегистрированную](05A-environments.ipynb) Среду вычисления `diabetes-experiment-env` в качестве Environment для обучения модели:


In [17]:
env = Environment.get(ws, 'diabetes-experiment-env')
print(f'Environment {env.name} will be used.')

Environment diabetes-experiment-env will be used.


Установим параметры Эксперимента:

In [20]:
experiment_name = 'ml_cluster_demo'
experiment_dir = 'new_env_demo'
os.makedirs(experiment_dir, exist_ok=True)

Создадим Estimator, передав в качестве параметра `compute_target` имя созданного Вычислительного кластера:

In [23]:
estimator = SKLearn(source_directory=experiment_dir,
                    inputs=[data_ds.as_named_input('data')],
                    entry_script='train-model.py',
                    script_params={'--reg_rate': 0.1},
                    compute_target=cluster_name, # Use the compute target created previously
                    environment_definition=env
                    )



Создаем Эксперимент и запускаем обучение модели:

In [24]:
# Create and run the experiment
experiment = Experiment(workspace=ws, name=experiment_name)
run = experiment.submit(config=estimator)

# Get run details
run.wait_for_completion(show_output=True)
RunDetails(run).show()

RunId: ml_cluster_demo_1599666049_4ecd2ca9
Web View: https://ml.azure.com/experiments/ml_cluster_demo/runs/ml_cluster_demo_1599666049_4ecd2ca9?wsid=/subscriptions/9aef4ce1-e591-4870-9443-0b0eb98df2aa/resourcegroups/ai-in-cloud-workshop-rg/workspaces/ai-in-cloud-workspace

Streaming azureml-logs/20_image_build_log.txt

2020/09/09 15:40:57 Downloading source code...
2020/09/09 15:40:58 Finished downloading source code
2020/09/09 15:40:58 Creating Docker network: acb_default_network, driver: 'bridge'
2020/09/09 15:40:59 Successfully set up Docker network: acb_default_network
2020/09/09 15:40:59 Setting up Docker configuration...
2020/09/09 15:40:59 Successfully set up Docker configuration
2020/09/09 15:40:59 Logging in to registry: aiincloudword7a28d0f.azurecr.io
2020/09/09 15:41:00 Successfully logged into aiincloudword7a28d0f.azurecr.io
2020/09/09 15:41:00 Executing step ID: acb_step_0. Timeout(sec): 5400, Working directory: '', Network: 'acb_default_network'
2020/09/09 15:41:00 Scannin

_UserRunWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': False, 'log_level': 'INFO', '…

Просмотрим результаты обучения модели:

In [25]:
metrics = run.get_metrics()
for key in metrics.keys():
    print(key, metrics.get(key))
        
print('\n')
for file in run.get_file_names():
    print(file)

Regularization Rate 0.1
Accuracy 0.7788888888888889
AUC 0.846851712258014
ROC aml://artifactId/ExperimentRun/dcid.ml_cluster_demo_1599666049_4ecd2ca9/ROC_1599666712.png


ROC_1599666712.png
azureml-logs/20_image_build_log.txt
azureml-logs/55_azureml-execution-tvmps_eef41315fe4a190fd72ae8cedaaae422b531a56db52ef5cc94e9b375db6c8990_d.txt
azureml-logs/65_job_prep-tvmps_eef41315fe4a190fd72ae8cedaaae422b531a56db52ef5cc94e9b375db6c8990_d.txt
azureml-logs/70_driver_log.txt
azureml-logs/75_job_post-tvmps_eef41315fe4a190fd72ae8cedaaae422b531a56db52ef5cc94e9b375db6c8990_d.txt
azureml-logs/process_info.json
azureml-logs/process_status.json
logs/azureml/107_azureml.log
logs/azureml/dataprep/backgroundProcess.log
logs/azureml/dataprep/backgroundProcess_Telemetry.log
logs/azureml/dataprep/engine_spans_714a0951-2f55-4bf4-99e4-09f0d8f3d704.jsonl
logs/azureml/dataprep/python_span_714a0951-2f55-4bf4-99e4-09f0d8f3d704.jsonl
logs/azureml/dataprep/python_span_bd5b2bdc-cd4b-43da-a6e0-f8df64560300.jsonl
logs/

## Регистрация модели

Зарегистрируем обученную модель:

In [28]:
run.register_model(model_path=run.get_file_names()[-1], model_name='diabetes_predict_model',
                   properties={'AUC': run.get_metrics()['AUC'], 'Accuracy': run.get_metrics()['Accuracy']},
                   tags={'Demo':'Target compute'})

Model(workspace=Workspace.create(name='ai-in-cloud-workspace', subscription_id='9aef4ce1-e591-4870-9443-0b0eb98df2aa', resource_group='ai-in-cloud-workshop-rg'), name=diabetes_predict_model, id=diabetes_predict_model:1, version=1, tags={'Demo': 'Target compute'}, properties={'AUC': '0.846851712258014', 'Accuracy': '0.7788888888888889'})

Получим список всех зарегистрированных ML моделей:

In [31]:
for model in Model.list(ws):
    print(f'{model.name} v{model.version}')
    
    for tag_name in model.tags:
        tag = model.tags[tag_name]
        print ('\t', tag_name, ':', tag)
    for prop_name in model.properties:
        prop = model.properties[prop_name]
        print ('\t', prop_name, ':', prop)

diabetes_predict_model v1
	 Demo : Target compute
	 AUC : 0.846851712258014
	 Accuracy : 0.7788888888888889
diabetes_model v4
	 Dataset : Diabetes
	 AUC : 0.846851712258014
	 Accuracy : 0.7788888888888889
diabetes_model v3
	 Dataset : Diabetes
	 AUC : 0.846851712258014
	 Accuracy : 0.7788888888888889
diabetes_model v2
	 Dataset : Diabetes
	 AUC : 0.8468519356081545
	 Accuracy : 0.7788888888888889
diabetes_model v1
	 Training context : Estimator
	 AUC : 0.8468519356081545
	 Accuracy : 0.7788888888888889
amlstudio-covid19-service v1
	 CreatedByAMLStudio : true
amlstudio-covid19-service-pipe v1
	 CreatedByAMLStudio : true
amlstudio-covid19-spread-servi v1
	 CreatedByAMLStudio : true
amlstudio-pima-diabets-service v2
	 CreatedByAMLStudio : true
amlstudio-letter-recognition-s v1
	 CreatedByAMLStudio : true
amlstudio-pima-diabetes-model v1
	 CreatedByAMLStudio : true
