# Work with Compute

When you run a script as an Azure Machine Learning job, you need to define the execution context for the job run. One key configuration is the compute target on which the script will be run. This could be the local workstation (in this case the compute instance), or a remote compute target such as the Azure Machine Learning managed compute cluster that is provisioned on-demand.

In this notebook, you'll create a compute cluster and explore compute targets for jobs.

## Before you start

You'll need the latest version of the  **azureml-ai-ml** package to run the code in this notebook. Run the cell below to verify that it is installed.

> **Note**:
> If the **azure-ai-ml** package is not installed, run `pip install azure-ai-ml` to install it.

In [33]:
pip show azure-ai-ml

Name: azure-ai-ml
Version: 1.6.0
Summary: Microsoft Azure Machine Learning Client Library for Python
Home-page: https://github.com/Azure/azure-sdk-for-python
Author: Microsoft Corporation
Author-email: azuresdkengsysadmins@microsoft.com
License: MIT License
Location: c:\users\anshu\appdata\roaming\python\python310\site-packages
Requires: azure-common, azure-core, azure-mgmt-core, azure-storage-blob, azure-storage-file-datalake, azure-storage-file-share, colorama, isodate, jsonschema, marshmallow, msrest, opencensus-ext-azure, pydash, pyjwt, pyyaml, strictyaml, tqdm, typing-extensions
Required-by: 
Note: you may need to restart the kernel to use updated packages.


## Connect to your workspace

With the required SDK packages installed, now you're ready to connect to your workspace.

To connect to a workspace, we need identifier parameters - a subscription ID, resource group name, and workspace name. The resource group name and workspace name are already filled in for you. You only need the subscription ID to complete the command.

To find the necessary parameters, click on the subscription and workspace name at the top right of the Studio. A pane will open on the right.

<p style="color:red;font-size:120%;background-color:yellow;font-weight:bold"> Copy the subscription ID and replace **YOUR-SUBSCRIPTION-ID** with the value you copied. </p>

In [12]:
# enter details of your AML workspace
subscription_id = "84a5808b-5549-459a-98f2-f102e84fa1bb"
resource_group = "EYMAY"
workspace = "aml-anshu2"

In [13]:
from azure.identity import DefaultAzureCredential,InteractiveBrowserCredential
credential = InteractiveBrowserCredential(tenant_id="13a86542-2185-4187-8e07-7512f5525c55")

In [14]:
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential

# get a handle to the workspace
ml_client = MLClient(
    credential, subscription_id, resource_group, workspace
)

In [15]:
from azure.ai.ml.entities import Workspace

In [16]:
try:
    ws = Workspace(name="aml-anshu2",location="westus",
                )

    ws = ml_client.workspaces.begin_create(ws)
except:
    print("workspace already exists")


Class ManagedNetwork: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
The deployment request aml-anshu2-3082457 was accepted. ARM deployment URI for reference: 
https://portal.azure.com//#blade/HubsExtension/DeploymentDetailsBlade/overview/id/%2Fsubscriptions%2F84a5808b-5549-459a-98f2-f102e84fa1bb%2FresourceGroups%2FEYMAY%2Fproviders%2FMicrosoft.Resources%2Fdeployments%2Faml-anshu2-3082457


    # creating a new workspace with Azure ml SDK v2, authentication with interactive browser auth
    from azure.ai.ml.entities import Workspace
    ml_client2 = MLClient(
        credential, subscription_id, "EY23", "aml-anshu3"
    )

    ws = Workspace(name="aml-anshu3",location="westus",
                )

    ws = ml_client2.workspaces.begin_create(ws)



## Create a compute cluster

In many cases, your local compute resources may not be sufficient to process a complex or long-running experiment that needs to process a large volume of data; and you may want to take advantage of the ability to dynamically create and use compute resources in the cloud. Azure Machine Learning supports a range of compute targets, which you can define in your workpace and use to run jobs; paying for the resources only when using them.

You can create a compute cluster in [Azure Machine Learning studio](https://ml.azure.com), by using the Python SDK, or the Azure CLI. The following code cell checks your workspace for the existence of a compute cluster names `aml-cluster`, and if it doesn't exist, creates it.

In [18]:
from azure.ai.ml.entities import AmlCompute

# Name assigned to the compute cluster
cpu_compute_target = "aml-cluster"

try:
    # let's see if the compute target already exists
    cpu_cluster = ml_client.compute.get(cpu_compute_target)
    print(
        f"You already have a cluster named {cpu_compute_target}, we'll reuse it as is."
    )

except Exception:
    print("Creating a new cpu compute target...")

    # Let's create the Azure ML compute object with the intended parameters
    cpu_cluster = AmlCompute(
        name=cpu_compute_target,
        # Azure ML Compute is the on-demand VM service
        type="amlcompute",
        # VM Family
        size="STANDARD_DS11_V2",
        # Minimum running nodes when there is no job running
        min_instances=0,
        # Nodes in cluster
        max_instances=2,
        # How many seconds will the node running after the job termination
        idle_time_before_scale_down=120,
        # Dedicated or LowPriority. The latter is cheaper but there is a chance of job termination
        tier="Dedicated",
    )

    # Now, we pass the object to MLClient's create_or_update method
    cpu_cluster = ml_client.compute.begin_create_or_update(cpu_cluster)


Creating a new cpu compute target...


After you've created a compute cluster, you can only change the configuration for:

- `min_instances`: Minimum number of nodes
- `max_instances`: Maximum number of nodes
- `idle_time_before_scale_down`: Idle time before scale down

Currently, your compute cluster `aml-cluster` can only scale do a maximum of one node. Let's change that to two, to allow for parallel compute.

In [20]:
from azure.ai.ml.entities import AmlCompute

cluster_scale = AmlCompute(
    name="aml-cluster",
    max_instances=2,
)
ml_client.begin_create_or_update(cluster_scale)

<azure.core.polling._poller.LROPoller at 0x1c576a8de10>

When the compute cluster is updated, you can verify its configuration by printing its attributes.

In [21]:
cpu_cluster = ml_client.compute.get("aml-cluster")

print (
        f"AMLCompute with name {cpu_cluster.name} has a maximum of {cpu_cluster.max_instances} nodes"
    )

AMLCompute with name aml-cluster has a maximum of 2 nodes


## Create a script to train a model

To train a model, you'll first create the **diabetes_training.py** script in the **src** folder. The script uses the **diabetes.csv** file in the same folder as the training data.

In [22]:
import os
os.makedirs("src",exist_ok=True)

In [45]:
%%writefile src/diabetes-training.py
# import libraries
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import roc_auc_score
from sklearn.metrics import roc_curve

# load the diabetes dataset
print("Loading Data...")
diabetes = pd.read_csv('diabetes.csv')

# separate features and labels
X, y = diabetes[['Pregnancies','PlasmaGlucose','DiastolicBloodPressure','TricepsThickness','SerumInsulin','BMI','DiabetesPedigree','Age']].values, diabetes['Diabetic'].values

# split data into training set and test set
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30, random_state=0)



# set regularization hyperparameter
reg = 0.01

# train a logistic regression model
print('Training a logistic regression model with regularization rate of', reg)
model = LogisticRegression(C=1/reg, solver="liblinear").fit(X_train, y_train)

# calculate accuracy
y_hat = model.predict(X_test)
acc = np.average(y_hat == y_test)
print('Accuracy:', acc)

# calculate AUC
y_scores = model.predict_proba(X_test)
auc = roc_auc_score(y_test,y_scores[:,1])
print('AUC: ' + str(auc))

import os
import joblib
os.makedirs('outputs', exist_ok=True)
joblib.dump(value=model, filename='outputs/model.pkl')


Overwriting src/diabetes-training.py


## Working with Environment

In [None]:
#list all existing environment options
env_list = ml_client.environments.list()
for env in env_list:
    print(env.name)

churn-aci-env
AzureML-ACPT-pytorch-1.13-py38-cuda11.7-gpu
AzureML-ACPT-pytorch-1.12-py38-cuda11.6-gpu
AzureML-ACPT-pytorch-1.12-py39-cuda11.6-gpu
AzureML-ACPT-pytorch-1.11-py38-cuda11.5-gpu
AzureML-ACPT-pytorch-1.11-py38-cuda11.3-gpu
AzureML-responsibleai-0.21-ubuntu20.04-py38-cpu
AzureML-responsibleai-0.20-ubuntu20.04-py38-cpu
AzureML-tensorflow-2.5-ubuntu20.04-py38-cuda11-gpu
AzureML-tensorflow-2.6-ubuntu20.04-py38-cuda11-gpu
AzureML-tensorflow-2.7-ubuntu20.04-py38-cuda11-gpu
AzureML-sklearn-1.0-ubuntu20.04-py38-cpu
AzureML-pytorch-1.10-ubuntu18.04-py38-cuda11-gpu
AzureML-pytorch-1.9-ubuntu18.04-py37-cuda11-gpu
AzureML-pytorch-1.8-ubuntu18.04-py37-cuda11-gpu
AzureML-sklearn-0.24-ubuntu18.04-py37-cpu
AzureML-lightgbm-3.2-ubuntu18.04-py37-cpu
AzureML-pytorch-1.7-ubuntu18.04-py37-cuda11-gpu
AzureML-tensorflow-2.4-ubuntu18.04-py37-cuda11-gpu
AzureML-Triton
AzureML-Designer-Score
AzureML-VowpalWabbit-8.8.0
AzureML-PyTorch-1.3-CPU


In [None]:
myenv = ml_client.environments.get("AzureML-sklearn-0.24-ubuntu18.04-py37-cpu",version=44)
print(myenv.description,myenv.tags)

An environment for tasks such as regression, clustering, and classification with Scikit-learn. Contains the Azure ML SDK and additional python packages. {'Scikit-learn': '0.24.1', 'OS': 'Ubuntu18.04', 'Training': ''}


In [None]:
%%writefile src/conda-env.yml
name:basic-env-cpu
channel:
- conda-forge
dependencies:
- python=3.7
- scikit-learn
- pandas
- numpy 
- matplotlib
- tensorflow

In [None]:
# custom env

base_img = "mcr.microsoft.com/azureml/minimal-ubuntu18.04-py37-cpu-inference:latest"

from azure.ai.ml.entities import Environment

myenv = Environment(image=base_img,
                    name='sample_env',
                    conda_file="src/conda-env.yml")

ml_client.environments.create_or_update(myenv)

## Run a job on a compute cluster

Now, you're ready to run the job on the compute cluster you created.

> **Note**:
> The job will take some time to start as the compute cluster will need to scale from zero to one node. Once the compute cluster is ready, the script will be run. When the job has finished, the compute cluster will scale back down to zero nodes. You can review the compute cluster's status in the **Compute** page.

Writing src/conda-env.yml


In [46]:
from azure.ai.ml import command

# configure job
job = command(
    code="./src",
    command="python diabetes-training.py",
    environment="AzureML-sklearn-0.24-ubuntu18.04-py37-cpu@latest",
    compute="aml-cluster",
    display_name="diabetes-train-cluster",
    experiment_name="diabetes-training"
    )

# submit job
returned_job = ml_client.create_or_update(job)
aml_url = returned_job.studio_url
print("Monitor your job at", aml_url)

Monitor your job at https://ml.azure.com/runs/funny_engine_hhjgb1l13c?wsid=/subscriptions/84a5808b-5549-459a-98f2-f102e84fa1bb/resourcegroups/EYMAY/workspaces/aml-anshu2&tid=13a86542-2185-4187-8e07-7512f5525c55


In [28]:
print(returned_job.display_name,returned_job,returned_job.name)

diabetes-train-cluster type: command
outputs:
  default:
    mode: rw_mount
    type: uri_folder
    path: azureml://datastores/workspaceartifactstore/ExperimentRun/dcid.eager_rabbit_jzpg45b5hx
environment: azureml:AzureML-sklearn-0.24-ubuntu18.04-py37-cpu@latest
resources:
  instance_count: 1
  shm_size: 2g
compute: azureml:aml-cluster
services:
  Tracking:
    endpoint: azureml://westus.api.azureml.ms/mlflow/v1.0/subscriptions/84a5808b-5549-459a-98f2-f102e84fa1bb/resourceGroups/EYMAY/providers/Microsoft.MachineLearningServices/workspaces/aml-anshu2?
    type: Tracking
  Studio:
    endpoint: https://ml.azure.com/runs/eager_rabbit_jzpg45b5hx?wsid=/subscriptions/84a5808b-5549-459a-98f2-f102e84fa1bb/resourcegroups/EYMAY/workspaces/aml-anshu2&tid=13a86542-2185-4187-8e07-7512f5525c55
    type: Studio
component:
  name: eager_rabbit_jzpg45b5hx
  display_name: diabetes-train-cluster
  type: command
  outputs:
    default:
      type: uri_folder
      mode: rw_mount
  command: python diabete

In [49]:
#update the compute to perform scheduled start and stop

from azure.ai.ml.entities import ComputeInstance,ComputeSchedules, ComputeStartStopSchedule,RecurrenceTrigger,RecurrencePattern
from azure.ai.ml.constants import TimeZone

trigger1 = RecurrenceTrigger(start_time="2023-05-11 12:21:00",time_zone=TimeZone.INDIA_STANDARD_TIME,
                            frequency="week",interval=1,
                            schedule=RecurrencePattern(week_days=["Monday"],hours=9,minutes=[10]))

myschedule1 = ComputeStartStopSchedule(trigger=trigger1,action="start")


trigger2 = RecurrenceTrigger(start_time="yyyy-MM-ddTHH:mm:ss",time_zone=TimeZone.INDIA_STANDARD_TIME,
                            frequency="week",interval=1,
                            schedule=RecurrencePattern(week_days=["Monday"],hours=10,minutes=[10]))

myschedule2 = ComputeStartStopSchedule(trigger=trigger2,action="stop")
compute_schedule = ComputeSchedules(compute_start_stop=[myschedule1,myschedule2])


mycompute = ComputeInstance(name="aml-cluster",schedules=compute_schedule)
ml_client.compute.begin_create_or_update(mycompute)

HttpResponseError: Operation returned an invalid status 'The Recurrence StartTime must be in the format: yyyy-MM-ddTHH:mm:ss'
Content: {
  "error": {
    "code": "UserError",
    "severity": null,
    "message": "The Recurrence StartTime must be in the format: yyyy-MM-ddTHH:mm:ss",
    "messageFormat": null,
    "messageParameters": null,
    "referenceCode": null,
    "detailsUri": null,
    "target": null,
    "details": [],
    "innerError": null,
    "debugInfo": null,
    "additionalInfo": null
  },
  "correlation": {
    "operation": "6aa231ce2d057aa6859cce33ba7e8905",
    "request": "75cdba3e06fa72c4"
  },
  "environment": "westus",
  "location": "westus",
  "time": "2023-05-11T06:51:48.3780309+00:00",
  "componentName": "machinelearningcompute"
}

In [52]:
# delete compute
#ml_client.compute.begin_delete("aml-cluster").wait()

* for job model path
    azureml://jobs/<job-id>/outputs/model.pkl

* registered model path
    azureml:<model-name>:<version>


    A path on your local computer	                    mlflow-model/model.pkl
    A path on an Azure Machine Learning Datastore	    azureml://datastores/<datastore-name>/paths/<path_on_datastore>
    A path from an Azure Machine Learning job	        azureml://jobs/<job-name>/outputs/<output-name>/paths/<path-to-model-relative-to-the-named-output-location>
    A path from an MLflow job	                        runs:/<run-id>/<path-to-model-relative-to-the-root-of-the-artifact-location>
    A path from a Model Asset in Azure Machine Learning Workspace	azureml:<model-name>:<version>
    A path from a Model Asset in Azure Machine Learning Registry	azureml://registries/<registry-name>/models/<model-name>/versions/<version>

In [29]:
print(returned_job.name)

eager_rabbit_jzpg45b5hx


In [47]:
# register model
from azure.ai.ml.entities import Model
from azure.ai.ml.constants import AssetTypes
path = f"azureml://jobs/{returned_job.name}/outputs/artifacts/paths/outputs/model.pkl"

file_model = Model(path=path,type=AssetTypes.CUSTOM_MODEL,
                   name="diabetes_model")

out = ml_client.models.create_or_update(file_model)
out.path

'azureml://subscriptions/84a5808b-5549-459a-98f2-f102e84fa1bb/resourceGroups/EYMAY/workspaces/aml-anshu2/datastores/workspaceartifactstore/paths/ExperimentRun/dcid.funny_engine_hhjgb1l13c/outputs/model.pkl'

In [48]:
out.name

'diabetes_model'

In [49]:
out.version

'3'

### Deployment

In [93]:
from azure.ai.ml.entities import ManagedOnlineEndpoint
import datetime

ep_name = "endpoint-diabetes2" 
ep = ManagedOnlineEndpoint(name = ep_name,
                           auth_mode="key")
ep_new = ml_client.online_endpoints.begin_create_or_update(ep).result()

In [104]:
ep = ml_client.online_endpoints.get("endpoint-diabetes2")

azure.ai.ml.entities._endpoint.online_endpoint.ManagedOnlineEndpoint

In [38]:
out.name

'diabetes_model'

In [50]:
out.version

'3'

In [51]:
import os
os.makedirs("score",exist_ok=True)

In [72]:
%%writefile score/score.py

import os
import joblib
import json, logging, numpy
def init():
    global model
    path = os.path.join(os.getenv("AZUREML_MODEL_DIR"),"./model.pkl") # AZUREML_MODEL_DIR = ./azureml-models/MODEL_NAME/VERSION
    model = joblib.load(path)
    loggign.info("initialization completed")

def run(raw_data):
    logging.info("model: request received")
    data = json.loads(raw_data)['data']
    data = numpy.array(data)
    result = model.predict(data)
    logging.info("request procsesed")
    return json.dumps(result.tolist())



Overwriting score/score.py


In [102]:
# deployment configuration
from azure.ai.ml.entities import Model,ManagedOnlineDeployment,CodeConfiguration
from azure.ai.ml.constants import AssetTypes


model = ml_client.models.get(name=out.name, version=out.version)

# define an online deployment
blue_deployment = ManagedOnlineDeployment(
    name="blue",
    endpoint_name=ep_new,
    model=model,
    instance_type="Standard_DS3_v2",
    instance_count=1,
    environment="AzureML-sklearn-0.24-ubuntu18.04-py37-cpu@latest",
    code_path="./score",
    scoring_script="score.py"
)

In [90]:
os.getcwd()

'd:\\AI\\MLOps\\EYMAY23\\MLOps-with-Azure'

In [103]:
ml_client.online_deployments.begin_create_or_update(blue_deployment)

Check: endpoint auth_mode: key
id: /subscriptions/84a5808b-5549-459a-98f2-f102e84fa1bb/resourceGroups/EYMAY/providers/Microsoft.MachineLearningServices/workspaces/aml-anshu2/onlineEndpoints/endpoint-diabetes2
identity:
  principal_id: e59f2618-96b0-4b83-ab1c-fdf26e2b3465
  tenant_id: 13a86542-2185-4187-8e07-7512f5525c55
  type: system_assigned
kind: Managed
location: westus
mirror_traffic: {}
name: endpoint-diabetes2
openapi_uri: https://endpoint-diabetes2.westus.inference.ml.azure.com/swagger.json
properties:
  AzureAsyncOperationUri: https://management.azure.com/subscriptions/84a5808b-5549-459a-98f2-f102e84fa1bb/providers/Microsoft.MachineLearningServices/locations/westus/mfeOperationsStatus/oe:83a32fcd-1a62-4d40-917a-38866f11f415:5b9da9f8-bb1a-4509-a796-73ee5206679b?api-version=2022-02-01-preview
  azureml.onlineendpointid: /subscriptions/84a5808b-5549-459a-98f2-f102e84fa1bb/resourcegroups/eymay/providers/microsoft.machinelearningservices/workspaces/aml-anshu2/onlineendpoints/endpoi

HttpResponseError: Operation returned an invalid status 'Bad Request'
Content: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN""http://www.w3.org/TR/html4/strict.dtd">
<HTML><HEAD><TITLE>Bad Request</TITLE>
<META HTTP-EQUIV="Content-Type" Content="text/html; charset=us-ascii"></HEAD>
<BODY><h2>Bad Request - Invalid URL</h2>
<hr><p>HTTP Error 400. The request URL is invalid.</p>
</BODY></HTML>
