# Run a training script as a command job

You can use the Python SDK for Azure Machine Learning to submit scripts as command jobs. By using jobs, you can easily keep track of the input parameters and outputs when training a machine learning model.

## Before you start

You'll need the latest version of the **azure-ai-ml** package to run the code in this notebook. Run the cell below to verify that it is installed.

> **Note**:
> If the **azure-ai-ml** package is not installed, run `pip install azure-ai-ml` to install it.

In [1]:
pip show azure-ai-ml

Name: azure-ai-ml
Version: 1.22.4
Summary: Microsoft Azure Machine Learning Client Library for Python
Home-page: https://github.com/Azure/azure-sdk-for-python
Author: Microsoft Corporation
Author-email: azuresdkengsysadmins@microsoft.com
License: MIT License
Location: c:\users\alienware\miniconda3\envs\py310\lib\site-packages
Requires: azure-common, azure-core, azure-mgmt-core, azure-storage-blob, azure-storage-file-datalake, azure-storage-file-share, colorama, isodate, jsonschema, marshmallow, msrest, opencensus-ext-azure, opencensus-ext-logging, pydash, pyjwt, pyyaml, strictyaml, tqdm, typing-extensions
Required-by: 
Note: you may need to restart the kernel to use updated packages.




## Connect to your workspace

With the required SDK packages installed, now you're ready to connect to your workspace.

To connect to a workspace, we need identifier parameters - a subscription ID, resource group name, and workspace name. Since you're working with a compute instance, managed by Azure Machine Learning, you can use the default values to connect to the workspace.

In [2]:
import configparser
import os
from azure.identity import DefaultAzureCredential
from azure.ai.ml import MLClient


# Load environment variables from config.ini file
config = configparser.ConfigParser()

# Use an absolute path to avoid path issues
config_file_path = "G:/My Drive/Ingegneria/Data Science GD/My-Practice/my models/Azure ML/config.ini"
config.read(config_file_path)

# all following IDs to be retrieved, to login correctly
os.environ['AZURE_CLIENT_ID'] = config['azure']['client_id']
os.environ['AZURE_CLIENT_SECRET'] = config['azure']['client_secret']
os.environ['AZURE_TENANT_ID'] = config['azure']['tenant_id']
os.environ['AZURE_SUBSCRIPTION_ID'] = config['azure']['subscription_id']
os.environ['AZURE_STORAGE_KEY'] = config['azure']['storage_key']


credential = DefaultAzureCredential()
credential.get_token("https://management.azure.com/.default")

# Initialize MLClient with the obtained credential
ml_client = MLClient(
    credential=credential,
    subscription_id=os.environ['AZURE_SUBSCRIPTION_ID'],
    resource_group_name="rg-dp100-labs",
    workspace_name="mlw-dp100-labs"
)
# ml_client

## Custom tracking with MLflow

When running a script as a job you can use MLflow in your training script to track the model. MLflow allows you to track any custom parameters, metrics, or artifacts you want to store with your job output.

Run the following cells to create the **train-model-mlflow.py** script in the **src** folder. The script trains a classification model by using the **diabetes.csv** file in the same folder, which is passed as an argument. 

Review the code below to find that the script will import `mlflow` and log:

- The regularization rate as a **parameter**. 
- The accuracy and AUC as **metrics**.
- The plotted ROC curve as an **artifact**.

In [3]:
import os

# create a folder for the script files
script_folder = 'src'
os.makedirs(script_folder, exist_ok=True)
print(script_folder, 'folder created')

src folder created


In [4]:
%%writefile $script_folder/train-model-mlflow.py
# import libraries
import mlflow
import argparse
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import roc_auc_score
from sklearn.metrics import roc_curve
import matplotlib.pyplot as plt

def main(args):
    # read data
    df = get_data(args.training_data)

    # split data
    X_train, X_test, y_train, y_test = split_data(df)

    # train model
    model = train_model(args.reg_rate, X_train, X_test, y_train, y_test)

    # evaluate model
    eval_model(model, X_test, y_test)

# function that reads the data
def get_data(path):
    print("Reading data...")
    df = pd.read_csv(path)
    
    return df

# function that splits the data
def split_data(df):
    print("Splitting data...")
    X, y = df[['Pregnancies','PlasmaGlucose','DiastolicBloodPressure','TricepsThickness',
    'SerumInsulin','BMI','DiabetesPedigree','Age']].values, df['Diabetic'].values

    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30, random_state=0)

    return X_train, X_test, y_train, y_test

# function that trains the model
def train_model(reg_rate, X_train, X_test, y_train, y_test):
    mlflow.log_param("Regularization rate", reg_rate)
    print("Training model...")
    model = LogisticRegression(C=1/reg_rate, solver="liblinear").fit(X_train, y_train)

    return model

# function that evaluates the model
def eval_model(model, X_test, y_test):
    # calculate accuracy
    y_hat = model.predict(X_test)
    acc = np.average(y_hat == y_test)
    print('Accuracy:', acc)
    mlflow.log_metric("Accuracy", acc)

    # calculate AUC
    y_scores = model.predict_proba(X_test)
    auc = roc_auc_score(y_test,y_scores[:,1])
    print('AUC: ' + str(auc))
    mlflow.log_metric("AUC", auc)

    # plot ROC curve
    fpr, tpr, thresholds = roc_curve(y_test, y_scores[:,1])
    fig = plt.figure(figsize=(6, 4))
    # Plot the diagonal 50% line
    plt.plot([0, 1], [0, 1], 'k--')
    # Plot the FPR and TPR achieved by our model
    plt.plot(fpr, tpr)
    plt.xlabel('False Positive Rate')
    plt.ylabel('True Positive Rate')
    plt.title('ROC Curve')
    plt.savefig("ROC-Curve.png")
    mlflow.log_artifact("ROC-Curve.png")    

def parse_args():
    # setup arg parser
    parser = argparse.ArgumentParser()

    # add arguments
    parser.add_argument("--training_data", dest='training_data',
                        type=str)
    parser.add_argument("--reg_rate", dest='reg_rate',
                        type=float, default=0.01)

    # parse args
    args = parser.parse_args()

    # return args
    return args

# run script
if __name__ == "__main__":
    # add space in logs
    print("\n\n")
    print("*" * 60)

    # parse args
    args = parse_args()

    # run main function
    main(args)

    # add space in logs
    print("*" * 60)
    print("\n\n")


Writing src/train-model-mlflow.py


Now, you can submit the script as a command job.

Run the cell below to train the model. 

In [5]:
from azure.ai.ml import command

# configure job

job = command(
    code="./src",
    command="python train-model-mlflow.py --training_data diabetes.csv",
    environment="AzureML-sklearn-0.24-ubuntu18.04-py37-cpu@latest",
    compute="aml-cluster",
    display_name="diabetes-train-mlflow",
    experiment_name="diabetes-training", 
    tags={"model_type": "LogisticRegression"}
    )

# submit job
returned_job = ml_client.create_or_update(job)
aml_url = returned_job.studio_url
print("Monitor your job at", aml_url)

Class AutoDeleteSettingSchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class AutoDeleteConditionSchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class BaseAutoDeleteSettingSchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class IntellectualPropertySchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class ProtectionLevelSchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class BaseIntellectualPropertySchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
[32mUploading src (0.82 MBs): 100%|##

Monitor your job at https://ml.azure.com/runs/khaki_seal_88kss2k8jw?wsid=/subscriptions/a90ed0cd-b0b9-4e3a-bd85-67272a44de15/resourcegroups/rg-dp100-labs/workspaces/mlw-dp100-labs&tid=8bfc37bf-8e21-4420-841d-49303c72ec1a


In the Studio, navigate to the **diabetes-train-mlflow** job to explore the overview of the command job you ran:

- Find the logged parameters in the **Overview** tab, under **Params**.
- Find the logged metrics in the **Metrics** tab.
- Find the logged artifacts in the **Images** tab (specifically for images), and in the **Outputs + logs** tab (all files).

## Autologging with MLflow

Instead of using custom logging, MLflow can also automatically log any parameters, metrics, and artifacts. Autologging with MLflow requires only one line of code.

Run the following cell to create the **train-model-autolog.py** script in the **src** folder. The script trains a classification model by using the **diabetes.csv** file in the same folder, which is passed as an argument. 

Review the code below to find that the script will import `mlflow` and enables autologging with the line: 

`mlflow.autolog()`

In [6]:
%%writefile $script_folder/train-model-autolog.py
# import libraries
import mlflow
import argparse
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import roc_auc_score
from sklearn.metrics import roc_curve
import matplotlib.pyplot as plt

def main(args):
    # enable autologging
    mlflow.autolog()

    # read data
    df = get_data(args.training_data)

    # split data
    X_train, X_test, y_train, y_test = split_data(df)

    # train model
    model = train_model(args.reg_rate, X_train, X_test, y_train, y_test)

    eval_model(model, X_test, y_test)

# function that reads the data
def get_data(path):
    print("Reading data...")
    df = pd.read_csv(path)
    
    return df

# function that splits the data
def split_data(df):
    print("Splitting data...")
    X, y = df[['Pregnancies','PlasmaGlucose','DiastolicBloodPressure','TricepsThickness',
    'SerumInsulin','BMI','DiabetesPedigree','Age']].values, df['Diabetic'].values

    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30, random_state=0)

    return X_train, X_test, y_train, y_test

# function that trains the model
def train_model(reg_rate, X_train, X_test, y_train, y_test):
    print("Training model...")
    model = LogisticRegression(C=1/reg_rate, solver="liblinear").fit(X_train, y_train)

    return model

# function that evaluates the model
def eval_model(model, X_test, y_test):
    # calculate accuracy
    y_hat = model.predict(X_test)
    acc = np.average(y_hat == y_test)
    print('Accuracy:', acc)

    # calculate AUC
    y_scores = model.predict_proba(X_test)
    auc = roc_auc_score(y_test,y_scores[:,1])
    print('AUC: ' + str(auc))

    # plot ROC curve
    fpr, tpr, thresholds = roc_curve(y_test, y_scores[:,1])
    fig = plt.figure(figsize=(6, 4))
    # Plot the diagonal 50% line
    plt.plot([0, 1], [0, 1], 'k--')
    # Plot the FPR and TPR achieved by our model
    plt.plot(fpr, tpr)
    plt.xlabel('False Positive Rate')
    plt.ylabel('True Positive Rate')
    plt.title('ROC Curve')
    plt.savefig("ROC-Curve.png") 

def parse_args():
    # setup arg parser
    parser = argparse.ArgumentParser()

    # add arguments
    parser.add_argument("--training_data", dest='training_data',
                        type=str)
    parser.add_argument("--reg_rate", dest='reg_rate',
                        type=float, default=0.01)

    # parse args
    args = parser.parse_args()

    # return args
    return args

# run script
if __name__ == "__main__":
    # add space in logs
    print("\n\n")
    print("*" * 60)

    # parse args
    args = parse_args()

    # run main function
    main(args)

    # add space in logs
    print("*" * 60)
    print("\n\n")


Writing src/train-model-autolog.py


Now, you can submit the script as a command job.

Run the cell below to train the model. 

In [7]:
from azure.ai.ml import command

# configure job

job = command(
    code="./src",
    command="python train-model-autolog.py --training_data diabetes.csv",
    environment="AzureML-sklearn-0.24-ubuntu18.04-py37-cpu@latest",
    compute="aml-cluster",
    display_name="diabetes-train-autolog",
    experiment_name="diabetes-training"
    )

# submit job
returned_job = ml_client.create_or_update(job)
aml_url = returned_job.studio_url
print("Monitor your job at", aml_url)

[32mUploading src (0.82 MBs): 100%|############################################| 821230/821230 [00:01<00:00, 577609.80it/s][0m
[39m



Monitor your job at https://ml.azure.com/runs/bold_hair_znh0m8l1ty?wsid=/subscriptions/a90ed0cd-b0b9-4e3a-bd85-67272a44de15/resourcegroups/rg-dp100-labs/workspaces/mlw-dp100-labs&tid=8bfc37bf-8e21-4420-841d-49303c72ec1a


In the Studio, navigate to the **diabetes-train-autolog** job to explore the overview of the command job you ran:

- Find the logged parameters in the **Overview** tab, under **Params**.
- Find the logged metrics in the **Metrics** tab.
- Find the logged artifacts in the **Images** tab (specifically for images), and in the **Outputs + logs** tab (all files, including the model files).

## Use MLflow to view and search for experiments

The Azure Machine Learning Studio is an easy-to-use UI to view and compare job runs. Alternatively, you can use MLflow to view experiment jobs. 

To list the jobs in the workspace, use the following command to list the experiments in the workspace:


In [12]:
import mlflow
from azureml.core import Workspace

ws = Workspace.get(name="mlw-dp100-labs",
                   subscription_id=os.environ['AZURE_SUBSCRIPTION_ID'],
                   resource_group="rg-dp100-labs")

mlflow.set_tracking_uri(ws.get_mlflow_tracking_uri())

Failure while loading azureml_run_type_providers. Failed to load entrypoint azureml.scriptrun = azureml.core.script_run:ScriptRun._from_run_dto with exception (urllib3 1.26.15 (c:\users\alienware\miniconda3\envs\py310\lib\site-packages), Requirement.parse('urllib3<3.0.0,>1.26.17')).


In [13]:
import mlflow
experiments = mlflow.search_experiments()
for exp in experiments:
    print(exp.name)

diabetes-training
move-diabetes-data
prepare_image
auto-ml-class-dev
mlflow-experiment-diabetes


To retrieve a specific experiment, you can get it by its name:

In [14]:
experiment_name = "diabetes-training"
exp = mlflow.get_experiment_by_name(experiment_name)
print(exp)

<Experiment: artifact_location='', creation_time=1732880954417, experiment_id='8b578913-146b-42b1-a3ad-7e6a415a07ce', last_update_time=None, lifecycle_stage='active', name='diabetes-training', tags={}>


Using an experiment name, you can retrieve all jobs of that experiment:

In [19]:
import pandas as pd
pd.set_option('display.max_columns', None)
mlflow.search_runs(exp.experiment_id)

Unnamed: 0,run_id,experiment_id,status,artifact_uri,start_time,end_time,metrics.Accuracy,metrics.AUC,metrics.training_score,metrics.training_f1_score,metrics.training_log_loss,metrics.training_precision_score,metrics.training_recall_score,metrics.training_accuracy_score,metrics.training_roc_auc_score,params.Regularization rate,params.random_state,params.warm_start,params.class_weight,params.tol,params.dual,params.multi_class,params.intercept_scaling,params.penalty,params.verbose,params.fit_intercept,params.l1_ratio,params.max_iter,params.n_jobs,params.C,params.solver,tags.mlflow.rootRunId,tags.mlflow.user,tags.mlflow.runName,tags.model_type,tags.mlflow.autologging,tags.estimator_class,tags.estimator_name
0,amiable_nose_75gs2gyz11,8b578913-146b-42b1-a3ad-7e6a415a07ce,FAILED,,2024-11-29 11:53:10.608000+00:00,2024-11-29 11:54:31.795000+00:00,,,,,,,,,,,,,,,,,,,,,,,,,,amiable_nose_75gs2gyz11,cd1e38bd-fc3d-4eaf-b1a9-4eba1da5430a,diabetes-pythonv2-train,,,,
1,olden_cheese_t267mf4sml,8b578913-146b-42b1-a3ad-7e6a415a07ce,FAILED,,2024-11-29 12:42:44.592000+00:00,2024-11-29 12:44:57.986000+00:00,,,,,,,,,,,,,,,,,,,,,,,,,,olden_cheese_t267mf4sml,cd1e38bd-fc3d-4eaf-b1a9-4eba1da5430a,diabetes-pythonv2-train,,,,
2,quirky_apricot_t1c2v5pc65,8b578913-146b-42b1-a3ad-7e6a415a07ce,FAILED,,2024-11-29 13:49:21.573000+00:00,2024-11-29 13:51:32.661000+00:00,,,,,,,,,,,,,,,,,,,,,,,,,,quirky_apricot_t1c2v5pc65,cd1e38bd-fc3d-4eaf-b1a9-4eba1da5430a,diabetes-pythonv2-train,,,,
3,silver_camel_jxtfjlh9nl,8b578913-146b-42b1-a3ad-7e6a415a07ce,FAILED,,2024-11-29 14:21:43.891000+00:00,2024-11-29 14:24:18.751000+00:00,,,,,,,,,,,,,,,,,,,,,,,,,,silver_camel_jxtfjlh9nl,cd1e38bd-fc3d-4eaf-b1a9-4eba1da5430a,diabetes-pythonv2-train,,,,
4,bubbly_bucket_xvxz2m8xml,8b578913-146b-42b1-a3ad-7e6a415a07ce,FINISHED,,2024-11-29 15:13:53.688000+00:00,2024-11-29 15:15:59.123000+00:00,,,,,,,,,,,,,,,,,,,,,,,,,,bubbly_bucket_xvxz2m8xml,cd1e38bd-fc3d-4eaf-b1a9-4eba1da5430a,diabetes-pythonv2-train,,,,
5,happy_garlic_wr326ppnql,8b578913-146b-42b1-a3ad-7e6a415a07ce,FINISHED,,2024-12-02 12:05:02.967000+00:00,2024-12-02 12:06:34.272000+00:00,,,,,,,,,,,,,,,,,,,,,,,,,,happy_garlic_wr326ppnql,cd1e38bd-fc3d-4eaf-b1a9-4eba1da5430a,diabetes-train-cluster,,,,
6,zen_cup_mw8zwptm5b,8b578913-146b-42b1-a3ad-7e6a415a07ce,FINISHED,,2024-12-03 10:54:00.566000+00:00,2024-12-03 10:55:33.769000+00:00,,,,,,,,,,,,,,,,,,,,,,,,,,zen_cup_mw8zwptm5b,cd1e38bd-fc3d-4eaf-b1a9-4eba1da5430a,diabetes-train-curated-env,,,,
7,wheat_whistle_887dry6fnj,8b578913-146b-42b1-a3ad-7e6a415a07ce,FAILED,,2024-12-03 11:13:30.117000+00:00,2024-12-03 11:14:20.666000+00:00,,,,,,,,,,,,,,,,,,,,,,,,,,wheat_whistle_887dry6fnj,cd1e38bd-fc3d-4eaf-b1a9-4eba1da5430a,diabetes-train-custom-env,,,,
8,magenta_spring_lycgxh8y6p,8b578913-146b-42b1-a3ad-7e6a415a07ce,FINISHED,,2024-12-03 11:33:27.299000+00:00,2024-12-03 11:34:38.689000+00:00,,,,,,,,,,,,,,,,,,,,,,,,,,magenta_spring_lycgxh8y6p,cd1e38bd-fc3d-4eaf-b1a9-4eba1da5430a,diabetes-train-custom-env,,,,
9,salmon_peach_530mmkhf33,8b578913-146b-42b1-a3ad-7e6a415a07ce,FINISHED,,2024-12-06 10:17:40.963000+00:00,2024-12-06 10:19:07.956000+00:00,,,,,,,,,,,,,,,,,,,,,,,,,,salmon_peach_530mmkhf33,cd1e38bd-fc3d-4eaf-b1a9-4eba1da5430a,diabetes-train-script-parameters,,,,


To more easily compare job runs and outputs, you can configure the search to order the results. For example, the following cell orders the results by `start_time`, and only shows a maximum of `2` results: 

In [20]:
mlflow.search_runs(exp.experiment_id, order_by=["start_time DESC"], max_results=2)

Unnamed: 0,run_id,experiment_id,status,artifact_uri,start_time,end_time,metrics.training_score,metrics.training_roc_auc_score,metrics.training_precision_score,metrics.training_f1_score,metrics.training_log_loss,metrics.training_recall_score,metrics.training_accuracy_score,metrics.Accuracy,metrics.AUC,params.random_state,params.warm_start,params.class_weight,params.tol,params.dual,params.multi_class,params.intercept_scaling,params.penalty,params.verbose,params.fit_intercept,params.l1_ratio,params.max_iter,params.n_jobs,params.C,params.solver,params.Regularization rate,tags.mlflow.rootRunId,tags.estimator_class,tags.estimator_name,tags.mlflow.runName,tags.mlflow.autologging,tags.mlflow.user,tags.model_type
0,bold_hair_znh0m8l1ty,8b578913-146b-42b1-a3ad-7e6a415a07ce,FINISHED,,2024-12-07 10:29:42.921000+00:00,2024-12-07 10:31:15.846000+00:00,0.790857,0.861979,0.78576,0.785843,0.434396,0.790857,0.790857,,,,False,,0.0001,False,auto,1.0,l2,0.0,True,,100.0,,100.0,liblinear,,bold_hair_znh0m8l1ty,sklearn.linear_model._logistic.LogisticRegression,LogisticRegression,diabetes-train-autolog,sklearn,cd1e38bd-fc3d-4eaf-b1a9-4eba1da5430a,
1,khaki_seal_88kss2k8jw,8b578913-146b-42b1-a3ad-7e6a415a07ce,FINISHED,,2024-12-07 10:19:50.268000+00:00,2024-12-07 10:21:21.328000+00:00,,,,,,,,0.774,0.84832,,,,,,,,,,,,,,,,0.01,khaki_seal_88kss2k8jw,,,diabetes-train-mlflow,,cd1e38bd-fc3d-4eaf-b1a9-4eba1da5430a,LogisticRegression


You can even create a query to filter the runs. Filter query strings are written with a simplified version of the SQL `WHERE` clause. 

To filter, you can use two classes of comparators:

- Numeric comparators (metrics): =, !=, >, >=, <, and <=.
- String comparators (params, tags, and attributes): = and !=.

Learn more about [how to track experiments with MLflow](https://learn.microsoft.com/azure/machine-learning/how-to-track-experiments-mlflow).

In [21]:
query = "metrics.AUC > 0.8 and tags.model_type = 'LogisticRegression'"
mlflow.search_runs(exp.experiment_id, filter_string=query)

Unnamed: 0,run_id,experiment_id,status,artifact_uri,start_time,end_time,metrics.Accuracy,metrics.AUC,params.Regularization rate,tags.model_type,tags.mlflow.rootRunId,tags.mlflow.user,tags.mlflow.runName
0,khaki_seal_88kss2k8jw,8b578913-146b-42b1-a3ad-7e6a415a07ce,FINISHED,,2024-12-07 10:19:50.268000+00:00,2024-12-07 10:21:21.328000+00:00,0.774,0.84832,0.01,LogisticRegression,khaki_seal_88kss2k8jw,cd1e38bd-fc3d-4eaf-b1a9-4eba1da5430a,diabetes-train-mlflow
