# Running the script as an experiment

You can use a ScriptRunConfig to run a script-based experiment that trains a machine learning model.

To prepare for an experiment that trains a model, a script like this is created and saved in a folder. For example, you could save this script as training_script.py in a folder named training_folder. Since the script includes code to load training data from data.csv, this file should also be saved in the folder.

To run the script, create a ScriptRunConfig that references the folder and script file. You generally also need to define a Python (Conda) environment that includes any packages required by the script. In this example, the script uses Scikit-Learn so you must create an environment that includes that. The script also uses Azure Machine Learning to log metrics, so you need to remember to include the azureml-defaults package in the environment.

In [1]:
from azureml.core import Experiment, ScriptRunConfig, Environment, Workspace
from azureml.core.conda_dependencies import CondaDependencies

In [3]:
# connect to workspace
ws = Workspace.from_config()
ws

# create a python environment for the experiment
sklearn_env = Environment("sklearn-dev")

# ensure required packages are installed
packages = CondaDependencies.create(
    conda_packages=['scikit-learn', 'pip'],
    pip_packages=['azureml-defaults', 'pandas'])
sklearn_env.python.conda_dependencies = packages

# create a script config
script_config = ScriptRunConfig(
    source_directory=".",
    script="training_with_arguments.py",
    arguments=['--reg-rate', 0.1],
    environment=sklearn_env
)

# submit the experiment
experiment = Experiment(workspace=ws, name="training-experiment-with-argument")
run = experiment.submit(config=script_config)
run.wait_for_completion(show_output=True)

RunId: training-experiment-with-argument_1662117643_3812c36c
Web View: https://ml.azure.com/runs/training-experiment-with-argument_1662117643_3812c36c?wsid=/subscriptions/3ed3266f-ff5e-4b56-b844-7568f3957f98/resourcegroups/am-rg/workspaces/aml-workspace&tid=fd50ea2b-9154-4926-9399-6cc1b0859c88

Streaming azureml-logs/70_driver_log.txt

[2022-09-02T11:20:49.653187] Entering context manager injector.
[2022-09-02T11:20:50.471020] context_manager_injector.py Command line Options: Namespace(inject=['ProjectPythonPath:context_managers.ProjectPythonPath', 'RunHistory:context_managers.RunHistory', 'TrackUserError:context_managers.TrackUserError'], invocation=['training_with_arguments.py', '--reg-rate', '0.1'])
Script type = None
[2022-09-02T11:20:50.474537] Entering Run History Context Manager.
[2022-09-02T11:20:52.161553] Current directory: /private/var/folders/3g/fqf9w8vj3kn455_6g9l325kh0000gp/T/azureml_runs/training-experiment-with-argument_1662117643_3812c36c
[2022-09-02T11:20:52.161807] P

{'runId': 'training-experiment-with-argument_1662117643_3812c36c',
 'target': 'local',
 'status': 'Completed',
 'startTimeUtc': '2022-09-02T11:20:47.129048Z',
 'endTimeUtc': '2022-09-02T11:22:08.304256Z',
 'services': {},
 'properties': {'_azureml.ComputeTargetType': 'local',
  'ContentSnapshotId': '8300f301-a750-422b-90ad-58bde7856389',
  'azureml.git.repository_uri': 'https://github.com/mazumdarabhinav/azureml.git',
  'mlflow.source.git.repoURL': 'https://github.com/mazumdarabhinav/azureml.git',
  'azureml.git.branch': 'main',
  'mlflow.source.git.branch': 'main',
  'azureml.git.commit': '53ea90cc1d920915bb4aa76f2dff5d273cab9faa',
  'mlflow.source.git.commit': '53ea90cc1d920915bb4aa76f2dff5d273cab9faa',
  'azureml.git.dirty': 'True'},
 'inputDatasets': [],
 'outputDatasets': [],
 'runDefinition': {'script': 'training_with_arguments.py',
  'command': '',
  'useAbsolutePath': False,
  'arguments': ['--reg-rate', '0.1'],
  'sourceDirectoryDataStore': None,
  'framework': 'Python',
  'co

## Registering Models

## Retreiving model files

After an experiment run has completed, you can use the run objects get_file_names method to list the files generated. Standard practice is for scripts that train models to save them in the run's outputs folder.

You can also use the run object's download_file and download_files methods to download output files to the local file system.

In [4]:
# "run" is a reference to a completed experiment run

# list the files generated by the experiment
for file in run.get_file_names():
    print(file)

# download a named file
run.download_file(name="outputs/model.pkl", output_file_path='model.pkl')

azureml-logs/60_control_log.txt
azureml-logs/70_driver_log.txt
logs/azureml/69024_azureml.log
outputs/model.pkl


## Registering a model

Model registration enables you to track multiple versions of a model, and retrieve models for inferencing (predicting label values from new data). When you register a model, you can specify a name, description, tags, framework (such as Scikit-Learn or PyTorch), framework version, custom properties, and other useful metadata. Registering a model with the same name as an existing model automatically creates a new version of the model, starting with 1 and increasing in units of 1.

To register a model from a local file, you can use the register method of the Model object as shown here:

### Option-1

In [7]:
from azureml.core import Model

model = Model.register(
    workspace=ws,
    model_name="diabetes_classification_model",
    model_path="model.pkl", # local path
    description="A diabetes classification model",
    tags={'data-format': 'CSV'},
    model_framework=Model.Framework.SCIKITLEARN,
    model_framework_version='0.20.3'
)

Registering model diabetes_classification_model


### Option-2

Alternatively, if you have a reference to the Run used to train the model, you can use its register_model method as shown here:

In [8]:
run.register_model(
    model_name="diabetes_classification_model",
    model_path="outputs/model.pkl", # run outputs path
    description="A diabetes classification model",
    tags={'data-format': 'CSV'},
    model_framework=Model.Framework.SCIKITLEARN,
    model_framework_version='0.20.3'
)

Model(workspace=Workspace.create(name='aml-workspace', subscription_id='3ed3266f-ff5e-4b56-b844-7568f3957f98', resource_group='am-rg'), name=diabetes_classification_model, id=diabetes_classification_model:2, version=2, tags={'data-format': 'CSV'}, properties={})

## Viewing registered models

In [9]:
from azureml.core import Model

for model in Model.list(ws):
    print(model.name, "version:", model.version)

diabetes_classification_model version: 2
diabetes_classification_model version: 1
