# Managing Experiments & Train Runs in Azure
Simple notebook to showcase some of the functions/attributes in the experiments object in the Python SDK.

Link to documentation -> [here](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.experiment?view=azure-ml-py)

### Connecting to the workspace

In [1]:
from azureml.core import Workspace

ws = Workspace.from_config(path="../")

Performing interactive authentication. Please follow the instructions on the terminal.


Note, we have launched a browser for you to login. For old experience with device code, use "az login --use-device-code"


You have logged in. Now let us find all the subscriptions to which you have access...
Interactive authentication successfully completed.


## Run Experiment from script

### Setting up the Environment

The Environment object allows us to create a new environment as code and make it reproducible. The doc [linked here](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-use-environments), provides details on how to create and manage environments

Link to documentation -> [here](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.environment.environment?view=azure-ml-py)

To get the list of curated environments, run the following code:
```python
envs = Environment.list(workspace=ws)

for env in envs:
    if env.startswith("AzureML"):
        print("Name",env)
        print("packages", envs[env].python.conda_dependencies.serialize_to_string())
```

In [2]:
from azureml.core import Environment
from azureml.core.conda_dependencies import CondaDependencies

# Create a Python environment for the experiment
diabetes_env = Environment("diabetes-experiment-env")
diabetes_env.python.user_managed_dependencies = False # Let Azure ML manage dependencies
diabetes_env.docker.enabled = True # Use a docker container

# Create a set of package dependencies (conda or pip as required)
diabetes_packages = CondaDependencies.create(conda_packages=['scikit-learn','ipykernel','matplotlib','pandas','pip'],
                                             pip_packages=['azureml-sdk','pyarrow'])

# Add the dependencies to the environment
diabetes_env.python.conda_dependencies = diabetes_packages

print(diabetes_env.name, 'defined.')

# Register the environment
diabetes_env.register(workspace=ws)

diabetes-experiment-env defined.


{
    "databricks": {
        "eggLibraries": [],
        "jarLibraries": [],
        "mavenLibraries": [],
        "pypiLibraries": [],
        "rcranLibraries": []
    },
    "docker": {
        "arguments": [],
        "baseDockerfile": null,
        "baseImage": "mcr.microsoft.com/azureml/intelmpi2018.3-ubuntu16.04:20200821.v1",
        "baseImageRegistry": {
            "address": null,
            "password": null,
            "registryIdentity": null,
            "username": null
        },
        "enabled": true,
        "platform": {
            "architecture": "amd64",
            "os": "Linux"
        },
        "sharedVolumes": true,
        "shmSize": null
    },
    "environmentVariables": {
        "EXAMPLE_ENV_VAR": "EXAMPLE_VALUE"
    },
    "inferencingStackVersion": null,
    "name": "diabetes-experiment-env",
    "python": {
        "baseCondaEnvironment": null,
        "condaDependencies": {
            "channels": [
                "anaconda",
                "co

### Setting up the Compute Target

In [3]:
ws.compute_targets.keys()

dict_keys(['compute-instance-small', 'compute-cluster'])

In [4]:
from azureml.core.compute import ComputeTarget, AmlCompute
from azureml.core.compute_target import ComputeTargetException

cluster_name = "compute-cluster"

try:
    # Check for existing compute target
    training_cluster = ComputeTarget(workspace=ws, name=cluster_name)
    print('Found existing cluster, use it.')
except ComputeTargetException:
    # If it doesn't already exist, create it
    try:
        compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_DS11_V2', max_nodes=2)
        training_cluster = ComputeTarget.create(ws, cluster_name, compute_config)
        training_cluster.wait_for_completion(show_output=True)
    except Exception as ex:
        print(ex)

Found existing cluster, use it.


In [5]:
from azureml.core import Dataset

print("Datasets:")
for dataset_name in list(ws.datasets.keys()):
    dataset = Dataset.get_by_name(ws, dataset_name)
    print("\t", dataset.name, 'version', dataset.version)

Datasets:
	 diabetes_portal version 1


`ScriptRunConfig` defines the script to be run and the Python environment in which to run it

In [6]:
from azureml.core import Experiment, ScriptRunConfig
from azureml.widgets import RunDetails

# get the registered environment
registered_env = Environment.get(ws, 'diabetes-experiment-env')

# Get the training dataset
diabetes_ds = ws.datasets.get("diabetes_portal")

# Create a script config
script_config = ScriptRunConfig(source_directory="../scripts/",
                                script='basic_training.py',
                                arguments = ['--input-data', diabetes_ds.as_named_input('training_data')],
                                environment=registered_env,
                                compute_target=cluster_name) 

# submit the experiment
experiment_name = 'mslearn-train-diabetes'
experiment = Experiment(workspace=ws, name=experiment_name)
run = experiment.submit(config=script_config)
RunDetails(run).show()

_UserRunWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': False, 'log_level': 'INFO', '…

In [7]:
# List the files generated by the experiment
for file in run.get_file_names():
    print(file)

ROC_1629284084.png
azureml-logs/20_image_build_log.txt
azureml-logs/55_azureml-execution-tvmps_aa4c12a9d64ab4e97f19bf630502407054bb04c38556f4a50e7246661e78f215_p.txt
azureml-logs/65_job_prep-tvmps_aa4c12a9d64ab4e97f19bf630502407054bb04c38556f4a50e7246661e78f215_p.txt
azureml-logs/70_driver_log.txt
azureml-logs/75_job_post-tvmps_aa4c12a9d64ab4e97f19bf630502407054bb04c38556f4a50e7246661e78f215_p.txt
azureml-logs/process_info.json
azureml-logs/process_status.json
logs/azureml/105_azureml.log
logs/azureml/dataprep/backgroundProcess.log
logs/azureml/dataprep/backgroundProcess_Telemetry.log
logs/azureml/job_prep_azureml.log
logs/azureml/job_release_azureml.log
outputs/diabetes_model.pkl


In [10]:
from azureml.core import Model
run.register_model( model_name='classification_model',
                    model_path='outputs/diabetes_model.pkl',
                    description='A classification model',
                    tags={'data-format': 'CSV'},
                    model_framework=Model.Framework.SCIKITLEARN,
                    model_framework_version='0.20.3')

Model(workspace=Workspace.create(name='azureMLWN', subscription_id='a859c8d5-c4f8-43b2-bc8c-178f5180bb8a', resource_group='azureMLRG'), name=classification_model, id=classification_model:1, version=1, tags={'data-format': 'CSV'}, properties={})

In [11]:
for model in Model.list(ws):
    # Get model name and auto-generated version
    print(model.name, 'version:', model.version)

classification_model version: 1
