## **Introduction**

Azure Machine Learning is a platform for operating machine learning workloads in the cloud. In particular, it
is useful for the following points:
- Model registration and management
- Data storage and connectivity
- Scalable computational resources
- ML workflow orchestration
- Metrics and monitoring
- Model deployment

https://docs.microsoft.com/it-it/learn/modules/work-with-data-in-aml/

### **Workspace**

A workspace is a context in which all the experiment could be performed, data could be stored and other assets could be used. 

#### Create a workspace

In [17]:
from azureml.core import Workspace

In [18]:
"""
ws = Workspace.create(name = "aml-workspace",
                     subscription_id="123456-abs-c...",
                     resource_group="aml-workspace",
                     create_resource_group=True,
                     location='eastus'
                     )
"""

'\nws = Workspace.create(name = "aml-workspace",\n                     subscription_id="123456-abs-c...",\n                     resource_group="aml-workspace",\n                     create_resource_group=True,\n                     location=\'eastus\'\n                     )\n'

#### Connect to a workspace

You can connect through both the from_config option or with the definition of the credentials.

In [19]:
ws = Workspace.from_config(path="./Config/config.json")

#### **List target**

In [25]:
for compute_name in ws.compute_targets:
    compute = ws.compute_targets[compute_name]
    print(compute.name, ":", compute.type)

cpu-cluster : AmlCompute
v100-cluster-X1 : AmlCompute
v100-cluster-X4 : AmlCompute


## **Experiments**

An experiment is a named process, usually a script or a pipeline, which creates a metrics and could be ran with different dataset and settings.

Data can be stored through different commands:
- _log_: scalars.
- _log_list_: list.
- _log_row_: row with multiple columns.
- _log_table_: dictionary.
- _log_image_: image or a plot.

### **Logging metrics**

In [20]:
from azureml.core import Experiment

import pandas as pd

In [21]:
# create a new experiment.
experiment = Experiment(workspace = ws, name = "my-exp-test-2")

# start a new experiment.
run = experiment.start_logging()

url = "https://raw.githubusercontent.com/datasets/covid-19/main/data/countries-aggregated.csv"

try:
    df = pd.read_csv(url)
except Exception:
    raise FileNotFoundError("Retry.")

# log values.
run.log("Rows", df.shape[0])
run.log_list("Shape dataframe", [df.shape[0], df.shape[1]])
run.log_table("Shape dataframe dict", {"rows": df.shape[0], "columns": df.shape[1]})


# end the experiment.
run.complete()

In [22]:
# retrieve the metrics (you can also see them in AzureML studio).

from azureml.widgets import RunDetails

In [23]:
RunDetails(run).show()

_UserRunWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': False, 'log_level': 'INFO', '…

In [24]:
import json

# Get logged metrics
metrics = run.get_metrics()
json.dumps(metrics)

'{"Rows": 161568, "Shape dataframe": [161568, 5], "Shape dataframe dict": {"rows": 161568, "columns": 5}}'

### **Log output files**

In [None]:
import pickle 

with open('shape_df.pickle', 'wb') as f:
    pickle.dump(df.shape, f, protocol=pickle.HIGHEST_PROTOCOL)

In [None]:
run.upload_file(name='outputs/shape_df.pickle', path_or_stream='outputs/shape_df.pickle')

In [None]:
# retrieve files 

files = run.get_file_names()
print(json.dumps(files, indent=2))

### **Run a Script as an Experiment**

In [47]:
from azureml.core import Experiment, ScriptRunConfig

In [52]:
script_config = ScriptRunConfig(source_directory = "./Experiment",
                                script = "1_tutorial.py")

exp = Experiment(workspace=ws, name = "exp-with-script")
run = exp.submit(config=script_config)

run.wait_for_completion(show_output=True)

RunId: exp-with-script_1652104776_311dea45
Web View: https://ml.azure.com/runs/exp-with-script_1652104776_311dea45?wsid=/subscriptions/7e15aa60-0ca2-46f7-bad5-ab0c241c73d3/resourcegroups/rg-wp41prod/workspaces/obxihbwp41ws&tid=292eb105-33d7-4285-b2ec-60243d6187c2

Streaming azureml-logs/70_driver_log.txt

[2022-05-09T13:59:38.678557] Entering context manager injector.
[2022-05-09T13:59:39.577474] context_manager_injector.py Command line Options: Namespace(inject=['ProjectPythonPath:context_managers.ProjectPythonPath', 'RunHistory:context_managers.RunHistory', 'TrackUserError:context_managers.TrackUserError'], invocation=['1_tutorial.py'])
Script type = None
[2022-05-09T13:59:39.582474] Entering Run History Context Manager.
[2022-05-09T13:59:40.891475] Current directory: C:\Users\LUCIAR~1\AppData\Local\Temp\azureml_runs\exp-with-script_1652104776_311dea45
[2022-05-09T13:59:40.891475] Preparing to call script [1_tutorial.py] with arguments:[]
[2022-05-09T13:59:40.894478] After variable e

{'runId': 'exp-with-script_1652104776_311dea45',
 'target': 'local',
 'status': 'Completed',
 'startTimeUtc': '2022-05-09T13:59:38.154567Z',
 'endTimeUtc': '2022-05-09T13:59:45.683309Z',
 'services': {},
 'properties': {'_azureml.ComputeTargetType': 'local',
  'ContentSnapshotId': 'cf782475-05a2-4486-ae13-1f7c1a0b3367',
  'azureml.git.repository_uri': 'https://github.com/LuciaRavazzi/AzureML',
  'mlflow.source.git.repoURL': 'https://github.com/LuciaRavazzi/AzureML',
  'azureml.git.branch': 'main',
  'mlflow.source.git.branch': 'main',
  'azureml.git.commit': '501dd29e398df4697b4a4938b90b91bfb1fdb1f9',
  'mlflow.source.git.commit': '501dd29e398df4697b4a4938b90b91bfb1fdb1f9',
  'azureml.git.dirty': 'True'},
 'inputDatasets': [],
 'outputDatasets': [],
 'runDefinition': {'script': '1_tutorial.py',
  'command': '',
  'useAbsolutePath': False,
  'arguments': [],
  'sourceDirectoryDataStore': None,
  'framework': 'Python',
  'communicator': 'None',
  'target': 'local',
  'dataReferences': {}