In [111]:
%reload_azureml_ws

2020-04-01 15:57:06,175 - azureml.core.workspace - INFO - Found the config file in: /home/yuvraj/projects/AzureAworkspace/azureds/.azureml/config.json
Ready to use Azure ML 1.2.0 to work with azml-workspace
Imported workspace as ws
2020-04-01 15:57:08,281 - root - INFO - Defined global variable `ws` and `conf_catalog`


# Azure Experiment 


## 1.1 Azure Experiment 

In AzureML, an experiment is an abstraction model that let you run a script or a pipeline. The main features of an experiment is the ability to generate metrics and outputs that can be tracked in the Azure Machine Learning Workspace.

When you submit an experiment, you use its run context to initialize and end the experiment run that is tracked in Azure Machine Learning, as shown in the following code sample:

>```python
from azureml.core import Experiment
# create an experiment variable
experiment = Experiment(workspace = ws, name = "my-experiment")
# start the experiment
run = experiment.start_logging()
# experiment code goes here
def do_something():
    return pass
# end the experiment
run.complete()
```

## 1.2 Experiment Logging Metric and Outputs

Every experiment generates log files that include the messages that would be written to the terminal during interactive execution. This enables you to use simple print statements to write messages to the log. However, if you want to record named metrics for comparison across runs, you can do so by using the Run object; which provides a range of logging functions specifically for this purpose. These include:

- ***log:*** Record a single named value.
- ***log_list:*** Record a named list of values.
- ***log_row:*** Record a row with multiple columns.
- ***log_table:*** Record a dictionary as a table.
- ***log_image:*** Record an image file or a plot.

For example, following code records the number of observations (records) in a CSV file:

>```python
from azureml.core import Experiment
import pandas as pd
# Create an Azure ML experiment in your workspace
experiment = Experiment(workspace = ws, name = 'my-experiment')
# Start logging data from the experiment
run = experiment.start_logging()
# load the dataset and count the rows
data = pd.read_csv('data.csv')
row_count = (len(data))
# Log the row count
run.log('observations', row_count)
# Complete the experiment
run.complete()
```

## 1.3 Running an Experiment Script
To run a script as an experiment, you must define a run configuration that defines the Python environment in which the script will be run, and a script run configuration that associates the run environment with the script. These are implemented by using the RunConfiguration and ScriptRunConfig objects.

For example, the following code could be used to run an experiment based on a script in the experiment_files folder (which must also contain any files used by the script, such as the data.csv file in previous script code example):

```python
from azureml.core import Experiment, RunConfiguration, ScriptRunConfig

# create a new RunConfig object
experiment_run_config = RunConfiguration()

# Create a script config
script_config = ScriptRunConfig(source_directory=experiment_folder, 
                      script='experiment.py',
                      run_config=experiment_run_config) 

# submit the experiment
experiment = Experiment(workspace = ws, name = 'my-experiment')
run = experiment.submit(config=script_config)
run.wait_for_completion(show_output=True)
```

>Note: The RunConfig object defines the Python environment for the experiment, including the packages available to the script. If your script depends on packages that are not included in the default environment, you must associate the RunConfig with an Environment object that makes use of a CondaDependencies object to specify the Python packages required. Runtime environments are discussed in more detail later in this course.

# 1.4 View Experiment Results

After the experiment has been finished, you can use the **run** object to get information about the run and its outputs:

```python
import json

# Get run details
details = run.get_details()
print(details)

# Get logged metrics
metrics = run.get_metrics()
print(json.dumps(metrics, indent=2))

# Get output files
files = run.get_file_names()
print(json.dumps(files, indent=2))
```

In Jupyter Notebooks, you can use the **RunDetails** widget to get a better visualization of the run details.

```python
from azureml.widgets import RunDetails

RunDetails(run).show()
```

Note that the **RunDetails** widget includes a link to view the run in Azure Machine Learning studio. Click this to open a new browser tab with the run details (you can also just open [Azure Machine Learning studio](https://ml.azure.com) and find the run on the **Experiments** page). When viewing the run in Azure Machine Learning studio, note the following:

- The **Properties** tab contains the general properties of the experiment run.
- The **Metrics** tab enables you to select logged metrics and view them as tables or charts.
- The **Images** tab enables you to select and view any images or plots that were logged in the experiment (in this case, the *Label Distribution* plot)
- The **Child Runs** tab lists any child runs (in this experiment there are none).
- The **Outputs** tab shows the output files generated by the experiment.
- The **Logs** tab shows any logs that were generated by the compute context for the experiment (in this case, the experiment was run inline so there are no logs).
- The **Snapshots** tab contains all files in the folder where the experiment code was run (in this case, everything in the same folder as this notebook).
- The **Raw JSON** tab shows a JSON representation of the experiment details.
- The **Explanations** tab is used to show model explanations generated by the experiment (in this case, there are none).

# 1.5 Running Experiment Script

To run a script as an experiment, you must define a run configuration that defines the Python environment in which the script will be run, and a script run configuration that associates the run environment with the script. These are implemented by using the ***RunConfiguration*** and ***ScriptRunConfig*** objects.

For example, the following code could be used to run an experiment based on a script in the experiment_files folder (which must also contain any files used by the script, such as the data.csv file in previous script code example):

``` Python
from azureml.core import Experiment, RunConfiguration, ScriptRunConfig

# create a new RunConfig object
experiment_run_config = RunConfiguration()

# Create a script config
script_config = ScriptRunConfig(source_directory=experiment_folder, 
                      script='experiment.py',
                      run_config=experiment_run_config) 

# submit the experiment
experiment = Experiment(workspace = ws, name = 'my-experiment')
run = experiment.submit(config=script_config)
run.wait_for_completion(show_output=True)
```

Note: The RunConfig object defines the Python environment for the experiment, including the packages available to the script. If your script depends on packages that are not included in the default environment, you must associate the RunConfig with an Environment object that makes use of a CondaDependencies object to specify the Python packages required. Runtime environments are discussed in more detail later in this course.

# 2.0 Azure Estimators

You can use a ***Run Configuration*** and a ***Script Run Configuration*** to run a script-based experiment that trains a machine learning model. However, depending on the machine learning framework being used and the dependencies it requires, the run configuration may become complex. 

Azure Machine Learning provides a higher level abstraction called an ***Estimator*** that encapsulates a run configuration and a script configuration in a single object, and for which there are pre-defined, framework-specific variants that already include the package dependencies for common machine learning frameworks such as Scikit-Learn, PyTorch, and Tensorflow.

##  2.1 Writing a script for an estimator

When using an experiment to train a model, your script should save the trained model in the outputs folder. For example, the following code shows how a model trained using Scikit-Learn can be saved in the outputs folder using the joblib package:


```python
from azureml.core import Run
import pandas as pd
import numpy as np
import joblib
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression

# Get the experiment run context
run = Run.get_context()

# Prepare the dataset
diabetes = pd.read_csv('data.csv')
X, y = data[['Feature1','Feature2','Feature3']].values, data['Label'].values
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30)

# Train a logistic regression model
reg = 0.1
model = LogisticRegression(C=1/reg, solver="liblinear").fit(X_train, y_train)

# calculate accuracy
y_hat = model.predict(X_test)
acc = np.average(y_hat == y_test)
run.log('Accuracy', np.float(acc))

# Save the trained model
os.makedirs('outputs', exist_ok=True)
joblib.dump(value=model, filename='outputs/model.pkl')

run.complete()

```

## 2.2 Using an Estimator

You can use a generic Estimator class to define a run configuration for a training script like this:

```python
from azureml.train.estimator import Estimator
from azureml.core import Experiment

# Create an estimator
script_params = {
   '--num_epochs': 20,
   '--data_dir': ds_data.as_mount(),
   '--output_dir': './outputs'
}

estimator = Estimator(source_directory=project_folder,
                     compute_target=compute_target,
                     entry_script='cntk_distr_mnist.py',
                     script_params=script_params,
                     node_count=2,
                     process_count_per_node=1,
                     distributed_backend='mpi',
                     pip_packages=['cntk-gpu==2.6'],
                     custom_docker_image='microsoft/mmlspark:gpu-0.12',
                     use_gpu=True)
                     
# Create and run an experiment
experiment = Experiment(workspace = ws, name = 'training_experiment')
run = experiment.submit(config=estimator)
```

>NOTE: See the following [link](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/training-with-deep-learning/distributed-cntk-with-custom-docker/distributed-cntk-with-custom-docker.ipynb) for a more comprehensive example of using estimators with azure cognitice services.