**<center><h1>Introduction</h1></center>

Azure Machine Learning is a platform for operating machine learning workloads in the cloud.

<img src="images/01-01-what-is-azure-ml.jpeg" />

Built on the Microsoft Azure cloud platform, Azure Machine Learning enables you to manage:

- Scalable on-demand compute for machine learning workloads.
- Data storage and connectivity to ingest data from a wide range of sources.
- Machine learning workflow orchestration to automate model training, deployment, and management processes.
- Model registration and management, so you can track multiple versions of models and the data on which they were trained.
- Metrics and monitoring for training experiments, datasets, and published services.
- Model deployment for real-time and batch inferences.

**<h2>Learning objectives</h2>**

In this module, you will learn how to:

- Provision an Azure Machine Learning workspace.
- Use tools and interfaces to work with Azure Machine Learning.
- Run code-based experiments in an Azure Machine Learning workspace.

<hr>

**<center><h1>Azure Machine Learning workspaces</h1></center>**

A workspace is a context for the experiments, data, compute targets, and other assets associated with a machine learning workload.

**<h2>Workspaces for Machine Learning Assets</h2>**

A workspace defines the boundary for a set of related machine learning assets. You can use workspaces to group machine learning assets based on projects, deployment environments (for example, test and production), teams, or some other organizing principle. The assets in a workspace include:

- Compute targets for development, training, and deployment.
- Data for experimentation and model training.
- Notebooks containing shared code and documentation.
- Experiments, including run history with logged metrics and outputs.
- Pipelines that define orchestrated multi-step processes.
- Models that you have trained.


**<h2>Workspaces as Azure Resources</h2>**

Workspaces are Azure resources, and as such they are defined within a resource group in an Azure subscription, along with other related Azure resources that are required to support the workspace.

<img src="images/01-02-workspace.png" />

The Azure resources created alongside a workspace include:

- A storage account - used to store files used by the workspace as well as data for experiments and model training.
- An Application Insights instance, used to monitor predictive services in the workspace.
- An Azure Key Vault instance, used to manage secrets such as authentication keys and credentials used by the workspace.
- A container registry, created as-needed to manage containers for deployed models.


**<h2>Role-Based Access Control</h2>**

You can assign role-based authorization policies to a workspace, enabling you to manage permissions that restrict what actions specific Azure Active Directory (AAD) principals can perform. For example, you could create a policy that allows only users in the **IT Operations** group to create compute targets and datastores, while allowing users in the **Data Scientists** group to create and run experiments and register models.

**<h2>Creating a Workspace</h2>**

You can create a workspace in any of the following ways:

- In the Microsoft Azure portal, create a new **Machine Learning** resource, specifying the subscription, resource group and workspace name.
- Use the Azure Machine Learning Python SDK to run code that creates a workspace. For example, the following code creates a workspace named aml-workspace (assuming the Azure ML SDK for Python is installed and a valid subscription ID is specified):

```
from azureml.core import Workspace
    
ws = Workspace.create(name='aml-workspace', 
                      subscription_id='123456-abc-123...',
                      resource_group='aml-resources',
                      create_resource_group=True,
                      location='eastus'
                     )
```

- Use the Azure Command Line Interface (CLI) with the Azure Machine Learning CLI extension. For example, you could use the following command (which assumes a resource group named aml-resources has already been created):

```
# Bash Command
az ml workspace create -w 'aml-workspace' -g 'aml-resources'
```

Create an Azure Resource Manager template. For more information the template format for an Azure Machine Learning workspace, see the Azure Machine Learning documentation.

<hr>

**<center><h1>Exercise - Create a workspace</h1></center>**

Now it's your chance to get started with Azure Machine Learning for yourself by creating a workspace.

In this exercise, you will:

- Provision an Azure Machine Learning workspace.
- Create a compute instance.
- Run a notebook.

**<h2>Instructions</h2>**

Follow these instructions to complete the exercise.

1. If you do not already have an Azure subscription, sign up for a free trial at https://azure.microsoft.com.
2. View the exercise repo at https://aka.ms/mslearn-dp100.
3. Complete the Create an Azure Machine Learning workspace exercise.

<hr>

**<center><h1>Azure Machine Learning tools and interfaces</h1></center>**

Azure Machine Learning provides a cloud-based service that offers flexibility in how you use it. There are user interfaces specifically designed for Azure Machine Learning, or you can use a programmatic interface to manage workspace resources and run machine learning operations.

**<h2>Azure Machine Learning studio</h2>**

You can manage the assets in your Azure Machine Learning workspace in the Azure portal, but as this is a general interface for managing all kinds of resources in Azure, data scientists and other users involved in machine learning operations may prefer to use a more focused, dedicated interface.

<img src="images/01-03-aml-studio.jpeg"/>

Azure Machine Learning studio is a web-based tool for managing an Azure Machine Learning workspace. It enables you to create, manage, and view all of the assets in your workspace and provides the following graphical tools:

- Designer: A drag and drop interface for "no code" machine learning model development.
- Automated Machine Learning: A wizard interface that enables you to train a model using a combination of algorithms and data preprocessing techniques to find the best model for your data.

<mark>Note: A previously released tool named Azure Machine Learning Studio provided a free service for drag and drop machine learning model development. The studio interface for the Azure Machine Learning service includes this capability in the designer tool, as well as other workspace asset management capabilities.</mark>

To use Azure Machine Learning studio, use a a web browser to navigate to https://ml.azure.com and sign in using credentials associated with your Azure subscription. You can then select the subscription and workspace you want to manage.

**<h2>The Azure Machine Learning SDK</h2>**

While graphical interfaces like Azure Machine Learning studio make it easy to create and manage machine learning assets, it is often advantageous to use a code-based approach to managing resources. By writing scripts to create and manage resources, you can:

- Run machine learning operations from your preferred development environment.
- Automate asset creation and configuration to make it repeatable.
- Ensure consistency for resources that must be replicated in multiple environments (for example, development, test, and production)
- Incorporate machine learning asset configuration into developer operations (DevOps) workflows, such as continuous integration / continuous deployment (CI/CD) pipelines.

Azure Machine Learning provides software development kits (SDKs) for Python and R, which you can use to create, manage, and use assets in an Azure Machine Learning workspace.

<mark>Note:This course focuses on the Python SDK because it has broader capabilities than the R SDK, which is in preview at the time of writing.</mark>

**<h2>Installing the Azure Machine Learning SDK for Python</h2>**

You can install the Azure Machine Learning SDK for Python by using the `pip` package management utility, as shown in the following code sample:

```
# Bash Command
pip install azureml-sdk
```

The SDK is installed using the Python pip utility, and consists of the main **azureml-sdk** package as well as numerous other ancillary packages that contain specialized functionality. For example, the **azureml-widgets* package provides support for interactive widgets in a Jupyter notebook environment. To install additional packages, include them in the `pip install command:`

```
# Bash Command
pip install azureml-sdk azureml-widgets
```

**<h2>Connecting to a Workspace</h2>**

After installing the SDK package in your Python environment, you can write code to connect to your workspace and perform machine learning operations. The easiest way to connect to a workspace is to use a workspace configuration file, which includes the Azure subscription, resource group, and workspace details as shown here:

```
# JSON Commands
{
    "subscription_id": "1234567-abcde-890-fgh...",
    "resource_group": "aml-resources",
    "workspace_name": "aml-workspace"
}
```

To connect to the workspace using the configuration file, you can use the **from_config** method of the **Workspace** class in the SDK, as shown here:

```
# Python Code
from azureml.core import Workspace

ws = Workspace.from_config()
```

By default, the **from_config** method looks for a file named **config.json** in the folder containing the Python code file, but you can specify another path if necessary.

As an alternative to using a configuration file, you can use the **get** method of the **Workspace** class with explicitly specified subscription, resource group, and workspace details as shown here - though the configuration file technique is generally preferred due to its greater flexibility when using multiple scripts:

```
# Python code
from azureml.core import Workspace

ws = Workspace.get(name='aml-workspace',
                   subscription_id='1234567-abcde-890-fgh...',
                   resource_group='aml-resources')
```

Whichever technique you use, if there is no current active session with your Azure subscription, you will be prompted to authenticate.

**<h2>Working with the Workspace Class</h2>**

The **Workspace** class is the starting point for most code operations. For example, you can use its **compute_targets** attribute to retrieve a dictionary object containing the compute targets defined in the workspace, like this:

```
# Python Code
for compute_name in ws.compute_targets:
    compute = ws.compute_targets[compute_name]
    print(compute.name, ":", compute.type)
```

The SDK contains a rich library of classes that you can use to create, manage, and use many kinds of asset in an Azure Machine Learning workspace.

**<h2>The Azure Machine Learning CLI Extension</h2>**

The Azure command-line interface (CLI) is a cross-platform command-line tool for managing Azure resources. The Azure Machine Learning CLI extension is an additional package that provides commands for working with Azure Machine Learning.

To install the Azure Machine Learning CLI extension, you must first install the Azure CLI. See the full [installation instructions for all supported platforms](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli?view=azure-cli-latest) for more details.

After installing the Azure CLI, you can add the Azure Machine Learning CLI extension by running the following command:
```
# Bash Commands
az extension add -n azure-cli-ml
```
To use the Azure Machine Learning CLI extension, run the `az ml` command with the appropriate parameters for the action you want to perform. For example, to list the compute targets in a workspace, run the following command:
```
# Bash Commands
az ml computetarget list -g 'aml-resources' -w 'aml-workspace'
```

<mark>Note: In the code sample above, the -g parameter specifies the name of the resource group in which the Azure Machine Learning workspace specified in the -w parameter is defined. These parameters are shortened aliases for --resource-group and --workspace-name.</mark>



**<h2>Compute Instances</h2>**
Azure Machine Learning includes the ability to create Compute Instances in a workspace to provide a development environment that is managed with all of the other assets in the workspace.

<img src="images/01-05-notebook-vm.jpeg" />

Compute Instances include Jupyter Notebook and JupyterLab installations that you can use to write and run code that uses the Azure Machine Learning SDK to work with assets in your workspace.

You can choose a compute instance image that provides the compute specification you need, from small CPU-only VMs to large GPU-enabled workstations. Because compute instances are hosted in Azure, you only pay for the compute resources when they are running; so you can create a compute instance to suit your needs, and stop it when your workload has completed to minimize costs.

You can store notebooks independently in workspace storage, and open them in any compute instance.

**<h2>Visual Studio Code</h2>**

Visual Studio Code is a lightweight code editing environment for Microsoft Windows, Apple macOS, and Linux. It provides a visual interface for many kinds of code, including Microsoft C#, JavaScript, Python and others; as well as intellisense and syntax formatting for common data formats such as JSON and XML.

Visual Studio Code's flexibility is based on the ability to install modular extensions that add syntax checking, debugging, and visual management interfaces for specific workloads. For example, the Microsoft Python extension for Visual Studio Code adds support for writing and running Python code in scripts or notebooks within the Visual Studio Code interface.

<img src="images/01-04-vscode.jpeg" />

he Azure Machine Learning Extension for Visual Studio Code provides a graphical interface for working with assets in an Azure Machine Learning workspace. You can combine the capabilities of the Azure Machine Learning and Python extensions to manage a complete end-to-end machine learning workload in Azure Machine Learning from the Visual Studio Code environment.

<hr>


**<center><h1></h1></center>**

Like any scientific discipline, data science involves running experiments; typically to explore data or to build and evaluate predictive models. In Azure Machine Learning, an experiment is a named process, usually the running of a script or a pipeline, that can generate metrics and outputs and be tracked in the Azure Machine Learning workspace.

<img src="images/03-01-experiment.jpeg" />

An experiment can be run multiple times, with different data, code, or settings; and Azure Machine Learning tracks each run, enabling you to view run history and compare results for each run.



**<h2>The Experiment Run Context</h2>**

When you submit an experiment, you use its run context to initialize and end the experiment run that is tracked in Azure Machine Learning, as shown in the following code sample:
```
# Python Code
from azureml.core import Experiment

# create an experiment variable
experiment = Experiment(workspace = ws, name = "my-experiment")

# start the experiment
run = experiment.start_logging()

# experiment code goes here

# end the experiment
run.complete()
```
After the experiment run has completed, you can view the details of the run in the Experiments tab in Azure Machine Learning studio.


**<h2>Logging Metrics and Creating Outputs</h2>**


Experiments are most useful when they produce metrics and outputs that can be tracked across runs.

**<h3>Logging Metrics</h3>**



Every experiment generates log files that include the messages that would be written to the terminal during interactive execution. This enables you to use simple print statements to write messages to the log. However, if you want to record named metrics for comparison across runs, you can do so by using the **Run** object; which provides a range of logging functions specifically for this purpose. These include:

- **log:** Record a single named value.
- **log_list:** Record a named list of values.
- **log_row:** Record a row with multiple columns.
- **log_table:** Record a dictionary as a table.
- **log_image:** Record an image file or a plot.

For example, following code records the number of observations (records) in a CSV file:
```
#Python Code
from azureml.core import Experiment
import pandas as pd

# Create an Azure ML experiment in your workspace
experiment = Experiment(workspace = ws, name = 'my-experiment')

# Start logging data from the experiment
run = experiment.start_logging()

# load the dataset and count the rows
data = pd.read_csv('data.csv')
row_count = (len(data))

# Log the row count
run.log('observations', row_count)

# Complete the experiment
run.complete()
```
**<h3>Retrieving and Viewing Logged Metrics</h3>**


You can view the metrics logged by an experiment run in Azure Machine Learning studio or by using the RunDetails widget in a notebook, as shown here:
```
# Python Code
from azureml.widgets import RunDetails

RunDetails(run).show()
```
You can also retrieve the metrics using the **Run** object's **get_metrics** method, which returns a JSON representation of the metrics, as shown here:
```
# Python Code
import json

# Get logged metrics
metrics = run.get_metrics()
print(json.dumps(metrics, indent=2))
```
The previous code might produce output similar to this:
```
# JSON Code
{
  "observations": 15000
}
```


**<h3>Experiment Output Files</h3>**

In addition to logging metrics, an experiment can generate output files. Often these are trained machine learning models, but you can save any sort of file and make it available as an output of your experiment run. The output files of an experiment are saved in its **outputs** folder.

The technique you use to add files to the outputs of an experiment depend on how you're running the experiment. The examples shown so far control the experiment lifecycle inline in your code, and when taking this approach you can upload local files to the run's **outputs** folder by using the **Run** object's **upload_file** method in your experiment code as shown here:
```
# Python Code
run.upload_file(name='outputs/sample.csv', path_or_stream='./sample.csv')
```
When running an experiment in a remote compute context (which we'll discuss later in this course), any files written to the **outputs** folder in the compute context are automatically uploaded to the run's **outputs** folder when the run completes.

Whichever approach you use to run your experiment, you can retrieve a list of output files from the **Run** object like this:
```
# Python Code
import json

files = run.get_file_names()
print(json.dumps(files, indent=2))
```
The previous code would produce output similar to this:
```
# JSON Code
[
  "outputs/sample.csv"
]
```



**<h3>Running a Script as an Experiment</h3>**

You can run an experiment inline using the **start_logging** method of the **Experiment** object, but it's more common to encapsulate the experiment logic in a script and run the script as an experiment. The script can be run in any valid compute context, making this a more flexible solution for running experiments as scale.

An experiment script is just a Python code file that contains the code you want to run in the experiment. To access the experiment run context (which is needed to log metrics) the script must import the **azureml.core.Run** class and call its **get_context** method. The script can then use the run context to log metrics, upload files, and complete the experiment, as shown in the following example:
```
#Python Code
from azureml.core import Run
import pandas as pd
import matplotlib.pyplot as plt
import os

# Get the experiment run context
run = Run.get_context()

# load the diabetes dataset
data = pd.read_csv('data.csv')

# Count the rows and log the result
row_count = (len(data))
run.log('observations', row_count)

# Save a sample of the data
os.makedirs('outputs', exist_ok=True)
data.sample(100).to_csv("outputs/sample.csv", index=False, header=True)

# Complete the run
run.complete()
```

To run a script as an experiment, you must define a script configuration that defines the script to be run and the Python environment in which to run it. This is implemented by using a **ScriptRunConfig** object.

For example, the following code could be used to run an experiment based on a script in the **experiment_files** folder (which must also contain any files used by the script, such as the data.csv file in previous script code example):
```
# PYthon Code
from azureml.core import Experiment, ScriptRunConfig

# Create a script config
script_config = ScriptRunConfig(source_directory=experiment_folder,
                                script='experiment.py') 

# submit the experiment
experiment = Experiment(workspace = ws, name = 'my-experiment')
run = experiment.submit(config=script_config)
run.wait_for_completion(show_output=True)
```

<mark> Note: An implicitly created RunConfiguration object defines the Python environment for the experiment, including the packages available to the script. If your script depends on packages that are not included in the default environment, you must associate the ScriptRunConfig with an Environment object that makes use of a CondaDependencies object to specify the Python packages required. Runtime environments are discussed in more detail later in this course.</mark>



<hr>




**<center><h1>Exercise - Run experiments</h1></center>**


Now it's your chance to try out Azure Machine Learning for yourself.

In this exercise, you will:

- Run an Azure Machine Learning experiment.
- Run a script as an experiment.
- Use MLflow to track experiment metrics.

**<h2>Instructions</h2>**

Follow these instructions to complete the exercise.

1. If you do not already have an Azure subscription, sign up for a free trial at https://azure.microsoft.com.
2. View the exercise repo at https://aka.ms/mslearn-dp100.
3. If you have not already done so, complete the **Create an Azure Machine Learning workspace** exercise to provision an Azure Machine Learning workspace, create a compute instance, and clone the required files.
4. Complete the **Run experiments** exercise.

<hr>
