**<center><h1>Introduction</h1></center>**

Automated Machine Learning enables you to try multiple algorithms and preprocessing transformations with your data. This, combined with scalable cloud-based compute makes it possible to find the best performing model for your data without the huge amount of time-consuming manual trial and error that would otherwise be required.

<img src = "images/08-02-automl.jpeg" />

Azure Machine Learning includes support for automated machine learning through a visual interface in Azure Machine Learning studio. You can use the Azure Machine Learning SDK to run automated machine learning experiments.

**<h2>Learning objectives</h2>**

In this module, you will learn how to:

- Use Azure Machine Learning's automated machine learning capabilities to determine the best performing algorithm for your data.
- Use automated machine learning to preprocess data for training.
- Run an automated machine learning experiment.

<hr>

**<center><h1>Automated machine learning tasks and algorithms</h1></center>**

You can use automated machine learning in Azure Machine Learning to train models for the following types of machine learning tasks:

- Classification
- Regression
- Time Series Forecasting



**<h2>Task-specific algorithms</h2>**

Azure Machine Learning includes support for numerous commonly used algorithms for these tasks, including:

**<h3>Classification algorithms</h3>**

- Logistic Regression
- Light Gradient Boosting Machine (GBM)
- Decision Tree
- Random Forest
- Naive Bayes
- Linear Support Vector Machine (SVM)
- XGBoost
- Deep Neural Network (DNN) Classifier
- Others...


**<h3>Regression algorithms</h3>**

- Linear Regression
- Light Gradient Boosting Machine (GBM)
- Decision Tree
- Random Forest
- Elastic Net
- LARS Lasso
- XGBoost
- Others...

**<h3>Forecasting algorithms</h3>**

- Linear Regression
- Light Gradient Boosting Machine (GBM)
- Decision Tree
- Random Forest
- Elastic Net
- LARS Lasso
- XGBoost
- Others...

**<h2>Restrict algorithm selection</h2>**

By default, automated machine learning will randomly select from the full range of algorithms for the specified task. You can choose to block individual algorithms from being selected; which can be useful if you know that your data is not suited to a particular type of algorithm, or you have to comply with a policy that restricts the type of machine learning algorithms you can use in your organization.

<hr>

**<center><h1>Preprocessing and featurization</h1></center>**

As well as trying a selection of algorithms, automated machine learning can apply preprocessing transformations to your data; improving the performance of the model.




**<h2>Scaling and normalization</h2>**

Automated machine learning applies scaling and normalization to numeric data automatically, helping prevent any large-scale features from dominating training. During an automated machine learning experiment, multiple scaling or normalization techniques will be applied.

**<h2>Optional featurization</h2>**

You can choose to have automated machine learning apply preprocessing transformations, such as:

- Missing value imputation to eliminate nulls in the training dataset.

- Categorical encoding to convert categorical features to numeric indicators.

- Dropping high-cardinality features, such as record IDs.

- Feature engineering (for example, deriving individual date parts from DateTime features)

-  Others...


<hr>

**<center><h1>Running automated machine learning experiments</h1></center>**

To run an automated machine learning experiment, you can either use the user interface in Azure Machine Learning studio, or submit an experiment using the SDK.



**<h2>Configure an automated machine learning experiment</h2>**

The user interface provides an intuitive way to select options for your automated machine learning experiment. When using the SDK, you have greater flexibility, and you can set experiment options using the **AutoMLConfig** class, as shown in the following example.

```
from azureml.train.automl import AutoMLConfig

automl_run_config = RunConfiguration(framework='python')
automl_config = AutoMLConfig(name='Automated ML Experiment',
                             task='classification',
                             primary_metric = 'AUC_weighted',
                             compute_target=aml_compute,
                             training_data = train_dataset,
                             validation_data = test_dataset,
                             label_column_name='Label',
                             featurization='auto',
                             iterations=12,
                             max_concurrent_iterations=4)
```

**<h3>Specify data for training</h3>**

Automated machine learning is designed to enable you to simply bring your data, and have Azure Machine Learning figure out how best to train a model from it.

When using the Automated Machine Learning user interface in Azure Machine Learning studio, you can create or select an Azure Machine Learning [dataset](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-create-register-datasets) to be used as the input for your automated machine learning experiment.

When using the SDK to run an automated machine learning experiment, you can submit the data in the following ways:

- Specify a dataset or dataframe of training data that includes features and the label to be predicted.
- Optionally, specify a second validation data dataset or dataframe that will be used to validate the trained model. if this is not provided, Azure Machine Learning will apply cross-validation using the training data.

Alternatively:

- Specify a dataset, dataframe, or numpy array of X values containing the training features, with a corresponding y array of label values.
- Optionally, specify X_valid and y_valid datasets, dataframes, or numpy arrays of X_valid values to be used for validation.

**<h3>Specify the primary metric</h3>**

One of the most important settings you must specify is the **primary_metric**. This is the target performance metric for which the optimal model will be determined. Azure Machine Learning supports a set of named metrics for each type of task. To retrieve the list of metrics available for a particular task type, you can use the **get_primary_metrics** function as shown here:

```
from azureml.train.automl.utilities import get_primary_metrics

get_primary_metrics('classification')
```

More Information: You can find a full list of primary metrics and their definitions in [Understand automated machine learning results](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-understand-automated-ml).

**<h2>Submit an automated machine learning experiment</h2>**

You can submit an automated machine learning experiment like any other SDK-based experiment.

```
from azureml.core.experiment import Experiment

automl_experiment = Experiment(ws, 'automl_experiment')
automl_run = automl_experiment.submit(automl_config)
```

You can monitor automated machine learning experiment runs in Azure Machine Learning studio, or in the Jupyter Notebooks **RunDetails** widget.



**<h2>Retrieve the best run and its model</h2>**

You can easily identify the best run in Azure Machine Learning studio, and download or deploy the model it generated. To accomplish this programmatically with the SDK, you can use code like the following example:

```
best_run, fitted_model = automl_run.get_output()
best_run_metrics = best_run.get_metrics()
for metric_name in best_run_metrics:
    metric = best_run_metrics[metric_name]
    print(metric_name, metric)
```

**<h2>Explore preprocessing steps</h2>**

Automated machine learning uses scikit-learn pipelines to encapsulate preprocessing steps with the model. You can view the steps in the fitted model you obtained from the best run using the code above like this:


```
for step_ in fitted_model.named_steps:
    print(step_)
```

<hr>

**<center><h1>Using automated machine learning</h1></center>**

Now it's your chance to use Azure Machine Learning'a automated machine learning capabilities to train a machine learning model.

In this exercise, you'll:

- Run an automated machine learning experiment.
- Review the best performing model.

**<h2>Instructions</h2>**

Follow these instructions to complete the exercise.

1. If you do not already have an Azure subscription, sign up for a free trial at https://azure.microsoft.com.
2. View the exercise repo at https://aka.ms/mslearn-dp100.
3. If you have not already done so, complete the Create an Azure Machine Learning workspace exercise to provision an **Azure Machine Learning workspace**, create a compute instance, and clone the required files.
4. Complete the Use automated machine learning from the SDK exercise.

<mark>**Note:** There is also an exercise named Use automated machine learning, in which you can use the visual interface for automated machine learning. You can also complete this if you want to explore the "no-code" approach.</mark>

<hr>

**<center><h1>Knowledge check</h1></center>**

1. You are using automated machine learning to train a model that predicts the species of a penguin based on its bill and flipper measurements. Which kind of task should you specify for automated machine learning?

- Regression

- Forecasting

- Classification

2. You want to use automated machine learning to find the model with the best AUC_weighted metric. Which parameter of the AutoMLConfig object should you set?

- task='AUC_weighted'

- label_column_name= 'AUC_weighted'

- primary_metric='AUC_weighted'



<hr>

**<center><h1>Summary</h1></center>**

In this module, you learned how to:

- Use Azure Machine Learning's automated machine learning capabilities to determine the best performing algorithm for your data.
- Use automated machine learning to preprocess data for training.
- Run an automated machine learning experiment.

<hr>