# Automated ML

In [None]:
import joblib

from azureml.core import Dataset, Workspace, Experiment
from azureml.core.compute import ComputeTarget, AmlCompute
from azureml.core.compute_target import ComputeTargetException
from azureml.train.automl import AutoMLConfig
from azureml.core.model import Model
from azureml.widgets import RunDetails

In [None]:
ws = Workspace.from_config()

# choose a name for experiment
experiment_name = 'sleep-health-project'

exp = Experiment(ws, experiment_name)

In [None]:
cluster_name = "sleep-health-compute"

# Verfiy that cluster does not exist already
try:
    cluster = ComputeTarget(workspace=ws, name=cluster_name)
    print("Found existing cluster, use it.")
except ComputeTargetException:
    compute_config = AmlCompute.provisioning_configuration(vm_size="STANDARD_D2_V2", max_nodes=4)
    cluster = ComputeTarget.create(ws, cluster_name, compute_config)

cluster.wait_for_completion(show_output=True)

## Dataset

### Overview
The Sleep Health and Lifestyle Dataset from Kaggle is used to perform a classification task. The data covers a wide range of variables related to sleep and daily habits. In the classification task it should be determined wether a person has a certain sleep disorder or none.

In [None]:
dataset = Dataset.get_by_name(ws, name='Sleep-Health-Dataset')

## AutoML Configuration

The following AutoML Config is used to get the best model:
- compute_target: This parameter specifies the target compute resource where the AutoML experiment will be executed. The cluster already is running is used.
- experiment_timeout_minutes: This parameter sets the maximum amount of time, in minutes, that the AutoML experiment is allowed to run. The experiment should be completed within 30 minutes.
- task: A classification task is specified since the model will be trained to predict a categorical variable.
- primary_metric: This parameter determines the evaluation metric that AutoML will use to optimize and compare the performance of different models. Accuracy is a good fit for a classification task.
- training_data: The Dataset will all relevant data including the target variable.
- label_column_name: Column name of the target variable.
- n_cross_validations: Since cross validations help to assess the model's generalization performance by splitting the training data into multiple subsets for training and validation, 5-fold cross validation is used.

In [None]:
# Set parameters for AutoMLConfig
automl_config = AutoMLConfig(
    compute_target=cluster,
    experiment_timeout_minutes=30,
    task="classification",
    primary_metric="accuracy",
    training_data=dataset,
    label_column_name="Sleep Disorder",
    n_cross_validations=5
)

In [None]:
# Submit automl run
remote_run = exp.submit(automl_config, show_output=True)

## Run Details

In [None]:
RunDetails(remote_run).show()
remote_run.wait_for_completion()

## Best Model

In [None]:
best_run, model = remote_run.get_output()

In [None]:
best_run

In [None]:
model_path = "best_model_automl.pkl"
joblib.dump(model, model_path)

In [None]:
model = Model.register(model_path=model_path,model_name="best_automl_model",workspace=ws)

## Model Deployment

Remember you have to deploy only one of the two models you trained but you still need to register both the models. Perform the steps in the rest of this notebook only if you wish to deploy this model.

TODO: In the cell below, register the model, create an inference config and deploy the model as a web service.

TODO: In the cell below, send a request to the web service you deployed to test it.

TODO: In the cell below, print the logs of the web service and delete the service

In [None]:
cluster.delete()

**Submission Checklist**
- I have registered the model.
- I have deployed the model with the best accuracy as a webservice.
- I have tested the webservice by sending a request to the model endpoint.
- I have deleted the webservice and shutdown all the computes that I have used.
- I have taken a screenshot showing the model endpoint as active.
- The project includes a file containing the environment details.
