# Automated ML

TODO: Import Dependencies. In the cell below, import all the dependencies that you will need to complete the project.

In [None]:
from azureml.core import Workspace, Experiment, Dataset
from azureml.core.compute import ComputeTarget, AmlCompute
from azureml.core.compute_target import ComputeTargetException
from azureml.train.automl.utilities import get_primary_metrics
from azureml.train.automl import AutoMLConfig
from azureml.pipeline.steps import AutoMLStep
from azureml.widgets import RunDetails
import pickle
from azureml.core.environment import Environment
from azureml.core.model import InferenceConfig, Model
from azureml.core.webservice.aci import AciWebservice
from azureml.core.webservice import Webservice
import os

## Dataset

### Overview

Task- Prediction of house prices in King County (A regression problem)
Data- Dataset has been taken from: https://www.kaggle.com/harlfoxem/housesalesprediction

In [None]:
ws = Workspace.from_config()

# choose a name for experiment
experiment_name = 'exp1'

experiment=Experiment(ws, experiment_name

In [None]:
try:
    cluster = ComputeTarget(workspace=ws, name="compute11")
    print("Cluster exists")
except:
    config = AmlCompute.provisioning_configuration(vm_size='STANDARD_D2_V2', max_nodes=4)
    cluster = ComputeTarget.create(ws, "compute11", config)

cluster.wait_for_completion()

In [None]:
train_data = Dataset.get_by_name(ws, name="housedata")

train_data.take(5).to_pandas_dataframe()

## AutoML Configuration

The task specified is regression according to the problem statement. Metric is accuracy as its a good way to evaluate the model. Cross validation helps to prevent overfitting of the model. Concurrent iterations has been set to 4 and its value has to be less than or equal to the number of nodes provided during creation of compute cluster.

In [None]:
automl_settings = {
    "experiment_timeout_minutes": 30,
    "task": "regression", 
    "primary_metric": "mean_absolute_error",
    "training_data": train_data,
    "label_column_name": "price",
    "n_cross_validations": 3,
    "enable_early_stopping": True,
    "max_cores_per_iteration": -1,
    "max_concurrent_iterations": 4,
    "compute_target": cluster
}


automl_config = AutoMLConfig(**automl_settings)

In [None]:
# TODO: Submit your experiment
remote_run = experiment.submit(automl_config)

## Run Details

OPTIONAL: Write about the different models trained and their performance. Why do you think some models did better than others?

TODO: In the cell below, use the `RunDetails` widget to show the different experiments.

In [None]:
RunDetails(remote_run).show()

In [None]:
remote_run.wait_for_completion(show_output=True)

## Best Model

TODO: In the cell below, get the best model from the automl experiments and display all the properties of the model.



In [None]:
best_auto_run, best_auto_model = remote_run.get_output()
best_auto_model._final_estimator

In [None]:
#TODO: Save the best model

model_name = "automl_model.pkl"
with open(model_name, 'wb') as f:
    pickle.dump(best_auto_model,f)

## Model Deployment




In [None]:
model = Model.register(ws, model_path='./automl_model.pkl', model_name='houseprice_prediction')

In [None]:
env = best_auto_run.get_environment()
inference_config = InferenceConfig(entry_script = "score.py", environment = env)
deployment_config = AciWebservice.deploy_configuration(cpu_cores = 1, memory_gb = 1)

In [None]:
# env = best_auto_run.get_environment()
# entry_script='score.py'
# best_auto_run.download_file('outputs/scoring_file_v_1_0_0.py', entry_script)

In [None]:
# model = remote_run.register_model(model_name=best_auto_run.properties['model_name'], 
#                                            description='AutoML model')
# inference_config = InferenceConfig(entry_script = entry_script, environment = env)
#  deployment_config = AciWebservice.deploy_configuration(cpu_cores = 1, memory_gb = 1)

In [None]:
service = Model.deploy(ws, name='deploy1', [model], inference_config, deployment_config)
service.wait_for_deployment(True)
print("State: " + service.state)
print("Scoring URI: " + service.scoring_uri)

In [None]:
%run endpoint.py

In [None]:
service.get_logs()

In [None]:
service.update(enable_app_insights=True)

In [None]:
service.delete()