# Automated ML

TODO: Import Dependencies. In the cell below, import all the dependencies that you will need to complete the project.

In [None]:
import azureml.core
from azureml.core import Dataset
from azureml.core.experiment import Experiment
from azureml.core.workspace import Workspace
from azureml.core.compute import ComputeTarget, AmlCompute
from azureml.core.compute_target import ComputeTargetException
from azureml.data.dataset_factory import TabularDatasetFactory
from azureml.widgets import RunDetails
from azureml.data.datapath import DataPath
from azureml.train.automl import AutoMLConfig
from azureml.interpret import ExplanationClient
from azureml.automl.core.featurization import FeaturizationConfig
import pandas as pd
import logging
from matplotlib import pyplot as plt
import train
import joblib 
import os

## Dataset

### Overview
What is Churn?
Churn is a process in which customers stop or plan to stop using services/contracts of a company. So churn prediction is about identifying customers who are likely to cancel their services/contracts soon. Then companies can offer discounts or other benefits on these services and users can continue with their services.

Naturally, we can use the past data about customers who churned and based on that we will create a model for identifying present customers who are about to go away. This is a binary classification problem. The target variable that we want to predict is categorical and has only two possible outcomes: churn or not churn.


### Task

Some of our customers are churning. They no longer are using our services and going to a different provider. We would like to prevent that from happening. For that, we develop a system for identifying these customers and offer them an incentive to stay. We need to be able to interpret the predictions of the model. 

Firstly we will do some EDA (Exploratory data analysis) in which we identify which features are important in our data and 
then we split the data into train and test so we can test our models and then we deploy our best model.
According to the description, this dataset has the following information:

Services of the customers — phone; multiple lines; internet; tech support and extra services such as online security, backup, device protection, and TV streaming

Account information — how long they have been clients, type of contract, type of payment method

Charges — how much the client was charged in the past month and in total

Demographic information — gender, age, and whether they have dependents or a partner

Churn — yes/no, whether the customer left the company within the past month

The label "status" tells us whether a student was placed or not and this is the target column for predictions.




In [None]:
ws = Workspace.from_config()
print('Workspace name: ' + ws.name, 
      'Azure region: ' + ws.location, 
      'Subscription id: ' + ws.subscription_id, 
      'Resource group: ' + ws.resource_group, sep = '\n')
# choose a name for experiment
experiment_name = 'Churn Prediction'

experiment=Experiment(ws, experiment_name)

In [None]:
# Choose a name for your CPU cluster
cpu_cluster_name = "project-compute"

# Verify that cluster does not exist already
try:
    compute_target = ComputeTarget(workspace=ws, name=cpu_cluster_name)
    print('Found existing cluster, use it.')
except ComputeTargetException:
    compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_DS3_V2',
                                                           max_nodes=6)
    compute_target = ComputeTarget.create(ws, cpu_cluster_name, compute_config)

compute_target.wait_for_completion(show_output=True)

In [None]:

# Loading dataset using url
# NOTE: update the key to match the dataset name
found = False
key = "Churn Prediction Dataset"
description_text = "Churn Prediction for Capstone Project"

if key in ws.datasets.keys(): 
        found = True
        ds = ws.datasets[key] 

if not found:
        # Create Dataset and register it into Workspace
        example_data = 'https://raw.githubusercontent.com/tejasbangera/Udacity-Captstone-Project/main/WA_Fn-UseC_-Telco-Customer-Churn.csv?token=AO3MXNWWA5PXCKIPPCWWT33AO7TLY'
        ds = TabularDatasetFactory.from_delimited_files(path = example_data)        
        #Register Dataset in Workspace
        ds = ds.register(workspace=ws,name=key,description=description_text)

In [None]:
df=ds.to_pandas_dataframe()
df.head()

In [None]:
from train import clean_data

# Use the clean_data function to clean your data.
x, y = clean_data(ds)

In [None]:
from sklearn.model_selection import train_test_split
#create train test split
train_x, test_x, train_y, test_y = train_test_split(x,y, test_size=0.2, random_state=200)
#join the train_x and train_y to create train dataset
train_df=pd.concat([train_x,train_y], axis=1)

## AutoML Configuration

TODO: Explain why you chose the automl settings and cofiguration you used below.

In [None]:

# TODO: Put your automl settings here
automl_settings = {
    "experiment_timeout_hours" : 0.3,
    "enable_early_stopping" : True,
    "iteration_timeout_minutes": 5,
    "max_concurrent_iterations": 4,
    "primary_metric": 'accuracy',
    "featurization": 'auto',
    "verbosity": logging.INFO,
}

# TODO: Put your automl config here
automl_config = AutoMLConfig(compute_target=compute_target,
                             task = "classification",
                             training_data=train_data,
                             label_column_name=label,   
                             path = project_folder,
                             debug_log = "automl_errors.log",
                             **automl_settings
)

In [None]:
# TODO: Submit your experiment
remote_run = experiment.submit(automl_config)

## Run Details

OPTIONAL: Write about the different models trained and their performance. Why do you think some models did better than others?

TODO: In the cell below, use the `RunDetails` widget to show the different experiments.

In [None]:
# TODO: Submit your experiment
remote_run = experiment.submit(automl_config)

## Best Model

TODO: In the cell below, get the best model from the automl experiments and display all the properties of the model.



In [None]:
from azureml.core.run import Run

# Get the best run object
best_run, fitted_model = remote_run.get_output()
print(best_run)
print(fitted_model)
best_run.get_tags()
print(best_run.properties['model_name'])

In [None]:
#TODO: Save the best model
import joblib
os.makedirs('outputs', exist_ok = True)
joblib.dump(fitted_model, 'outputs/fitted_model.joblib')

## Model Deployment

Remember you have to deploy only one of the two models you trained.. Perform the steps in the rest of this notebook only if you wish to deploy this model.

TODO: In the cell below, register the model, create an inference config and deploy the model as a web service.

In [None]:
best_run, fitted_model = remote_run.get_output()
model_name = best_run.properties['model_name']

script_file_name = './score.py'

best_run.download_file('outputs/scoring_file_v_1_0_0.py', './score.py')

In [None]:
description = 'AutoML Model trained on Churn dataset to predict if a customer has churned or not.'
tags = None
model = remote_run.register_model(model_name = model_name, description = description, tags = tags)

print(remote_run.model_id)

TODO: In the cell below, send a request to the web service you deployed to test it.

In [None]:
from azureml.core.model import InferenceConfig
from azureml.core.webservice import AciWebservice
from azureml.core.webservice import Webservice
from azureml.core.model import Model
from azureml.core.environment import Environment

env = best_run.get_environment()    

inference_config = InferenceConfig(entry_script=script,environment=env)

deploy_config =AciWebservice.deploy_configuration(cpu_cores = 1, 
                                               memory_gb = 1,
                                               enable_app_insights=True,
                                               auth_enabled=True,
                                                 )

aci_service_name = 'Churn Prediction'
print(aci_service_name)
aci_service = Model.deploy(ws, aci_service_name, [model], inference_config, aciconfig)
aci_service.wait_for_deployment(True)
print(aci_service.state)

In [None]:
print(aci_service.scoring_uri)

In [None]:
service.wait_for_deployment(show_output=True)
service.update(enable_app_insights = True)
print("State : "+service.state)
print("Key " + service.get_keys()[0])
print("Swagger URI : "+service.swagger_uri)
print("Scoring URI : "+service.scoring_uri)

In [None]:
import numpy as np
from numpy import array
import json
import requests

# URL for the web service
scoring_uri = aci_service.scoring_uri
# If the service is authenticated, set the key or token
primary,secondary = aci_service.get_keys()
key = primary

# Convert to JSON string
test_x_json = test_x.to_json(orient='records')
data = "{\"data\": " + test_x_json +"}"

# Set the content type
headers = {'Content-Type': 'application/json'}
# If authentication is enabled, set the authorization header
headers['Authorization'] = f'Bearer {key}'

# Make the request and display the response
resp = requests.post(scoring_uri, data, headers=headers)
print(resp.text)
#resp = requests.post(aci_service.scoring_uri, data, headers=headers)

y_pred = json.loads(json.loads(resp.text))['result']

In [None]:
# Set the content type
headers = {'Content-Type': 'application/json'}
# If authentication is enabled, set the authorization header
headers['Authorization'] = f'Bearer {service.get_keys()[0]}'

# Make the request and display the response
response = requests.post(service.scoring_uri, input_data, headers=headers)
print('Prediction output:', response.text)

# Print original labels
print('True Values :', y_test.values)


TODO: In the cell below, print the logs of the web service and delete the service

In [None]:
print(service.get_logs())
