# Flight Delay Demo - Security & Enterprise Readiness

In [1]:
import warnings
warnings.filterwarnings("ignore")

import logging
logging.basicConfig(level = logging.ERROR)

## Setup working directory

The cell below creates our working directory. This will hold our generated scripts.

In [2]:
import os

project_folder = './scripts'

# Working directory
if not os.path.exists(project_folder):
    os.makedirs(project_folder)

## Getting started

Import and verify the Azure ML SDK version.

In [None]:
import azureml.core

azureml.core.VERSION

## Connect to workspace

In the next cell, we create a new Workspace config object using the `<subscription_id>`, `<resource_group_name>`, and `<workspace_name>`. This will fetch the matching Workspace and prompt you for authentication. Please click on the link and input the provided details.

For more information on **Workspace**, please visit: [Microsoft Workspace Documentation](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.workspace.workspace?view=azure-ml-py)

`<subscription_id>` = You can get this ID from the landing page of your Resource Group.

`<resource_group_name>` = This is the name of your Resource Group.

`<workspace_name>` = This is the name of your Workspace.

In [None]:
from azureml.core.workspace import Workspace

try:    
    # Get instance of the Workspace and write it to config file
    ws = Workspace(
        subscription_id = '<subscription_id>', 
        resource_group = '<resource_group>', 
        workspace_name = '<workspace_name>')

    # Writes workspace config file
    ws.write_config()
    
    print('Library configuration succeeded')
except Exception as e:
    print(e)
    print('Workspace not found')

## Load Data from Azure Dataset Registry

First step is to get our data using the Dataset module, the function `Dataset.get_by_name()` returns a registered Dataset from a given `workspace` and its registration `name`.

`workspace` = The existing AzureML workspace in which the Dataset was registered..

`name` = The registration name.

`dataframe.take() ` = Function returns the elements in the given positional indices along an axis. 

For more information on **Dataset**, please visit: [Microsoft Dataset Documentation](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.dataset.dataset?view=azure-ml-py#get-by-name-workspace--name--version--latest--)


In [None]:
from azureml.core import Dataset

tabular = Dataset.get_by_name(ws, 'flight_dataset_2008_with_weather')

data = tabular.to_pandas_dataframe()
tabular.take(3).to_pandas_dataframe()

## Fetch existing AML Compute Cluster

**Note:** This cluster was deployed by the setup guide. The cluster sits under a private VNet.

In [None]:
from azureml.core.compute import ComputeTarget

### Create AML CPU Compute Cluster
compute_target = ComputeTarget(workspace=ws, name='cpu-cluster')
print('Found existing compute target.')

## Instantiate an Automated ML Config
Before the execution of an Automated ML run, the `AutoMLConfig` should be setup. `AutoMLConfig` is a configuration object that contains and persists the parameters for configuring the experiment run parameters. This configuration is a key element in the execution of the run since it defines things such as the number of iterations and primary metric to optimize on. In the example below the run will be setup to execute a regression task with 25 iterations and using `accuracy` as primary metric.

For more information on **AutoMLConfig**, please visit: [Microsoft AutoMLConfig Documentation](https://docs.microsoft.com/en-us/python/api/azureml-train-automl/azureml.train.automl.automlconfig?view=azure-ml-py)

In [7]:
from azureml.train.automl import AutoMLConfig

training_data, validation_data = tabular.random_split(percentage=0.9, seed=1)

automl_config = AutoMLConfig(task = 'classification',
                             iterations = 3,
                             iteration_timeout_minutes = 30, 
                             max_cores_per_iteration = 10,
                             primary_metric = 'accuracy',
                             debug_log = 'automl.log',
                             training_data = training_data,
                             validation_data = validation_data,
                             label_column_name = "ArrDelay15",
                             compute_target = compute_target,
                             path = project_folder,
                             model_explainability = True,
                             experiment_exit_score = 0.8,
                             enable_early_stopping = True,
                             enable_onnx_compatible_models=True)

## Run our Experiment on AML Compute

The Experiment constructor allows to create an experiment instance. The constructor takes in the current workspace, which is fetched by calling `Workspace.from_config()` and an experiment name. 

The `experiment.submit()` function is called to send the experiment for execution. The only parameter received by this function is the `AutoMLConfig` object instantiated previously in this module.

For more information on **Experiment**, please visit: [Microsoft Experiment Documentation](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.experiment.experiment?view=azure-ml-py)

In [None]:
from azureml.core.experiment import Experiment

# Get an instance of the Workspace from the config file
ws = Workspace.from_config()
ws.update(image_build_compute = 'cpu-cluster')

# Create Experiment
experiment = Experiment(ws, 'flight-delay-exp')

remote_run = experiment.submit(automl_config, show_output=False)
remote_run

## Display Automated ML Run Details
The creation of an object of type `AutoMLRun` will enable us to observe the experiment progress and results. The object is created by calling the constructor `AutoMLRun()`. It takes as arguments the experiment and the identifier of the run to fetch. After the object has been instantiated, the `RunDetails()` function will retrieve the progress, metrics, and tasks for the specified run. They will be displayed by calling the function `show()` over the mentioned object.

For more information on **AutoMLRun**, please visit: [Microsoft AutoMLRun Documentation](https://docs.microsoft.com/en-us/python/api/azureml-train-automl/azureml.train.automl.run.automlrun?view=azure-ml-py)

For more information on **RunDetails**, please visit: [Microsoft RunDetails Documentation](https://docs.microsoft.com/en-us/python/api/azureml-widgets/azureml.widgets.rundetails?view=azure-ml-py)


**Note:** Please wait for the execution of the cell to finish before moving forward. (Status should be **Completed**)

In [None]:
from azureml.train.automl.run import AutoMLRun 
from azureml.widgets import RunDetails
from azureml.core.experiment import Experiment

experiment = Experiment(ws, 'flight-delay-exp')
remote_run = AutoMLRun(experiment=experiment, run_id=remote_run.id)

RunDetails(remote_run).show()

## Show best run

Select the best model from your iterations. The `get_output` function returns the best run and the fitted model for the last fit invocation. By using the overloads on get_output, you can retrieve the best run and fitted model for any logged metric or a particular iteration.

In [None]:
remote_run.wait_for_completion()
best_run, fitted_model = remote_run.get_output()
print(best_run)

## Register Model

Next, register the model obtained from the best run. In order to register the model, the function `register_model()` should be called. This will take care of registering the model obtained from the best run.

In [11]:
# register the model for deployment
model = best_run.register_model(model_name='flight_delay_weather', 
                                model_path='outputs/model.pkl',
                                datasets=[(Dataset.Scenario.TRAINING, tabular)])

In [12]:
from azureml.core.model import Model
model = Model(ws, 'flight_delay_weather')

## Create/connect to the Kubernetes compute cluster

The `AksCompute Class` manages an Azure Kubernetes Service compute target in Azure Machine Learning.

The `ComputeTarget Class` is an abstract parent class for all compute targets managed by Azure Machine Learning. A compute target is a designated compute resource/environment where you run your training script or host your service deployment. 

The `ComputeTargetException` is an exception related to failures when creating, interacting with, or configuring a compute target. This exception is commonly raised for failures attaching a compute target, missing headers, and unsupported configuration values.

For more information on **AksCompute Class**, please visit: [Microsoft Machine Learning - AksCompute Class Documentation](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.compute.aks.akscompute?view=azure-ml-py)

For more information on **ComputeTarget Class**, please visit: [Microsoft Machine Learning - ComputeTarget Class Documentation](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.compute.computetarget?view=azure-ml-py)

For more information on **ComputeTargetException Class**, please visit: [Microsoft Machine Learning - ComputeTargetException Class Documentation](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.exceptions.computetargetexception?view=azure-ml-py)


In [None]:
from azureml.core.compute import ComputeTarget, AksCompute
from azureml.exceptions import ComputeTargetException

try:
    aks_target = AksCompute(ws, 'secure-aks')
except ComputeTargetException:
    # Create the compute configuration and set virtual network information
    config = AksCompute.provisioning_configuration(location="<region>")
    config.vnet_resourcegroup_name = "<resource-group-name>"
    config.vnet_name = "<vnet-name>"
    config.subnet_name = "<subnet-name>"
    config.service_cidr = "10.0.0.0/16"
    config.dns_service_ip = "10.0.0.10"
    config.docker_bridge_cidr = "172.17.0.1/16"
    config.vm_size = "Standard_DS2_v2"

    # Create the compute target
    aks_target = ComputeTarget.create(workspace=ws,
                                    name="secure-aks",
                                    provisioning_configuration=config)
    aks_target.wait_for_completion(True)

## Update AKS to enable the Internal Load Balancer

The internal load balancer enables services to be published to the virtual network.

In [None]:
from azureml.core.compute.aks import AksUpdateConfiguration

# Change to the name of the subnet that contains AKS
subnet_name = "<subnet-name>"
# Update AKS configuration to use an internal load balancer
update_config = AksUpdateConfiguration(None, "InternalLoadBalancer", subnet_name)
aks_target.update(update_config)
# Wait for the operation to complete
aks_target.wait_for_completion(show_output = True)

## Create Scoring File

Creating the scoring file is next step before deploying the service. This file is responsible for the actual generation of predictions using the model. The values or scores generated can represent predictions of future values, but they might also represent a likely category or outcome.

The first thing to do in the scoring file is to fetch the model. This is done by calling `Model.get_model_path()` and passing the model name as a parameter.

After the model has been loaded, the function `model.predict()` function should be called to start the scoring process.

For more information on **Machine Learning - Score**, please visit: [Microsoft Machine Learning - Score Documentation](https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/machine-learning-score)


In [None]:
%%writefile score.py
import pickle
import json
import time
import numpy as np
import pandas as pd
import azureml.automl.core
from azureml.core.model import Model
from inference_schema.schema_decorators import input_schema, output_schema
from inference_schema.parameter_types.numpy_parameter_type import NumpyParameterType
from sklearn.externals import joblib
 
input_sample = np.array([[3,30,7,820,930,'MQ',70,'DFW','LIT',304,32.89595056,-97.0372,'TX',34.72939611,-92.22424556,'AR',44236.8,44236.8,0.0,11.0,409.6,208.0,0.0,0.0,28.5,16.0,15.0,8.5,1720.0,1120.0]])
output_sample = np.array([1])
 
def init():
    global model, inputs_dc, prediction_dc
    print ("model initialized" + time.strftime("%H:%M:%S"))
    
    # this name is model.id of model that we want to deploy
    model_path = Model.get_model_path(model_name = 'flight_delay_weather')
    
    # deserialize the model file back into a sklearn model
    model = joblib.load(model_path)
    
@input_schema('data', NumpyParameterType(input_sample))
@output_schema(NumpyParameterType(output_sample))
def run(data):
    try:
        df = pd.DataFrame(data, columns=['Month', 'DayofMonth', 'DayOfWeek', 'CRSDepTime', 'CRSArrTime', 'UniqueCarrier', 'CRSElapsedTime', 'Origin', 'Dest', 'Distance', 'Origin_Lat', 'Origin_Lon', 'Origin_State', 'Dest_Lat', 'Dest_Lon', 'Dest_State', 'Origin_dayl', 'Dest_dayl', 'Origin_prcp', 'Dest_prcp', 'Origin_srad', 'Dest_srad', 'Origin_swe', 'Dest_swe', 'Origin_tmax', 'Dest_tmax', 'Origin_tmin', 'Dest_tmin', 'Origin_vp', 'Dest_vp']) 
        result = model.predict(df)
    except Exception as e:
        result = str(e)
        print(result)
        return {"error": result}
    return {"result":result.tolist()}

## Create Environment Definition File

The dependencies file allows us to define the libraries to be included in the inferencing environment.

In [None]:
%%writefile score-new.yml
name: project_environment
dependencies:
  # The python interpreter version.
  # Currently Azure ML only supports 3.5.2 and later.
- python=3.6.2

- pip:
  - azureml-sdk[notebooks,automl]
  - azureml-defaults
  - inference-schema
  - azureml-monitoring
  - joblib
- numpy
- scikit-learn
channels:
- anaconda
- conda-forge

## Deploy the model to AKS

The first step is to define the dependencies that are needed for the service to run and they are defined by calling `CondaDependencies.create()`. This create function will receive as parameters the pip and conda packages to install on the remote machine. Secondly, the output of this function is persisted into a `.yml` file that will be leveraged later on the process.

Now that the AKS cluster has been deployed and our CondaDependencies have been declared, it’s time to create an `InferenceConfig` object by calling its constructor and passing the runtime type, the path to the `entry_script` (score.py), and the `conda_file` (the previously created file that holds the environment dependencies).

Next, define the configuration of the web service to deploy. This is done by calling `AksWebservice.deploy_configuration()` and passing along the number of `cpu_cores` and `memory_gb` that the service needs.

Finally, in order to deploy the model and service to the created AKS cluster, the function `Model.deploy()` should be called, passing along the workspace object, a list of models to deploy, the defined inference configuration, deployment configuration, and the AKS object created in the step above.

For more information on **CondaDependencies**, please visit: [Microsoft CondaDependencies Documentation](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.conda_dependencies.condadependencies?view=azure-ml-py)

For more information on **InferenceConfig**, please visit: [Microsoft InferenceConfig Documentation](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.model.inferenceconfig?view=azure-ml-py)

For more information on **AksWebService**, please visit: [Microsoft AksWebService Documentation](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.webservice.akswebservice?view=azure-ml-py)

For more information on **Model**, please visit: [Microsoft Model Documentation](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.model.model?view=azure-ml-py)


**Note:** Please wait for the execution of the cell to finish before moving forward.

In [None]:
from azureml.exceptions import WebserviceException
from azureml.core.model import InferenceConfig
from azureml.core.webservice import AksWebservice
from azureml.core import Workspace

# Load workspace from an existing config file
ws = Workspace.from_config()
# Update the workspace to use an existing compute cluster
ws.update(image_build_compute = 'cpu-cluster')

inference_config = InferenceConfig(runtime= "python",
                                    entry_script="score.py",
                                    conda_file="score-new.yml")

deployment_config = AksWebservice.deploy_configuration(cpu_cores = 1, 
                                                        memory_gb = 1,
                                                        collect_model_data=True, 
                                                        enable_app_insights=True)

try:
    service = AksWebservice(ws, 'flight-delay-secure')
    print(service.state)
except WebserviceException:
    service = Model.deploy(ws, 
                            'flight-delay-secure', 
                            [model], 
                            inference_config, 
                            deployment_config, 
                            aks_target)

    service.wait_for_deployment(show_output = True)
    print(service.state)

## Connect to the deployed webservice

Now with test data, we can get it into a suitable format to consume the web service. First an instance of the web service should be obtained by calling the constructor `Webservice()` with the Workspace object and the service name as parameters. Sanitizing of the data is then performed in order to avoid sending unexpected columns to the web service. Finally, call the service via POST using the `requests` module. `requests.post()` will call the deployed web service. It takes for parameters the service URL, the test data, and a headers dictionary that contains the authentication token.

For more information on **Webservice**, please visit: [Microsoft Webservice Documentation](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.webservice?view=azure-ml-py)

In [18]:
import json
import requests
import pandas as pd
from azureml.core.webservice import Webservice

aks_service = Webservice(ws, 'flight-delay-secure')
aks_service.update(enable_app_insights=True)

# prepare the test data
val = validation_data.to_pandas_dataframe()
val = val.drop(columns=['ArrDelay15'])
sample = val.sample(n=10, random_state=4).values.tolist()

headers = {'Content-Type':'application/json'}

if aks_service.auth_enabled:
    headers['Authorization'] = 'Bearer '+ aks_service.get_keys()[0]

output_df = []
for x in sample:    
    test_sample = json.dumps({'data': [x]})
    response = requests.post(aks_service.scoring_uri, data=test_sample, headers=headers)
    prediction = [response.json()['result'][0]]
    prediction.extend(x)
    output_df.append(prediction)

## Check the IP of the scoring webservice

The IP address of the service shows it is connected to the virtual network.

In [None]:
service.scoring_uri

## Present scoring service predictions

Let's format our service responses and present them in a suitable way to our end users.

In [None]:
def highlight_delays(val):
    return 'background-color: yellow' if val == True else ''

predictions = pd.DataFrame(output_df, columns =['Prediction', 'Month', 'DayofMonth', 'DayOfWeek', 'CRSDepTime', 'CRSArrTime', 'UniqueCarrier', 'CRSElapsedTime', 'Origin', 'Dest', 'Distance', 'Origin_Lat', 'Origin_Lon', 'Origin_State', 'Dest_Lat', 'Dest_Lon', 'Dest_State', 'Origin_dayl', 'Dest_dayl', 'Origin_prcp', 'Dest_prcp', 'Origin_srad', 'Dest_srad', 'Origin_swe', 'Dest_swe', 'Origin_tmax', 'Dest_tmax', 'Origin_tmin', 'Dest_tmin', 'Origin_vp', 'Dest_vp'])
predictions = predictions.style.applymap(highlight_delays, subset=['Prediction'])
predictions