## Introduction:

In this notebook we train a simple neural network with one hidden layer together with hyper parameter optimization for scaling method used, number of hidden layer neurons, hidden layer activation function, learning rate, number of epochs and the momentum to predict the CO gas emission from turbine sensor data. The original work was carried out in https://journals.tubitak.gov.tr/cgi/viewcontent.cgi?article=1505&context=elektrik with DOI: 10.3906/elk-1807-87.

We make use of Azure Machine Learning (AzureML) cloud platform for training, testing and deployment of the model. We were able to achive a slightly higher validation and testing performance in terms of coeeffcient of determination, than what was mentioned in the original paper.

### Prerequisites:
1. An Azure account with an active subscription together with Owner or Contributer role.
2. An AzureML Workspace with all required dependancies.
3. Azure Machine Learning Python SDK v2 in Notebook execution environment.

### 1. Connect to Azure Machine Learning Workspace Using the Config.json file

First, **DefaultCredential** is used and if it fails then use **InteractiveBrowserCredential** which will request Azure username and password in the browser. Once the credential are created, an object of MLClient is created using the Config.json file which is accessible directly from azure compute resources assiociated with a workspace. The Config.json file can also be downloaded from the AzureML studio.

In [None]:
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredentiall, InteractiveBrowserCredential

try:
    credential = DefaultAzureCredential()
    # Check if given credential can get token successfully.
    credential.get_token("https://management.azure.com/.default")
except Exception as ex:
    # Fall back to InteractiveBrowserCredential in case DefaultAzureCredential not work
    credential = InteractiveBrowserCredential()

ml_client = MLClient.from_config(credential)

### 2. Create or Get a Reference to a Compute Cluster for Training and Deployment

First, the workspace is checked for any compute resource with the given name and if exists a reference is created for that compute resource. If not then a compute cluster with size "Standard_E8s_v3" with default settings is created.

In [None]:
from azure.ai.ml.entities import AmlCompute
compute_cluster_name = "GasEmission-cc-bhathiya"
try:
    # Try to get a reference to a compute resource with the name
    compute_cluster = ml_client.compute.get(name = compute_cluster_name)
    print(f"Compute-cluster with name {compute_cluster_name} already exists")
except:
    # If not create a compute cluster with the name
    compute_cluster = AmlCompute(
        name=compute_cluster_name,
        size="Standard_E8s_v3",
        idle_time_before_scale_down=180,
        )
    print(f"Compute-cluster with name {compute_cluster_name} is being created")
    ml_client.compute.begin_create_or_update(compute_cluster).wait()
    print(f"Compute-cluster with name {compute_cluster_name} succesfuly created")

### 3. Download Training and Testing Data in to a Local Folder Structure.

Sensor data from years 2011 to 2015 can be downloaded from [UC Irvine Machine Learning Repository](https://archive.ics.uci.edu/dataset/551/gas+turbine+co+and+nox+emission+data+set). According the the orginal research paper, the first three years (2011, 2012, 2013) were used for training and the last two years (2014, 2015) were for testing. For performance comparison purposes, same train-test split is used.

The code checks whether the data already exist in the required folder structure and if it does not, the data .zip file is downloaded from the repository and extracted locally. Trarning data is moved to "./Data/PipelineNN/Train/" folder and the testing data is moved to "./Data/PipelineNN/Test/" folder. The downloaded .zip file is then deleted.

In [None]:
import os
from zipfile import ZipFile
import requests

if (
    # Check whther the data files exist in the local folder structure
    os.path.exists("./Data/PipelineNN/Train/gt_2011.csv") and
    os.path.exists("./Data/PipelineNN/Train/gt_2012.csv") and
    os.path.exists("./Data/PipelineNN/Train/gt_2013.csv") and
    os.path.exists("./Data/PipelineNN/Test/gt_2014.csv") and
    os.path.exists("./Data/PipelineNN/Test/gt_2015.csv")
    ):
    print("Data files already exist")
else:
    # Data files do not exists!
    
    # Get s handle to the repository .zip file
    print("Data files need to be downloaded")
    zipfile_handle = requests.get("https://archive.ics.uci.edu/static/public/551/gas+turbine+co+and+nox+emission+data+set.zip", allow_redirects=True)
    
    # Download the .zip file to Data folder
    local_dir_Data = "./Data"
    if not os.path.exists(local_dir_Data):
        os.mkdir(local_dir_Data)
    with open("./Data/GasEmission.zip",'wb') as output_file:
        output_file.write(zipfile_handle.content)
    
    # Try to extract the downloaded .zip file
    try:
        with ZipFile("./Data/GasEmission.zip", 'r') as zipfile_Object:
            zipfile_Object.extractall(path="./Data")
            Extracted_bool = True
            print("Zip file succusfuly extracted")
    except:
        Extracted_bool = False
        print("Downloaded Zipfile Error! Extraction Failed!")
    
    # If extraction successful, move training file to Train folder and testing files to Test folder.
    if (Extracted_bool):
        local_dir_PipelineNN = "./Data/PipelineNN"
        if not os.path.exists(local_dir_PipelineNN):
            os.mkdir(local_dir_PipelineNN)

        local_dir_Train = "./Data/PipelineNN/Train"
        if not os.path.exists(local_dir_Train):
            os.mkdir(local_dir_Train)

        os.replace("./Data/gt_2011.csv","./Data/PipelineNN/Train/gt_2011.csv")
        os.replace("./Data/gt_2012.csv","./Data/PipelineNN/Train/gt_2012.csv")
        os.replace("./Data/gt_2013.csv","./Data/PipelineNN/Train/gt_2013.csv")
        print("Succcesfuly created ./Data/PipelineNN/Train folder")

        local_dir_Test = "./Data/PipelineNN/Test"
        if not os.path.exists(local_dir_Test):
            os.mkdir(local_dir_Test)

        os.replace("./Data/gt_2014.csv","./Data/PipelineNN/Test/gt_2014.csv")
        os.replace("./Data/gt_2015.csv","./Data/PipelineNN/Test/gt_2015.csv")
        print("Succcesfuly created ./Data/PipelineNN/Test folder")
        
        # Delete the downloaded .zip file
        os.remove("./Data/GasEmission.zip")

### 4. Define and Create the Environment for Training, Testing and Deployment

#### 4.1 Create a Folder to Hold Environment Definition

In [None]:
import os
dependencies_dir = "./dependencies"
os.makedirs(dependencies_dir, exist_ok=True)

#### 4.2 Environment Definition in .yaml Format

ScikitLearn and TensorFlow both are added, so that the environment can be used for any model supported by ScikitLearn and Tensorflow.

In [None]:
%%writefile {dependencies_dir}/conda_sk_tensor.yaml
name: model-env
channels:
- anaconda
- conda-forge
dependencies:
- python=3.8
- pip=21.3.1
- pandas~=1.3.0
- scipy~=1.7.0
- numpy~=1.22.0
- pip:
  - scikit-learn-intelex==2023.1.1
  - wheel~=0.38.1
  - matplotlib~=3.5.0
  - psutil~=5.8.0
  - tqdm~=4.62.0
  - ipykernel~=6.20.2
  - azureml-core==1.50.0
  - azureml-defaults==1.50.0
  - azureml-mlflow==1.50.0
  - azureml-telemetry==1.50.0
  - scikit-learn~=1.1.0
  - joblib~=1.2.0
  - debugpy~=1.6.3
  - tensorflow~=2.12.0

#### 4.3 Create and Register the Environment in the Workspace

Uses the environment definition .yaml file written previously.

In [None]:
from azure.ai.ml.entities import Environment

custom_env_name = "aml-scikit-learn_tensorflow"
custom_env = Environment(
    name=custom_env_name,
    description="Custom environment for Gas Emission Prediction with Sklearn and Tensorflow",
    tags={"scikit-learn": "1.1.0", "tensorflow": "2.12.0"},
    conda_file=os.path.join(dependencies_dir, "conda_sk_tensor.yaml"),
    image="mcr.microsoft.com/azureml/openmpi4.1.0-cuda11.6-cudnn8-ubuntu20.04"
)
ml_client.environments.create_or_update(custom_env)

print(f"Environment with name {custom_env.name} is registered to workspace, the environment version is {custom_env.version}")

### 5. Create and Run Training Pipeline

#### 5.1 Create a Folder to Hold Component Codes

In [None]:
local_dir_Pipeline = "./src/Pipeline_components"
if not os.path.exists(local_dir_Pipeline):
    os.mkdir(local_dir_Pipeline)

#### 5.2 Code for the Component to Comine Data and Create Train and Test Datasets 
Train and Test datasets are created combining the data files contained in the input data folder "data_files". Creating the training and testing datasets is carried out accourding to the instructions given in the research paper as described in Section-3 in this Notebook. Train and test data are saved as .csv files in the folder path defined by "train_data" and "test_data" respectively. The testing dataset is kept as the holdout dataset, which is used to obtain the final performance results for comparison.

In [None]:
%%writefile {local_dir_Pipeline}/combine_data.py
import os
import argparse
import pandas as pd
import mlflow

def main():
    parser = argparse.ArgumentParser()
    parser.add_argument("--data_files", type=str, help="path to input data folder")
    parser.add_argument("--train_data", type=str, help="path to output train data folder")
    parser.add_argument("--test_data", type=str, help="path to output test data folder")
    args = parser.parse_args()

    df_train_2011 = pd.read_csv(os.path.join(args.data_files, "Train/gt_2011.csv"))
    df_train_2012 = pd.read_csv(os.path.join(args.data_files, "Train/gt_2012.csv"))
    df_train_2013 = pd.read_csv(os.path.join(args.data_files, "Train/gt_2013.csv"))
    
    df_train_full = pd.concat([df_train_2011, df_train_2012, df_train_2013])

    X_train_full = df_train_full.drop(["CO","NOX"],axis=1)
    y_train_full = df_train_full[["CO"]]

    X_train_full.to_csv(os.path.join(args.train_data,"train_X.csv"), index=False)
    y_train_full.to_csv(os.path.join(args.train_data,"train_Y.csv"), index=False)

    df_test_2014 = pd.read_csv(os.path.join(args.data_files, "Test/gt_2014.csv"))
    df_test_2015 = pd.read_csv(os.path.join(args.data_files, "Test/gt_2015.csv"))
    
    df_test_full = pd.concat([df_test_2014, df_test_2015])

    X_test_full = df_test_full.drop(["CO","NOX"],axis=1)
    y_test_full = df_test_full[["CO"]]

    X_test_full.to_csv(os.path.join(args.test_data,"test_X.csv"), index=False)
    y_test_full.to_csv(os.path.join(args.test_data,"test_Y.csv"), index=False)

if __name__ == "__main__":
    main()

#### 5.3 Code for the Component to Train and Save the Model

The model consists of a scaler followed by a neural-network with single hidden layer. Model hyper-parameters are type of scaling, number of hidden layer neurons, hidden layer activation function, learning rate, number of epochs and the momentum. The training dataset is divided in to a training and a validation set. Validation RMSE and the R2 score is logged using MLFlow for hyper parameter tuning. Training RMSE and R2 score is also logged for bias-variance analysis, when needed. Training model pipeline is saved as a MLFlow model.

In [None]:
%%writefile {local_dir_Pipeline}/train_NNh1_sklearn.py
import os
import argparse
import pandas as pd
import numpy as np
import mlflow
import mlflow.keras
import sklearn
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, r2_score
from sklearn.preprocessing import StandardScaler, MinMaxScaler, MaxAbsScaler
from sklearn.neural_network import MLPRegressor
from sklearn.pipeline import Pipeline
import joblib

def main():
    parser = argparse.ArgumentParser()
    parser.add_argument("--random_state", type=int, required=False, default=0)
    parser.add_argument("--train_data", type=str, help="Path to training data")
    parser.add_argument("--test_train_ratio", type=float, required=False, default=0.25)
    parser.add_argument("--scaler_type", type=str, required=False, default="minmax")
    parser.add_argument("--hidden_layer_neurons", type=int, required=False, default=100)
    parser.add_argument("--hidden_layer_activation", type=str, required=False, default="relu")
    parser.add_argument("--learning_rate", type=float, required=False, default=10)
    parser.add_argument("--epochs", type=int, required=False, default=30)
    parser.add_argument("--momentum", type=float, required=False, default=0)
    parser.add_argument("--ouput_model_path", type=str, help="Path for the model")
    args = parser.parse_args()

    #Load training set into Dataframes
    X_full = pd.read_csv(os.path.join(args.train_data, "train_X.csv"))
    y_full = pd.read_csv(os.path.join(args.train_data, "train_Y.csv"))

    mlflow.start_run()
    mlflow.sklearn.autolog()

    #Split training data in to training and validation sets
    X_train, X_vali, y_train, y_vali =  train_test_split(X_full,y_full,test_size=args.test_train_ratio,random_state=args.random_state)
    
    #Holds the neural-network model from SkLearn
    model_sklearnNN = MLPRegressor(
        hidden_layer_sizes=args.hidden_layer_neurons, 
        activation=args.hidden_layer_activation, 
        solver="sgd", 
        learning_rate="adaptive", 
        learning_rate_init=args.learning_rate,  
        max_iter=args.epochs,
        momentum=args.momentum,
        random_state=args.random_state
        )

    #A Scaler is defined according the scaling type provided and the final model pipeline consists
    # of a scaler following by the neural network.
    if (args.scaler_type == "minmax"):
        scaler = MinMaxScaler()
        scaler_model_pipeline = Pipeline([
            ("scalar", scaler),
            ("nn_model", model_sklearnNN)
            ])
    elif (args.scaler_type == "standard"):
        scaler = StandardScaler()
        scaler_model_pipeline = Pipeline([
            ("scalar", scaler),
            ("nn_model", model_sklearnNN)
            ])
    elif (args.scaler_type == "maxabs"):
        scaler = MaxAbsScaler()
        scaler_model_pipeline = Pipeline([
            ("scalar", scaler),
            ("nn_model", model_sklearnNN)
            ])
    else:
        scaler_model_pipeline = model_sklearnNN

    scaler_model_pipeline.fit(X_train, y_train)

    #Log training performance
    y_predict = scaler_model_pipeline.predict(X_train)
    mlflow.log_metric("Train RMSE",  np.sqrt(mean_squared_error(y_train,y_predict)))
    mlflow.log_metric("Train R2-score", r2_score(y_train,y_predict))

    #Log validation performance
    y_predict = scaler_model_pipeline.predict(X_vali)
    mlflow.log_metric("Validation RMSE",  np.sqrt(mean_squared_error(y_vali,y_predict)))
    mlflow.log_metric("Validation R2-score", r2_score(y_vali,y_predict))

    # Saving the model to a file
    mlflow.sklearn.save_model(
        sk_model=scaler_model_pipeline,
        path=args.ouput_model_path,
    )
    mlflow.end_run()

if __name__ == "__main__":
    main()

### 5.4 Code for the Component to Test and Register the Model
The purpose of this component is to caculate the final test performance of the model using the holdout test set and register the model using MLFlow.

In [None]:
%%writefile {local_dir_Pipeline}/test_and_register_model_nnsklearn.py
import os
import argparse
import pandas as pd
import numpy as np
from sklearn.metrics import mean_squared_error, r2_score
import joblib
import mlflow
import mlflow.keras
import mlflow.sklearn

def main():

    parser = argparse.ArgumentParser()
    parser.add_argument("--test_data_folder", type=str, help="Path of the test data folder")
    parser.add_argument("--input_model_path", type=str, help="path of the input model")
    parser.add_argument("--registered_model_name", type=str, help="Name for the registered best model")
    args = parser.parse_args()

    X_test = pd.read_csv(os.path.join(args.test_data_folder, "test_X.csv"))
    y_test = pd.read_csv(os.path.join(args.test_data_folder, "test_Y.csv"))

    mlflow.start_run()
    
    loaded_best_model = mlflow.sklearn.load_model(args.input_model_path)

    print("Registering the best sweeped model via MLFlow")
    mlflow.sklearn.log_model(
        sk_model=loaded_best_model,
        registered_model_name=args.registered_model_name,
        artifact_path=args.registered_model_name,
    )

    y_predict = loaded_best_model.predict(X_test)
    mlflow.log_metric("Test RMSE",  np.sqrt(mean_squared_error(y_test,y_predict)))
    mlflow.log_metric("Test R2-score", r2_score(y_test,y_predict))

    mlflow.end_run()


if __name__ == "__main__":
    main()    

#### 5.5 Create Pipeline Components from the Code Definitions Above

In [None]:
from azure.ai.ml import command
from azure.ai.ml import Input, Output

combine_data_component = command(
    name="combine_files_for_train_test",
    display_name="combine_files_for_train_test",
    description="Combine .csv files for training and testing seperatly",
    inputs={
        "data_files": Input(type="uri_folder"),
    },
    outputs=dict(
        train_data=Output(type="uri_folder", mode="rw_mount"),
        test_data=Output(type="uri_folder", mode="rw_mount"),
    ),
    
    code=local_dir_Pipeline,
    command="python combine_data.py \
            --data_files ${{inputs.data_files}} \
            --train_data ${{outputs.train_data}} --test_data ${{outputs.test_data}} \
            ",
    environment=f"{custom_env.name}:{custom_env.version}"
)
combine_data_component = ml_client.create_or_update(combine_data_component.component)

train_nnh1_component = command(
    name="train_nnh1_regressor",
    display_name="train_nnh1_regressor",
    inputs={
        "random_state": Input(type="number"),
        "train_data": Input(type="uri_folder"),
        "test_train_ratio": Input(type="number"),
        "scaler_type": Input(type="string"),
        "hidden_layer_neurons": Input(type="number"),
        "hidden_layer_activation": Input(type="string"),
        "learning_rate": Input(type="number"),
        "epochs": Input(type="number"),
        "momentum": Input(type="number")
    },
    outputs={
        "ouput_model_path": Output(type="mlflow_model")
    },
    
    code=local_dir_Pipeline,
    command="python train_NNh1_sklearn.py \
            --random_state ${{inputs.random_state}} \
            --train_data ${{inputs.train_data}} \
            --test_train_ratio ${{inputs.test_train_ratio}} \
            --scaler_type ${{inputs.scaler_type}}\
            --hidden_layer_neurons ${{inputs.hidden_layer_neurons}} \
            --hidden_layer_activation ${{inputs.hidden_layer_activation}} \
            --learning_rate ${{inputs.learning_rate}} \
            --epochs ${{inputs.epochs}} \
            --momentum ${{inputs.momentum}}\
            --ouput_model_path ${{outputs.ouput_model_path}} \
            ",
    environment=f"{custom_env.name}:{custom_env.version}"
)
train_nnh1_component = ml_client.create_or_update(train_nnh1_component.component)

test_and_register_model_component = command(
    name="test_model",
    display_name="test_model",
    inputs={
        "test_data_folder": Input(type="uri_folder"),
        "input_model_path": Input(type="mlflow_model"),
        "registered_model_name": Input(type="string")
    },
    
    code=local_dir_Pipeline,
    command="python test_and_register_model_nnsklearn.py \
            --test_data_folder ${{inputs.test_data_folder}} \
            --input_model_path ${{inputs.input_model_path}} \
            --registered_model_name ${{inputs.registered_model_name}} \
            ",
    environment=f"{custom_env.name}:{custom_env.version}"
)
test_and_register_model_component = ml_client.create_or_update(test_and_register_model_component.component)

#### 5.6 Define the Pipeline
Inputs for the pipeline are the following.
1. input_data_folder : Contains the Data .csv files
2. test_train_ratio : Used for the split within model training component
3. registered_model_name : To register the best model
4. random_state : Used within model training component for the train test split and the neural-network.

Pipeline contains a hyper-parameter tuning step using "sweep". Best model from the sweep step will be registered and tested by the "test_and_register_model_component" of the pipeline.

In [None]:
from azure.ai.ml import dsl
from azure.ai.ml.sweep import Choice
import numpy as np

@dsl.pipeline(
    compute=compute_cluster_name, # to use serverless compute, change this to: compute=azureml:serverless
    description="Gas Emission Prediction Pipeline - NNh1",
)
def GassEmission_prediction_pipeline_nnh1_sklearn(
    pipeline_job_input_data_folder,
    pipeline_job_test_train_ratio,
    pipeline_job_registered_model_name,
    pipeline_job_random_state
):
    
    combine_data_job = combine_data_component(
        data_files=pipeline_job_input_data_folder
    )

    train_job = train_nnh1_component(
        random_state=pipeline_job_random_state,
        train_data=combine_data_job.outputs.train_data,
        test_train_ratio=pipeline_job_test_train_ratio,
        scaler_type=Choice(values=["minmax", "none"]),
        hidden_layer_neurons=50,
        hidden_layer_activation="relu",
        learning_rate=Choice(values=[0.1, 0.2]),
        epochs=Choice(values=[160, 240]),
        momentum=0
    )

    sweep_step = train_job.sweep(
        sampling_algorithm="grid",
        primary_metric="Validation R2-score",
        goal="maximize"
    )
    sweep_step.set_limits(max_total_trials=500, max_concurrent_trials=20, timeout=72000)

    test_and_register_model_job = test_and_register_model_component(
        test_data_folder=combine_data_job.outputs.test_data,
        registered_model_name=pipeline_job_registered_model_name,
        input_model_path=sweep_step.outputs.ouput_model_path
    )

#### 5.7 Create an Instance of the Pipeline and Run
Input data folder should be local to the compute running this Notebook.

In [None]:
registered_best_model_name = "pipeline_best_model_nnsklearn"

pipeline = GassEmission_prediction_pipeline_nnh1_sklearn(
    pipeline_job_random_state=0,
    pipeline_job_input_data_folder=Input(type="uri_folder", path="./Data/PipelineNN"),
    pipeline_job_test_train_ratio=0.2,
    pipeline_job_registered_model_name=registered_best_model_name
)

pipeline_job = ml_client.jobs.create_or_update(
    pipeline,
    experiment_name="train_model_GasEmission_prediction_Pipeline_nnh1_sklearn",
)
ml_client.jobs.stream(pipeline_job.name)

### 6. Create a Reference to the Trained Best Model
This reference will then be the model deployment. It should be noted that some other model training process may register a model in the same name if run parallelly. To avoid such situations, name of the job that created the best model is cross checked with the pipeline job name.

In [None]:
from mlflow.tracking.client import MlflowClient

#Get the latest model with name given by registered_best_model_name
latest_model_version = max(
    [int(m.version) for m in ml_client.models.list(name=registered_best_model_name)]
)
registered_best_model = ml_client.models.get(name=registered_best_model_name,version=latest_model_version)

#Get a reference to the job that created the latest model with name given by registered_best_model_name
best_model_job = ml_client.jobs.get(name=registered_best_model.job_name)

#Get a reference to the MLFlow run associated wih the job that created the latest model
MLflow_client = MlflowClient()
mlflow_best_model_job = MLflow_client.get_run(best_model_job.name)

#Get a reference to its MLFlow parent run
mlflow_best_model_job_parent = MLflow_client.get_run(mlflow_best_model_job.data.tags["mlflow.parentRunId"])

#If the MLflow parent run of the job that created the lated model with name  given by  registered_best_model_name is
#not the pipeline job name, then there is a version conflict, which indicates that some other job has created a model with same name
if (mlflow_best_model_job_parent.data.tags["mlflow.rootRunId"] != pipeline_job.name):
    print("Registered Best Model Version Conflict Detected! \nPipeline_job runID and the best model's associated job's parent job's runID are mismatched. The model may be updated by some other job")
    registered_best_model = None

In [None]:
print("Registered Best Model Name: ", registered_best_model.name)
print("Registered Best Model Version: ", registered_best_model.version)
print("Registered Best Model Path: ", registered_best_model.path)
print("Registered Best Model ID: ", registered_best_model.id)

### 7. Model Deployment as a Managed Online Deployment

#### 7.1 Create the End-Point for Deployment

In [None]:
import uuid
from azure.ai.ml.entities import ManagedOnlineEndpoint, ManagedOnlineDeployment

#End-point name has to be unique
endpoint_name = "gaspredict-nnsklearn-" + str(uuid.uuid4())[:8]

endpoint = ManagedOnlineEndpoint(
    name=endpoint_name,
    description="This is an online endpoint for CO gas emission prediction - nnsklearn",
    auth_mode="key",
    tags={
        "training_dataset": "sensor_data",
        "model_type": "sklearn.MLPRegressor",
    },
)

endpoint_result = ml_client.begin_create_or_update(endpoint).result()

print(f"Endpint {endpoint_result.name} provisioning state: {endpoint_result.provisioning_state}")

#### 7.2 Create the Deployment inside the Endpoint

In [None]:
blue_deployment = ManagedOnlineDeployment(
    name="blue",
    endpoint_name=endpoint_name,
    model=registered_best_model,
    instance_type="Standard_DS2_v2",
    instance_count=1,
)

blue_deployment_results = ml_client.online_deployments.begin_create_or_update(
    blue_deployment
).result()

print(
    f"Deployment {blue_deployment_results.name} provisioning state: {blue_deployment_results.provisioning_state}"
)

#### 7.3 Test the Deployment

First a folder is created to hold the file containing the testing samples.

In [None]:
deploy_dir = "./deploy"
if not os.path.exists(deploy_dir):
    os.mkdir(deploy_dir)

A .json file containing the samples is written. Samples are extracted from the hold-out test.

In [None]:
%%writefile {deploy_dir}/sample-request.json
{
  "input_data": {
    "columns": ["AT", "AP", "AH", "AFDP", "GTEP", "TIT", "TAT", "TEY", "CDP"],
    "index": [0, 1],
    "data": [
            [1.9532,1020.1,84.985,2.5304,20.116,1048.7,544.92,116.27,10.799],
            [1.2191,1020.1,87.523,2.3937,18.584,1045.5,548.5,109.18,10.347]
        ]
  }
}

Invoke the End-point with the Sample File.

In [None]:
ml_client.online_endpoints.invoke(
    endpoint_name=endpoint_name,
    request_file="./deploy/sample-request.json",
    deployment_name="blue",
)