<a href="https://colab.research.google.com/github/smreynolds92/Great-Learning/blob/main/milestone1_learner_notebook1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Milestone 1: Advertising

**Business Context:**

Retail company, "Fashion Haven," operates multiple stores in different cities. The company invests in advertising campaigns to promote its latest collections through various media sources like TV, Newspaper, and Radio. They want to understand the impact of each media source on their sales revenue to optimize their advertising strategy and improve overall business performance.

Currently, Fashion Haven lacks an effective method to predict the sales revenue generated from their advertising efforts accurately. As a result, they struggle to allocate their advertising budget optimally across different media channels, leading to sub optimal returns on investment and inefficient resource allocation.

To address this business problem, Fashion Haven has collected historical data containing information on various advertising campaigns (TV, Newspaper, Radio) and their corresponding sales revenue across their different store locations. The goal is to build a robust predictive model that accurately estimates the sales revenue based on the media sources' advertising budgets, helping the company make data-driven decisions and drive business growth.


Dataset Description:

The data contains the different attributes of the advertising business. The detailed data dictionary is given below.

* TV: Expenditure on media resource- TV
* Radio: Expenditure on media resource- Radio
* NewsPaper: Expenditure on media resource- Newspaper
* Sales: Target Column - Amount of Sales

*** I have used the class workbooks, my previous projects, and internet searches as an example for much of my coding in this project.

### Upload the dataset on Blob Storage as Data Asset

In [None]:
# pip install azure-ai-ml

In [None]:
# pip show azure-ai-ml

In [None]:
# pip install --upgrade azure-ai-ml

In [None]:
# Handle to the workspace
from azure.ai.ml import MLClient

# Authentication package
from azure.identity import DefaultAzureCredential

credential = DefaultAzureCredential()

In [None]:
# Get a handle to the workspace
ml_client = MLClient(
    credential=credential,
    subscription_id="4f8c6956-21ba-4424-aa60-2705417ce538", #Provide your subscription ID as shown in the above screenshot
    resource_group_name="test-resource", #Provide your Resource Group as shown in the above screenshot
    workspace_name="demo-azureml",
)

**Observation:**
- Using the subscription_id, resource_group_name, and workspace_name from this Azure session/setup.

In [None]:
# Import the necessary modules
from azure.ai.ml.entities import AmlCompute

# Name assigned to the compute cluster
cpu_compute_target = "mls-oct24-cluster"

try:
    # Checking to see if the compute target already exists
    cpu_cluster = ml_client.compute.get(cpu_compute_target)
    print(
        f"You already have a cluster named {cpu_compute_target}, we'll reuse it as is."
    )

except Exception:
    print("Creating a new cpu compute target...")

    # If not already created, creating the Azure ML compute object with the intended parameters
    cpu_cluster = AmlCompute(
        name=cpu_compute_target,
        # Azure ML Compute is the on-demand VM service
        type="amlcompute",
        # VM Family
        size="STANDARD_D2_V3",
        # Minimum running nodes when there is no job running
        min_instances=0,
        # Nodes in cluster
        max_instances=1,
        # How many seconds will the node running after the job termination
        idle_time_before_scale_down=600,
        # Dedicated or LowPriority. The latter is cheaper but there is a chance of job termination
        tier="Dedicated",
    )

    # Passing the object to MLClient's create_or_update method
    cpu_cluster = ml_client.compute.begin_create_or_update(cpu_cluster).result()

print(
    f"AMLCompute with name {cpu_cluster.name} is created, the compute size is {cpu_cluster.size}"
)

You already have a cluster named mls-oct24-cluster, we'll reuse it as is.
AMLCompute with name mls-oct24-cluster is created, the compute size is STANDARD_D2_V3


**Observation:**
- The compute cluster is created and has a STANDARD_D2_V3 size.

### Create a processing script to perform the data preprocessing job

In [None]:
# Import the necessary modules
import os

## Set the name of the directory we want to create
src_dir = "./src"

# # The os.makedirs() function creates a directory
# exist_ok=True means that the function will not raise an exception if the directory already exists
os.makedirs(src_dir, exist_ok=True)

**Observation:**
- Creating a folder for the .py files in this notebook.

In [None]:
%%writefile {src_dir}/pre_process.py

# Import the necessary modules
import os
import argparse
import pandas as pd
import azureml.core
import numpy as np
import mlflow
import mlflow.sklearn
from sklearn.preprocessing import LabelEncoder, StandardScaler
from sklearn.model_selection import train_test_split
from azureml.core import Workspace

def main():
 """Main function of the script."""

 # input and output arguments
 parser = argparse.ArgumentParser()
 parser.add_argument("--data", type=str, help="path to input data")
 parser.add_argument("--output", type=str, help="path to output data")
 args = parser.parse_args()

 # Start Logging
 mlflow.start_run()

 ###################
 #<prepare the data>
 ###################

 print("input data:", args.data)

 data = pd.read_csv(args.data)

 ###################
 #<processing>
 ###################

 # Collecting the numerical features
 numerical_columns = data.select_dtypes(include=['float64', 'int64']).columns

 # Apply data scaling to numerical columns
 scaler = StandardScaler()
 data[numerical_columns] = scaler.fit_transform(data[numerical_columns])

 # Exporting processed data to local
 processed_data_path = os.path.join(args.output, 'advertising_sales_processed.csv')
 data.to_csv(processed_data_path, index=False)

 # Stop Logging
 mlflow.end_run()

if __name__ == "__main__":
 main()

Overwriting ./src/pre_process.py


**Observation:**
- The pre-process.py file is created and stored in the ./src folder.

### Configure the processing job

In [None]:
# Import the necessary modules
from azure.ai.ml.entities import Data
from azure.ai.ml import command
from azure.ai.ml import Input, Output

# Define a new AML job using the `command` function
job = command(
 inputs=dict(
 data=Input(
 type="uri_file",
 path="./Data/Advertising_Sales.csv",
 ),
 ),
 outputs=dict(
 output=Output(
 type="uri_folder",
 # Creating the path with name for this processing job
 path="azureml://datastores/workspaceblobstore/paths/advertising_sales_processed_data",
 ),
 ),
 code="src/", # Location of the source code
 command="python pre_process.py --data ${{inputs.data}} --output ${{outputs.output}}",
 # Specify the environment to be used for the job
 environment="mls-oct24-env@latest",
 # Specify the compute target to be used for the job
 compute="mls-oct24-cluster",
 # Specify the display name
 display_name="advertising_sales_processing",
 # Specify the experiment name
 experiment_name="advertising_sales_price_processing",
)

**Observation:**
- The processing job is configured with parameters for this workbook/milestone.

### Run the processing job

In [None]:
# Running the processing job
ml_client.create_or_update(job)

Class AutoDeleteSettingSchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class AutoDeleteConditionSchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class BaseAutoDeleteSettingSchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class IntellectualPropertySchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class ProtectionLevelSchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class BaseIntellectualPropertySchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
pathOnCompute is not a known attribute

Experiment,Name,Type,Status,Details Page
advertising_sales_price_processing,hungry_jewel_4rnydqkgnl,command,Starting,Link to Azure Machine Learning studio


**Observation:**
- The processing job is run and a link to the jobs is also created.  The job did complete, but there are some notes for your information/warnings.

### Create a training script to perform the training job

In [None]:
# Import the necessary modules
import os

## Set the name of the directory we want to create
src_dir = "./src"

# # The os.makedirs() function creates a directory
# exist_ok=True means that the function will not raise an exception if the directory already exists
os.makedirs(src_dir, exist_ok=True)

**Observation:**
- Creating a folder for the .py files in this notebook.  Although this was created earlier, I am putting another instance here just in case it is needed, if that part of the notebook was not run first.

In [None]:
%%writefile {src_dir}/main.py

# Import the necessary modules
import mlflow
import argparse

import pandas as pd

from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.pipeline import make_pipeline
from sklearn.compose import make_column_transformer

from sklearn.model_selection import train_test_split

from sklearn.ensemble import GradientBoostingRegressor

mlflow.start_run()

# Create an argument parser to take input arguments from command line
def main():

 parser = argparse.ArgumentParser()
 parser.add_argument("--data", type=str, help="path to train data")
 parser.add_argument("--n_estimators", required=False, default=100, type=int)
 parser.add_argument("--learning_rate", required=False, default=0.1, type=float)

 args = parser.parse_args()


 # Load input data
 df = pd.read_csv(args.data)

 # Defining the variables
 target = 'Sales'
 numeric_features = ['TV','Radio', 'Newspaper']
 categorical_features = []

 # Creating X and y from the variables
 X = df.drop([target], axis=1)
 y = df[target]

 # Splitting the train and test data
 X_train, X_test, y_train, y_test = train_test_split(
 X, y, test_size=0.2, random_state=42
 )

 # Defining the GradientBoostingRegressor model
 model_gbr = GradientBoostingRegressor(
 n_estimators=args.n_estimators,
 learning_rate=args.learning_rate
 )

 # Initialize and train a GradientBoostingRegressor model
 model_pipeline = make_pipeline(model_gbr)
 model_pipeline.fit(X_train, y_train)

 # Compute and log model RSquared results
 rsq = model_pipeline.score(X_test, y_test)
 mlflow.log_metric("RSquared", float(rsq))

 # Registering the model to the workspace
 print("Registering model pipeline")
 mlflow.sklearn.log_model(
 sk_model=model_pipeline,
 # Setting the name for the registered_model
 registered_model_name="advertising-sales-price-predictor",
 # Setting the name for the artifact_path
 artifact_path="advertising-sales-price-predictor"
 )

 # End of MLflow tracking
 mlflow.end_run()

if __name__ == '__main__':
 main()

Overwriting ./src/main.py


**Observation:**
- The main.py file is created and stored in the ./src folder.  The model that is chosen is the GradientBoostingRegressor which is good for predictive problems.  The primary metric chosen is RSquared, which will measures how well a statistical model predicts an outcome.

### Configure the training job

In [None]:
# Import the necessary modules
from azure.ai.ml import command
from azure.ai.ml import Input

# Define a new AML job using the `command` function
job = command(
    inputs={
        "data": Input(type="uri_file", path="./Data/Advertising_Sales.csv"),
        "n_estimators": 100,
        "learning_rate": 0.1
    },
    code="src/main.py", # Location of the source code
    command="python main.py --data ${{inputs.data}}",
    # Specify the environment to be used for the job
    environment="mls-oct24-env@latest",
    # Specify the compute target to be used for the job
    compute="mls-oct24-cluster",
    # Specify the display name
    display_name="advertising_sales_training",
    # Specify the experiment name
    experiment_name="advertising_sales_price_training",
)

**Observation:**
- The training job is configured with parameters for this workbook/milestone.

### Run the training job

In [None]:
# ml_client.create_or_update will create a new job if it does not exist or update the existing job if it does
ml_client.create_or_update(job)

Experiment,Name,Type,Status,Details Page
advertising_sales_price_training,icy_beach_7x462s6xvf,command,Starting,Link to Azure Machine Learning studio


**Observation:**
- The training job is run and a link to the jobs is also created.

### Define the parameter space for hyperparameter tuning

In [None]:
# Import the necessary modules
from azure.ai.ml.sweep import Choice

# Reusing the command_job created before
job_for_sweep = job(
    n_estimators=Choice(values=[100, 200, 300, 400]),
    learning_rate=Choice(values=[0.001, 0.005, 0.05, 0.1])
)

**Observation:**
- Setting the hyperparameters for the sweep job, trying different n_estimators and different learning_rate.

### Configure the sweep job for tuning

In [None]:
# compute specifies the compute target where the sweep job will run.
# sampling_algorithm specifies the search algorithm to use for hyperparameter tuning.
# primary_metric specifies the metric to optimize during hyperparameter tuning.
# goal specifies whether to maximize or minimize the primary metric.
# max_total_trials specifies the maximum number of trials to run during hyperparameter tuning.
# max_concurrent_trials specifies the maximum number of trials to run concurrently during hyperparameter tuning.

sweep_job = job_for_sweep.sweep(
    compute="mls-oct24-cluster",
    sampling_algorithm="bayesian",
    primary_metric="RSquared",
    goal="Maximize",
    max_total_trials=16,
    max_concurrent_trials=3
)

**Observation:**
- Configuring the sweep job for tuning.  The alogrithm chosen is bayesian, which uses statistical independence for the variables.  The primary metric chosen is RSquared, which will measures how well a statistical model predicts an outcome.  And it is going to try and run 16 trials.

In [None]:
# Creating the names for the sweep job
sweep_job.experiment_name = "advertising_sales_price_tuning"
sweep_job.display_name = "advertising_sales_tuning"
sweep_job.description = "Run a hyperparameter sweep job for GBR"

### Run the sweep job

In [None]:
# Create or update the sweep job
returned_sweep_job = ml_client.create_or_update(sweep_job)

# Stream the output and wait until the job is finished
ml_client.jobs.stream(returned_sweep_job.name)

# Refresh the latest status of the job after streaming
returned_sweep_job = ml_client.jobs.get(name=returned_sweep_job.name)

RunId: olden_tongue_273yy7s51p
Web View: https://ml.azure.com/runs/olden_tongue_273yy7s51p?wsid=/subscriptions/4f8c6956-21ba-4424-aa60-2705417ce538/resourcegroups/test-resource/workspaces/demo-azureml

Streaming azureml-logs/hyperdrive.txt

[2024-11-10T17:25:57.0418840Z][GENERATOR][DEBUG]Sampled 3 jobs from search space 
[2024-11-10T17:25:57.3183497Z][SCHEDULER][INFO]Scheduling job, id='olden_tongue_273yy7s51p_0' 
[2024-11-10T17:25:57.4160730Z][SCHEDULER][INFO]Scheduling job, id='olden_tongue_273yy7s51p_1' 
[2024-11-10T17:25:57.4172441Z][SCHEDULER][INFO]Scheduling job, id='olden_tongue_273yy7s51p_2' 
[2024-11-10T17:25:57.9572334Z][SCHEDULER][INFO]Successfully scheduled a job. Id='olden_tongue_273yy7s51p_2' 
[2024-11-10T17:25:57.9000683Z][SCHEDULER][INFO]Successfully scheduled a job. Id='olden_tongue_273yy7s51p_1' 
[2024-11-10T17:25:57.9485385Z][SCHEDULER][INFO]Successfully scheduled a job. Id='olden_tongue_273yy7s51p_0' 
[2024-11-10T17:27:00.6242056Z][GENERATOR][DEBUG]Sampled 1 jobs fr

**Observation:**
- The sweep job is run and a link to the jobs is also created.

### Extract the run that gave best modeling results

In [None]:
# Import the necessary modules
from azure.ai.ml.entities import Model

if returned_sweep_job.status == "Completed":

    # Collecting the run which gave the best result
    best_run = returned_sweep_job.properties["best_child_run_id"]

    # Collecting the model from this run
    model = Model(
        # The script stores the model as "advertising_sales_best_model"
        path="azureml://jobs/{}/outputs/artifacts/paths/advertising-sales-price-predictor/".format(
            best_run
        ),
        name="advertising_sales_best_model",
        description="Model created from advertising sales",
        type="custom_model",
    )

else:
    print(
        "Sweep job status: {}. Please wait until it completes".format(
            returned_sweep_job.status
        )
    )

**Observation:**
- Extracting the sweep run that had the best results.  The best model had RSquared: 0.98409, n_estimators: 100, and learning_rate: 0.005.

### Register the best model

In [None]:
# Registering the best model for the run
registered_model = ml_client.models.create_or_update(model=model)

### Configure an Endpoint

In [None]:
# Import the necessary modules
from azure.ai.ml.entities import (
    ManagedOnlineEndpoint,
    ManagedOnlineDeployment,
    Model,
    Environment,
    CodeConfiguration,
)
from azure.ai.ml.constants import AssetTypes

In [None]:
# Importing the required modules
import random
import string

# Creating a unique endpoint name by including a random suffix

# Defining a list of allowed characters for the endpoint suffix
allowed_chars = string.ascii_lowercase + string.digits

# Generating a random 5-character suffix for the endpoint name by choosing
# characters randomly from the list of allowed characters
endpoint_suffix = "".join(random.choice(allowed_chars) for x in range(5))

# Creating the final endpoint name by concatenating a prefix string
# with the generated suffix string
endpoint_name = "advertising-sales-endpoint-" + endpoint_suffix

**Observation:**
- Creating an endpoint name that is unique and appends a randomly chosen 5 characters at the end of the name.

In [None]:
# Printing the endpoint name
print(f"Endpoint name: {endpoint_name}")

Endpoint name: advertising-sales-endpoint-22vpm


**Observation:**
- Printing the name of the endpoint to show the unique name with the randomly chosen extra 5 characters.

In [None]:
# Configuring the endpoint
endpoint = ManagedOnlineEndpoint(
    name=endpoint_name,
    # Name of the endpoint, should be unique within your deployment
    description="An online endpoint serving an MLflow model for the advertising sales classification task",
    # A string describing the purpose of the endpoint
    auth_mode="key",
    # Authentication mode to use for the endpoint (in this case, using an API key)
    tags={"foo": "bar"},
    # A dictionary of key-value pairs that can be used to tag the endpoint
)

**Observation:**
- Configuring the endpoint with a description, auth_mode, and tags.

### Create an Endpoint

In [None]:
# Creating the endpoint
ml_client.online_endpoints.begin_create_or_update(endpoint).result()

ManagedOnlineEndpoint({'public_network_access': 'Enabled', 'provisioning_state': 'Succeeded', 'scoring_uri': 'https://advertising-sales-endpoint-22vpm.eastus2.inference.ml.azure.com/score', 'openapi_uri': 'https://advertising-sales-endpoint-22vpm.eastus2.inference.ml.azure.com/swagger.json', 'name': 'advertising-sales-endpoint-22vpm', 'description': 'An online endpoint serving an MLflow model for the advertising sales classification task', 'tags': {'foo': 'bar'}, 'properties': {'createdBy': 'Sarah Garrett', 'createdAt': '2024-11-10T17:39:29.719279+0000', 'lastModifiedAt': '2024-11-10T17:39:29.719279+0000', 'azureml.onlineendpointid': '/subscriptions/4f8c6956-21ba-4424-aa60-2705417ce538/resourcegroups/test-resource/providers/microsoft.machinelearningservices/workspaces/demo-azureml/onlineendpoints/advertising-sales-endpoint-22vpm', 'AzureAsyncOperationUri': 'https://management.azure.com/subscriptions/4f8c6956-21ba-4424-aa60-2705417ce538/providers/Microsoft.MachineLearningServices/locati

**Observation:**
- Creating the endpoint.  Information about the endpoint creation is shown above for further details.

### Create a deployment script to perform model deployment

In [None]:
%%writefile {src_dir}/score.py

# Import necessary libraries and modules
import logging
import os
import json
import mlflow
from io import StringIO
from mlflow.pyfunc.scoring_server import infer_and_parse_json_input, predictions_to_json

######################LOGGER#####################
# Set up Azure logging
import logging
from logging import Logger
from opencensus.ext.azure.log_exporter import AzureLogHandler

# Connect to Application Insights and set logging level to INFO
application_insights_connection_string= 'InstrumentationKey=e040f1cb-4a7a-4009-87be-7861c813da51;IngestionEndpoint=https://eastus-8.in.applicationinsights.azure.com/;LiveEndpoint=https://eastus.livediagnostics.monitor.azure.com/'
handler = AzureLogHandler(
connection_string=application_insights_connection_string)
logger = logging.getLogger()
logger.addHandler(handler)
logger.setLevel(logging.INFO)

####################################################

# Define the init() function to load the MLflow model
def init():
    global model
    global input_schema
    # "model" is the path of the mlflow artifacts when the model was registered. For automl
    # models, this is generally "mlflow-model".
    model_path = os.path.join(os.getenv("AZUREML_MODEL_DIR"), "advertising-sales-price-predictor")
    model = mlflow.pyfunc.load_model(model_path)
    input_schema = model.metadata.get_input_schema()

# Define the run() function to make predictions using the loaded model
def run(raw_data):
    # Parse input data
    json_data = json.loads(raw_data)
    if "input_data" not in json_data.keys():
        raise Exception("Request must contain a top level key named 'input_data'")
    serving_input = json.dumps(json_data["input_data"])
    data = infer_and_parse_json_input(serving_input, input_schema)
    # Make predictions
    predictions = model.predict(data)

    # Log the input data and predictions to Azure
    logger.info("Data:{0},Predictions:{1}".format(str(data),str(predictions)))

    # Convert predictions to JSON format and return
    result = StringIO()
    predictions_to_json(predictions, result)
    return result.getvalue()

Writing ./src/score.py


**Observation:**
- The score.py file is created and stored in the ./src folder.

In [None]:
# Create a new deployment with name "blue"
blue_deployment = ManagedOnlineDeployment(
    name="blue",
    # Use the previously generated endpoint name
    endpoint_name=endpoint_name,
    # Use the registered model
    model=registered_model,
    # Use the latest environment named "mls-oct24-env@latest"
    environment="mls-oct24-env@latest",
    # Use the code in the "./src" directory and the "score.py" script
    code_configuration=CodeConfiguration(
        code="./src", scoring_script="score.py"
    ),
    # Use a single instance of type "Standard_E2s_v3"
    instance_type="Standard_E2s_v3",
    instance_count=1,
    # Enable Application Insights for the deployment
    app_insights_enabled=True,
)

**Observation:**
- Creating a new deployment named blue with parameters.

In [None]:
# Create or update the blue_deployment
ml_client.online_deployments.begin_create_or_update(blue_deployment).result()

Check: endpoint advertising-sales-endpoint-22vpm exists
[32mUploading src (0.01 MBs): 100%|██████████| 5394/5394 [00:00<00:00, 57532.20it/s]
[39m



.........................................................................

ManagedOnlineDeployment({'private_network_connection': None, 'package_model': False, 'provisioning_state': 'Succeeded', 'endpoint_name': 'advertising-sales-endpoint-22vpm', 'type': 'Managed', 'name': 'blue', 'description': None, 'tags': {}, 'properties': {'AzureAsyncOperationUri': 'https://management.azure.com/subscriptions/4f8c6956-21ba-4424-aa60-2705417ce538/providers/Microsoft.MachineLearningServices/locations/eastus2/mfeOperationsStatus/odidp:9f5627da-e74d-4fdb-a9a3-135424c647ab:e4596965-ac91-42a7-a8cf-79b86ce0531b?api-version=2023-04-01-preview'}, 'print_as_yaml': False, 'id': '/subscriptions/4f8c6956-21ba-4424-aa60-2705417ce538/resourceGroups/test-resource/providers/Microsoft.MachineLearningServices/workspaces/demo-azureml/onlineEndpoints/advertising-sales-endpoint-22vpm/deployments/blue', 'Resource__source_path': '', 'base_path': '/mnt/batch/tasks/shared/LS_root/mounts/clusters/compute12/code/Users/scharon.garrett', 'creation_context': <azure.ai.ml._restclient.v2023_04_01_previe

**Observation:**
- Creating or updating the blue deployment.  Information about this deployment are shown above for further details.

### Configure the deployment

In [None]:
# Configure the deployment
ml_client.online_endpoints.invoke(
    #Name of the endpoint
    endpoint_name=endpoint_name,
    #Name of the specific deployment to test in an endpoint
    deployment_name="blue",
    #File with request data for testing
    request_file="sample-request-sklearn.json",
)

'"{\\"predictions\\": [7.123222905473065, 14.593084394064093, 13.71235667273795, 19.53728678525383, 12.288216842394489]}"'

**Observation:**
- Configuring the blue deployment with a sample json.  I created a json file since one has not been given yet.  These are the details of the json:

{"input_data": {
  "dataframe_split": {
    "columns": [
      "TV",
      "Radio",
      "Newspaper"
    ],
    "data": [
      [17.8, 19.6, 23.5],
      [112.3, 29.3, 49.6],
      [74.7, 41.5, 35.2],
      [206.3, 33.4, 1.8],
      [228.7, 5.2, 18.3]
    ]
  }
}
}

### Delete the Endpoint

**Important!** An Endpoint is a LIVE node which is always running, ready to process & predict to give you output. So unless you are making real-time predictions on streaming data, delete your endpoints after use

In [None]:
# Deleting the endpoint
ml_client.online_endpoints.begin_delete(name=endpoint_name)

<azure.core.polling._poller.LROPoller at 0x7fa3952e41f0>

..........

**Observation:**
- Deleting the endpoint, so that there is not a LIVE event running.

### **Conclusions**

- The data provided was complete and had 200 rows.  All the data was numerical, which helped with making this problem/solution more straight forward.
- The model that is chosen is the GradientBoostingRegressor which is good for predictive problems - which we are trying to predict sales.  The primary metric chosen is RSquared, which will measure how well a statistical model predicts an outcome.  The alogrithm chosen is bayesian, which uses statistical independence for the variables.
- The runs did take longer than the other code, and the sweep took the longest.  The sweep allowed for you to observe the findings in real time and to find trends and the best run.  I liked being able to choose the best run and use it going forward with the deployment.
- The best model had RSquared: 0.98409, n_estimators: 100, and learning_rate: 0.005, which was used for the deployment.
- Even though the json for this project was not provided, it was pretty simple to create one from the sample one in module 3.
- This was a good start in learning about different types of storage and how you would be charged, and the same goes for live events.