This notebook has been taken from a [Microsoft Azure Machine Learning Notebooks Github Repository](https://github.com/katiehouse3/microsoft-azure-ml-notebooks/tree/master/commercial-blocks-classification) and has been modified to demonstrate the displaying of run/job metrics onto a Microsoft Teams Channel and registering the run/job as a model from Teams.

# Classifying Commercial Blocks with Microsoft Azure

---
---

## Contents
1. [Introduction](#Introduction)
1. [Setup](#Setup)
    1. [Accessing the Azure Workspace](#Accessing-the-Azure-Workspace)
    1. [Configuring the Azure Workspace](#Configuring-the-Azure-Workspace)
    1. [Importing Libraries](#Importing-Libraries)
    1. [Creating an Experiment](#Creating-an-Experiment)
1. [Data](#Data)
1. [Classifying with Scikit Learn](#Classifying-with-SciKit-Learn)
1. [Viewing metrics on Microsoft Teams](#Viewing-metrics-on-teams)

---

## Introduction

**The Task:** Classify commercial blocks from TV news segments (+1 commercial, -1 Non-commercial).

This classification model runs a dataset of broadcast data to classify whether a specific segment is a commercial on television. The dataset was taken from the UCI Machine Learning Repository with over 120,000 instances and 4125 features. For more information about the features in this dataset, check out [the UCI Machine Learning Repository](http://archive.ics.uci.edu/ml/datasets/tv+news+channel+commercial+detection+dataset). 

**The Method:** This experiment will compare training a classification model with traditional machine learning models (KNN, Random Forest, Neural Networks) to training it with Microsoft Azure Automated Machine Learning. 

<br>

---

## Setup

### Accessing the Azure Workspace
To configure a [Microsoft Azure workspace](https://learn.microsoft.com/en-us/azure/machine-learning/quickstart-create-resources?view=azureml-api-2), you must [set up a  Azure subscription](https://azure.microsoft.com/en-us/free/) and manage the subscription from the [Azure portal](https://portal.azure.com/). 

In [1]:
# Load your workspace
from azureml.core import Workspace

ws = Workspace.from_config()

### Configuring the Azure Workspace
Before configuring your workspace, be sure to have the following installed:
```
$ pip install -U scikit-learn
$ pip install azureml.core
```

### Importing Libraries

In [4]:
import azureml.core
import logging
from azureml.core.experiment import Experiment
from azureml.core.workspace import Workspace

### Creating an Experiment
This command will create a new experiment on Azure's Machine Learning Services Workspace. Experiments track important metrics of each model run, such as accuracy, precision and F1 score.

In [5]:
# Choose a name for the experiment
experiment_name = 'azure-kpi-metrics'   # DEFAULT VALUE: modify if needed

experiment = Experiment(ws, experiment_name)

---

## Data
Data is from the UCI Machine Learning Repository with over 120,000 instances and 4125 features to [classify commercial blocks](http://archive.ics.uci.edu/ml/datasets/tv+news+channel+commercial+detection+dataset).

Data Citation:
Dr. Prithwijit Guha , Raghvendra D. Kannao and Ravishankar Soni 
Multimedia Analytics Lab, 
Department of Electrical and Electronics Engineering, 
Indian Institute of Technology, Guwahati, India 
rdkannao@gmail.com , prithwijit.guha@gmail.com

These functions will import your data as a dataframe to use with SciKit learn models in the `train.py` script.

In [6]:
# Data Upload Functions
from sklearn.datasets import load_svmlight_file

def get_data(filepath):
    data = load_svmlight_file(filepath)
    return data[0], data[1]


# Import Data
print("\nImporting Data...")

X_train, y_train = get_data("./data/train_data.txt")
X_test, y_test = get_data("./data/test_data.txt")

X_train = X_train.toarray() # convert sparce matrix to array
X_test = X_test.toarray() 
print("Data imported.")

print("\nTraining data has %i rows" % len(X_train))
print("Testing data has %i rows" % len(X_test))



Importing Data...
Data imported.

Training data has 13500 rows
Testing data has 5750 rows


## Classifying with SciKit Learn

### Managing Dependencies
This is a local run of the classifcation model, so you must ensure all the necessary packages are available in the Python environment you run in the training script.

In [7]:
from azureml.core.runconfig import RunConfiguration

# Editing a run configuration property on-fly.
run_config_user_managed = RunConfiguration()

run_config_user_managed.environment.python.user_managed_dependencies = True

### Train with `train.py` Script and Log Metrics
With Azure, you can train on almost any local `train.py` script! The training script in this scenario implements SciKit learn models with the standard SciKit Learn libraries. This script was taken from a homework assignment that did not originally use Microsoft Azure. Azure Machine Learning Services allows users to run standalone `train.py` scripts on an experiment without any alterations. However, logging variables were added to the original script to track model accuracy and other important metics. To add logging, the following lines were added to the original `train.py` script:

**Added to the library imports:**
```
from azureml.core.run import Run
```
**Added after the model was trained** 

This code initializes the logger in your experiment context
```
run_logger = Run.get_context()
```

**Logged multiple variables to track!** 
```
run_logger.log(name='Model', value=model_name)
run_logger.log(name='Accuracy', value=accuracy)
```


For more documentation about logging variables, [navigate here](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-track-experiments#viewing-charts-in-run-details).

### Read Local Training Script
This local training script (with added logging capabilities) iterates through three Scikit Learn models, `RandomForestClassifier`, `MLPClassifier`, `KNeighborsClassifier`, with default parameters and logs the test accuracy, precision and F1 Score of each model.

Running the below cells after the initial run is sufficient if any changes in the model type (`RandomForestClassifier`, `MLPClassifier`, `KNeighborsClassifier`) is made in the `train.py` file 

In [8]:
with open('train.py', 'r') as f:
    print(f.read())

# IMPORT LIBRARIES
from sklearn.datasets import load_svmlight_file
import numpy as np
import time
from sklearn.metrics import f1_score
from sklearn.metrics import accuracy_score
from sklearn.metrics import average_precision_score
import csv
from sklearn.ensemble import RandomForestClassifier
from sklearn.neural_network import MLPClassifier
from sklearn.neighbors import KNeighborsClassifier
# IMPORT AZURE LIBRARY
from azureml.core.run import Run


# DEFINE FUNCTIONS
def get_data(filepath):
    data = load_svmlight_file(filepath)
    return data[0], data[1]


def Classifier_Test_Train(model, model_name):
    # Training Model...
    start_time = time.time()  # track train time
    model.fit(X_train, y_train)
    train_t = round(time.time() - start_time, 2)

    # Testing Model...
    predictions = model.predict(X_test)
    accuracy = round(accuracy_score(y_test, predictions),2)
    f1score = round(f1_score(y_test, predictions), 2)
    precision = round(average_precision_score(y_test, pred

### Submit a Local Run of the Training Script
Adding the `script='train.py'` argument to `ScriptRunConfig()` will run your local training script on your experiment.
<p>
Edit `train.py` if you want to change the classifier model to get different metric values.

In [9]:
from azureml.core import ScriptRunConfig

src = ScriptRunConfig(source_directory='./', script='train.py', run_config=run_config_user_managed)
run = experiment.submit(src)

### View your Evaluation Metrics in Azure
Azure automatically visualizes your logged metrics for your convenience. This expermiment logs `Model Name`, `Accuracy`, `Precision` and `F1 score` for the chosen model out of Random Forest, Neural Networks, K Nearest Neighbor models in SciKit Learn. 

Use run by itself to access a link to the run in the Azure Portal. **Click `Link to Azure Portal` below** to view your visualizations.

In [10]:
run

Experiment,Id,Type,Status,Details Page,Docs Page
azure-kpi-metrics,azure-kpi-metrics_1689745634_ffe6502e,azureml.scriptrun,Starting,Link to Azure Machine Learning studio,Link to Documentation


In [11]:
run.wait_for_completion()

{'runId': 'azure-kpi-metrics_1689745634_ffe6502e',
 'target': 'local',
 'status': 'Finalizing',
 'startTimeUtc': '2023-07-19T05:47:17.669972Z',
 'services': {},
 'properties': {'_azureml.ComputeTargetType': 'local',
  'ContentSnapshotId': '7f0a91c7-b3e2-45fe-9935-d66674c6e186'},
 'inputDatasets': [],
 'outputDatasets': [],
 'runDefinition': {'script': 'train.py',
  'command': '',
  'useAbsolutePath': False,
  'arguments': [],
  'sourceDirectoryDataStore': None,
  'framework': 'Python',
  'communicator': 'None',
  'target': 'local',
  'dataReferences': {},
  'data': {},
  'outputData': {},
  'datacaches': [],
  'jobName': None,
  'maxRunDurationSeconds': None,
  'nodeCount': 1,
  'instanceTypes': [],
  'priority': None,
  'credentialPassthrough': False,
  'identity': None,
  'environment': {'name': 'default-environment',
   'version': 'Autosave_2023-07-18T05:29:45Z_8c766d72',
   'assetId': 'azureml://locations/eastus/workspaces/fa386377-f84a-480e-af3e-d558bb8884df/environments/default-e

## Azure KPI Metrics
Display the current run and previous run metrics onto a Microsoft Teams Channel using the `azure_kpi_metrics` function imported from `azure_kpi_metrics.py` file. 

##### Install the required Python libraries

In [12]:
pip install -r requirements.txt

Note: you may need to restart the kernel to use updated packages.


##### Import the `azure_kpi_metrics` function from `azure_kpi_metrics.py` file.

In [13]:
from azure_kpi_metrics import *

##### Set Incoming Webhook URL and Experiment Name

Optional parameters that can be changed

In [14]:
exp_name = 'azure-kpi-metrics'
teams_url = "<INCOMING_WEBHOOK_URL>"

##### Call the `azure_kpi_metrics` function

In [15]:
azure_kpi_metrics(exp_name, teams_url)

#### **Ensure your compute instance is stopped after running the notebook**