<a href="https://www.kaggle.com/code/adedapoadeniran/talkingstage-explanation?scriptVersionId=186159173" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

# TalkingStageBot: Training and Usage Explanation

This notebook provides an overview of the `TalkingStageBot`, detailing how to train the model using the training data and how to use the trained model to generate responses to user questions. The bot utilizes machine learning techniques to classify and respond to user inputs.

## 1. Setup and Dependencies

We start by importing the necessary libraries and setting up the environment.


In [None]:
# Import necessary libraries
import os
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
import matplotlib.pyplot as plt

from microsoft.ml import MLContext, DataViewSchema
from microsoft.ml.data import TextLoaderSaverCatalog
from microsoft.ml.transforms.text import FeaturizeText
from microsoft.ml.trainers import SdcaMaximumEntropy
from microsoft.ml.transforms import Concatenate
from microsoft.ml.transformers import MapValueToKey, MapKeyToValue


## 2. Loading and Preprocessing Data

In this section, we will load and preprocess the training data. The training data is assumed to be in a CSV file named `training_data.csv`.


In [None]:
# Load the training data
training_data_path = 'path/to/training_data.csv'  # Update this path to the actual location of your training data

# Define the InputData class for schema
class InputData:
    def __init__(self, text, label):
        self.Text = text
        self.Label = label

# Create an ML context
ml_context = MLContext()

# Load the training data
training_data = ml_context.Data.LoadFromTextFile(
    path=training_data_path,
    separatorChar=',',
    hasHeader=True
)


## 3. Data Preprocessing

We will preprocess the training data to ensure consistency. This includes converting text to lowercase and trimming whitespace.


In [None]:
# Preprocess the training data
preprocessed_training_data = ml_context.Data.CreateEnumerable(
    training_data,
    reuseRowObject=False
).select(lambda row: InputData(text=row.Text.strip().lower(), label=row.Label.strip().lower()))

# Create an IDataView from preprocessed data
preprocessed_training_data_view = ml_context.Data.LoadFromEnumerable(preprocessed_training_data)


## 4. Defining the Training Pipeline

We will define the data preparation and training pipeline. The pipeline includes text featurization, label mapping, and a classification trainer.


In [None]:
# Define the data preparation and training pipeline
pipeline = ml_context.Transforms.Text.FeaturizeText(
    outputColumnName='Features',
    inputColumnName='Text'
).Append(
    ml_context.Transforms.Conversion.MapValueToKey(
        outputColumnName='LabelKey',
        inputColumnName='Label'
    )
).Append(
    ml_context.Transforms.Concatenate(
        outputColumnName='Features',
        inputColumnNames=['Features']
    )
).AppendCacheCheckpoint(ml_context).Append(
    ml_context.MulticlassClassification.Trainers.SdcaMaximumEntropy(
        labelColumnName='LabelKey',
        featureColumnName='Features',
        l2Regularization=0.1,
        l1Regularization=0.01,
        maximumNumberOfIterations=1000
    )
).Append(
    ml_context.Transforms.Conversion.MapKeyToValue(
        outputColumnName='PredictedLabel',
        inputColumnName='PredictedLabel'
    )
)


## 5. Training the Model

We will train the model using the defined pipeline and save the trained model to a file named `model.zip`.


In [None]:
# Train the model
model = pipeline.Fit(preprocessed_training_data_view)

# Save the trained model
model_path = 'path/to/model.zip'  # Update this path to the desired location to save the model
ml_context.Model.Save(model, preprocessed_training_data_view.Schema, model_path)


## 6. Using the Trained Model

We will demonstrate how to use the trained model to generate responses to user inputs.


In [None]:
# Load the trained model
model = ml_context.Model.Load(model_path, out model_schema)

# Define the function to predict responses
def get_response(question):
    # Split input into individual questions
    question_list = question.split(['.', '?', ',', '!'], remove_empty_entries=True)
    responses_set = set()

    for question in question_list:
        lower_question = question.lower().strip()
        # If no direct keyword match, use ML model prediction
        prediction = predict(lower_question)
        if prediction and prediction.PredictedLabel:
            responses_set.add(prediction.PredictedLabel)
        else:
            responses_set.add("I don't have an answer for that.")
    
    return ', '.join(responses_set)

# Define the prediction function
def predict(text):
    input_data = [{'Text': text}]
    input_data_view = ml_context.Data.LoadFromEnumerable(input_data)
    transformed_data = model.Transform(input_data_view)
    predictions = ml_context.Data.CreateEnumerable(transformed_data, reuse_row_object=False).tolist()
    return predictions[0] if predictions else None


## 7. Testing the Bot

Let's test the bot with a sample input.


In [None]:
# Test the bot with a sample input
sample_input = "What's your name? Where do you live?"
response = get_response(sample_input)
print(f"Response: {response}")


## Conclusion

This notebook provided an overview of the `TalkingStageBot`, from training the model to generating responses. You can further customize and improve the bot by updating the training data and fine-tuning the model parameters.
