Skip to content

Build an AI chatbot using Keras Sequential Model with real chat data and a pre-trained Universal Sentence Encoder. This project covers data preprocessing, model training, and evaluation, offering a deployable chatbot for improved customer interaction.

Notifications You must be signed in to change notification settings

ssjiyobindas/AI-Chatbot-using-Keras-Sequential-Model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 

Repository files navigation

AI Chatbot Development using Keras Sequential Model

Overview

This project utilizes internal customer support data to create a robust chatbot using the Sequential model in Keras. The aim is to efficiently process unstructured data, label data through unsupervised and supervised techniques, and ultimately build an AI Chatbot for customer assistance.

Tech Stack

  • Language: Python
  • Libraries: pandas, numpy, seaborn, spacy, tensorflow, sklearn, nltk, matplotlib, hyperopt, keras, chatintents

Data

The dataset comprises unstructured ProjectPro customer service inquiry chat logs, involving dialogues between a human customer agent and a visitor to the ProjectPro website.

Approach

  1. Preprocess semi-structured data
  2. Perform exploratory data analysis
  3. Unsupervised labeling
  4. Supervised labeling
  5. Training data preparation
  6. Hyperparameter tuning
  7. Train deep learning sequential model
  8. Evaluate the model
  9. Use the model for prediction
  10. Run the chatbot

Note: Refer to README.MD file for running all code files.

Project Takeaways

  1. Understand the role of Chatbots in customer service.
  2. Learn to process semi-structured data effectively.
  3. Explore unsupervised and supervised labeling techniques.
  4. Clean textual data for optimal model input.
  5. Conduct exploratory data analysis for textual data.
  6. Prepare data for training chatbots.
  7. Create embeddings for textual data.
  8. Build a Keras sequential model.
  9. Perform hyperparameter tuning for optimal results.
  10. Evaluate the model using accuracy and F1 score.
  11. Make predictions using the developed model.
  12. Run the Chatbot seamlessly.

Implementation Details

Chatbot built using real chat data and pretrained model

This project is compiled with Python using a pre-trained Universal Sentence Encoder (USE) model. The code is designed to run locally in a terminal.

Virtual environment

Create a python 3.9 virtual environment using (please ensure you have miniconda installed):

conda create -n myenv python=3.9

Then activate virtual environemt:

conda activate myenv

Add myenv to Jupyter-Notebook/Lab

To ensure python versions are compatible between myenv and Jupyter it is necessary to create myenv IPython kernel.

Begin by installing ipykernel:

pip install --user ipykernel

Then link the myvenv kernel to Jupyter:

python -m ipykernel install --user --name=myenv

After running Jupyter select Kernel from the Jupyter menu bar and select Change kernel... from the Kernel menu. From the pop up box select the myenv kernel.

Install dependencies

To install dependencies use the below command:

pip install -r requirements.txt

Data

The dataset is an unstructured assortment of ProjectPro customer service enquiry chat logs. The chat logs consist of timestamped dialogue between a human customer agent and a visitor to the ProjectPro website. The dialogue consists predominately of queries about ProjectPro's services, prices, location, and signup information.

Preprocess, explore, and cluster

Initialise the Cluster class to begin these steps:

python engine.py --cluster

Preprocessing

Data label clustering is performed in an unsupervised way. An initial step before any clustering is to preprocess the chat logs one by one:

  • Extract only the text transcripts with the relevant chatter.
  • Remove urls.
  • Normalise contractions and other shorthand.
  • Strip everything except letter characters.
  • lemmatize words.
  • load text and user into dataframe

Exploratory Data Analysis

Following preprocessing, the data is then explored to identify and vidualise features. Beginning with initialising the EDA object:

python engine.py --eda

Various data exploration methods can be called on the data to explore the features:

# check token frequency distribution
python engine.py --eda_token_dist

# plot the frequency distribution
python engine.py --eda_plot_token_dist

# get top N tokens
python engine.py --eda_top_n_tokens N  # integer

# get token length histogram
python engine.py --eda_tokens_hist

# get senth length histogram
python engine.py --eda_sent_hist

The dataframe derived from preprocessing is clustered using the chatintents module. The follwing is the order in which the data is clustered:

Along the way the outcomes of different stages in the clustering process can be eplored using the following methods:

# check the best hyperparameters derived through bayesian optimization
python engine.py --cluster_best_params

# get cluster visualisation
python engine.py --cluster_plot

# get summary dataframe slice (20) of cluster labels e.g. label count
python engine.py --cluster_labels_summary

# get dataframe slice (20) of labelled text
python engine.py --cluster_labeled_utts

The final dataframe is exported to a CSV file for further human review and ammendment. The data can be then exported to a json file or kept in the csv and processed using parse_data_csv function.

Train the model

To train the model, run:

python engine.py --train

The data is prepared by one hot encoding the labels and splitting into train, test, eval sets.

Training entails a pre-step of hyperparameter tuning using keras_tuner. Hyperparameters such as:

  • Number of layers
  • Number of perceptrons
  • Dropout layer value
  • Activation function
  • Learning rate
  • Number of epochs

Once these are optimized the best hyperparameter configuration is used to train the model.

The mddel has an early stopping mechanism, which uses the validation loss as a stopping condition. Once the validation loss drops, the training continues for a set number of epochs and stops if there is no improvement over the historic best value.

The pre-training process can be explored using a number of methods:

# Check the summary of the hyperparameter tuning
python engine.py --train_search_summary

# Check the results of the hyperparameter tuning
python engine.py --train_results_summary

# check the summary of the model
python engine.py --train_model_summary

# get diagram of model
python engine.py --train_model_diagram

Once the model has finished training, the model with the best weights is saved.

Evaluate the model

The saved model can then be evaluated using the test data. To evaluate the model, begin by initializing the Eval class:

python engine.py --eval

A number of evaluation methods can be called to explore the model performance:

# check test data loss and accuracy
python engine.py --eval_test_loss_acc

# Plot of validation vs train accuracy over epochs
python engine.py --eval_acc_plot

# Plot of validation vs train loss over epochs
python engine.py --eval_loss_plot

# Compare predicted vs actual intents of test data
python engine.py --eval_comp_preds

# Get f-score of predicted intents
python engine.py --eval_fscore

# Get confusion matrix of predicted vs actual intents
python engine.py --eval_conf_matrix

Predictions in chat

To run the model in a chat environment use:

python engine.py --chat

The chat will run in a terminal and simulate a deployed chatbot with predicted responses given some user input.

About

Build an AI chatbot using Keras Sequential Model with real chat data and a pre-trained Universal Sentence Encoder. This project covers data preprocessing, model training, and evaluation, offering a deployable chatbot for improved customer interaction.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published