Skip to content

Creating a chatbot intent architecture using clustering methods

Notifications You must be signed in to change notification settings

ssears219/Chatbot-Intent-Architecture

Repository files navigation

Chatbot Intent Architecure

Creating an intent architecture using clustering

Chatbot

Description

Given a dataset of user utterances, how do we determine intents, or classifications, we should train a chatbot on? We could manually label the utterances one-by-one with their respective intent, but that could take too much time. We could filter the utterances by keywords, but different words may mean the same thing – or the same words may mean different things. We could deploy intents iteratively, but we would have a high chance of mistaking untrained utterances as trained ones. This project explores a solution; cluster the entire dataset of user utterances based on their similarity and use the resulting clusters as the intents in the bot.

Data

Bitext Free Dataset

Contents

  • bitext_free_dataset.csv - starting data from Bitext
  • Training Dataset.xlsx - data used for training the AWS Lex bot
  • Hierarchical Clustering and Cosine Similarity.ipynb - Initial attempt at cluserting
  • Clustering with Word2Vec word embedding.ipynb - Final clustering method notebook
  • Chatbot Intent Architecture.docx - Writeup
  • Chatbot Intent Architecture.pptx - Presentation Video
  • Results.xlsx - Performance Metrics from testing

Tools

  • Python
  • Gensim
  • SKLearn
  • Scipy
  • NLTK

Author

Samuel Sears @ssears219

Acknowledgments

About

Creating a chatbot intent architecture using clustering methods

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published