# A Basic Chatbot using RNN(LSTM)

### Simple Chatbot Project: Introduction

#### Introduction

In this project, I aimed to develop a simple chatbot using various datasets. Chatbots are software applications designed to interact with users in natural language and perform specific tasks. This project involves creating a chatbot using multiple datasets that serve different purposes.

#### Datasets Used

The following datasets were used for training and testing the chatbot:

1. **IT Helpdesk Chatbot Dataset**:
   - **Path**: `/kaggle/input/it-helpdesk-chatbot-dataset/intents.json`
   - **Description**: This dataset contains various intents and responses for IT helpdesk scenarios.

2. **Simple Chatbot Dataset**:
   - **Path**: `/kaggle/input/simple-chatbot-dataset/intents.json`
   - **Description**: This dataset includes general intents and responses for building a simple chatbot.

3. **Computer Science Theory QA Dataset**:
   - **Path**: `/kaggle/input/computer-science-theory-qa-dataset/intents.json`
   - **Description**: This dataset is designed for questions and answers related to computer science theory.

4. **Books Dataset**:
   - **Paths**:
     - `/kaggle/input/books-dataset/intents.json`
     - `/kaggle/input/books-dataset/data.csv`
   - **Description**: This dataset contains intents and data related to books, such as book recommendations and information.

5. **Chatbot Dataset by Mohammad Nourullahi**:
   - **Path**: `/kaggle/input/d/mohammadnourullahi/chatbot-dataset/intents.json`
   - **Description**: A general-purpose chatbot dataset created by Mohammad Nourullahi.

6. **General Chatbot Dataset**:
   - **Path**: `/kaggle/input/chatbot-dataset/intents.json`
   - **Description**: This dataset includes a variety of intents and responses for general chatbot interactions.

7. **Star Wars Chatbot Dataset**:
   - **Path**: `/kaggle/input/star-wars-chat-bot/starwarsintents.json`
   - **Description**: This dataset contains intents and responses related to the Star Wars universe.

#### Objective

The main objective of this project is to create a chatbot capable of handling various types of interactions by leveraging the diverse intents and responses from the provided datasets. The chatbot will be designed to understand user queries and provide appropriate responses, simulating a natural conversation.

#### Methodology

1. **Data Preparation**: The intents and responses from the datasets were preprocessed and combined into a unified format for training.
2. **Model Training**: A machine learning model was trained using the prepared data to recognize different intents and generate appropriate responses.
3. **Evaluation**: The chatbot's performance was evaluated using test data to ensure accuracy and reliability.
4. **Implementation**: The trained model was implemented into a simple chatbot application, allowing for user interaction.

#### Conclusion

This project demonstrates the development of a simple chatbot using various datasets, showcasing the potential of chatbot technology in different domains. The resulting chatbot can handle a wide range of queries, providing a versatile tool for user interaction.


In [1]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

/kaggle/input/it-helpdesk-chatbot-dataset/intents.json
/kaggle/input/simple-chatbot-dataset/intents.json
/kaggle/input/computer-science-theory-qa-dataset/intents.json
/kaggle/input/books-dataset/intents.json
/kaggle/input/books-dataset/data.csv
/kaggle/input/d/mohammadnourullahi/chatbot-dataset/intents.json
/kaggle/input/chatbot-dataset/intents.json
/kaggle/input/star-wars-chat-bot/starwarsintents.json


In [2]:
intents = pd.read_json('/kaggle/input/books-dataset/intents.json')

In [3]:
intents.head()

Unnamed: 0,intents
0,"{'tag': 'greeting', 'patterns': ['Hello', 'Hi'..."
1,"{'tag': 'goodbye', 'patterns': ['Goodbye', 'By..."
2,"{'tag': 'thanks', 'patterns': ['Thanks', 'Than..."
3,"{'tag': 'book_search', 'patterns': ['Can you r..."
4,"{'tag': 'Fiction', 'patterns': ['Fiction', 'Re..."


In [4]:
intents.shape

(51, 1)

In [5]:
intents['intents'][0]

{'tag': 'greeting',
 'patterns': ['Hello',
  'Hi',
  'Hey',
  'Greetings',
  'Good morning',
  'Good afternoon',
  'Good evening'],
 'responses': ['Hello! How can I help you today?',
  'Hi there! What can I do for you?',
  'Hey! What brings you here today?']}

In [6]:
intents2=pd.read_json("/kaggle/input/simple-chatbot-dataset/intents.json")

In [7]:
intents2.head()

Unnamed: 0,intents
0,"{'tag': 'greeting', 'patterns': ['Hi', 'Hey', ..."
1,"{'tag': 'weather', 'patterns': ['What's the we..."
2,"{'tag': 'hobbies', 'patterns': ['What are your..."
3,"{'tag': 'music', 'patterns': ['What's your fav..."
4,"{'tag': 'movies', 'patterns': ['What's a good ..."


In [8]:
intents2.shape

(18, 1)

In [9]:
intents3=pd.read_json("/kaggle/input/chatbot-dataset/intents.json")

In [10]:
intents3.head()

Unnamed: 0,intents
0,"{'tag': 'greeting', 'patterns': ['Hi', 'How ar..."
1,"{'tag': 'goodbye', 'patterns': ['cya', 'see yo..."
2,"{'tag': 'creator', 'patterns': ['what is the n..."
3,"{'tag': 'name', 'patterns': ['name', 'your nam..."
4,"{'tag': 'hours', 'patterns': ['timing of colle..."


In [11]:
intents3.shape

(38, 1)

In [12]:
intents4=pd.read_json("/kaggle/input/d/mohammadnourullahi/chatbot-dataset/intents.json")

In [13]:
intents4.head()

Unnamed: 0,intents
0,"{'tag': 'greeting', 'patterns': ['Hi', 'How ar..."
1,"{'tag': 'goodbye', 'patterns': ['cya', 'see yo..."
2,"{'tag': 'creator', 'patterns': ['what is the n..."
3,"{'tag': 'name', 'patterns': ['name', 'your nam..."
4,"{'tag': 'hours', 'patterns': ['timing of colle..."


In [14]:
intents4.shape

(38, 1)

In [15]:
intents5=pd.read_json("/kaggle/input/computer-science-theory-qa-dataset/intents.json")
intents5.head()

Unnamed: 0,intents
0,"{'tag': 'abstraction', 'patterns': ['Explain d..."
1,"{'tag': 'error', 'patterns': ['What is a synta..."
2,"{'tag': 'documentation', 'patterns': ['Explain..."
3,"{'tag': 'testing', 'patterns': ['What is softw..."
4,"{'tag': 'datastructure', 'patterns': ['How do ..."


In [16]:
intents5.shape

(172, 1)

In [17]:
intents6=pd.read_json("/kaggle/input/it-helpdesk-chatbot-dataset/intents.json")
intents6.head()

Unnamed: 0,intents
0,"{'tag': 'greeting', 'patterns': ['Hi there', '..."
1,"{'tag': 'goodbye', 'patterns': ['Bye', 'See yo..."
2,"{'tag': 'thanks', 'patterns': ['Thanks', 'Than..."
3,"{'tag': 'noanswer', 'patterns': ['q', 'random'..."
4,"{'tag': 'options', 'patterns': ['How you could..."


In [18]:
intents6.shape

(18, 1)

In [19]:
intents7=pd.read_json("/kaggle/input/star-wars-chat-bot/starwarsintents.json")
intents7.head()

Unnamed: 0,intents
0,"{'tag': 'greeting', 'patterns': ['Hi', 'Hey', ..."
1,"{'tag': 'goodbye', 'patterns': ['Bye', 'See yo..."
2,"{'tag': 'thanks', 'patterns': ['Thanks', 'Than..."
3,"{'tag': 'tasks', 'patterns': ['What can you do..."
4,"{'tag': 'alive', 'patterns': ['Are you alive.'..."


In [20]:
%%bash
jq -s '[.[].intents] | add | {intents: .}' \
    /kaggle/input/books-dataset/intents.json \
    /kaggle/input/chatbot-dataset/intents.json \
    /kaggle/input/simple-chatbot-dataset/intents.json \
    /kaggle/input/d/mohammadnourullahi/chatbot-dataset/intents.json \
    /kaggle/input/computer-science-theory-qa-dataset/intents.json \
    /kaggle/input/it-helpdesk-chatbot-dataset/intents.json \
    /kaggle/input/star-wars-chat-bot/starwarsintents.json \
    > merged_intents.json
echo "Done"

Done


In [21]:
import json
import numpy as np
import random
import nltk
from keras.models import Sequential
from keras.layers import Dense, Dropout, LSTM
from keras.optimizers import Adam
from nltk.stem import SnowballStemmer

with open('/kaggle/working/merged_intents.json', 'r') as file:
    data = json.load(file)

2024-07-09 17:03:34.410371: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-07-09 17:03:34.410528: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-07-09 17:03:34.552923: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered


In [22]:
stemmer = SnowballStemmer("english")

words = []
labels = []
docs_x = []
docs_y = []

for intent in data["intents"]:
    for pattern in intent["patterns"]:
        wrds = nltk.word_tokenize(pattern)
        words.extend(wrds)
        docs_x.append(wrds)
        docs_y.append(intent["tag"])

    if intent["tag"] not in labels:
        labels.append(intent["tag"])


words = [stemmer.stem(w.lower()) for w in words if w != "?"]
words = sorted(list(set(words)))

labels = sorted(labels)

training = []
output = []
out_empty = [0 for _ in range(len(labels))]


sequence_length = 5
for i in range(len(docs_x) - sequence_length):
    seq_in = docs_x[i:i + sequence_length]
    seq_out = docs_y[i + sequence_length]

    bag = []
    for seq in seq_in:
        wrds = [stemmer.stem(w.lower()) for w in seq]
        for w in words:
            if w in wrds:
                bag.append(1)
            else:
                bag.append(0)

    output_row = out_empty[:]
    output_row[labels.index(seq_out)] = 1

    training.append(bag)
    output.append(output_row)

training = np.array(training)
training = np.reshape(training, (training.shape[0], sequence_length, len(words)))

output = np.array(output)

In [23]:
model = Sequential()
model.add(LSTM(550, input_shape=(training.shape[1], training.shape[2]), return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(512, return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(256, return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(128, return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(64, return_sequences=False))
model.add(Dropout(0.2))
model.add(Dense(len(labels), activation='softmax'))

model.summary()
model.compile(Adam(learning_rate=.001), loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(training, output, epochs=500, verbose=1, batch_size=4)


  super().__init__(**kwargs)


Epoch 1/500
[1m391/391[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m11s[0m 11ms/step - accuracy: 0.0349 - loss: 5.2278
Epoch 2/500
[1m391/391[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 11ms/step - accuracy: 0.0352 - loss: 4.4915
Epoch 3/500
[1m391/391[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 11ms/step - accuracy: 0.0665 - loss: 4.2451
Epoch 4/500
[1m391/391[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 11ms/step - accuracy: 0.0743 - loss: 4.0833
Epoch 5/500
[1m391/391[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 11ms/step - accuracy: 0.0983 - loss: 3.7320
Epoch 6/500
[1m391/391[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 11ms/step - accuracy: 0.1015 - loss: 3.5294
Epoch 7/500
[1m391/391[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 11ms/step - accuracy: 0.1322 - loss: 3.4145
Epoch 8/500
[1m391/391[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 11ms/step - accuracy: 0.1097 - loss: 3.4290
Epoch 9/500
[1m391/391

<keras.src.callbacks.history.History at 0x7d7680241bd0>

### Explanation of the Model

The provided code defines and trains a sequential model using LSTM layers for a chatbot application. Here’s a detailed explanation of the model:

#### Model Architecture

1. **Sequential Model**:
   - The `Sequential` model is a linear stack of layers in Keras, allowing you to build a neural network layer by layer.

2. **LSTM Layers**:
   - **LSTM (550 units)**: The first LSTM layer with 550 units. It receives input with a shape defined by `(training.shape[1], training.shape[2])` and returns sequences (`return_sequences=True`) to the next layer.
   - **Dropout (0.2)**: A dropout layer with a dropout rate of 0.2, which helps prevent overfitting by randomly setting 20% of the input units to 0 at each update during training.
   - **LSTM (512 units)**: The second LSTM layer with 512 units, also returning sequences.
   - **Dropout (0.2)**: Another dropout layer to reduce overfitting.
   - **LSTM (256 units)**: The third LSTM layer with 256 units, returning sequences.
   - **Dropout (0.2)**: Dropout layer to reduce overfitting.
   - **LSTM (128 units)**: The fourth LSTM layer with 128 units, returning sequences.
   - **Dropout (0.2)**: Dropout layer to reduce overfitting.
   - **LSTM (64 units)**: The fifth LSTM layer with 64 units. This layer does not return sequences (`return_sequences=False`), meaning it outputs the last output for each sample.
   - **Dropout (0.2)**: Dropout layer to reduce overfitting.

3. **Dense Layer**:
   - **Dense**: A fully connected (dense) layer with the number of units equal to the number of labels (`len(labels)`). The activation function used is `softmax`, which is suitable for multi-class classification.

#### Model Summary

- The `model.summary()` function prints a summary of the model, showing the output shape and number of parameters for each layer.

#### Model Compilation

- **Optimizer**: Adam optimizer with a learning rate of 0.001.
- **Loss Function**: `categorical_crossentropy`, which is used for multi-class classification problems.
- **Metrics**: `accuracy`, to monitor the accuracy during training and evaluation.

#### Model Training

- The model is trained using the `fit` method with the following parameters:
  - **Training Data**: `training` (input data).
  - **Output Data**: `output` (labels).
  - **Epochs**: 500, indicating the number of times the model will iterate over the entire training dataset.
  - **Verbose**: 1, to print progress and performance metrics during training.
  - **Batch Size**: 4, specifying the number of samples that will be propagated through the network at once.

#### Summary

This model is designed to process sequential data using multiple LSTM layers, which are well-suited for capturing dependencies in sequential data, such as text. Dropout layers are used to mitigate overfitting, and a dense layer with softmax activation is used to produce the final class probabilities. The model is compiled with the Adam optimizer and trained over 500 epochs with a batch size of 4.


In [24]:
model.save("chatbot.h5")

### Conclusion

This project demonstrates the process of developing a simple yet effective chatbot using various datasets and a neural network model built with multiple LSTM layers. By leveraging the strengths of LSTM networks in handling sequential data, the chatbot is capable of understanding and generating appropriate responses to a wide range of user inputs.

The following key points summarize the project:

- **Data Utilization**: Multiple datasets were used to cover diverse intents and responses, enhancing the chatbot's versatility.
- **Model Architecture**: The model consists of stacked LSTM layers, each followed by dropout layers to prevent overfitting, and a final dense layer with softmax activation for classification.
- **Training and Evaluation**: The model was trained using the Adam optimizer and categorical cross-entropy loss function, with accuracy as the primary metric for performance evaluation.
- **Implementation**: The trained model can be implemented in a simple chatbot application, providing a foundation for more complex conversational agents.

This project showcases the potential of LSTM-based models in creating conversational AI systems and serves as a starting point for further enhancements and developments in the field of chatbot technology.
