# 1.Creating the model and training it with Simple Intents 

This code is for training a chatbot model using TensorFlow and Natural Language Processing (NLP) techniques.
Here's a breakdown of the code:

#### 1.Imports:
This section imports the necessary libraries for data manipulation (json, pickle, numpy), machine learning (tensorflow), text processing (wikipedia, nltk), and defines a WordNetLemmatizer for word normalization. It then loads the chatbot intents from a JSON file, which defines conversation patterns and corresponding responses.

In [None]:
import random
import json
import pickle
import numpy as np
import tensorflow as tf

import wikipedia

import nltk
from nltk.stem import WordNetLemmatizer
# Initialize lemmatizer for word normalization
lemmatizer = WordNetLemmatizer()

# Load chatbot intents from JSON file (conversation patterns and responses)
intents = json.loads(open('D:\Python Projects\ChatBot\Intent.json').read())


### 2.Data Preprocessing
Loads chatbot intents from a JSON file (defines conversation patterns and responses).
Extracts words from conversation patterns and removes punctuation.
Lemmatizes words (reduces words to their base form, e.g., "running" -> "run").
Creates a vocabulary of unique words and a list of intent classes.
Saves the vocabulary and classes as pickle files.

In [None]:
words = []  # List to store all unique words from conversation patterns
classes = []  # List to store all unique intent classes
documents = []  # List to store (pattern, intent) pairs
ignoreLetters = ['?', '!', '.', ',']  # List of punctuation to remove

# Extract words, remove punctuation, and lemmatize from each conversation pattern
for intent in intents['intents']:
    for pattern in intent['text']:
        wordList = nltk.word_tokenize(pattern)  # Tokenize the pattern into words
        words.extend(wordList)  # Add words to the words list
        documents.append((wordList, intent['intent']))  # Add (pattern, intent) pair
        if intent['intent'] not in classes:  # Add unique intent class to classes list
            classes.append(intent['intent'])

# Remove punctuation and lemmatize all words in the vocabulary
words = [lemmatizer.lemmatize(word) for word in words if word not in ignoreLetters]
# Sort and remove duplicates to create a unique vocabulary
words = sorted(set(words))

# Sort the list of intent classes
classes = sorted(set(classes))

# Save the vocabulary and classes for later use
pickle.dump(words, open('words.pkl', 'wb'))  # Save words list as pickle file
pickle.dump(classes, open('classes.pkl', 'wb'))  # Save classes list as pickle file
# Training Data Preparation
training = []  # List to store training data (bag-of-words + output vectors)
outputEmpty = [0] * len(classes)  # Create an empty output vector

### 3.Training Data Preparation:
Creates a bag-of-words representation for each conversation pattern.
This is a vector where each element represents a word in the vocabulary and its value indicates if the word appears in the pattern.
Creates one-hot encoded output vectors for each intent class.
These vectors have a length equal to the number of intent classes, with a 1 at the index corresponding to the intended class.
Combines bag-of-words and output vectors for each pattern and intent.
Shuffles the training data.
Converts lists to NumPy arrays for efficient machine learning operations.

In [None]:

# Create bag-of-words representation and one-hot encoded output vector for each pattern-intent pair
for document in documents:
    bag = []  # List to store bag-of-words representation
    wordPatterns = document[0]  # Get the word pattern from the document
    wordPatterns = [lemmatizer.lemmatize(word.lower()) for word in wordPatterns]  # Lemmatize and lowercase words

    # Check if each word in the vocabulary exists in the pattern
    for word in words:
        bag.append(1) if word in wordPatterns else bag.append(0)

    outputRow = list(outputEmpty)  # Create a copy of the empty output vector
    outputRow[classes.index(document[1])] = 1  # Set the element corresponding to the intent class to 1
    training.append(bag + outputRow)  # Combine bag-of-words and output vector

# Shuffle the training data for randomization
random.shuffle(training)

# Convert training data lists to NumPy arrays for efficient processing
training = np.array(training)
trainX = training[:, :len(words)]  # Separate input (bag-of-words) data
trainY = training[:, len(words):]  # Separate output (one-hot encoded) data


### 4.Model Building:Creates a sequential neural network model with TensorFlow Keras.
The model has two hidden layers with ReLU activation and dropout for regularization.
The output layer has a softmax activation for multi-class classification (one output for each intent class
:



In [None]:
# Model Building
model = tf.keras.Sequential()  # Create a sequential neural network model
model.add(tf.keras.layers.Dense(128, input_shape=(len(trainX[0]),), activation='relu'))  # Hidden layer with 128 neurons, ReLU activation
model.add(tf.keras.layers.Dropout(0.5))  # Dropout layer with 50% chance of dropping neurons for regularization
model.add(tf.keras.layers.Dense(64, activation='relu'))  # Second hidden layer with 64 neurons, ReLU activation
model.add(tf.keras.layers.Dropout(0.5))  # Dropout layer with 50% chance of dropping neurons for regularization
model.add(tf.keras.layers.Dense(len(trainY[0]), activation='softmax'))  # Output layer with softmax activation (one output for each intent class)


### 5.Training:
Defines an optimizer with specific learning rate and momentum for training the model.
Compiles the model with categorical crossentropy loss for multi-class classification and accuracy metric.
Trains the model on the prepared training data for 200 epochs with a batch size of 5.
Saves the trained model as a .h5 file.

In [None]:
sgd = tf.keras.optimizers.SGD(learning_rate=0.01, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])

hist = model.fit(np.array(trainX), np.array(trainY), epochs=200, batch_size=5, verbose=1)
model.save('chatbot_model.h5', hist)

### 6.Output:

Prints a message indicating successful training completion.

In [None]:
print('Done')

# 2.Chatbot Interaction

### Imports and Loading Data:

Imports libraries for data manipulation, model loading, natural language processing, and time-related functions.
Loads a WordNetLemmatizer for word normalization.
Loads chatbot intents from a JSON file.
Loads the vocabulary and classes from pickle files created during model training.
Loads the trained chatbot model.

In [None]:
import random
import json
import pickle
import numpy as np
import nltk
from datetime import datetime
import wikipedia

from nltk.stem import WordNetLemmatizer
from keras.models import load_model

lemmatizer = WordNetLemmatizer()
intents = json.loads(open('D:\Python Projects\ChatBot\Intent.json').read())

words = pickle.load(open('words.pkl', 'rb'))
classes = pickle.load(open('classes.pkl', 'rb'))
model = load_model('chatbot_model.h5')

### Text Preprocessing Functions:

#### clean_up_sentence(sentence):

Tokenizes a sentence into words using nltk.word_tokenize.
Lemmatizes each word using the WordNetLemmatizer (reduces to base form).
Returns a list of cleaned words.

In [None]:
def clean_up_sentence(sentence):
    sentence_words = nltk.word_tokenize(sentence)
    sentence_words = [lemmatizer.lemmatize(word) for word in sentence_words]
    return sentence_words

#### bag_of_words(sentence):

Cleans the sentence using clean_up_sentence.
Creates a bag-of-words representation: a vector where each element indicates the presence or absence of a word from the vocabulary.
Returns a NumPy array representing the bag-of-words.

In [None]:
def bag_of_words (sentence):
    sentence_words = clean_up_sentence(sentence)
    bag = [0] * len(words)
    for w in sentence_words:
        for i, word in enumerate(words):
            if word == w:
                bag[i] = 1
    return np.array(bag)

### Intent Prediction:


#### predict_class(sentence):
Converts the sentence to a bag-of-words using bag_of_words.
Passes the bag-of-words to the model for prediction.
Filters results above a threshold (0.25) for higher confidence.
Sorts results by probability in descending order.
Returns a list of predicted intents with scores.

In [None]:
def predict_class (sentence):
    bow = bag_of_words (sentence)
    res = model.predict(np.array([bow]))[0]
    ERROR_THRESHOLD = 0.25
    results = [[i, r] for i, r in enumerate(res) if r > ERROR_THRESHOLD]

    results.sort(key=lambda x: x[1], reverse=True)
    return_list = []
    for r in results:
        return_list.append({'intent': classes [r[0]], 'probability': str(r[1])})
    return return_list

### Response Generation:

#### get_response(intents_list, intents_json):
Takes a list of predicted intents and the loaded intents JSON.
Finds a matching intent in the JSON and randomly selects a response from it.
If an extension response is available, randomly selects one.
Returns the chosen response and extension (or empty string if no extension).

In [None]:
def get_response(intents_list, intents_json):
  intent = intents_list[0]['intent']
  list_of_intents = intents_json['intents']
  for i in list_of_intents:
    if i['intent'] == intent:
      result = random.choice(i['responses'])
      extension = random.choice(i['extension']['responses']) if len(i['extension']['responses']) > 0 else ""  # Empty string if no extension response
      return result, extension
  return result, ""


### Main Interaction Loop:

#### Prints a message indicating the chatbot is running.#### 
Enters an infinite loop:
Prompts the user for input.
Predicts the intent of the user's message using predict_class.
Generates a response using get_response.
Prints the generated response to the user.

In [None]:
print("GO! Bot is running!")

while True:
    message = input("")
    ints = predict_class (message)
    res = get_response (ints, intents)
    print (res)

GO! Bot is running!
