Python modules and libraries to code our robot, we're going to need a few modules built into Python, as well as some popular libraries for NLP and Deep Learning, as well as the de facto NumPy library, which is ideal for processing arrays.

In [2]:
import json
import string
import random 
import nltk
import numpy as np
from nltk.stem import WordNetLemmatizer 
import tensorflow as tf 
from tensorflow.keras import Sequential 
from tensorflow.keras.layers import Dense, Dropout
nltk.download("punkt")
nltk.download("wordnet")
nltk.download('omw-1.4')

2022-07-04 15:09:30.945771: I tensorflow/core/util/util.cc:169] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2022-07-04 15:09:30.949703: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2022-07-04 15:09:30.949723: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
[nltk_data] Downloading package punkt to /home/hakim/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package wordnet to /home/hakim/nltk_data...
[nltk_data]   Package wordnet is already up-to-date!


True

Data:
Before we start thinking about coding a few lines in Python, we need to set up a JSON intents file that defines some of the intents that might occur during interactions with our chatbot. To do this, we first need to create a set of tags into which user requests can fit.
For example:

A user may want to know the first name of our chatbot, so we create an intent tagged with a tag called name.
A user may want to know the age of our chatbot, so we create an intent tagged with the age tag and so on...

Tags and Patterns:
For each of the tags we create, we need to specify patterns. This defines the different ways a user can ask a question to our chatbot. For example, under the name tag, a user can ask a person's first name in different ways: "What is your first name?", "Who are you?", "What is your name?".

The chatbot then takes these patterns and uses them as training data to determine what a person asking our chatbot for their first name would look like. The goal is for it to be able to adapt to the different ways a person might ask our chatbot for their first name. Therefore, users don't need to use the exact queries that our chatbot has learned. They could ask the question "What's your name?" and our chatbot would be able to deduce that the user wants to know their first name and in response it would then provide their first name.

Remark: 
Our robot won't be super intelligent and therefore it won't always recognize what is said or asked. But with enough examples, it will be able to do a more than interesting job of deciphering. Keep in mind that our goal is mainly to implement NLP and Deep Learning techniques. And thus build our chatbot with these 2 elements under the hood.

Responses associated to the Patterns:
In this JSON intents file containing our various intents, next to each intents tag and pattern, there will be pre-recorded responses. Indeed, for our chatbot (which I repeat will be very simple and naive), these answers will not be generated. This means that our patterns will not be as fluid as the patterns that users may ask for (i.e. they will not adapt to the situation and the context).
What does this mean? Well, simply that the answers will be static answers that the chatbot will return when asked a question

In [4]:
# use of a dictionary to represent a JSON file of intentions
data = {"intents": [
             {"tag": "greeting",
              "patterns": ["Hello", "La forme?", "yo", "Salut", "ça roule?"],
              "responses": ["Salut à toi!", "Hello", "Comment vas tu?", "Salutations!", "Enchanté"],
             },
             {"tag": "age",
              "patterns": ["Quel âge as-tu?", "C'est quand ton anniversaire?", "Quand es-tu né?"],
              "responses": ["J'ai 25 ans", "Je suis né en 1996", "Ma date d'anniversaire est le 3 juillet et je suis né en 1996", "03/07/1996"]
             },
             {"tag": "date",
              "patterns": ["Que fais-tu ce week-end?",
"Tu veux qu'on fasse un truc ensemble?", "Quels sont tes plans pour cette semaine"],
              "responses": ["Je suis libre toute la semaine", "Je n'ai rien de prévu", "Je ne suis pas occupé"]
             },
             {"tag": "name",
              "patterns": ["Quel est ton prénom?", "Comment tu t'appelles?", "Qui es-tu?"],
              "responses": ["Mon prénom est Miki", "Je suis Miki", "Miki"]
             },
             {"tag": "goodbye",
              "patterns": [ "bye", "Salut", "see ya", "adios", "cya"],
              "responses": ["C'était sympa de te parler", "à plus tard", "On se reparle très vite!"]
             }
]}

Séparation des données
Afin de créer nos données d’entraînement, nous devons d’abord effectuer certaines opérations sur nos données telles que :

Créer un vocabulaire de tous les mots utilisés dans les patterns (rappelons que les patterns sont les requêtes/questions posées par l’utilisateur).
Créer une liste des classes – Il s’agit simplement des tags de chaque intention.
Créer une liste de tous les patterns dans le fichier des intentions.
Créer une liste de touts les tags associés à chaque pattern dans le fichier intents.

In [10]:
# initialization of lemmatizer to get the root of the mots
lemmatizer = WordNetLemmatizer()
# listes creation
words = []
classes = []
doc_X = []
doc_y = []
# browse with a For loop all the intentions
# tokenize each pattern and add the tokens to the words list, the patterns and
# the tag associated to the intention are added to the corresponding lists
for intent in data["intents"]:
    for pattern in intent["patterns"]:
        tokens = nltk.word_tokenize(pattern)
        words.extend(tokens)
        doc_X.append(pattern)
        doc_y.append(intent["tag"])
    
    # add the tag to the classes if it is not already there 
    if intent["tag"] not in classes:
        classes.append(intent["tag"])
# lemmatize all vocabulary words and convert them to lower case
# if the words do not appear in the punctuation
words = [lemmatizer.lemmatize(word.lower()) for word in words if word not in string.punctuation]
# sort the vocabulary and classes alphabetically and take the
# set to make sure there are no duplicates
words = sorted(set(words))
classes = sorted(set(classes))

Voici à quoi ressemble chaque liste :

In [11]:
print(words)
print(classes)
print(doc_X)
print(doc_y)

['adios', 'anniversaire', 'as-tu', 'bye', "c'est", 'ce', 'cette', 'comment', 'cya', 'ensemble', 'es-tu', 'est', 'fais-tu', 'fasse', 'forme', 'hello', 'la', 'né', 'plan', 'pour', 'prénom', "qu'on", 'quand', 'que', 'quel', 'quels', 'qui', 'roule', 'salut', 'see', 'semaine', 'sont', "t'appelles", 'te', 'ton', 'truc', 'tu', 'un', 'veux', 'week-end', 'ya', 'yo', 'âge', 'ça']
['age', 'date', 'goodbye', 'greeting', 'name']
['Hello', 'La forme?', 'yo', 'Salut', 'ça roule?', 'Quel âge as-tu?', "C'est quand ton anniversaire?", 'Quand es-tu né?', 'Que fais-tu ce week-end?', "Tu veux qu'on fasse un truc ensemble?", 'Quels sont tes plans pour cette semaine', 'Quel est ton prénom?', "Comment tu t'appelles?", 'Qui es-tu?', 'bye', 'Salut', 'see ya', 'adios', 'cya']
['greeting', 'greeting', 'greeting', 'greeting', 'greeting', 'age', 'age', 'age', 'date', 'date', 'date', 'name', 'name', 'name', 'goodbye', 'goodbye', 'goodbye', 'goodbye', 'goodbye']


Data Processing:

Now that we have separated our data, we are ready to train our algorithm. However, neural networks expect to receive numerical values, not words. So we must first process our data so that a neural network can read what we are doing.

In order to convert our data into numerical values, we will use a "bag of words" technique.

In [12]:
# liste for trainig data
training = []
out_empty = [0] * len(classes)
# creation of the word set model
for idx, doc in enumerate(doc_X):
    bow = []
    text = lemmatizer.lemmatize(doc.lower())
    for word in words:
        bow.append(1) if word in text else bow.append(0)
    # marks the index of the class to which the atguel pattern is associated with
    output_row = list(out_empty)
    output_row[classes.index(doc_y[idx])] = 1
    # adds the one hot encoded BoW and associated classes to the training list
    training.append([bow, output_row])
# mix the data and convert them into an array
random.shuffle(training)
training = np.array(training, dtype=object)
# separate features and target labels
train_X = np.array(list(training[:, 0]))
train_y = np.array(list(training[:, 1]))

[nltk_data] Downloading package omw-1.4 to /home/hakim/nltk_data...
[nltk_data]   Package omw-1.4 is already up-to-date!


Building the Deep Learning neural network:

After converting our data to digital format, we can now build a neural network model into which we will feed our training data. The idea is that the model examines the features and predicts the label associated with those features, and then selects an appropriate response from that label.

In [13]:
# definition of some parameters
input_shape = (len(train_X[0]),)
output_shape = len(train_y[0])
epochs = 200

In [14]:
#Model
model = Sequential()
model.add(Dense(128, input_shape=input_shape, activation="relu"))
model.add(Dropout(0.5))
model.add(Dense(64, activation="relu"))
model.add(Dropout(0.3))
model.add(Dense(output_shape, activation = "softmax"))
adam = tf.keras.optimizers.Adam(learning_rate=0.01, decay=1e-6)
model.compile(loss='categorical_crossentropy', optimizer=adam, metrics=["accuracy"])

2022-07-04 15:25:22.913560: E tensorflow/stream_executor/cuda/cuda_driver.cc:271] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2022-07-04 15:25:22.913586: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (hakim-Vostro-3500): /proc/driver/nvidia/version does not exist
2022-07-04 15:25:22.913887: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


In [15]:
print(model.summary())

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense (Dense)               (None, 128)               5760      
                                                                 
 dropout (Dropout)           (None, 128)               0         
                                                                 
 dense_1 (Dense)             (None, 64)                8256      
                                                                 
 dropout_1 (Dropout)         (None, 64)                0         
                                                                 
 dense_2 (Dense)             (None, 5)                 325       
                                                                 
Total params: 14,341
Trainable params: 14,341
Non-trainable params: 0
_________________________________________________________________
None


In our sequential model, we used exclusion layers (or DropOut) which are very effective in preventing Deep Learning models from over-fitting the data.

In [16]:
# entraînement du modèle
model.fit(x=train_X, y=train_y, epochs=200, verbose=1)

Epoch 1/200
Epoch 2/200
Epoch 3/200
Epoch 4/200
Epoch 5/200
Epoch 6/200
Epoch 7/200
Epoch 8/200
Epoch 9/200
Epoch 10/200
Epoch 11/200
Epoch 12/200
Epoch 13/200
Epoch 14/200
Epoch 15/200
Epoch 16/200
Epoch 17/200
Epoch 18/200
Epoch 19/200
Epoch 20/200
Epoch 21/200
Epoch 22/200
Epoch 23/200
Epoch 24/200
Epoch 25/200
Epoch 26/200
Epoch 27/200
Epoch 28/200
Epoch 29/200
Epoch 30/200
Epoch 31/200
Epoch 32/200
Epoch 33/200
Epoch 34/200
Epoch 35/200
Epoch 36/200
Epoch 37/200
Epoch 38/200
Epoch 39/200
Epoch 40/200
Epoch 41/200
Epoch 42/200
Epoch 43/200
Epoch 44/200
Epoch 45/200
Epoch 46/200
Epoch 47/200
Epoch 48/200
Epoch 49/200
Epoch 50/200
Epoch 51/200
Epoch 52/200
Epoch 53/200
Epoch 54/200
Epoch 55/200
Epoch 56/200
Epoch 57/200
Epoch 58/200
Epoch 59/200
Epoch 60/200
Epoch 61/200
Epoch 62/200
Epoch 63/200
Epoch 64/200
Epoch 65/200
Epoch 66/200
Epoch 67/200
Epoch 68/200
Epoch 69/200
Epoch 70/200
Epoch 71/200
Epoch 72/200
Epoch 73/200
Epoch 74/200
Epoch 75/200
Epoch 76/200
Epoch 77/200
Epoch 78

<keras.callbacks.History at 0x7f689b8d21f0>

Creation of the chatbot application:

Great! We've trained our Deep Learning model, but now we need to create the actual functions that would allow us to use our model in a chatbot application. For this next task, I created a set of utility functions that will allow us to easily perform this task.

In [17]:
def clean_text(text): 
  tokens = nltk.word_tokenize(text)
  tokens = [lemmatizer.lemmatize(word) for word in tokens]
  return tokens
def bag_of_words(text, vocab): 
  tokens = clean_text(text)
  bow = [0] * len(vocab)
  for w in tokens: 
    for idx, word in enumerate(vocab):
      if word == w: 
        bow[idx] = 1
  return np.array(bow)
def pred_class(text, vocab, labels): 
  bow = bag_of_words(text, vocab)
  result = model.predict(np.array([bow]))[0]
  thresh = 0.2
  y_pred = [[idx, res] for idx, res in enumerate(result) if res > thresh]
  y_pred.sort(key=lambda x: x[1], reverse=True)
  return_list = []
  for r in y_pred:
    return_list.append(labels[r[0]])
  return return_list
def get_response(intents_list, intents_json): 
  tag = intents_list[0]
  list_of_intents = intents_json["intents"]
  for i in list_of_intents: 
    if i["tag"] == tag:
      result = random.choice(i["responses"])
      break
  return result

The next part is simple. We need to create a while loop that allows the user to enter a query which is then cleaned up, i.e. we take the tokens and lemmatize each word. After that, we convert our text into numeric values using our "bag of words" model and make a prediction of which tag the features best represent based on our intentions.
From there, we take a random answer from our answers in this intent tag and use it to answer the user's query/question.

In [18]:
# launch of the chatbot
while True:
    message = input("")
    intents = pred_class(message, words, classes)
    result = get_response(intents, data)
    print(result)

Comment vas tu?
Salut à toi!
Mon prénom est Miki
Hello
Salut à toi!
Salutations!
Comment vas tu?
Salut à toi!


KeyboardInterrupt: Interrupted by user