--- 
--- 
# Chatbot
--- 
--- 

Ce notebook regroupe les différents modèles créés sur les autres notebook afin de faire fonctionner notre chatbot.

---
## Importation des bibliothèques et modules
---

In [1]:
import keras
import tensorflow as tf 
import numpy
import pandas as pd
import random
import re
import numpy as np
import csv
import pickle
import spacy
import nltk
from nltk.stem import SnowballStemmer
from nltk.tokenize import word_tokenize

Using TensorFlow backend.


In [2]:
!pip install --upgrade tensorflow

Requirement already up-to-date: tensorflow in /usr/local/lib/python3.6/dist-packages (2.1.0)


In [3]:
from google.colab import drive
drive.mount('/content/gdrive')

Drive already mounted at /content/gdrive; to attempt to forcibly remount, call drive.mount("/content/gdrive", force_remount=True).


---
## Elements du modèle génératif
---


In [4]:
model =  tf.keras.models.load_model('/content/gdrive/My Drive/generative_model.h5')



In [0]:
char2idx = pickle.load(open('/content/gdrive/My Drive/lst_char.pkl', "rb"))
idx2char = pickle.load(open('/content/gdrive/My Drive/lst_idx.pkl', "rb"))

In [0]:
def generate_text(model, start_string):
  # Evaluation step (generating text using the learned model)

  # Number of characters to generate
  num_generate = 100

  # Converting our start string to numbers (vectorizing)
  input_eval = [char2idx[s] for s in start_string]
  input_eval = tf.expand_dims(input_eval, 0)

  # Empty string to store our results
  text_generated = []

  # Low temperatures results in more predictable text.
  # Higher temperatures results in more surprising text.
  # Experiment to find the best setting.
  temperature = 1.0

  # Here batch size == 1
  model.reset_states()
  for i in range(num_generate):
      predictions = model(input_eval)
      # remove the batch dimension
      predictions = tf.squeeze(predictions, 0)

      # using a categorical distribution to predict the word returned by the model
      predictions = predictions / temperature
      predicted_id = tf.random.categorical(predictions, num_samples=1)[-1,0].numpy()

      # We pass the predicted word as the next input to the model
      # along with the previous hidden state
      input_eval = tf.expand_dims([predicted_id], 0)

      text_generated.append(idx2char[predicted_id])

  return (start_string + ''.join(text_generated))

---
## Elements du modèle de classification métier
---


In [0]:
scrapping = pd.read_csv("/content/gdrive/My Drive/scraping.csv", sep=';', header=0)

In [0]:
vectorizer_bin = pickle.load(open('/content/gdrive/My Drive/vectorizer_bin.pkl', "rb"))
model_bin =  tf.keras.models.load_model('/content/gdrive/My Drive/model_bin.h5')

In [0]:
vectorizer_multi = pickle.load(open('/content/gdrive/My Drive/vectorizer_multi.pkl', "rb"))
classifier_multilog = pickle.load(open('/content/gdrive/My Drive/model_multilog.pkl', "rb"))

In [10]:
nltk.download('stopwords')

[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


True

In [11]:
!python -m spacy download fr_core_news_sm

[38;5;2m✔ Download and installation successful[0m
You can now load the model via spacy.load('fr_core_news_sm')


In [0]:
stemmer = SnowballStemmer('french')
sw = nltk.corpus.stopwords.words("french")
sw.extend(["a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z"])
sw.extend(["dans","si","même","mêmes","tous","tout","toutes","puis","depuis","lors","votre","notre","ainsi","ici",
          "prendre","avoir","rendre","être"])
nlp = spacy.load('fr_core_news_sm')

In [0]:
corpus_question = ["qu'","c'est",'comment', 'pourquoi', 'quoi', 'depuis', 'combien', 'quand', 'où', 'est ce que', 'lequel', 'laquelle', 'lesquels', 'quel', 'quelle', 'quels', 'quelles', "qu'est ce que", 'êtes', 'je', 'tu', 'il', 'elle', 'nous', 'vous', 'ils']

def mots_vides(phrase,sw=sw):
    liste_mots = []
    for mot in phrase.split():
        if mot not in sw :
            liste_mots.append(mot)
    return " ".join(liste_mots)

def lemmatise_text(text):
    return " ".join([token.lemma_ for token in nlp(text)])

def mots_question(phrase,corpus=corpus_question):
    liste_mots = []
    for mot in phrase.split():
        if mot not in corpus :
            liste_mots.append(mot)
    return " ".join(liste_mots)
  
def nettoyage(question):
    q = question.lower()
    q = re.sub(r'[^\w\s]',' ',q)
    q = re.sub(r'\d+',' ',q)
    q = mots_vides(q)
    q = mots_question(q)
    q = lemmatise_text(q)
    return  q

In [0]:
def choix_question(theme, user_question):
  theme_reponses = scrapping[scrapping.theme==theme].reponse
  doc1 = nlp(nettoyage(user_question))
  list_doc1 = []
  list_doc2 = []
  list_similarity = []
  for reponse in theme_reponses:
      doc2 = nlp(nettoyage(reponse))
      list_doc1.append(doc1)
      list_doc2.append(doc2)
      list_similarity.append(doc1.similarity(doc2))
  result = pd.DataFrame({'doc1' : list_doc1, 'doc2' : list_doc2, 'similarité' : list_similarity, "reponse":theme_reponses}).sort_values('similarité', ascending=False)
  idx = result.similarité.idxmax(axis = 0)
  return result.reponse[idx]

In [0]:
def chatbot():
  question = str(input("Bonjour, je suis le chatbot :)\n"))
  while question != "au revoir":
    question_vect  = vectorizer_bin.transform([question])
    result = model_bin.predict(question_vect.toarray())

    #question métier : extraire une réponse de la base
    if result > 0.5 :
      #Chercher le thème de la question 
      question_vectmulti = vectorizer_multi.transform([question])
      theme = classifier_multilog.predict(question_vectmulti)
      print(choix_question(theme[0],question),'\n')
      question = str(input())

    #text génération d'une réponse originale
    else:
      gen = generate_text(model, start_string=question)
      rep = re.split(r'[.?!]',gen)[1]
      print(rep.strip(),'\n')
      question = str(input())


In [0]:
import warnings
warnings.filterwarnings("ignore")

In [17]:
chatbot()

Bonjour, je suis le chatbot :)
Bonjour
Dis-le-Fräuls, le soleil opre nuit pour ux 

Comment ca va ? 
Le choue soit loin 

Comment se faire livrer ? 
Lors de la réception de votre commande, vous devez vérifier l'état de vos meubles en présence des livreurs. Toute anomalie constatée doit être spécifiée sur le bon de livraison. Pas d'inquiétude cependant, vous disposez d'un délai supplémentaire de 72h pour nous signaler toute avarie sur votre livraison. Pour effectuer votre demande : Si vous n'avez pas de compte client ou si vous avez effectué une commande en magasin et qu'elle n'est pas rattachée à votre compte, contactez le Service Client par ici.  

au revoir 
Vous fous tuois 

au revoir
