# Chatting bot Intent Detection

<b style = "color:red">Important!</b>

**This demo could only be run with Internet Connetion. Because our dataset and part of the model is online**

**Check the `setup.ipynb` to find the modules needed**

Dataset url https://github.com/google-research-datasets/Taskmaster/raw/master/TM-1-2019

In [1]:
__author__ = ["Haolin Pan", "Riade Benbaki"]
__version__ = "École Polytechnique, 2020/3/31"
__data__ = "https://github.com/google-research-datasets/Taskmaster/raw/master/TM-1-2019/self-dialogs.json"

# Set up

In [2]:
# set up
from collections import defaultdict
import numpy as np
import pandas as pd
import requests
from scipy import stats
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
import os
import matplotlib.pyplot as plt
import tensorflow as tf
import tensorflow_hub as hub
tf.compat.v1.enable_eager_execution() #Should be commented for training with tf1 only
from keras.models import Sequential
from keras.layers import Dense, Activation
from keras import metrics

from sklearn import preprocessing
from sklearn.metrics.pairwise import cosine_similarity
from sklearn.metrics import f1_score

Using TensorFlow backend.


## Local Modules

Functions of Topic Clustering are show in the demo `Demo_topic_clustering.ipynb` 

The class `Intent_classification` is to form the data set used for the intent detection

The class `Intent_detection` is to realize the models of detecting the  intents of one sentences

In [3]:
from src.topic_clustering_model import Topic_clustering
from src.sub_intents_detection import Intent_classifiction
from src.sub_intents_detection import Intent_detection

# Initialization

In [4]:
IC = Intent_classifiction()

module use loaded
Instructions for updating:
If using Keras pass *_constraint arguments to layers.


Instructions for updating:
If using Keras pass *_constraint arguments to layers.


module elmo loaded


# Data

### Raw Data

In [5]:
IC.data[:2]

[{'conversation_id': 'dlg-00055f4e-4a46-48bf-8d99-4e477663eb23',
  'instruction_id': 'restaurant-table-2',
  'utterances': [{'index': 0,
    'speaker': 'USER',
    'text': "Hi, I'm looking to book a table for Korean fod."},
   {'index': 1,
    'speaker': 'ASSISTANT',
    'text': 'Ok, what area are you thinking about?'},
   {'index': 2,
    'speaker': 'USER',
    'text': 'Somewhere in Southern NYC, maybe the East Village?',
    'segments': [{'start_index': 13,
      'end_index': 49,
      'text': 'Southern NYC, maybe the East Village',
      'annotations': [{'name': 'restaurant_reservation.location.restaurant.accept'}]},
     {'start_index': 13,
      'end_index': 25,
      'text': 'Southern NYC',
      'annotations': [{'name': 'restaurant_reservation.location.restaurant.accept'}]}]},
   {'index': 3,
    'speaker': 'ASSISTANT',
    'text': "Ok, great.  There's Thursday Kitchen, it has great reviews.",
    'segments': [{'start_index': 20,
      'end_index': 35,
      'text': 'Thursday Ki

We can see that in the raw data, there is some phrases which is mark by the intents of the user, like `restaurant_reservation.location`

So we want to use this label to establish a deep learning model to detect the intents of the user given a sentence

We store the intents associated with one sentences in the data structure of `dict` and store them in `Intent_Classification.phrs2inetnts` 

In [6]:
IC.intents_clustering()

Here we set the intent **'other'** to the sentence without any intents in the topic.


<div style= "color : red "><b> Attention</b> </div>
Here, the opening phrase of a conversation is marked as **'other'** too. Because they introduce no specific intents other than the topic which we already know

In [7]:
IC.phrs2intents

{"Hi, I'm looking to book a table for Korean fod.": ['other'],
 'Somewhere in Southern NYC, maybe the East Village?': ['restaurant_reservation.location.restaurant.accept'],
 "That's great. So I need a table for tonight at 7 pm for 8 people. We don't want to sit at the bar, but anywhere else is fine.": ['restaurant_reservation.type.seating',
  'restaurant_reservation.time.reservation',
  'restaurant_reservation.num.guests'],
 'What times are available?': ['other'],
 "Yikes, we can't do those times.": ['other'],
 'Let me check.': ['other'],
 'Lets try Boka, are they free for 8 people at 7?': ['restaurant_reservation.name.restaurant.accept',
  'restaurant_reservation.num.guests.accept',
  'restaurant_reservation.time.reservation.accept'],
 "Great, let's book that.": ['other'],
 "No, that's it, just book.": ['other'],
 'Yes please.': ['other',
  'coffee_ordering.preference.accept',
  'coffee_ordering.preference'],
 'Hi I would like to see if the Movie What Men Want is playing here.': ['mov

# Word Embedding

We embed these sentences with the model "elmo", and store the phrases embedded in a `.json` file

In [8]:
IC.word_embedding()

intents clustering completed!
embedded data loading completed
Embedding Progress:  0.0
Embedding Progress:  0.023909716908951797
Embedding Progress:  0.047819433817903594
Embedding Progress:  0.0717291507268554
Embedding Progress:  0.09563886763580719
Embedding Progress:  0.11954858454475899
Embedding Progress:  0.1434583014537108
Embedding Progress:  0.16736801836266257
Embedding Progress:  0.19127773527161437
Embedding Progress:  0.21518745218056617
Embedding Progress:  0.23909716908951797
Embedding Progress:  0.2630068859984698
Embedding Progress:  0.2869166029074216
Embedding Progress:  0.3108263198163734
Embedding Progress:  0.33473603672532515
Embedding Progress:  0.358645753634277
Embedding Progress:  0.38255547054322875
Embedding Progress:  0.4064651874521806
Embedding Progress:  0.43037490436113235
Embedding Progress:  0.4542846212700842
Embedding Progress:  0.47819433817903595
Embedding Progress:  0.5021040550879877
Embedding Progress:  0.5260137719969395
Embedding Progress: 

Finally, we get the data we need to build the model of intent detection, a `p2i` mapping the sentences to its intents and `p2v` mapping the intentces to the its vector representation

In [9]:
p2i, p2v = IC.get_embedded_data_and_clustered_data()

# Modeling

## Initialization the models

In [10]:
ID = Intent_detection(p2i, p2v)

module elmo loaded


## Training

Now we want to train a 0-1 classfication models for **every topic $\times$ every intents in the topic**

The Input is the vector representation of the sentence and the output is bool value whether the topic contains the 

Firstly, we form a well balanced development set:

**50% positive samples**

**25% negative samples with wrong intents**

**25% negative samples without any intents (intent = 'other')**

For example: 

In [11]:
ID.get_data('restaurant', 'time')

We get: 
2257  postive sample
1129  sample of wrong intents
1128  sample of no intents


(array([[ 0.47903794,  0.19902727,  0.09933247, ..., -0.11180887,
          0.04152172, -0.06517152],
        [ 0.07155436, -0.2274629 , -0.08247091, ..., -0.27136704,
          0.2191311 ,  0.09496994],
        [ 0.21138269, -0.4100072 ,  0.2637415 , ..., -0.06332628,
         -0.08134664, -0.09446161],
        ...,
        [ 0.37168708,  0.2541787 , -0.31583005, ..., -0.08804071,
         -0.02512026,  0.22882295],
        [-0.14834785,  0.06300254, -0.06442624, ..., -0.12333031,
          0.21194746,  0.22607255],
        [-0.13581023,  0.04395352, -0.12871231, ..., -0.00336048,
          0.17237607,  0.2781555 ]]), array([0, 1, 1, ..., 0, 0, 0]))

After a simple early_stop research, we find the best early stop epoch within 1 to 10, and then store the best model with the highest `f1_sscore` on the test set

In [58]:
ID.train_model()

Now training the classsfier for topic:  auto  ; intent:  name
Input: str; Output: boolean(if the str contents the intent:  name  ).
----------------------------------------------------------------
We get: 
2616  postive sample
1309  sample of wrong intents
1308  sample of no intents
data_loaded!
Train on 4186 samples
f1_score on dev set: 
f1_macro:  0.9655209004156664
f1_micro:  0.9656160458452722
f1_weighted:  0.9655710680058222
Train on 4186 samples
Epoch 1/2
Epoch 2/2
f1_score on dev set: 
f1_macro:  0.9741848111461271
f1_micro:  0.9742120343839542
f1_weighted:  0.9742080309666266
Train on 4186 samples
Epoch 1/3
Epoch 2/3
Epoch 3/3
f1_score on dev set: 
f1_macro:  0.973233459453932
f1_micro:  0.9732569245463228
f1_weighted:  0.9732554106693944
Train on 4186 samples
Epoch 1/4
Epoch 2/4
Epoch 3/4
Epoch 4/4
f1_score on dev set: 
f1_macro:  0.9760907592040301
f1_micro:  0.9761222540592168
f1_weighted:  0.9761147947514094
Train on 4186 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoc

Epoch 7/9
Epoch 8/9
Epoch 9/9
f1_score on dev set: 
f1_macro:  0.975132640533193
f1_micro:  0.9751671442215855
f1_weighted:  0.9751582971219976

Now training the classsfier for topic:  auto  ; intent:  year
Input: str; Output: boolean(if the str contents the intent:  year  ).
----------------------------------------------------------------
We get: 
964  postive sample
483  sample of wrong intents
482  sample of no intents
data_loaded!
Train on 1543 samples
f1_score on dev set: 
f1_macro:  0.9348127132097139
f1_micro:  0.9352331606217616
f1_weighted:  0.9348940901281748
Train on 1543 samples
Epoch 1/2
Epoch 2/2
f1_score on dev set: 
f1_macro:  0.9870395863412014
f1_micro:  0.9870466321243523
f1_weighted:  0.9870442835299689
Train on 1543 samples
Epoch 1/3
Epoch 2/3
Epoch 3/3
f1_score on dev set: 
f1_macro:  0.9766762673971279
f1_micro:  0.9766839378238342
f1_weighted:  0.9766828420485906
Train on 1543 samples
Epoch 1/4
Epoch 2/4
Epoch 3/4
Epoch 4/4
f1_score on dev set: 
f1_macro:  0.984

Epoch 8/8
f1_score on dev set: 
f1_macro:  0.9660442073170732
f1_micro:  0.9660493827160492
f1_weighted:  0.9660429134673291
Train on 2588 samples
Epoch 1/9
Epoch 2/9
Epoch 3/9
Epoch 4/9
Epoch 5/9
Epoch 6/9
Epoch 7/9
Epoch 8/9
Epoch 9/9
f1_score on dev set: 
f1_macro:  0.9598578059031517
f1_micro:  0.9598765432098766
f1_weighted:  0.9598551291450481

Now training the classsfier for topic:  auto  ; intent:  date
Input: str; Output: boolean(if the str contents the intent:  date  ).
----------------------------------------------------------------
We get: 
956  postive sample
479  sample of wrong intents
478  sample of no intents
data_loaded!
Train on 1530 samples
f1_score on dev set: 
f1_macro:  0.8082150237324475
f1_micro:  0.814621409921671
f1_weighted:  0.8101369395892146
Train on 1530 samples
Epoch 1/2
Epoch 2/2
f1_score on dev set: 
f1_macro:  0.9109594661123571
f1_micro:  0.9112271540469974
f1_weighted:  0.9112271540469974
Train on 1530 samples
Epoch 1/3
Epoch 2/3
Epoch 3/3
f1_score

Epoch 7/7
f1_score on dev set: 
f1_macro:  0.9372158333845806
f1_micro:  0.937246963562753
f1_weighted:  0.9372101733521856
Train on 1975 samples
Epoch 1/8
Epoch 2/8
Epoch 3/8
Epoch 4/8
Epoch 5/8
Epoch 6/8
Epoch 7/8
Epoch 8/8
f1_score on dev set: 
f1_macro:  0.9230315057484461
f1_micro:  0.9230769230769231
f1_weighted:  0.9230239361936999
Train on 1975 samples
Epoch 1/9
Epoch 2/9
Epoch 3/9
Epoch 4/9
Epoch 5/9
Epoch 6/9
Epoch 7/9
Epoch 8/9
Epoch 9/9
f1_score on dev set: 
f1_macro:  0.9433114754098362
f1_micro:  0.9433198380566802
f1_weighted:  0.9433086878608881

Now training the classsfier for topic:  coffee  ; intent:  location
Input: str; Output: boolean(if the str contents the intent:  location  ).
----------------------------------------------------------------
We get: 
1024  postive sample
513  sample of wrong intents
512  sample of no intents
data_loaded!
Train on 1639 samples
f1_score on dev set: 
f1_macro:  0.9558918453704368
f1_micro:  0.9560975609756097
f1_weighted:  0.956024

Epoch 5/6
Epoch 6/6
f1_score on dev set: 
f1_macro:  0.7425988700564973
f1_micro:  0.746268656716418
f1_weighted:  0.7448924867189477
Train on 266 samples
Epoch 1/7
Epoch 2/7
Epoch 3/7
Epoch 4/7
Epoch 5/7
Epoch 6/7
Epoch 7/7
f1_score on dev set: 
f1_macro:  0.34951456310679613
f1_micro:  0.5373134328358209
f1_weighted:  0.37559773945804953
Train on 266 samples
Epoch 1/8
Epoch 2/8
Epoch 3/8
Epoch 4/8
Epoch 5/8
Epoch 6/8
Epoch 7/8
Epoch 8/8
f1_score on dev set: 
f1_macro:  0.7761194029850746
f1_micro:  0.7761194029850746
f1_weighted:  0.7761194029850746
Train on 266 samples
Epoch 1/9
Epoch 2/9
Epoch 3/9
Epoch 4/9
Epoch 5/9
Epoch 6/9
Epoch 7/9
Epoch 8/9
Epoch 9/9
f1_score on dev set: 
f1_macro:  0.7161649944258639
f1_micro:  0.7164179104477613
f1_weighted:  0.7155327043711209

Now training the classsfier for topic:  coffee  ; intent:  name
Input: str; Output: boolean(if the str contents the intent:  name  ).
----------------------------------------------------------------
We get: 
1998  p

Train on 3197 samples
Epoch 1/6
Epoch 2/6
Epoch 3/6
Epoch 4/6
Epoch 5/6
Epoch 6/6
f1_score on dev set: 
f1_macro:  0.9222648511748037
f1_micro:  0.9225
f1_weighted:  0.9224679342511095
Train on 3197 samples
Epoch 1/7
Epoch 2/7
Epoch 3/7
Epoch 4/7
Epoch 5/7
Epoch 6/7
Epoch 7/7
f1_score on dev set: 
f1_macro:  0.9211463814168306
f1_micro:  0.92125
f1_weighted:  0.9212821574913286
Train on 3197 samples
Epoch 1/8
Epoch 2/8
Epoch 3/8
Epoch 4/8
Epoch 5/8
Epoch 6/8
Epoch 7/8
Epoch 8/8
f1_score on dev set: 
f1_macro:  0.32260795935647757
f1_micro:  0.47625
f1_weighted:  0.3072840812870449
Train on 3197 samples
Epoch 1/9
Epoch 2/9
Epoch 3/9
Epoch 4/9
Epoch 5/9
Epoch 6/9
Epoch 7/9
Epoch 8/9
Epoch 9/9
f1_score on dev set: 
f1_macro:  0.32260795935647757
f1_micro:  0.47625
f1_weighted:  0.3072840812870449

Now training the classsfier for topic:  coffee  ; intent:  type
Input: str; Output: boolean(if the str contents the intent:  type  ).
------------------------------------------------------------

f1_score on dev set: 
f1_macro:  0.862048125035138
f1_micro:  0.8622516556291391
f1_weighted:  0.8621393628876212
Train on 3018 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
f1_score on dev set: 
f1_macro:  0.33713784021071114
f1_micro:  0.5086092715231788
f1_weighted:  0.34294286262493534
Train on 3018 samples
Epoch 1/6
Epoch 2/6
Epoch 3/6
Epoch 4/6
Epoch 5/6
Epoch 6/6
f1_score on dev set: 
f1_macro:  0.8688290697735637
f1_micro:  0.8688741721854305
f1_weighted:  0.8688709505845829
Train on 3018 samples
Epoch 1/7
Epoch 2/7
Epoch 3/7
Epoch 4/7
Epoch 5/7
Epoch 6/7
Epoch 7/7
f1_score on dev set: 
f1_macro:  0.8688290697735637
f1_micro:  0.8688741721854305
f1_weighted:  0.8688709505845829
Train on 3018 samples
Epoch 1/8
Epoch 2/8
Epoch 3/8
Epoch 4/8
Epoch 5/8
Epoch 6/8
Epoch 7/8
Epoch 8/8
f1_score on dev set: 
f1_macro:  0.8660409211564171
f1_micro:  0.866225165562914
f1_weighted:  0.8661264632022905
Train on 3018 samples
Epoch 1/9
Epoch 2/9
Epoch 3/9
Epoch 4/9
Epoch 5/9
Epoch

f1_micro:  0.958204334365325
f1_weighted:  0.9581100070751634
Train on 2580 samples
Epoch 1/4
Epoch 2/4
Epoch 3/4
Epoch 4/4
f1_score on dev set: 
f1_macro:  0.9485155783967327
f1_micro:  0.9489164086687306
f1_weighted:  0.9487827985780646
Train on 2580 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
f1_score on dev set: 
f1_macro:  0.9610177387391954
f1_micro:  0.9613003095975232
f1_weighted:  0.96121296951404
Train on 2580 samples
Epoch 1/6
Epoch 2/6
Epoch 3/6
Epoch 4/6
Epoch 5/6
Epoch 6/6
f1_score on dev set: 
f1_macro:  0.9563736698407186
f1_micro:  0.9566563467492261
f1_weighted:  0.9565802414277048
Train on 2580 samples
Epoch 1/7
Epoch 2/7
Epoch 3/7
Epoch 4/7
Epoch 5/7
Epoch 6/7
Epoch 7/7
f1_score on dev set: 
f1_macro:  0.9610380524333726
f1_micro:  0.9613003095975232
f1_weighted:  0.9612260858718202
Train on 2580 samples
Epoch 1/8
Epoch 2/8
Epoch 3/8
Epoch 4/8
Epoch 5/8
Epoch 6/8
Epoch 7/8
Epoch 8/8
f1_score on dev set: 
f1_macro:  0.9564723548436778
f1_micro:  0.95665

Train on 1863 samples
Epoch 1/3
Epoch 2/3
Epoch 3/3
f1_score on dev set: 
f1_macro:  0.9333139451684231
f1_micro:  0.9334763948497854
f1_weighted:  0.933342197286921
Train on 1863 samples
Epoch 1/4
Epoch 2/4
Epoch 3/4
Epoch 4/4
f1_score on dev set: 
f1_macro:  0.9333410239254321
f1_micro:  0.9334763948497854
f1_weighted:  0.9333668088634041
Train on 1863 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
f1_score on dev set: 
f1_macro:  0.9355795992848321
f1_micro:  0.9356223175965666
f1_weighted:  0.9355938387220769
Train on 1863 samples
Epoch 1/6
Epoch 2/6
Epoch 3/6
Epoch 4/6
Epoch 5/6
Epoch 6/6
f1_score on dev set: 
f1_macro:  0.9398432457353619
f1_micro:  0.9399141630901288
f1_weighted:  0.9398609750740535
Train on 1863 samples
Epoch 1/7
Epoch 2/7
Epoch 3/7
Epoch 4/7
Epoch 5/7
Epoch 6/7
Epoch 7/7
f1_score on dev set: 
f1_macro:  0.9420277835372175
f1_micro:  0.9420600858369099
f1_weighted:  0.9420395298280148
Train on 1863 samples
Epoch 1/8
Epoch 2/8
Epoch 3/8
Epoch 4/8
Epoc

Train on 1892 samples
Epoch 1/2
Epoch 2/2
f1_score on dev set: 
f1_macro:  0.9039541904130173
f1_micro:  0.904862579281184
f1_weighted:  0.9041714138380137
Train on 1892 samples
Epoch 1/3
Epoch 2/3
Epoch 3/3
f1_score on dev set: 
f1_macro:  0.9491598294579198
f1_micro:  0.9492600422832981
f1_weighted:  0.9492123218902608
Train on 1892 samples
Epoch 1/4
Epoch 2/4
Epoch 3/4
Epoch 4/4
f1_score on dev set: 
f1_macro:  0.3384615384615385
f1_micro:  0.5116279069767442
f1_weighted:  0.34633273703041145
Train on 1892 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
f1_score on dev set: 
f1_macro:  0.9619407788863856
f1_micro:  0.9619450317124736
f1_weighted:  0.9619501351037791
Train on 1892 samples
Epoch 1/6
Epoch 2/6
Epoch 3/6
Epoch 4/6
Epoch 5/6
Epoch 6/6
f1_score on dev set: 
f1_macro:  0.972491757369158
f1_micro:  0.9725158562367865
f1_weighted:  0.9725106921937233
Train on 1892 samples
Epoch 1/7
Epoch 2/7
Epoch 3/7
Epoch 4/7
Epoch 5/7
Epoch 6/7
Epoch 7/7
f1_score on dev set: 
f1

We get: 
2422  postive sample
1212  sample of wrong intents
1211  sample of no intents
data_loaded!
Train on 3876 samples
f1_score on dev set: 
f1_macro:  0.8532790960646122
f1_micro:  0.8544891640866874
f1_weighted:  0.8529628282861152
Train on 3876 samples
Epoch 1/2
Epoch 2/2
f1_score on dev set: 
f1_macro:  0.8873011757399991
f1_micro:  0.8875128998968008
f1_weighted:  0.8871852315588934
Train on 3876 samples
Epoch 1/3
Epoch 2/3
Epoch 3/3
f1_score on dev set: 
f1_macro:  0.9050340031702204
f1_micro:  0.9050567595459237
f1_weighted:  0.905068896279632
Train on 3876 samples
Epoch 1/4
Epoch 2/4
Epoch 3/4
Epoch 4/4
f1_score on dev set: 
f1_macro:  0.8978171464726352
f1_micro:  0.8978328173374613
f1_weighted:  0.8978471822968852
Train on 3876 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
f1_score on dev set: 
f1_macro:  0.9050340031702204
f1_micro:  0.9050567595459237
f1_weighted:  0.9049991100608087
Train on 3876 samples
Epoch 1/6
Epoch 2/6
Epoch 3/6
Epoch 4/6
Epoch 5/6
Epoc

Epoch 5/9
Epoch 6/9
Epoch 7/9
Epoch 8/9
Epoch 9/9
f1_score on dev set: 
f1_macro:  0.9498125664410004
f1_micro:  0.9498327759197325
f1_weighted:  0.9498159346874556

Now training the classsfier for topic:  movie  ; intent:  price
Input: str; Output: boolean(if the str contents the intent:  price  ).
----------------------------------------------------------------
We get: 
45  postive sample
23  sample of wrong intents
22  sample of no intents
data_loaded!
Train on 72 samples
f1_score on dev set: 
f1_macro:  0.4581939799331104
f1_micro:  0.5
f1_weighted:  0.4247491638795987
Train on 72 samples
Epoch 1/2
Epoch 2/2
f1_score on dev set: 
f1_macro:  0.8285714285714285
f1_micro:  0.8333333333333334
f1_weighted:  0.834920634920635
Train on 72 samples
Epoch 1/3
Epoch 2/3
Epoch 3/3
f1_score on dev set: 
f1_macro:  0.8036363636363637
f1_micro:  0.8333333333333334
f1_weighted:  0.8206060606060606
Train on 72 samples
Epoch 1/4
Epoch 2/4
Epoch 3/4
Epoch 4/4
f1_score on dev set: 
f1_macro:  0.766233

Epoch 2/9
Epoch 3/9
Epoch 4/9
Epoch 5/9
Epoch 6/9
Epoch 7/9
Epoch 8/9
Epoch 9/9
f1_score on dev set: 
f1_macro:  0.6625000000000001
f1_micro:  0.6666666666666666
f1_weighted:  0.6541666666666667

Now training the classsfier for topic:  pizza  ; intent:  size
Input: str; Output: boolean(if the str contents the intent:  size  ).
----------------------------------------------------------------
We get: 
1492  postive sample
747  sample of wrong intents
746  sample of no intents
data_loaded!
Train on 2388 samples
f1_score on dev set: 
f1_macro:  0.8943763427271578
f1_micro:  0.8944723618090452
f1_weighted:  0.8943923459074724
Train on 2388 samples
Epoch 1/2
Epoch 2/2
f1_score on dev set: 
f1_macro:  0.9261238356657517
f1_micro:  0.9262981574539364
f1_weighted:  0.9261418689541846
Train on 2388 samples
Epoch 1/3
Epoch 2/3
Epoch 3/3
f1_score on dev set: 
f1_macro:  0.929504048582996
f1_micro:  0.9296482412060302
f1_weighted:  0.9295200699855553
Train on 2388 samples
Epoch 1/4
Epoch 2/4
Epoch 

Epoch 3/8
Epoch 4/8
Epoch 5/8
Epoch 6/8
Epoch 7/8
Epoch 8/8
f1_score on dev set: 
f1_macro:  0.915903890160183
f1_micro:  0.9166666666666666
f1_weighted:  0.916094584286804
Train on 670 samples
Epoch 1/9
Epoch 2/9
Epoch 3/9
Epoch 4/9
Epoch 5/9
Epoch 6/9
Epoch 7/9
Epoch 8/9
Epoch 9/9
f1_score on dev set: 
f1_macro:  0.9402645427392975
f1_micro:  0.9404761904761905
f1_weighted:  0.9403492018340546

Now training the classsfier for topic:  pizza  ; intent:  preference
Input: str; Output: boolean(if the str contents the intent:  preference  ).
----------------------------------------------------------------
We get: 
943  postive sample
472  sample of wrong intents
471  sample of no intents
data_loaded!
Train on 1508 samples
f1_score on dev set: 
f1_macro:  0.8411586732406993
f1_micro:  0.8412698412698413
f1_weighted:  0.841581111751439
Train on 1508 samples
Epoch 1/2
Epoch 2/2
f1_score on dev set: 
f1_macro:  0.8719801309550688
f1_micro:  0.8730158730158731
f1_weighted:  0.8731377250230263


Epoch 2/7
Epoch 3/7
Epoch 4/7
Epoch 5/7
Epoch 6/7
Epoch 7/7
f1_score on dev set: 
f1_macro:  0.8909938569548047
f1_micro:  0.8910133843212237
f1_weighted:  0.8909659607170634
Train on 4184 samples
Epoch 1/8
Epoch 2/8
Epoch 3/8
Epoch 4/8
Epoch 5/8
Epoch 6/8
Epoch 7/8
Epoch 8/8
f1_score on dev set: 
f1_macro:  0.888904759288762
f1_micro:  0.8891013384321224
f1_weighted:  0.8889941134448349
Train on 4184 samples
Epoch 1/9
Epoch 2/9
Epoch 3/9
Epoch 4/9
Epoch 5/9
Epoch 6/9
Epoch 7/9
Epoch 8/9
Epoch 9/9
f1_score on dev set: 
f1_macro:  0.8792869102006572
f1_micro:  0.8795411089866156
f1_weighted:  0.8791809940398412

Now training the classsfier for topic:  pizza  ; intent:  type
Input: str; Output: boolean(if the str contents the intent:  type  ).
----------------------------------------------------------------
We get: 
2907  postive sample
1453  sample of wrong intents
1453  sample of no intents
data_loaded!
Train on 4650 samples
f1_score on dev set: 
f1_macro:  0.9096777867932748
f1_micro:

Train on 3611 samples
Epoch 1/6
Epoch 2/6
Epoch 3/6
Epoch 4/6
Epoch 5/6
Epoch 6/6
f1_score on dev set: 
f1_macro:  0.9612395495178399
f1_micro:  0.9612403100775194
f1_weighted:  0.9612420213367984
Train on 3611 samples
Epoch 1/7
Epoch 2/7
Epoch 3/7
Epoch 4/7
Epoch 5/7
Epoch 6/7
Epoch 7/7
f1_score on dev set: 
f1_macro:  0.9568104193621575
f1_micro:  0.9568106312292359
f1_weighted:  0.9568117964981672
Train on 3611 samples
Epoch 1/8
Epoch 2/8
Epoch 3/8
Epoch 4/8
Epoch 5/8
Epoch 6/8
Epoch 7/8
Epoch 8/8
f1_score on dev set: 
f1_macro:  0.9501658684947971
f1_micro:  0.9501661129568106
f1_weighted:  0.9501674574978852
Train on 3611 samples
Epoch 1/9
Epoch 2/9
Epoch 3/9
Epoch 4/9
Epoch 5/9
Epoch 6/9
Epoch 7/9
Epoch 8/9
Epoch 9/9
f1_score on dev set: 
f1_macro:  0.9490536324031555
f1_micro:  0.9490586932447398
f1_weighted:  0.9490463222986449

Now training the classsfier for topic:  restaurant  ; intent:  location
Input: str; Output: boolean(if the str contents the intent:  location  ).
-----

Epoch 3/5
Epoch 4/5
Epoch 5/5
f1_score on dev set: 
f1_macro:  0.9499861072520144
f1_micro:  0.9500000000000001
f1_weighted:  0.9500083356487914
Train on 1198 samples
Epoch 1/6
Epoch 2/6
Epoch 3/6
Epoch 4/6
Epoch 5/6
Epoch 6/6
f1_score on dev set: 
f1_macro:  0.9465145302192906
f1_micro:  0.9466666666666667
f1_weighted:  0.9465905984429788
Train on 1198 samples
Epoch 1/7
Epoch 2/7
Epoch 3/7
Epoch 4/7
Epoch 5/7
Epoch 6/7
Epoch 7/7
f1_score on dev set: 
f1_macro:  0.9465145302192906
f1_micro:  0.9466666666666667
f1_weighted:  0.9465905984429788
Train on 1198 samples
Epoch 1/8
Epoch 2/8
Epoch 3/8
Epoch 4/8
Epoch 5/8
Epoch 6/8
Epoch 7/8
Epoch 8/8
f1_score on dev set: 
f1_macro:  0.943150784184771
f1_micro:  0.9433333333333332
f1_weighted:  0.9432366896664475
Train on 1198 samples
Epoch 1/9
Epoch 2/9
Epoch 3/9
Epoch 4/9
Epoch 5/9
Epoch 6/9
Epoch 7/9
Epoch 8/9
Epoch 9/9
f1_score on dev set: 
f1_macro:  0.9398690481493029
f1_micro:  0.94
f1_weighted:  0.9399438777782727

Now training the clas

Train on 2804 samples
Epoch 1/4
Epoch 2/4
Epoch 3/4
Epoch 4/4
f1_score on dev set: 
f1_macro:  0.8901247208691173
f1_micro:  0.8901569186875893
f1_weighted:  0.8900254442621627
Train on 2804 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
f1_score on dev set: 
f1_macro:  0.8987128219217111
f1_micro:  0.8987161198288159
f1_weighted:  0.8987433275624312
Train on 2804 samples
Epoch 1/6
Epoch 2/6
Epoch 3/6
Epoch 4/6
Epoch 5/6
Epoch 6/6
f1_score on dev set: 
f1_macro:  0.9228690889530051
f1_micro:  0.9229671897289586
f1_weighted:  0.9230142781014163
Train on 2804 samples
Epoch 1/7
Epoch 2/7
Epoch 3/7
Epoch 4/7
Epoch 5/7
Epoch 6/7
Epoch 7/7
f1_score on dev set: 
f1_macro:  0.9141422389156528
f1_micro:  0.9144079885877318
f1_weighted:  0.9143943603994201
Train on 2804 samples
Epoch 1/8
Epoch 2/8
Epoch 3/8
Epoch 4/8
Epoch 5/8
Epoch 6/8
Epoch 7/8
Epoch 8/8
f1_score on dev set: 
f1_macro:  0.9157789918892556
f1_micro:  0.9158345221112696
f1_weighted:  0.9158931373456176
Train on 2804 s

Epoch 2/3
Epoch 3/3
f1_score on dev set: 
f1_macro:  0.9158688986240109
f1_micro:  0.9159021406727829
f1_weighted:  0.915858670301312
Train on 2614 samples
Epoch 1/4
Epoch 2/4
Epoch 3/4
Epoch 4/4
f1_score on dev set: 
f1_macro:  0.9097384506340485
f1_micro:  0.9097859327217125
f1_weighted:  0.9097257887440047
Train on 2614 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
f1_score on dev set: 
f1_macro:  0.9266048181954212
f1_micro:  0.926605504587156
f1_weighted:  0.9266034454119517
Train on 2614 samples
Epoch 1/6
Epoch 2/6
Epoch 3/6
Epoch 4/6
Epoch 5/6
Epoch 6/6
f1_score on dev set: 
f1_macro:  0.9189282114013195
f1_micro:  0.918960244648318
f1_weighted:  0.9189183550176279
Train on 2614 samples
Epoch 1/7
Epoch 2/7
Epoch 3/7
Epoch 4/7
Epoch 5/7
Epoch 6/7
Epoch 7/7
f1_score on dev set: 
f1_macro:  0.9327059791962883
f1_micro:  0.9327217125382263
f1_weighted:  0.932699685859513
Train on 2614 samples
Epoch 1/8
Epoch 2/8
Epoch 3/8
Epoch 4/8
Epoch 5/8
Epoch 6/8
Epoch 7/8
Epoch 8/8

Train on 632 samples
Epoch 1/2
Epoch 2/2
f1_score on dev set: 
f1_macro:  0.7969572607055507
f1_micro:  0.8037974683544303
f1_weighted:  0.7964855222470072
Train on 632 samples
Epoch 1/3
Epoch 2/3
Epoch 3/3
f1_score on dev set: 
f1_macro:  0.8531064311411132
f1_micro:  0.8544303797468354
f1_weighted:  0.85292990466035
Train on 632 samples
Epoch 1/4
Epoch 2/4
Epoch 3/4
Epoch 4/4
f1_score on dev set: 
f1_macro:  0.8921934577563717
f1_micro:  0.8924050632911392
f1_weighted:  0.8921329990321524
Train on 632 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
f1_score on dev set: 
f1_macro:  0.9303546383490282
f1_micro:  0.930379746835443
f1_weighted:  0.9303378993580851
Train on 632 samples
Epoch 1/6
Epoch 2/6
Epoch 3/6
Epoch 4/6
Epoch 5/6
Epoch 6/6
f1_score on dev set: 
f1_macro:  0.904876580373269
f1_micro:  0.9050632911392406
f1_weighted:  0.9048232344401345
Train on 632 samples
Epoch 1/7
Epoch 2/7
Epoch 3/7
Epoch 4/7
Epoch 5/7
Epoch 6/7
Epoch 7/7
f1_score on dev set: 
f1_macro:  

data_loaded!
Train on 3374 samples
f1_score on dev set: 
f1_macro:  0.9502369668246445
f1_micro:  0.9502369668246445
f1_weighted:  0.9502369668246445
Train on 3374 samples
Epoch 1/2
Epoch 2/2
f1_score on dev set: 
f1_macro:  0.9549740277972765
f1_micro:  0.9549763033175356
f1_weighted:  0.9549664427297461
Train on 3374 samples
Epoch 1/3
Epoch 2/3
Epoch 3/3
f1_score on dev set: 
f1_macro:  0.9715510412745852
f1_micro:  0.9715639810426541
f1_weighted:  0.9715654187946617
Train on 3374 samples
Epoch 1/4
Epoch 2/4
Epoch 3/4
Epoch 4/4
f1_score on dev set: 
f1_macro:  0.9739264751312944
f1_micro:  0.9739336492890995
f1_weighted:  0.9739367239281588
Train on 3374 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
f1_score on dev set: 
f1_macro:  0.9703741145581039
f1_micro:  0.9703791469194313
f1_weighted:  0.9703832643059718
Train on 3374 samples
Epoch 1/6
Epoch 2/6
Epoch 3/6
Epoch 4/6
Epoch 5/6
Epoch 6/6
f1_score on dev set: 
f1_macro:  0.9762925343954878
f1_micro:  0.976303317535545

Epoch 6/9
Epoch 7/9
Epoch 8/9
Epoch 9/9
f1_score on dev set: 
f1_macro:  0.9197801617079744
f1_micro:  0.9202898550724637
f1_weighted:  0.9201508477912393

Now training the classsfier for topic:  uber  ; intent:  duration
Input: str; Output: boolean(if the str contents the intent:  duration  ).
----------------------------------------------------------------
We get: 
20  postive sample
11  sample of wrong intents
10  sample of no intents
data_loaded!
Train on 32 samples
f1_score on dev set: 
f1_macro:  0.25
f1_micro:  0.3333333333333333
f1_weighted:  0.16666666666666666
Train on 32 samples
Epoch 1/2
Epoch 2/2
f1_score on dev set: 
f1_macro:  0.41558441558441556
f1_micro:  0.4444444444444444
f1_weighted:  0.37229437229437223
Train on 32 samples
Epoch 1/3
Epoch 2/3
Epoch 3/3
f1_score on dev set: 
f1_macro:  0.55
f1_micro:  0.5555555555555556
f1_weighted:  0.5666666666666667
Train on 32 samples
Epoch 1/4
Epoch 2/4
Epoch 3/4
Epoch 4/4
f1_score on dev set: 
f1_macro:  1.0
f1_micro:  1.0
f1_

Epoch 3/9
Epoch 4/9
Epoch 5/9
Epoch 6/9
Epoch 7/9
Epoch 8/9
Epoch 9/9
f1_score on dev set: 
f1_macro:  0.7749999999999999
f1_micro:  0.7777777777777778
f1_weighted:  0.7833333333333334

Now training the classsfier for topic:  uber  ; intent:  type
Input: str; Output: boolean(if the str contents the intent:  type  ).
----------------------------------------------------------------
We get: 
1644  postive sample
822  sample of wrong intents
822  sample of no intents
data_loaded!
Train on 2630 samples
f1_score on dev set: 
f1_macro:  0.901215577312611
f1_micro:  0.9012158054711246
f1_weighted:  0.9012219657509922
Train on 2630 samples
Epoch 1/2
Epoch 2/2
f1_score on dev set: 
f1_macro:  0.9269193391642372
f1_micro:  0.9270516717325228
f1_weighted:  0.9270516717325228
Train on 2630 samples
Epoch 1/3
Epoch 2/3
Epoch 3/3
f1_score on dev set: 
f1_macro:  0.9255179806090791
f1_micro:  0.925531914893617
f1_weighted:  0.9255613317165303
Train on 2630 samples
Epoch 1/4
Epoch 2/4
Epoch 3/4
Epoch 4/

### f1 score on test_set

In [59]:
for t in ID.best_f1.keys():
    print(128 * "=")
    print("In topic ", t)
    for st in ID.best_f1[t].keys():
        print("\t", st, ": %.3f"%ID.best_f1[t][st])

In topic  auto
	 name : 0.980
	 year : 0.987
	 reason : 0.966
	 date : 0.950
In topic  coffee
	 size : 0.943
	 location : 0.968
	 num : 0.776
	 name : 0.927
	 type : 0.944
	 preference : 0.869
In topic  movie
	 time : 0.964
	 location : 0.944
	 num : 0.972
	 name : 0.924
	 type : 0.956
	 price : 0.829
In topic  pizza
	 size : 0.953
	 location : 0.940
	 preference : 0.882
	 name : 0.891
	 type : 0.941
In topic  restaurant
	 time : 0.961
	 location : 0.950
	 num : 0.963
	 name : 0.923
	 type : 0.933
In topic  uber
	 time : 0.937
	 location : 0.976
	 num : 0.920
	 duration : 1.000
	 type : 0.931


### Early stop epochs

In [60]:
for t in ID.best_f1.keys():
    print(128 * "=")
    print("In topic ", t)
    for st in ID.best_f1[t].keys():
        print("\t", st, ": %d"%ID.best_epoch[t][st])

In topic  auto
	 name : 6
	 year : 2
	 reason : 7
	 date : 9
In topic  coffee
	 size : 9
	 location : 4
	 num : 8
	 name : 5
	 type : 9
	 preference : 6
In topic  movie
	 time : 9
	 location : 8
	 num : 6
	 name : 9
	 type : 8
	 price : 2
In topic  pizza
	 size : 6
	 location : 9
	 preference : 8
	 name : 7
	 type : 5
In topic  restaurant
	 time : 6
	 location : 5
	 num : 8
	 name : 6
	 type : 7
In topic  uber
	 time : 9
	 location : 6
	 num : 9
	 duration : 4
	 type : 7


# Sentences Demonstration

In [5]:
ID = Intent_detection()
ID.load_model()

test_phrases = np.array([["Get me a restaurant in paris!"], ["This Evening at 8pm"], ["I want this"],
                    ["We have totally 4"], ["My dad, mom, me and my sister"], ["Do you have some recommends"],
                    ["Thank you! Bye"]])

t = 'restaurant'
print(128 * "=")
for phr in test_phrases:
    intent = []
    res = ID.predict( phr[0] , t  )
    sub_topic_array = list( ID.topic2sub_topic[t].keys() )
    print(phr[0])
    for st in sub_topic_array:
        if res[ ID.topic2sub_topic[t][st] - 1] > 0.4:
            intent.append(st)

    if len(intent) > 0:
        print("Detected intents: ", end =" ")
        for i in intent:
            print(i)
    else:
        print("No intent detected")
    print(128*"-")
print(128 * "=")

module elmo loaded
Loading models of topic:  auto
Loaded weights for model auto : name
Loaded weights for model auto : year
Loaded weights for model auto : reason
Loaded weights for model auto : date
Loading models of topic:  coffee
Loaded weights for model coffee : size
Loaded weights for model coffee : location
Loaded weights for model coffee : num
Loaded weights for model coffee : name
Loaded weights for model coffee : type
Loaded weights for model coffee : preference
Loading models of topic:  movie
Loaded weights for model movie : time
Loaded weights for model movie : location
Loaded weights for model movie : num
Loaded weights for model movie : name
Loaded weights for model movie : type
Loaded weights for model movie : price
Loading models of topic:  pizza
Loaded weights for model pizza : size
Loaded weights for model pizza : location
Loaded weights for model pizza : preference
Loaded weights for model pizza : name
Loaded weights for model pizza : type
Loading models of topic:  re

# Conversation Demonstration

In [6]:
TCM = Topic_clustering()

module use loaded
module elmo loaded
module nnlm loaded
Gnews Swivel loaded
Loaded weights for model elmo
Loaded weights for model use


In [7]:
class Demo_Conversation(object):
    def __init__(self, model_cl, model_id):
        self.model_cl = model_cl
        self.model_id = model_id
        self.classes_arr = ['auto', 'coffee', 'movie', 'non-opening', 'pizza','restaurant','uber']
        self.opening_response = {"auto" : "\tIt seems you want to repair your vehicle", \
                            "coffee" : "\tIt seems you want to order some coffee", \
                            "movie" : "\tIt seems you want to book movie tickets", \
                            "pizza" : "\tIt seems you want to order some pizza",\
                            "restaurant": "\tIt seems you want to book a table",\
                            "uber": "\tIt seems you want to take ride"}

    def inConversationDetected(self, embed = "elmo"):
        
        print("Conversation Test")
        print("Still working on it, not completed")
        print("Input \'~\' to stop")
        print(128 * "=")
        print("\tHello, what can I do for you?")
        print(128 * "-")
        userInput = ""
        current_topic = "non-opening";
        pre_topic = "non-opening"
        userInput = input()
        prediction = self.model_cl.prediction(embed, [userInput])[0]
        current_topic = prediction
        
        while current_topic == "non-opening":
            print("\t Sorry I can't understand what you mean")
            print(128 * "-")
            prediction = self.model_cl.prediction(embed, [userInput])[0]
            current_topic = prediction
            userInput = input()
        
        pre_topic = current_topic
        print("\ttopic predicted: ", current_topic)
        print("\tentering the scenario ", current_topic)
        print(self.opening_response[current_topic])
        
        
        while (userInput != "~"):  
            prediction = self.model_cl.prediction(embed, [userInput])[0]
            if prediction == "non-opening" or prediction == current_topic:
                
                pred_intents = self.model_id.predict( userInput , current_topic )
                sub_topic_array = list( self.model_id.topic2sub_topic[t].keys() )
                for st in sub_topic_array:
                    if pred_intents[ self.model_id.topic2sub_topic[t][st] - 1] > 0.4:
                        intent.append(st)

                if len(intent) > 0:
                    print("\tDetected intents: ", end =" ")
                    for i in intent:
                        print(i, end  = " ")
                else:
                    print("\tNo intent detected") 
                userInput = input()
                continue
            
            print(128 * "-")
            current_topic = prediction
            print("\ttopic predicted: ", current_topic)
            print("\tentering the scenario ", current_topic)
            print(self.opening_response[current_topic])
            userInput = input()
        
        print("Goodbye, thank you for your attention")

In [8]:
DC = Demo_Conversation(TCM, ID)

In [9]:
DC.inConversationDetected()


Conversation Test
Still working on it, not completed
Input '~' to stop
	Hello, what can I do for you?
--------------------------------------------------------------------------------------------------------------------------------
I want a ride
	topic predicted:  uber
	entering the scenario  uber
	It seems you want to take ride
	Detected intents:  type i am at fifth avenue
	Detected intents:  type location I need it for 2 people
--------------------------------------------------------------------------------------------------------------------------------
	topic predicted:  pizza
	entering the scenario  pizza
	It seems you want to order some pizza
I am very hungry
	Detected intents:  type location I need it at 6 am
	Detected intents:  type location location 

KeyboardInterrupt: 