 # Contextual Assistant (Chatbot)
- In this project, a chatbot is developed to make a conversation with users about movies and help/support them to gain information on movies and choose a movie to watch.
- In order to develop the chatbot, the "IMDB's movies" dataset is deployed which is consisted of 28 features of 5043 movies. In this project, some columns(features) of dataset such as movie-title, director_name, genres , imdb_score, title_year, color and gross and budget are used.

In [1]:
import pandas as pd

results = pd.read_csv("movie_metadata.csv", encoding="utf-8") #Read data
results.head()

Unnamed: 0,color,director_name,num_critic_for_reviews,duration,director_facebook_likes,actor_3_facebook_likes,actor_2_name,actor_1_facebook_likes,gross,genres,...,num_user_for_reviews,language,country,content_rating,budget,title_year,actor_2_facebook_likes,imdb_score,aspect_ratio,movie_facebook_likes
0,Color,James Cameron,723.0,178.0,0.0,855.0,Joel David Moore,1000.0,760505847.0,Action|Adventure|Fantasy|Sci-Fi,...,3054.0,English,USA,PG-13,237000000.0,2009.0,936.0,7.9,1.78,33000
1,Color,Gore Verbinski,302.0,169.0,563.0,1000.0,Orlando Bloom,40000.0,309404152.0,Action|Adventure|Fantasy,...,1238.0,English,USA,PG-13,300000000.0,2007.0,5000.0,7.1,2.35,0
2,Color,Sam Mendes,602.0,148.0,0.0,161.0,Rory Kinnear,11000.0,200074175.0,Action|Adventure|Thriller,...,994.0,English,UK,PG-13,245000000.0,2015.0,393.0,6.8,2.35,85000
3,Color,Christopher Nolan,813.0,164.0,22000.0,23000.0,Christian Bale,27000.0,448130642.0,Action|Thriller,...,2701.0,English,USA,PG-13,250000000.0,2012.0,23000.0,8.5,2.35,164000
4,,Doug Walker,,,131.0,,Rob Walker,131.0,,Documentary,...,,,,,,,12.0,7.1,,0


In [2]:
results.columns

Index(['color', 'director_name', 'num_critic_for_reviews', 'duration',
       'director_facebook_likes', 'actor_3_facebook_likes', 'actor_2_name',
       'actor_1_facebook_likes', 'gross', 'genres', 'actor_1_name',
       'movie_title', 'num_voted_users', 'cast_total_facebook_likes',
       'actor_3_name', 'facenumber_in_poster', 'plot_keywords',
       'movie_imdb_link', 'num_user_for_reviews', 'language', 'country',
       'content_rating', 'budget', 'title_year', 'actor_2_facebook_likes',
       'imdb_score', 'aspect_ratio', 'movie_facebook_likes'],
      dtype='object')

# RASA A.I. Toolkit

In order to develop a chatbot, the RASA Stack AI toolkit is deployed which is consisted of two main components. The components are as follows:
- **RASA NLU (Natural Language Understanding)** which is used to classify the intent of user and extract the entity.
- **RASA core** The dialogue management is done by RASA core. Moreover, RASA core take the structured output of RASA NLU and predicts the best next action using predictive models such as LSTM. 
  
Note that dialogue management is deployed to keep the track of conversation.

# RASA NLU

RASA NLU component is developed by taking the following steps:

1. As a first step, data is collected. In this section, the data that is used to train the classifier are "Intent" and "Entity" which is saved in "intents.md" file.
2. The next step is to define a pipeline to process and classify the user's intent.
   RASA NLU has pre_designed pipelines for deffrent languages and purposes. In this project, The "spacy_sklearn" pipeline is deployed.(Spacy module is advanced NLP library and scikit-learn module is used for classification purposes)
   The chosen pipeline is in "config.yml" file.
3. As the last step, the RASA NLU model is trained using the train data and defined pipeline.


In [3]:
%matplotlib inline

import logging, io, json, warnings
logging.basicConfig(level="INFO")
warnings.filterwarnings('ignore')

import sys
!{sys.executable} -m spacy download en

def pprint(o):
    # small helper to make dict dumps a bit prettier
    print(json.dumps(o, indent=2))


    Linking successful
    C:\Anaconda\envs\mie451-assignment-ci\lib\site-packages\en_core_web_sm
    -->
    C:\Anaconda\envs\mie451-assignment-ci\lib\site-packages\spacy\data\en

    You can now load the model via spacy.load('en')



  return f(*args, **kwds)
  return f(*args, **kwds)


In [4]:
from rasa_nlu.training_data import load_data
from rasa_nlu.model import Trainer
from rasa_nlu import config

# loading the nlu training samples
training_data = load_data("intents.md")

# trainer to educate our pipeline
trainer = Trainer(config.load("config.yml"))

# train the model!
interpreter = trainer.train(training_data, verbose=True)

# store it for future use
model_directory = trainer.persist("models/nlu", fixed_model_name="current")

INFO:rasa_nlu.training_data.loading:Training data format of intents.md is md
INFO:rasa_nlu.training_data.training_data:Training data stats: 
	- intent examples: 157 (9 distinct intents)
	- Found intents: 'deny', 'movie_suggestion', 'goodbye', 'inform', 'affirm', 'content_check', 'thanks', 'greet', 'General_info'
	- entity examples: 104 (5 distinct entities)
	- found entities: 'imdb', 'themecolor', 'movie', 'genre', 'year'

INFO:rasa_nlu.utils.spacy_utils:Trying to load spacy model with name 'en'
INFO:rasa_nlu.components:Added 'nlp_spacy' to component cache. Key 'nlp_spacy-en'.
INFO:rasa_nlu.model:Starting to train component nlp_spacy
INFO:rasa_nlu.model:Finished training component.
INFO:rasa_nlu.model:Starting to train component tokenizer_spacy
INFO:rasa_nlu.model:Finished training component.
INFO:rasa_nlu.model:Starting to train component intent_featurizer_spacy
INFO:rasa_nlu.model:Finished training component.
INFO:rasa_nlu.model:Starting to train component intent_entity_featurizer_re

Fitting 2 folds for each of 6 candidates, totalling 12 fits


[Parallel(n_jobs=1)]: Done  12 out of  12 | elapsed:    0.4s finished
INFO:rasa_nlu.model:Finished training component.
INFO:rasa_nlu.model:Successfully saved model into 'C:\mie1513\assignment-cai-sfalaki\assignment\models\nlu\default\current'


# RASA Core

The dialogue management is performed by RASA Core component. To train the RASA Core component the following steps are taken:  

1. DATA collection. The needed data for this part are stories and actions.

   Stories are example converstations. RASA Core using this data would learn how to predict the best next action. The collected data is saved on "stories.md" file.
   
   Actions is consisted of actions taht are taken by chatbot to satisfy user's needs.
2. Model training. The model is trained using the collected data to predict the next best action.
   Note that there is no need of importing TensorFlow mmodeules and train a moel. The RASA Core takes care of model training.

In [5]:
from rasa_core.policies import KerasPolicy, MemoizationPolicy
from rasa_core.policies.keras_policy import KerasPolicy
from rasa_core.policies.fallback import FallbackPolicy
from rasa_core.agent import Agent

#Adding Fallback action 
fallback = FallbackPolicy(fallback_action_name="action_default_fallback",core_threshold= 0.2, nlu_threshold=0.2)

agent = Agent('domain.yml', policies=[MemoizationPolicy(), KerasPolicy(),fallback])

# loading our training dialogues
training_data = agent.load_data('stories.md')

agent.train(
    training_data,
    validation_split=0.0,
    epochs=200
)

agent.persist('models/dialogue')

INFO:apscheduler.scheduler:Scheduler started
The default 'Loader' for 'load(stream)' without further arguments can be unsafe.
Use 'load(stream, Loader=ruamel.yaml.Loader)' explicitly if that is OK.
Alternatively include the following in your code:


In most other cases you should consider using 'safe_load(stream)'
  data = yaml.load(stream)
Processed Story Blocks: 100%|█████████████████████████████████████████████| 9/9 [00:00<00:00, 163.63it/s, # trackers=1]
Processed Story Blocks: 100%|██████████████████████████████████████████████| 9/9 [00:00<00:00, 54.88it/s, # trackers=9]
Processed Story Blocks: 100%|█████████████████████████████████████████████| 9/9 [00:00<00:00, 53.89it/s, # trackers=17]
Processed Story Blocks: 100%|█████████████████████████████████████████████| 9/9 [00:00<00:00, 46.15it/s, # trackers=18]
Processed actions: 806it [00:04, 199.21it/s, # examples=806]


_________________________________________________________________
Layer (type)                 Output Shape              Param #   
masking (Masking)            (None, 5, 35)             0         
_________________________________________________________________
lstm (LSTM)                  (None, 32)                8704      
_________________________________________________________________
dense (Dense)                (None, 16)                528       
_________________________________________________________________
activation (Activation)      (None, 16)                0         
Total params: 9,232
Trainable params: 9,232
Non-trainable params: 0
_________________________________________________________________


INFO:rasa_core.policies.keras_policy:Fitting model with 806 total samples and a validation split of 0.0


Epoch 1/200
Epoch 2/200
Epoch 3/200
Epoch 4/200
Epoch 5/200
Epoch 6/200
Epoch 7/200
Epoch 8/200
Epoch 9/200
Epoch 10/200
Epoch 11/200
Epoch 12/200
Epoch 13/200
Epoch 14/200
Epoch 15/200
Epoch 16/200
Epoch 17/200
Epoch 18/200
Epoch 19/200
Epoch 20/200
Epoch 21/200
Epoch 22/200
Epoch 23/200
Epoch 24/200
Epoch 25/200
Epoch 26/200
Epoch 27/200
Epoch 28/200
Epoch 29/200
Epoch 30/200
Epoch 31/200
Epoch 32/200
Epoch 33/200
Epoch 34/200
Epoch 35/200
Epoch 36/200
Epoch 37/200
Epoch 38/200
Epoch 39/200
Epoch 40/200
Epoch 41/200
Epoch 42/200
Epoch 43/200
Epoch 44/200
Epoch 45/200
Epoch 46/200
Epoch 47/200
Epoch 48/200
Epoch 49/200
Epoch 50/200
Epoch 51/200
Epoch 52/200
Epoch 53/200
Epoch 54/200
Epoch 55/200
Epoch 56/200
Epoch 57/200
Epoch 58/200
Epoch 59/200
Epoch 60/200
Epoch 61/200
Epoch 62/200
Epoch 63/200
Epoch 64/200
Epoch 65/200
Epoch 66/200
Epoch 67/200
Epoch 68/200
Epoch 69/200
Epoch 70/200
Epoch 71/200
Epoch 72/200
Epoch 73/200
Epoch 74/200
Epoch 75/200
Epoch 76/200
Epoch 77/200
Epoch 78

INFO:rasa_core.policies.keras_policy:Done fitting keras policy model
INFO:rasa_core.agent:Persisted model to 'C:\mie1513\assignment-cai-sfalaki\assignment\models\dialogue'


# Interaction Demo
### The sample stories are as follows:
<img src="graph.png">

# Chatbot

### **In order to develop the chatbot, a server which listens to the user's requests is set up and the two trained models are deployed to make conversation with user.**

##  Test 1:
 The main goal in this section is asking chatbot to give some general information about the movie. The chatbot is designed to return general about Director name, first leading actor and IMDB score.

In [6]:
import IPython
from IPython.display import clear_output
from rasa_core.agent import Agent
from rasa_core.interpreter import NaturalLanguageInterpreter
from rasa_core.utils import EndpointConfig

messages = ["Hi! you can chat in this window. Type 'stop' to end the conversation."]
interpreter = NaturalLanguageInterpreter.create('models/nlu/default/current/')
endpoint = EndpointConfig('http://localhost:5055/webhook')
agent = Agent.load('models/dialogue', interpreter=interpreter, action_endpoint = endpoint)
tracker = agent.tracker_store.get_or_create_tracker("sender_id") 
# get current tracker state
tracker.current_state()

print("Your bot is ready to talk! Type your messages here or send 'stop'")
while True:
    a = input()
    if a == 'stop':
        break
    #pprint(interpreter.parse(a))
    responses = agent.handle_text(a)
    for response in responses:
        print(response["text"])

INFO:rasa_nlu.components:Added 'nlp_spacy' to component cache. Key 'nlp_spacy-en'.


Your bot is ready to talk! Type your messages here or send 'stop'
hello there
Hey!!
give me information about the circle
General Information aboutthe circle movie:
 Director is Jafar Panahi
 Leading Actorctress is Fereshteh Sadre Orafaiy 
 IMDB score is 7.5  
stop


## Test 2
In this part, the main goal of the stories is checking the content rating of movie.

In [7]:
import IPython
from IPython.display import clear_output
from rasa_core.agent import Agent
from rasa_core.interpreter import NaturalLanguageInterpreter
from rasa_core.utils import EndpointConfig

messages = ["Hi! you can chat in this window. Type 'stop' to end the conversation."]
interpreter = NaturalLanguageInterpreter.create('models/nlu/default/current/')
endpoint = EndpointConfig('http://localhost:5055/webhook')
agent = Agent.load('models/dialogue', interpreter=interpreter, action_endpoint = endpoint)
tracker = agent.tracker_store.get_or_create_tracker("sender_id") 
# get current tracker state
tracker.current_state()

print("Your bot is ready to talk! Type your messages here or send 'stop'")
while True:
    a = input()
    if a == 'stop':
        break
    #pprint(interpreter.parse(a))
    responses = agent.handle_text(a)
    for response in responses:
        print(response["text"])
   

INFO:rasa_nlu.components:Added 'nlp_spacy' to component cache. Key 'nlp_spacy-en'.


Your bot is ready to talk! Type your messages here or send 'stop'
what is the content rating of titanic?
Content rating of titanic movie is PG-13 
thanks
No Problem
stop


In [8]:
import IPython
from IPython.display import clear_output
from rasa_core.agent import Agent
from rasa_core.interpreter import NaturalLanguageInterpreter
from rasa_core.utils import EndpointConfig

messages = ["Hi! you can chat in this window. Type 'stop' to end the conversation."]
interpreter = NaturalLanguageInterpreter.create('models/nlu/default/current/')
endpoint = EndpointConfig('http://localhost:5055/webhook')
agent = Agent.load('models/dialogue', interpreter=interpreter, action_endpoint = endpoint)
tracker = agent.tracker_store.get_or_create_tracker("sender_id") 
# get current tracker state
tracker.current_state()

print("Your bot is ready to talk! Type your messages here or send 'stop'")
while True:
    a = input()
    if a == 'stop':
        break
    #pprint(interpreter.parse(a))
    responses = agent.handle_text(a)
    for response in responses:
        print(response["text"])
   

INFO:rasa_nlu.components:Added 'nlp_spacy' to component cache. Key 'nlp_spacy-en'.


Your bot is ready to talk! Type your messages here or send 'stop'
hi
Hey!!
Is this movie safe for children to watch?
What is the name of the movie?
the matrix
Content rating of the matrix movie is R 
stop


## Test 3
The idea in this section is getting the name of 5 profitable movies by having the user's preferences. 
First, the profit feature is calculated using the "gross" and "budget" features.
Also, some of information that chatbot used to return the 5 profitable movies are "genre","year","imdb_score" and "color". 

In [9]:
import IPython
from IPython.display import clear_output
from rasa_core.agent import Agent
from rasa_core.interpreter import NaturalLanguageInterpreter
from rasa_core.utils import EndpointConfig

messages = ["Hi! you can chat in this window. Type 'stop' to end the conversation."]
interpreter = NaturalLanguageInterpreter.create('models/nlu/default/current/')
endpoint = EndpointConfig('http://localhost:5055/webhook')
agent = Agent.load('models/dialogue', interpreter=interpreter, action_endpoint = endpoint)
tracker = agent.tracker_store.get_or_create_tracker("sender_id") 
# get current tracker state
tracker.current_state()

print("Your bot is ready to talk! Type your messages here or send 'stop'")
while True:
    a = input()
    if a == 'stop':
        break
    #pprint(interpreter.parse(a))
    responses = agent.handle_text(a)
    for response in responses:
        print(response["text"])

INFO:rasa_nlu.components:Added 'nlp_spacy' to component cache. Key 'nlp_spacy-en'.


Your bot is ready to talk! Type your messages here or send 'stop'
Recommend some romance movies that are produced after 2010
please enter the minimum acceptable IMDB score
7.3
Top 5 profitable movies with your desired criteria are :
deadpool             305024263.0
the king's speech             123795342.0
the fault in our stars             112868837.0
silver linings playbook             111088910.0
les misérables             87775460.0
stop


In [10]:
import IPython
from IPython.display import clear_output
from rasa_core.agent import Agent
from rasa_core.interpreter import NaturalLanguageInterpreter
from rasa_core.utils import EndpointConfig

messages = ["Hi! you can chat in this window. Type 'stop' to end the conversation."]
interpreter = NaturalLanguageInterpreter.create('models/nlu/default/current/')
endpoint = EndpointConfig('http://localhost:5055/webhook')
agent = Agent.load('models/dialogue', interpreter=interpreter, action_endpoint = endpoint)
tracker = agent.tracker_store.get_or_create_tracker("sender_id") 
# get current tracker state
tracker.current_state()

print("Your bot is ready to talk! Type your messages here or send 'stop'")
while True:
    a = input()
    if a == 'stop':
        break
    #pprint(interpreter.parse(a))
    responses = agent.handle_text(a)
    for response in responses:
        print(response["text"])

INFO:rasa_nlu.components:Added 'nlp_spacy' to component cache. Key 'nlp_spacy-en'.


Your bot is ready to talk! Type your messages here or send 'stop'
suggest some popular black and white movie
which genre do you like?
action
Top 5 profitable movies with your desired criteria are :
pearl harbor             58539855.0
kill bill: vol. 1             40098138.0
kill bill: vol. 2             36207920.0
space cowboys             25454043.0
die another day             18201106.0
thank you
No Problem
stop


In [11]:
import IPython
from IPython.display import clear_output
from rasa_core.agent import Agent
from rasa_core.interpreter import NaturalLanguageInterpreter
from rasa_core.utils import EndpointConfig

messages = ["Hi! you can chat in this window. Type 'stop' to end the conversation."]
interpreter = NaturalLanguageInterpreter.create('models/nlu/default/current/')
endpoint = EndpointConfig('http://localhost:5055/webhook')
agent = Agent.load('models/dialogue', interpreter=interpreter, action_endpoint = endpoint)
tracker = agent.tracker_store.get_or_create_tracker("sender_id") 
# get current tracker state
tracker.current_state()

print("Your bot is ready to talk! Type your messages here or send 'stop'")
while True:
    a = input()
    if a == 'stop':
        break
    #pprint(interpreter.parse(a))
    responses = agent.handle_text(a)
    for response in responses:
        print(response["text"])

INFO:rasa_nlu.components:Added 'nlp_spacy' to component cache. Key 'nlp_spacy-en'.


Your bot is ready to talk! Type your messages here or send 'stop'
hi
Hey!!
list the movies with horror genre
What is the year the movie is produced after
1990
please enter the minimum acceptable IMDB score
5.6
Top 5 profitable movies with your desired criteria are :
the blair witch project             140470114.0
the conjuring             117387272.0
the silence of the lambs             111727000.0
paranormal activity             107902283.0
i am legend             106386216.0
stop


**The end 😁**