### You can also run the notebook in [COLAB](https://colab.research.google.com/github/deepmipt/DeepPavlov/blob/master/examples/trippy_extended_tutorial.ipynb).

# TripPy Goal oriented bot in DeepPavlov

This tutorial describes how to build an **advanced** Goal-Oriented Bot (Gobot) in DeepPavlov using the [TripPy architecture](https://arxiv.org/pdf/2005.02877.pdf).
You can also train a simple bot following the trippy_simple tutorial.


This tutorial follows the same structure & uses the same data as the gobot_extended tutorial. We will only go over TripPy specific points here - so consult the gobot_extended notebook for general insights. Note that the only difference is the config used and fewer steps being needed for TripPy.

0. [Data preparation](#0.-Data-Preparation)
1. [Build Database of items](#1.-Build-Database-of-items)
2. [Build and Train a Bot](#3.-Build-and-Train-a-Bot)
3. [Interact with bot](#4.-Interact-with-Bot)

In [None]:
!git clone -b rulebased_gobot_trippy https://github.com/Muennighoff/DeepPavlov
%cd DeepPavlov
!pip install -r requirements.txt
!pip install transformers==2.9.1

Cloning into 'DeepPavlov'...
remote: Enumerating objects: 58503, done.[K
remote: Counting objects: 100% (1446/1446), done.[K
remote: Compressing objects: 100% (519/519), done.[K
remote: Total 58503 (delta 1089), reused 1224 (delta 914), pack-reused 57057[K
Receiving objects: 100% (58503/58503), 37.54 MiB | 22.41 MiB/s, done.
Resolving deltas: 100% (44934/44934), done.
/content/DeepPavlov
Collecting aio-pika==6.4.1
[?25l  Downloading https://files.pythonhosted.org/packages/c8/07/196a4115cbef31fa0c3dabdea146f02dffe5e49998341d20dbe2278953bc/aio_pika-6.4.1-py3-none-any.whl (40kB)
[K     |████████████████████████████████| 51kB 6.1MB/s 
[?25hCollecting Cython==0.29.14
[?25l  Downloading https://files.pythonhosted.org/packages/d8/58/2deb24de3c10cc4c0f09639b46f4f4b50059f0fdc785128a57dd9fdce026/Cython-0.29.14-cp37-cp37m-manylinux1_x86_64.whl (2.1MB)
[K     |████████████████████████████████| 2.1MB 11.6MB/s 
[?25hCollecting fastapi==0.47.1
[?25l  Downloading https://files.pythonhosted.

Collecting transformers==2.9.1
[?25l  Downloading https://files.pythonhosted.org/packages/22/97/7db72a0beef1825f82188a4b923e62a146271ac2ced7928baa4d47ef2467/transformers-2.9.1-py3-none-any.whl (641kB)
[K     |████████████████████████████████| 645kB 8.3MB/s 
Collecting sentencepiece
[?25l  Downloading https://files.pythonhosted.org/packages/ac/aa/1437691b0c7c83086ebb79ce2da16e00bef024f24fec2a5161c35476f499/sentencepiece-0.1.96-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2MB)
[K     |████████████████████████████████| 1.2MB 49.0MB/s 
Collecting tokenizers==0.7.0
[?25l  Downloading https://files.pythonhosted.org/packages/ea/59/bb06dd5ca53547d523422d32735585493e0103c992a52a97ba3aa3be33bf/tokenizers-0.7.0-cp37-cp37m-manylinux1_x86_64.whl (5.6MB)
[K     |████████████████████████████████| 5.6MB 22.8MB/s 
Installing collected packages: sentencepiece, tokenizers, transformers
Successfully installed sentencepiece-0.1.96 tokenizers-0.7.0 transformers-2.9.1


## 0. Data Preparation

In [None]:
from deeppavlov.dataset_readers.dstc2_reader import SimpleDSTC2DatasetReader

data = SimpleDSTC2DatasetReader().read('my_data')

2021-07-12 13:54:25.72 INFO in 'deeppavlov.dataset_readers.dstc2_reader'['dstc2_reader'] at line 283: [PosixPath('my_data/simple-dstc2-val.json'), PosixPath('my_data/simple-dstc2-tst.json')]]
2021-07-12 13:54:25.73 INFO in 'deeppavlov.dataset_readers.dstc2_reader'['dstc2_reader'] at line 284: [downloading data from http://files.deeppavlov.ai/datasets/simple_dstc2.tar.gz to my_data]
2021-07-12 13:54:25.74 INFO in 'deeppavlov.core.data.utils'['utils'] at line 95: Downloading from http://files.deeppavlov.ai/datasets/simple_dstc2.tar.gz to my_data/simple_dstc2.tar.gz
100%|██████████| 497k/497k [00:00<00:00, 691kB/s]
2021-07-12 13:54:27.359 INFO in 'deeppavlov.core.data.utils'['utils'] at line 272: Extracting my_data/simple_dstc2.tar.gz archive into my_data
2021-07-12 13:54:27.402 INFO in 'deeppavlov.dataset_readers.dstc2_reader'['dstc2_reader'] at line 304: [loading dialogs from my_data/simple-dstc2-trn.json]
2021-07-12 13:54:27.528 INFO in 'deeppavlov.dataset_readers.dstc2_reader'['dstc2_

In [None]:
!ls my_data

simple-dstc2-templates.txt  simple-dstc2-tst.json
simple-dstc2-trn.json	    simple-dstc2-val.json


To iterate over batches of preprocessed DSTC-2 we need to import `DatasetIterator`.

In [None]:
from deeppavlov.dataset_iterators.dialog_iterator import DialogDatasetIterator

iterator = DialogDatasetIterator(data)

You can now iterate over batches of preprocessed DSTC-2 dialogs:

In [None]:
from pprint import pprint

for dialog in iterator.gen_batches(batch_size=1, data_type='train'):
    turns_x, turns_y = dialog
    
    print("User utterances:\n----------------\n")
    pprint(turns_x[0], indent=4)
    print("\nSystem responses:\n-----------------\n")
    pprint(turns_y[0], indent=4)
    
    break

User utterances:
----------------

[   {'prev_resp_act': None, 'text': ''},
    {   'prev_resp_act': 'welcomemsg',
        'slots': [['area', 'west'], ['pricerange', 'cheap']],
        'text': 'can i have a cheap restaurant in the west part of town'},
    {   'db_result': {   'addr': '17 magdalene street city centre',
                         'area': 'west',
                         'food': 'vietnamese',
                         'name': 'thanh binh',
                         'phone': '01223 362456',
                         'postcode': 'c.b 3, 0 a.f',
                         'pricerange': 'cheap'},
        'prev_resp_act': 'api_call',
        'slots': [['area', 'west'], ['pricerange', 'cheap']],
        'text': 'can i have a cheap restaurant in the west part of town'},
    {   'prev_resp_act': 'inform_area+inform_pricerange+offer_name',
        'slots': [['slot', 'phone']],
        'text': 'can i have the phone number'},
    {'prev_resp_act': 'inform_phone+offer_name', 'text': 'thank 

In real-life annotation of data is expensive. To make our tutorial closer to production use-cases we take  only 50 dialogues for training.

In [None]:
!cp my_data/simple-dstc2-trn.json my_data/simple-dstc2-trn.full.json

In [None]:
import json

NUM_TRAIN = 967

with open('my_data/simple-dstc2-trn.full.json', 'rt') as fin:
    data = json.load(fin)
with open('my_data/simple-dstc2-trn.json', 'wt') as fout:
    json.dump(data[:NUM_TRAIN], fout, indent=2)
print(f"Train set is reduced to {NUM_TRAIN} dialogues (out of {len(data)}).")

Train set is reduced to 50 dialogues (out of 967).


## 1. Build Database of items

### Building database of restaurants

In [None]:
from deeppavlov.core.data.sqlite_database import Sqlite3Database

database = Sqlite3Database(primary_keys=["name"],
                           save_path="my_bot/db.sqlite")

2021-07-12 13:54:28.485 INFO in 'deeppavlov.core.data.sqlite_database'['sqlite_database'] at line 70: Initializing empty database on /content/DeepPavlov/my_bot/db.sqlite.


In [None]:
db_results = []

for dialog in iterator.gen_batches(batch_size=1, data_type='all'):
    turns_x, turns_y = dialog
    db_results.extend(x['db_result'] for x in turns_x[0] if x.get('db_result'))

print(f"Adding {len(db_results)} items.")
if db_results:
    database.fit(db_results)

2021-07-12 13:54:28.524 INFO in 'deeppavlov.core.data.sqlite_database'['sqlite_database'] at line 130: Created table with keys {'pricerange': 'text', 'area': 'text', 'addr': 'text', 'phone': 'text', 'postcode': 'text', 'food': 'text', 'name': 'text'}.


Adding 3016 items.


### Interacting with database

We can now play with the database and make requests to it:

In [None]:
database([{'pricerange': 'cheap', 'area': 'south'}])

[[{'addr': 'cambridge leisure park clifton way cherry hinton',
   'area': 'south',
   'food': 'chinese',
   'name': 'the lucky star',
   'phone': '01223 244277',
   'postcode': 'c.b 1, 7 d.y',
   'pricerange': 'cheap'},
  {'addr': 'cambridge leisure park clifton way',
   'area': 'south',
   'food': 'portuguese',
   'name': 'nandos',
   'phone': '01223 327908',
   'postcode': 'c.b 1, 7 d.y',
   'pricerange': 'cheap'}]]

## 3. Build and Train a Bot

The below image comes from the [TripPy paper](https://arxiv.org/pdf/2005.02877.pdf) and sketches out the models architecture.

&nbsp;
![trippy_architecture_original.png](img/trippy_architecture_original.jpg)
&nbsp;

The entire dialogue history, the last system & user utterances are tokenized and fed into a [BERT Model](https://arxiv.org/pdf/1810.04805.pdf). The model makes use of attention to calculate the importance of tokens in the input. In TripPy the BERT model is trained to do binary clasification for each input token in regards to whether it is a slot value of one of the predefined slot names.

For example, for the slot name "pricerange" the model will look at each token and classify whether it corresponds to that slot. For the input: *I want cheap food*, the output for pricerange should be [0,0,1,0], hence identifying that cheap corresponds to the pricerange. This span prediction is then used to copy the value out of the input.

Apart from "span" (also called "copy_value"), other "class types" (Predictions made for each slot name) are: 
- "dontcare" The model thinks the user does not care about this slot name's value
- "none": The user has not yet indicated his preference for this slot name
- "refer": The user has indicated his preference via another slot name
- "inform": The model has previously informed the user about the slot name
- "true / false": Used when there are slotnames with boolean values

Below is a sketch for how the full TripPy model has been implemented in DeepPavlov:

&nbsp;
![trippy_architecture.png](img/trippy_architecture.png)
&nbsp;

The above image also includes the input & input processing steps, while the previous sketch starts with the BERT Model (BERTForDST). 
Novel things in the DeepPavlov TripPy implementation are:
- The preprocessing is robust to datasets which do not contain position labels (During training TripPy requires position labels to train up its copy value capabilities) - This has been done by calculating Levenshtein distances
- An action prediction head has been added, which predicts what action the system should take from a predefined list of actions
- A database connection has been added, which allows the model to retrieve information about slot values from an sqlite Database
- A Natural Language Generation component has been added, which takes in the predicted action and database results and puts together the final response tothe user


We will now proceed with configuring the model & training.

In [None]:
from deeppavlov import configs
from deeppavlov.core.common.file import read_json

# Use TripPy Config
gobot_config = read_json(configs.go_bot.trippy_dstc2_minimal)

gobot_config['chainer']['pipe'][-1]['nlg_manager']['template_type'] = 'DefaultTemplate'
gobot_config['chainer']['pipe'][-1]['nlg_manager']['template_path'] = 'my_data/simple-dstc2-templates.txt'

gobot_config['metadata']['variables']['DATA_PATH'] = 'my_data'
gobot_config['metadata']['variables']['MODEL_PATH'] = 'my_bot'



Configure bot to use our database:

In [None]:
gobot_config['chainer']['pipe'][-1]['database'] = {
    'class_name': 'sqlite_database',
    'primary_keys': ["name"],
    'save_path': 'my_bot/db.sqlite'
}

Configure bot to use templates:

In [None]:
gobot_config['chainer']['pipe'][-1]['nlg_manager']['template_type'] = 'DefaultTemplate'
gobot_config['chainer']['pipe'][-1]['nlg_manager']['template_path'] = 'my_data/simple-dstc2-templates.txt'

Specify train/valid/test data path and path to save the final bot model:

In [None]:
gobot_config['metadata']['variables']['DATA_PATH'] = 'my_data'
gobot_config['metadata']['variables']['MODEL_PATH'] = 'my_bot'
# Configure the possible slot names - The "this" slotname is meaningless, but it is somehow part of the training set
gobot_config['chainer']['pipe'][-1]['slot_names'] = ['pricerange', 'this', 'area', 'food']

In [None]:
from deeppavlov import train_model

gobot_config['train']['batch_size'] = 4 # set batch size - Ideally use 8 & set lr to 1e-4 if your GPU allows
gobot_config['train']['max_batches'] = 600 # maximum number of training batches
gobot_config['train']['val_every_n_batches'] = 40 # evaluate on full 'valid' split every x epochs
gobot_config['train']['log_every_n_batches'] = 40 # evaluate on full 'train' split every x batches
gobot_config['train']['validation_patience'] = 10 # evaluate on full 'valid' split every x epochs
gobot_config['train']['log_on_k_batches'] = 10 # How many batches to use for logging

gobot_config['chainer']['pipe'][-1]['debug'] = False
gobot_config['chainer']['pipe'][-1]["optimizer_parameters"] = {"lr": 1e-5, "eps": 1e-6}

train_model(gobot_config)

Optionally, you can download the pre-trained model from kaggle. You will need a kaggle account and to upload your kaggle.json file. Then you may have to run the below cell two times.

In [None]:
### Optional - Download Pretrained TripPy from kaggle ###

# Make your json accessible to kaggle
#!cp /content/kaggle.json /root/.kaggle/

# Download the dataset
#!kaggle datasets download -d muennighoff/trippy-restaurant
#!unzip trippy-restaurant.zip

# Move into correct directory
#!mv db.sqlite /content/DeepPavlov/my_bot/
#!mv model.pth.tar /content/DeepPavlov/my_bot/

Downloading trippy-restaurant.zip to /content/DeepPavlov
 99% 985M/993M [00:09<00:00, 120MB/s]
100% 993M/993M [00:09<00:00, 110MB/s]


### Evaluation of training

Calculating **accuracy** of trained bot: whether predicted system responses match true responses (full string match).

In [None]:
from deeppavlov import evaluate_model

evaluate_model(gobot_config);

With settings of `max_batches=800`, valid accuracy `= 0.44` and test accuracy is `~ 0.45`.


If you have the compute, try training the model with a higher batch size, such as 8, or 16. The code automatically detects multiple GPUs and will run Data Parallelism. You will, however, need to upgrade the transformers huggingface version to 4.X and fix two transfomrer import statements in the TripPy code.



## 4. Interact with Bot

In [None]:
from deeppavlov import build_model

bot = build_model(gobot_config)

2021-07-12 14:09:13.932 INFO in 'deeppavlov.core.data.sqlite_database'['sqlite_database'] at line 66: Loading database from /content/DeepPavlov/my_bot/db.sqlite.
2021-07-12 14:09:17.459 INFO in 'deeppavlov.models.go_bot.trippy'['trippy'] at line 152: Load path /content/DeepPavlov/my_bot/model is given.
2021-07-12 14:09:17.461 INFO in 'deeppavlov.models.go_bot.trippy'['trippy'] at line 159: Load path /content/DeepPavlov/my_bot/model.pth.tar exists.
2021-07-12 14:09:17.463 INFO in 'deeppavlov.models.go_bot.trippy'['trippy'] at line 160: Initializing `TripPy` from saved.
2021-07-12 14:09:17.465 INFO in 'deeppavlov.models.go_bot.trippy'['trippy'] at line 163: Loading weights from /content/DeepPavlov/my_bot/model.pth.tar.
2021-07-12 14:09:18.263 INFO in 'deeppavlov.core.models.torch_model'['torch_model'] at line 98: Model was successfully initialized! Model summary:
 BertForDST(
  (bert): BertModel(
    (embeddings): BertEmbeddings(
      (word_embeddings): Embedding(30522, 768, padding_idx

In [None]:
bot.reset()
bot(['hi, i want to eat, can you suggest a place to go?'])

[['What kind of food would you like?']]

In [None]:
bot(['Perhaps something cheap'])

[['What part of town do you have in mind?']]

In [None]:
bot(['In the north of town'])

2021-07-16 14:09:36.730 INFO in 'deeppavlov.models.go_bot.trippy'['trippy'] at line 390: Made api_call with dict_keys(['pricerange', 'this', 'area', 'food']), got 11 results.


[['api_call area="north" food="dontcare" pricerange="dontcare"',
  'Meghna is a nice place in the north of town and the prices are moderate.']]

In [None]:
bot(['Whats their phone number?'])

[['The phone number of meghna is 01223 727410.']]

In [None]:
bot(['and the address?'])

[['Sure, meghna is on 205 victoria road chesterton.']]

In [None]:
bot(['whats their pricerange again?'])

[['The price range at meghna is moderate.']]

In [None]:
bot(['Alright sounds good, thank you!'])

[['You are welcome!']]

#### Original

These are examples used in the original DeepPavlov Go Bot Extended Tutorial.

In [None]:
bot.reset()
bot(['hi, i want to eat, can you suggest a place to go?'])

[['What kind of food would you like?']]

In [None]:
bot(['i want cheap food'])

[['What part of town do you have in mind?']]

In [None]:
bot(['chinese food'])

2021-07-12 14:10:38.514 INFO in 'deeppavlov.models.go_bot.trippy'['trippy'] at line 390: Made api_call with dict_keys(['pricerange', 'this', 'area', 'food']), got 16 results.


[['api_call area="dontcare" food="chinese" pricerange="dontcare"',
  'The good luck chinese food takeaway serves chinese food in the expensive price range.']]

In [None]:
bot(['thanks, give me their address'])

[['Sure, the good luck chinese food takeaway is on 82 cherry hinton road cherry hinton.']]

In [None]:
bot(['i want their phone number too'])

[['The phone number of the good luck chinese food takeaway is 01223 244149.']]

In [None]:
bot(['bye'])

[['The good luck chinese food takeaway serves chinese food in the expensive price range.']]

In [None]:
bot.reset()
bot(['Have you ever been in Cambridge?'])

2021-07-12 14:13:57.850 INFO in 'deeppavlov.models.go_bot.trippy'['trippy'] at line 390: Made api_call with dict_keys(['pricerange', 'this', 'area', 'food']), got 109 results.


[['api_call area="dontcare" food="dontcare" pricerange="dontcare"',
  'Frankie and bennys is a great restaurant.']]

In [None]:
bot.reset()
bot(['Can you suggest me a portuguese restaurant in Cambridge?'])

2021-07-12 14:16:43.903 INFO in 'deeppavlov.models.go_bot.trippy'['trippy'] at line 390: Made api_call with dict_keys(['pricerange', 'this', 'area', 'food']), got 2 results.


[['api_call area="dontcare" food="portuguese" pricerange="dontcare"',
  'Nandos serves portuguese food.']]

In [None]:
bot(['Does it have sangria?'])

[['Nandos serves portuguese food.']]

In [None]:
bot.reset()
bot(['Where can I get good pizza?'])

[['You are welcome!']]

In [None]:
bot(['Where can I get good pizza?'])

[['What part of town do you have in mind?']]

In [None]:
bot(['South of town'])

2021-07-12 14:18:43.686 INFO in 'deeppavlov.models.go_bot.trippy'['trippy'] at line 390: Made api_call with dict_keys(['pricerange', 'this', 'area', 'food']), got 9 results.


[['api_call area="south" food="dontcare" pricerange="dontcare"',
  'Frankie and bennys is a nice place in the south of town and the prices are expensive.']]

In [None]:
bot(['Whats their phone number?'])

[['The phone number of frankie and bennys is 01223 412430.']]