# Getting Started

In this tutorial, you will know how to
- use the models in **ConvLab-2** to build a dialog agent.
- build a simulator to chat with the agent and evaluate the performance.
- try different module combinations.
- use analysis tool to diagnose your system.

Let's get started!

## Environment setup
Run the command below to install ConvLab-2. Then restart the notebook and skip this commend.

In [0]:
# first install ConvLab-2 and restart the notebook
! git clone https://github.com/thu-coai/ConvLab-2.git && cd ConvLab-2 && pip install -e .

## build an agent

We use the models adapted on [Multiwoz](https://www.aclweb.org/anthology/D18-1547)  dataset to build our agent. This pipeline agent consists of NLU, DST, Policy and NLG modules.

First, import some models:

In [1]:
# common import: convlab2.$module.$model.$dataset
from convlab2.nlu.jointBERT.multiwoz import BERTNLU
from convlab2.nlu.milu.multiwoz import MILU
from convlab2.dst.rule.multiwoz import RuleDST
from convlab2.policy.rule.multiwoz import RulePolicy
from convlab2.nlg.template.multiwoz import TemplateNLG
from convlab2.dialog_agent import PipelineAgent, BiSession
from convlab2.evaluator.multiwoz_eval import MultiWozEvaluator
from pprint import pprint
import random
import numpy as np
import torch

Then, create the models and build an agent:

In [2]:
# go to README.md of each model for more information
# BERT nlu
sys_nlu = BERTNLU()
# simple rule DST
sys_dst = RuleDST()
# rule policy
sys_policy = RulePolicy()
# template NLG
sys_nlg = TemplateNLG(is_user=False)
# assemble
sys_agent = PipelineAgent(sys_nlu, sys_dst, sys_policy, sys_nlg, name='sys')

intent num: 137
tag num: 331
Load from /home/archraven/Documents/GitHub/ConvLab-2/convlab2/nlu/jointBERT/multiwoz/output/all_context/pytorch_model.bin
bert-base-uncased


HBox(children=(HTML(value='Downloading'), FloatProgress(value=0.0, max=433.0), HTML(value='')))




HBox(children=(HTML(value='Downloading'), FloatProgress(value=0.0, max=440473133.0), HTML(value='')))


BERTNLU loaded


That's all! Let's chat with the agent using its response function:

In [3]:
sys_agent.response("I want to find a moderate hotel")

'How about a and b guest house ? Fits your request perfectly . There are 18 of those .'

In [4]:
sys_agent.response("Which type of hotel is it ?")

'It is a guesthouse .'

In [5]:
sys_agent.response("OK , where is its address ?")

'It is located at 124 tenison road.'

In [6]:
sys_agent.response("Thank you !")

'Have a good day .'

In [7]:
sys_agent.response("Try to find me a Chinese restaurant in south area .")

'I would suggest the good luck chinese food takeaway . They are located at 82 Cherry Hinton Road Cherry Hinton. The reference number is 00000003 . I have 3 options for you !.'

In [8]:
sys_agent.response("Which kind of food it provides ?")

'They serve chinese food .'

In [9]:
sys_agent.response("Book a table for 5 , this Sunday .")

'Reference number is : 00000003 .'

## Build a simulator to chat with the agent and evaluate

In many one-to-one task-oriented dialog system, a simulator is essential to train an RL agent. In our framework, we doesn't distinguish user or system. All speakers are **agents**. The simulator is also an agent, with specific policy inside for accomplishing the user goal.

We use `Agenda` policy for the simulator, this policy requires dialog act input, which means we should set DST argument of `PipelineAgent` to None. Then the `PipelineAgent` will pass dialog act to policy directly. Refer to `PipelineAgent` doc for more details.

In [10]:
# MILU
user_nlu = MILU()
# not use dst
user_dst = None
# rule policy
user_policy = RulePolicy(character='usr')
# template NLG
user_nlg = TemplateNLG(is_user=True)
# assemble
user_agent = PipelineAgent(user_nlu, user_dst, user_policy, user_nlg, name='user')

Load from https://convlab.blob.core.windows.net/convlab-2/new_milu(20200922)_multiwoz_all_context.tar.gz


100%|██████████| 10943654/10943654 [22:06<00:00, 8248.99B/s] 
  "num_layers={}".format(dropout, num_layers))


Loading goal model is done



Now we have a simulator and an agent. we will use an existed simple one-to-one conversation controller BiSession, you can also define your own Session class for your special need.

We add `MultiWozEvaluator` to evaluate the performance. It uses the parsed dialog act input and policy output dialog act to calculate **inform f1**, **book rate**, and whether the task is **success**.

In [11]:
evaluator = MultiWozEvaluator()
sess = BiSession(sys_agent=sys_agent, user_agent=user_agent, kb_query=None, evaluator=evaluator)

Let's make this two agents chat! The key is `next_turn` method of `BiSession` class.

In [12]:
def set_seed(r_seed):
    random.seed(r_seed)
    np.random.seed(r_seed)
    torch.manual_seed(r_seed)

set_seed(20200131)

sys_response = ''
sess.init_session()
print('init goal:')
pprint(sess.evaluator.goal)
print('-'*50)
for i in range(20):
    sys_response, user_response, session_over, reward = sess.next_turn(sys_response)
    print('user:', user_response)
    print('sys:', sys_response)
    print()
    if session_over is True:
        break
print('task success:', sess.evaluator.task_success())
print('book rate:', sess.evaluator.book_rate())
print('inform precision/recall/f1:', sess.evaluator.inform_F1())
print('-'*50)
print('final goal:')
pprint(sess.evaluator.goal)
print('='*100)

init goal:
{'restaurant': {'info': {'area': 'west', 'pricerange': 'moderate'},
                'reqt': {'phone': '?'}},
 'train': {'info': {'arriveBy': '10:45',
                    'day': 'sunday',
                    'departure': 'norwich',
                    'destination': 'cambridge'},
           'reqt': {'price': '?'}}}
--------------------------------------------------
user: I need to find information about a certain restaurant , can you help with that ? I am looking for a moderate restaurant . Is that located in the west ?
sys: I recommend saint johns chop house. Okay , may I suggest british food ? There are 3 different places that match your description .

user: What is the phone number of the restaurant ?
sys: 01799521260 is the restaurant phone number.

user: I also need a train. I need to find a train to cambridge please . I 'll be departing from norwich . I would like to leave on sunday .
sys: When would you like to leave by ? Is there a time you need to arrive by ?

user: 

## Try different module combinations

The combination modes of pipeline agent modules are flexible. We support joint models such as MDBT, TRADE, SUMBT for word-DST and MDRG, HDSA, LaRL for word-Policy, once the input and output are matched with previous and next module. We also support End2End models such as Sequicity.

Available models:

- NLU: BERTNLU, MILU, SVMNLU
- DST: RuleDST
- Word-DST: SUMBT, TRADE, MDBT (set `sys_nlu` to `None`)
- Policy: RulePolicy, Imitation, REINFORCE, PPO, GDPL
- Word-Policy: MDRG, HDSA, LaRL (set `sys_nlg` to `None`)
- NLG: Template, SCLSTM
- End2End: Sequicity, DAMD, RNN_rollout (directly used as `sys_agent`)
- Simulator policy: Agenda, VHUS (for `user_policy`)


In [14]:
# available NLU models
from convlab2.nlu.svm.multiwoz import SVMNLU
from convlab2.nlu.jointBERT.multiwoz import BERTNLU
from convlab2.nlu.milu.multiwoz import MILU
# available DST models
from convlab2.dst.rule.multiwoz import RuleDST
# from convlab2.dst.mdbt.multiwoz import MDBT
from convlab2.dst.sumbt.multiwoz import SUMBT
from convlab2.dst.trade.multiwoz import TRADE
# available Policy models
from convlab2.policy.rule.multiwoz import RulePolicy
from convlab2.policy.ppo.multiwoz import PPOPolicy
from convlab2.policy.pg.multiwoz import PGPolicy
from convlab2.policy.mle.multiwoz import MLEPolicy
from convlab2.policy.gdpl.multiwoz import GDPLPolicy
from convlab2.policy.vhus.multiwoz import UserPolicyVHUS
from convlab2.policy.mdrg.multiwoz import MDRGWordPolicy
from convlab2.policy.hdsa.multiwoz import HDSA
from convlab2.policy.larl.multiwoz import LaRL
# available NLG models
from convlab2.nlg.template.multiwoz import TemplateNLG
from convlab2.nlg.sclstm.multiwoz import SCLSTM
# available E2E models
from convlab2.e2e.sequicity.multiwoz import Sequicity
from convlab2.e2e.damd.multiwoz import Damd

Downloading from:  https://convlab.blob.core.windows.net/convlab-2/mdrg_model.zip
Load from https://convlab.blob.core.windows.net/convlab-2/mdrg_model.zip


01/11/2021 00:11:29 - INFO - convlab2.util.allennlp_file_utils -   https://convlab.blob.core.windows.net/convlab-2/mdrg_model.zip not found in cache, downloading to /tmp/tmpstcu3qjc
100%|██████████| 21577107/21577107 [12:49<00:00, 28049.24B/s]
01/11/2021 00:24:21 - INFO - convlab2.util.allennlp_file_utils -   copying /tmp/tmpstcu3qjc to cache at /home/archraven/.convlab2/cache/b0bc758ff68dc79ef5287ddd38b6267f8784df273b2e6f7e496a1e9031c65ca5.ea9a4a5a9034b22be1093ea89deb230956f14487e2c2441b9ee59cef0fc252a2
01/11/2021 00:24:21 - INFO - convlab2.util.allennlp_file_utils -   creating metadata file for /home/archraven/.convlab2/cache/b0bc758ff68dc79ef5287ddd38b6267f8784df273b2e6f7e496a1e9031c65ca5.ea9a4a5a9034b22be1093ea89deb230956f14487e2c2441b9ee59cef0fc252a2
01/11/2021 00:24:21 - INFO - convlab2.util.allennlp_file_utils -   removing temp file /tmp/tmpstcu3qjc


Extracting...
Downloading from:  https://convlab.blob.core.windows.net/convlab-2/mdrg_data.zip
Load from https://convlab.blob.core.windows.net/convlab-2/mdrg_data.zip


01/11/2021 00:24:24 - INFO - convlab2.util.allennlp_file_utils -   https://convlab.blob.core.windows.net/convlab-2/mdrg_data.zip not found in cache, downloading to /tmp/tmpzikv0kam
100%|██████████| 47104409/47104409 [32:27<00:00, 24187.33B/s] 
01/11/2021 00:56:55 - INFO - convlab2.util.allennlp_file_utils -   copying /tmp/tmpzikv0kam to cache at /home/archraven/.convlab2/cache/00a406587d87174b74198f14cae25cd2054923a471c59233f27ec80caef23686.da1518d0f3a98f95e2be9aee8474275aa3e182c5b9faccf16e9deac38752afce
01/11/2021 00:56:56 - INFO - convlab2.util.allennlp_file_utils -   creating metadata file for /home/archraven/.convlab2/cache/00a406587d87174b74198f14cae25cd2054923a471c59233f27ec80caef23686.da1518d0f3a98f95e2be9aee8474275aa3e182c5b9faccf16e9deac38752afce
01/11/2021 00:56:56 - INFO - convlab2.util.allennlp_file_utils -   removing temp file /tmp/tmpzikv0kam


Extracting...
Downloading from:  https://convlab.blob.core.windows.net/convlab-2/mdrg_db.zip
Load from https://convlab.blob.core.windows.net/convlab-2/mdrg_db.zip


01/11/2021 00:57:00 - INFO - convlab2.util.allennlp_file_utils -   https://convlab.blob.core.windows.net/convlab-2/mdrg_db.zip not found in cache, downloading to /tmp/tmpqb6rav9n
100%|██████████| 183081/183081 [00:08<00:00, 22030.78B/s]
01/11/2021 00:57:10 - INFO - convlab2.util.allennlp_file_utils -   copying /tmp/tmpqb6rav9n to cache at /home/archraven/.convlab2/cache/a9766cc757fb79e7ac5266715dd065c687f884e5dc06840bba4d3b07307eb95b.b7bac7303e20c54957b367fa386215aaa595d5df9fb04341554b2067d458679c
01/11/2021 00:57:10 - INFO - convlab2.util.allennlp_file_utils -   creating metadata file for /home/archraven/.convlab2/cache/a9766cc757fb79e7ac5266715dd065c687f884e5dc06840bba4d3b07307eb95b.b7bac7303e20c54957b367fa386215aaa595d5df9fb04341554b2067d458679c
01/11/2021 00:57:10 - INFO - convlab2.util.allennlp_file_utils -   removing temp file /tmp/tmpqb6rav9n


Extracting...


[nltk_data] Downloading package stopwords to
[nltk_data]     /home/archraven/nltk_data...
[nltk_data]   Unzipping corpora/stopwords.zip.


NLU+RuleDST or Word-DST:

In [15]:
# NLU+RuleDST:
sys_nlu = BERTNLU()
# sys_nlu = MILU()
# sys_nlu = SVMNLU()
sys_dst = RuleDST()

# or Word-DST:
# sys_nlu = None
# sys_dst = SUMBT()
# sys_dst = TRADE()
# sys_dst = MDBT()

01/11/2021 00:57:14 - DEBUG - urllib3.connectionpool -   Starting new HTTPS connection (1): s3.amazonaws.com:443
01/11/2021 00:57:16 - DEBUG - urllib3.connectionpool -   https://s3.amazonaws.com:443 "HEAD /models.huggingface.co/bert/bert-base-uncased-vocab.txt HTTP/1.1" 200 0
01/11/2021 00:57:16 - INFO - transformers.tokenization_utils -   loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at /home/archraven/.cache/torch/transformers/26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
01/11/2021 00:57:16 - DEBUG - urllib3.connectionpool -   Starting new HTTPS connection (1): s3.amazonaws.com:443


intent num: 137
tag num: 331
Load from /home/archraven/Documents/GitHub/ConvLab-2/convlab2/nlu/jointBERT/multiwoz/output/all_context/pytorch_model.bin
bert-base-uncased


01/11/2021 00:57:18 - DEBUG - urllib3.connectionpool -   https://s3.amazonaws.com:443 "HEAD /models.huggingface.co/bert/bert-base-uncased-config.json HTTP/1.1" 200 0
01/11/2021 00:57:18 - INFO - transformers.configuration_utils -   loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/archraven/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.7156163d5fdc189c3016baca0775ffce230789d7fa2a42ef516483e4ca884517
01/11/2021 00:57:18 - INFO - transformers.configuration_utils -   Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0

BERTNLU loaded


Policy+NLG or Word-Policy:

In [16]:
# Policy+NLG:
sys_policy = RulePolicy()
# sys_policy = PPOPolicy()
# sys_policy = PGPolicy()
# sys_policy = MLEPolicy()
# sys_policy = GDPLPolicy()
sys_nlg = TemplateNLG(is_user=False)
# sys_nlg = SCLSTM(is_user=False)

# or Word-Policy:
# sys_policy = LaRL()
# sys_policy = HDSA()
# sys_policy = MDRGWordPolicy()
# sys_nlg = None

Assemble the Pipeline system agent:

In [17]:
sys_agent = PipelineAgent(sys_nlu, sys_dst, sys_policy, sys_nlg, 'sys')

Or Directly use an end-to-end model:

In [0]:
# sys_agent = Sequicity()
# sys_agent = Damd()

Config an user agent similarly:

In [18]:
user_nlu = BERTNLU()
# user_nlu = MILU()
# user_nlu = SVMNLU()
user_dst = None
user_policy = RulePolicy(character='usr')
# user_policy = UserPolicyVHUS(load_from_zip=True)
user_nlg = TemplateNLG(is_user=True)
# user_nlg = SCLSTM(is_user=True)
user_agent = PipelineAgent(user_nlu, user_dst, user_policy, user_nlg, name='user')

01/11/2021 00:57:25 - DEBUG - urllib3.connectionpool -   Starting new HTTPS connection (1): s3.amazonaws.com:443
01/11/2021 00:57:27 - DEBUG - urllib3.connectionpool -   https://s3.amazonaws.com:443 "HEAD /models.huggingface.co/bert/bert-base-uncased-vocab.txt HTTP/1.1" 200 0
01/11/2021 00:57:27 - INFO - transformers.tokenization_utils -   loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at /home/archraven/.cache/torch/transformers/26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
01/11/2021 00:57:27 - DEBUG - urllib3.connectionpool -   Starting new HTTPS connection (1): s3.amazonaws.com:443


intent num: 137
tag num: 331
Load from /home/archraven/Documents/GitHub/ConvLab-2/convlab2/nlu/jointBERT/multiwoz/output/all_context/pytorch_model.bin
bert-base-uncased


01/11/2021 00:57:28 - DEBUG - urllib3.connectionpool -   https://s3.amazonaws.com:443 "HEAD /models.huggingface.co/bert/bert-base-uncased-config.json HTTP/1.1" 200 0
01/11/2021 00:57:28 - INFO - transformers.configuration_utils -   loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json from cache at /home/archraven/.cache/torch/transformers/4dad0251492946e18ac39290fcfe91b89d370fee250efe9521476438fe8ca185.7156163d5fdc189c3016baca0775ffce230789d7fa2a42ef516483e4ca884517
01/11/2021 00:57:28 - INFO - transformers.configuration_utils -   Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0

BERTNLU loaded
Loading goal model is done


## Use analysis tool to diagnose the system
We provide an analysis tool presents rich statistics and summarizes common mistakes from simulated dialogues, which facilitates error analysis and
system improvement. The analyzer will generate an HTML report which contains
rich statistics of simulated dialogues. For more information, please refer to `convlab2/util/analysis_tool`.

In [19]:
from convlab2.util.analysis_tool.analyzer import Analyzer

# if sys_nlu!=None, set use_nlu=True to collect more information
analyzer = Analyzer(user_agent=user_agent, dataset='multiwoz')

set_seed(20200131)
analyzer.comprehensive_analyze(sys_agent=sys_agent, model_name='sys_agent', total_dialog=100)

01/11/2021 00:58:07 - DEBUG - matplotlib -   (private) matplotlib data path: /home/archraven/miniconda3/envs/convlab/lib/python3.6/site-packages/matplotlib/mpl-data
01/11/2021 00:58:07 - DEBUG - matplotlib -   matplotlib data path: /home/archraven/miniconda3/envs/convlab/lib/python3.6/site-packages/matplotlib/mpl-data
01/11/2021 00:58:07 - DEBUG - matplotlib -   CONFIGDIR=/home/archraven/.config/matplotlib
01/11/2021 00:58:07 - DEBUG - matplotlib -   matplotlib version 3.3.3
01/11/2021 00:58:07 - DEBUG - matplotlib -   interactive is False
01/11/2021 00:58:07 - DEBUG - matplotlib -   platform is linux


01/11/2021 00:58:07 - DEBUG - matplotlib -   CACHEDIR=/home/archraven/.cache/matplotlib
01/11/2021 00:58:07 - DEBUG - matplotlib.font_manager -   Using fontManager instance from /home/archraven/.cache/matplotlib/fontlist-v330.json
01/11/2021 00:58:07 - DEBUG - matplotlib.pyplot -   Loaded backend module://ipykernel.pylab.backend_inline version unknown.
01/11/2021 00:58:07 - DEBUG - matplotlib.pyplot -   Loaded backend module://ipykernel.pylab.backend_inline version unknown.
01/11/2021 00:58:07 - DEBUG - matplotlib.pyplot -   Loaded backend agg version unknown.
dialogue:   1%|          | 1/100 [00:00<01:03,  1.57it/s]01/11/2021 00:58:08 - DEBUG - root -   Value not found in standard value set: [57814143479] (slot: phone domain: taxi)
01/11/2021 00:58:08 - DEBUG - root -   Value not found in standard value set: [63153575725] (slot: phone domain: taxi)
01/11/2021 00:58:09 - DEBUG - root -   Value not found in standard value set: [26224196313] (slot: phone domain: taxi)
01/11/2021 00:58:09

01/11/2021 00:58:42 - DEBUG - root -   Value not found in standard value set: [48024595135] (slot: phone domain: taxi)
01/11/2021 00:58:42 - DEBUG - root -   Value not found in standard value set: [15055936873] (slot: phone domain: taxi)
01/11/2021 00:58:42 - DEBUG - root -   Value not found in standard value set: [94647736980] (slot: phone domain: taxi)
01/11/2021 00:58:42 - DEBUG - root -   Value not found in standard value set: [76515124150] (slot: phone domain: taxi)
01/11/2021 00:58:42 - DEBUG - root -   Value not found in standard value set: [56356466654] (slot: phone domain: taxi)
01/11/2021 00:58:42 - DEBUG - root -   Value not found in standard value set: [88120397188] (slot: phone domain: taxi)
01/11/2021 00:58:42 - DEBUG - root -   Value not found in standard value set: [38799992945] (slot: phone domain: taxi)
01/11/2021 00:58:42 - DEBUG - root -   Value not found in standard value set: [30279814384] (slot: phone domain: taxi)
01/11/2021 00:58:42 - DEBUG - root -   Value not

KeyboardInterrupt: 

To compare several models:

In [0]:
set_seed(20200131)
analyzer.compare_models(agent_list=[sys_agent1, sys_agent2], model_name=['sys_agent1', 'sys_agent2'], total_dialog=100)