# NLP PROJECT
In this project, we are going to use the [Reformer](https://arxiv.org/abs/2001.04451), also known as the efficient Transformer, to generate a dialogue between two bots. We will feed the conversations to our model and it will learn how to understand the context of each one. Not only will it learn how to answer questions but it will also know how to ask questions if it needs more info. For example, after a customer asks for a train ticket, the chatbot can ask what time the said customer wants to leave. We can use this concept to automate call centers, hotel receptions, personal trainers, or any type of customer service.The Breakdown of this notebook is as follows:

* Understand how the Reformer works
* Explore the [MultiWoz](https://arxiv.org/abs/1810.00278) dataset
* Process the data to feed it into the model
* Train your model
* Generate a dialogue by feeding a question to the model

<a name="1"></a>
# Part 1:   Exploring the MultiWoz dataset

We will start by exploring the MultiWoz dataset. The dataset has more than 10,000 human annotated dialogues and spans multiple domains and topics. Some dialogues include multiple domains and others include single domains.

In [None]:
import json
import random
import numpy as np
from termcolor import colored,cprint

!pip -q install trax
import trax   
from trax import layers as tl
from trax.supervised import training
!pip list | grep trax

trax                          1.4.1


In [None]:
from google.colab import drive
drive.mount('/content/gdrive')

Drive already mounted at /content/gdrive; to attempt to forcibly remount, call drive.mount("/content/gdrive", force_remount=True).


In [None]:
DATA_FILE = 'data.json'

DATA_DIR = '/content/gdrive/MyDrive/week4/data'

# dictionary where we load the dialogue dataset
DIALOGUE_DB = {}

# vocabulary filename
VOCAB_FILE = 'en_32k.subword'

# vocabulary file directory
VOCAB_DIR = '/content/gdrive/MyDrive/week4/data/vocabs'

In [None]:
def load_json(directory, file):
    with open(f'{directory}/{file}') as file: 
        db = json.load(file)
    return db

DIALOGUE_DB = load_json(DATA_DIR, DATA_FILE)

In [None]:
print(f'The number of dialogues is: {len(DIALOGUE_DB)}')

The dialogues are composed of multiple files and the filenames are used as keys in our dictionary. Those with multi-domain dialogues have "MUL" in their filenames while single domain dialogues have either "SNG" or "WOZ".

In [None]:
print(list(DIALOGUE_DB.keys())[0:7]) 

There are 10,438 conversations, each in its own file.  We will train our model on all those conversations. Each file is also loaded into a dictionary and each has two keys which are the following:

In [None]:
print(DIALOGUE_DB['SNG0073.json'].keys())

The `goal` also points to a dictionary and it contains several keys pertaining to the objectives of the conversation. For example below, we can see that the conversation will be about booking a taxi.

In [None]:
DIALOGUE_DB['SNG0073.json']['goal']

The `log` on the other hand contains the dialog. It is a list of dictionaries and each element of this list contains several descriptions as well.

In [None]:
# get first element of the log list
DIALOGUE_DB['SNG0073.json']['log'][0]

In [None]:
print(' Person 1: ', DIALOGUE_DB['SNG0073.json']['log'][0]['text'])
print(' Person 2: ',DIALOGUE_DB['SNG0073.json']['log'][1]['text'])

<a name="ex01"></a>
`get_conversation()` function extracts the conversations from the dataset's file.

In [None]:
def get_conversation(file, data_db):
    result = ''
    len_msg_log = len(data_db[file]['log'])
    delimiter_1 = ' Person 1: '
    delimiter_2 = ' Person 2: '
    
    for i in range(len_msg_log):
        cur_log = data_db[file]['log'][i]
        if i%2 == 0:                   
            result += delimiter_1
        else: 
            result += delimiter_2
        result += cur_log['text']
    return result

In [None]:
file = 'SNG01856.json'
conversation = get_conversation(file, DIALOGUE_DB)
print(conversation)

In [None]:
DIALOGUE_DB['SNG01856.json']['log'][0]

The dataset also comes with hotel, hospital, taxi, train, police, and restaurant databases. For example, in case you need to call a doctor, or a hotel, or a taxi, this will allow you to automate the entire conversation. Take a look at the files accompanying the data set.

In [None]:
attraction_file = open('/content/gdrive/MyDrive/week4/data/attraction_db.json')
attractions = json.load(attraction_file)
print(attractions[0])

In [None]:
hospital_file = open('/content/gdrive/MyDrive/week4/data/hospital_db.json')
hospitals = json.load(hospital_file)
print(hospitals[0])

In [None]:
hotel_file = open('/content/gdrive/MyDrive/week4/data/hotel_db.json')
hotels = json.load(hotel_file)
print(hotels[0])

In [None]:
police_file = open('/content/gdrive/MyDrive/week4/data/hotel_db.json')
police = json.load(police_file)
print(police[0])

In [None]:
restaurant_file = open('/content/gdrive/MyDrive/week4/data/restaurant_db.json')
restaurants = json.load(restaurant_file)
print(restaurants[0])

In [None]:
with open('/content/gdrive/MyDrive/week4/data/README') as file:
    print(file.read())

In [None]:
all_files = DIALOGUE_DB.keys()

untokenized_data = []

for file in all_files:
    result = get_conversation(file, DIALOGUE_DB)
    untokenized_data.append(result)
print(untokenized_data[0])

Now let us split the list to a train and eval dataset.

In [None]:
# shuffle the list we generated above
random.shuffle(untokenized_data)

cut_off = int(len(untokenized_data) * .05)

train_data, eval_data = untokenized_data[:-cut_off], untokenized_data[-cut_off:]

print(f'number of conversations in the data set: {len(untokenized_data)}')
print(f'number of conversations in train set: {len(train_data)}')
print(f'number of conversations in eval set: {len(eval_data)}')

<a name="2.1"></a>
## 2.1   Tokenizing, batching with bucketing
We can now proceed in generating tokenized batches of our data. Let's first define a utility generator function to yield elements from our data sets:

In [None]:
def stream(data):
    # loop over the entire data
    while True:
        # get a random element
        d = random.choice(data)
        # yield a tuple pair of identical values 
        # (i.e. our inputs to the model will also be our targets during training)
        yield (d, d)

Now let's define our data pipeline for tokenizing and batching our data.

In [None]:
# trax allows us to use combinators to generate our data pipeline
data_pipeline = trax.data.Serial(
    # randomize the stream
    trax.data.Shuffle(),
    
    # tokenize the data
    trax.data.Tokenize(vocab_dir=VOCAB_DIR,
                       vocab_file=VOCAB_FILE),
    
    # filter too long sequences
    trax.data.FilterByLength(2048),
    
    # bucket by length
    trax.data.BucketByLength(boundaries=[128, 256,  512, 1024],
                             batch_sizes=[16,    8,    4,   2, 1]),
    
    # add loss weights but do not add it to the padding tokens (i.e. 0)
    trax.data.AddLossWeights(id_to_mask=0)
)

# apply the data pipeline to our train and eval sets
train_stream = data_pipeline(stream(train_data))
eval_stream = data_pipeline(stream(eval_data))

In [None]:
# the stream generators will yield (input, target, weights). let's just grab the input for inspection
inp, _, _ = next(train_stream)

# print the shape. format is (batch size, token length)
print("input shape: ", inp.shape)

# detokenize the first element
print(trax.data.detokenize(inp[0], vocab_dir=VOCAB_DIR, vocab_file=VOCAB_FILE))

In [None]:
import sys
sys.path.insert(0,'/content/')

<a name="3"></a>
# Part 3:   Reversible layers

When running large deep models, we often run out of memory as each layer allocates memory to store activations for use in backpropagation. To save this resource, we need to be able to recompute these activations during the backward pass without storing them during the forward pass. Take a look first at the leftmost diagram below. 

This is how the residual networks are implemented in the standard Transformer. It follows that, given `F()` is Attention and `G()` is Feed-forward(FF). 
: 

\begin{align}  
\mathrm{y}_\mathrm{a} &= \mathrm{x} + \mathrm{F}\left(\mathrm{x}\right)\tag{1} \\
\mathrm{y}_{b}&=\mathrm{y}_{a}+\mathrm{G}\left(\mathrm{y}_{a}\right)\tag{2}\\
\end{align}


As you can see, it requires that $\mathrm{x}$ and $\mathrm{y}_{a}$ be saved so it can be used during backpropagation. We want to avoid this to conserve memory and this is where reversible residual connections come in. They are shown in the middle and rightmost diagrams above. The key idea is that we will start with two copies of the input to the model and at each layer we will only update one of them. The activations that we *don’t* update are the ones that will be used to compute the residuals. 

Now in this reversible set up you get the following instead: 

\begin{align}  
\mathrm{y}_{1}&=\mathrm{x}_{1}+\mathrm{F}\left(\mathrm{x}_{2}\right)\tag{3}\\
\mathrm{y}_{2}&=\mathrm{x}_{2}+\mathrm{G}\left(\mathrm{y}_{1}\right)\tag{4}\\
\end{align}
To recover $\mathrm{(x_1,x_2)}$ from $\mathrm{(y_1, y_2)}$ 

\begin{align}  
\mathrm{x}_{2}&=\mathrm{y}_{2}-\mathrm{G}\left(\mathrm{y}_{1}\right)\tag{5}\\
\mathrm{x}_{1}&=\mathrm{y}_{1}-\mathrm{F}\left(\mathrm{x}_{2}\right)\tag{6}\\
\end{align}

With this configuration, we’re now able to run the network fully in reverse. You'll notice that during the backward pass, $\mathrm{x2}$ and $\mathrm{x1}$ can be recomputed based solely on the values of $\mathrm{y2}$ and $\mathrm{y1}$. No need to save it during the forward pass.

In [None]:
# UNQ_C2
# GRADED FUNCTION: reversible_layer_forward
def reversible_layer_forward(x, f, g):
    x1, x2 = np.split(x, 2, axis=-1) 

    y1 = x1 + f(x2)
    y2 = x2 + g(y1)
    # concatenate y1 and y2 along the depth dimension. be sure output is of type np.ndarray
    y = np.concatenate([y1, y2], axis=-1)
    return y

We will now implement the `reversible_layer_reverse` function  which is possible because at every time step you have $x_1$ and $x_2$ and $y_2$ and $y_1$, along with the function `f`, and `g`. Where `f` is the attention and `g` is the feedforward. This allows us to compute equations 5 and 6.

In [None]:
def reversible_layer_reverse(y, f, g):
    y1, y2 = np.split(y, 2, axis=-1)

    x2 = y2 - g(y1)
    x1 = y1 - f(x2)
    # concatenate x1 and x2 along the depth dimension
    x = np.concatenate([x1, x2], axis=-1)
    return x

In [None]:
#Reformer Language Model
def ReformerLM(vocab_size=33000, n_layers=2, mode='train', attention_type=tl.SelfAttention):
    model = tl.Serial( 
                trax.models.reformer.ReformerLM( 
                vocab_size=vocab_size,
                n_layers=n_layers,
                mode=mode,
                attention_type=attention_type
            ), tl.LogSoftmax() 
        )
    return model

In [None]:
temp_model = ReformerLM('train')
print(str(temp_model))
del temp_model 

We will now use CrossEntropyLoss loss function with Adam optimizer to optimize our network.

In [None]:
def training_loop(ReformerLM, train_gen, eval_gen, output_dir = "./model/"):
    # use the warmup_and_rsqrt_decay learning rate schedule
    lr_schedule = trax.lr.warmup_and_rsqrt_decay(
        n_warmup_steps=1000, max_value=0.01)

    train_task = training.TrainTask(            
        train_gen,
        tl.CrossEntropyLoss(),
        trax.optimizers.Adam(0.01),
        lr_schedule,
        None)

    eval_task = training.EvalTask(                      
        labeled_data=eval_gen,
        metrics=[tl.CrossEntropyLoss(), tl.Accuracy()]
    )

    loop = training.Loop(ReformerLM(mode='train'),
                         train_task,
                         eval_tasks=[eval_task],
                         output_dir=output_dir)
    return loop

In [None]:
# # UNIT TEST COMMENT: Use the train task and eval task for grading train_model
# test_loop = training_loop(ReformerLM, train_stream, eval_stream)
# train_task = test_loop.tasks
# eval_task = test_loop.eval_tasks

# print(train_task)
# print(eval_task)



[<trax.supervised.training.TrainTask object at 0x7fb3600cf7c0>]
[<trax.supervised.training.EvalTask object at 0x7fb36006cc10>]


In [None]:
# # BEGIN UNIT TEST
# # w4_unittest.test_tasks(train_task, eval_task)
# w4_unittest.test_tasks(test_loop)
# # END UNIT TEST

[92m All tests passed


In [None]:
# !rm -f model/model.pkl.gz
# loop = training_loop(ReformerLM, train_stream, eval_stream)
# loop.run(10)



**Approximate Expected output:**  

```

Step      1: Ran 1 train steps in 55.73 secs
Step      1: train CrossEntropyLoss |  10.41907787
Step      1: eval  CrossEntropyLoss |  10.41005802
Step      1: eval          Accuracy |  0.00000000

Step     10: Ran 9 train steps in 108.21 secs
Step     10: train CrossEntropyLoss |  10.15449715
Step     10: eval  CrossEntropyLoss |  9.63478279
Step     10: eval          Accuracy |  0.16350447
``` 

We will be using the [autoregressive_sample_stream()](https://trax-ml.readthedocs.io/en/latest/trax.supervised.html#trax.supervised.decoding.autoregressive_sample_stream) decoding method from Trax to do fast inference. Let's define a few parameters to initialize our model.

In [None]:
# define the `predict_mem_len` and `predict_drop_len` of tl.SelfAttention
def attention(*args, **kwargs):
    # number of input positions to remember in a cache when doing fast inference. 
    kwargs['predict_mem_len'] = 120
    # number of input elements to drop once the fast inference input cache fills up.
    kwargs['predict_drop_len'] = 120
    # return the attention layer with the parameters defined above
    return tl.SelfAttention(*args, **kwargs)

# define the model using the ReformerLM function you implemented earlier.
model = ReformerLM(
    vocab_size=33000,
    n_layers=6,
    mode='predict',
    attention_type=attention,
)

# define an input signature so we can initialize our model. shape will be (1, 1) and the data type is int32.
shape11 = trax.shapes.ShapeDtype((1, 1), dtype=np.int32)

We can now initialize our model from a file containing the pretrained weights. We will save this starting state so we can reset the model state when we generate a new conversation. This will become clearer in the `generate_dialogue()` function later.

In [None]:
# initialize from file
model.init_from_file('/content/gdrive/MyDrive/week4/chatbot_model1.pkl.gz',weights_only=True, input_signature=shape11)

# save the starting state
STARTING_STATE = model.state

Let's define a few utility functions as well to help us tokenize and detokenize. We can use the [tokenize()](https://trax-ml.readthedocs.io/en/latest/trax.data.html#trax.data.tf_inputs.tokenize) and [detokenize()](https://trax-ml.readthedocs.io/en/latest/trax.data.html#trax.data.tf_inputs.detokenize) from `trax.data.tf_inputs` to do this.

In [None]:
def tokenize(sentence, vocab_file, vocab_dir):
    return list(trax.data.tokenize(iter([sentence]), vocab_file=vocab_file, vocab_dir=vocab_dir))[0]

def detokenize(tokens, vocab_file, vocab_dir):
    return trax.data.detokenize(tokens, vocab_file=vocab_file, vocab_dir=vocab_dir)

We are now ready to define our decoding function. This will return a generator that yields that next symbol output by the model. It will be able to predict the next words by just feeding it a starting sentence.

<a name="ex06"></a>
### Exercise 06
**Instructions:** Implement the function below to return a generator that predicts the next word of the conversation.

In [None]:
# UNQ_C6
# GRADED FUNCTION
def ReformerLM_output_gen(ReformerLM, start_sentence, vocab_file, vocab_dir, temperature):
    """
    Args:
        ReformerLM:  the Reformer language model you just trained
        start_sentence (string): starting sentence of the conversation
        vocab_file (string): vocabulary filename
        vocab_dir (string): directory of the vocabulary file
        temperature (float): parameter for sampling ranging from 0.0 to 1.0.
            0.0: same as argmax, always pick the most probable token
            1.0: sampling from the distribution (can sometimes say random things)

    Returns:
        generator: yields the next symbol generated by the model
    """
    
    ### START CODE HERE (REPLACE INSTANCES OF 'None' WITH YOUR CODE) ###
    
    # Create input tokens using the the tokenize function
    input_tokens = tokenize(start_sentence, vocab_file=vocab_file, vocab_dir=vocab_dir)
    
    # Add batch dimension to array. Convert from (n,) to (x, n) where 
    # x is the batch size. Default is 1. (hint: you can use np.expand_dims() with axis=0)
    # input_tokens_with_batch = np.array(input_tokens)[None, :]
    input_tokens_with_batch = np.expand_dims(input_tokens, axis=0)
    
    # call the autoregressive_sample_stream function from trax
    output_gen = trax.supervised.decoding.autoregressive_sample_stream( 
        # model
        model=ReformerLM,
        # inputs will be the tokens with batch dimension
        inputs=input_tokens_with_batch,
        # temperature
        temperature=temperature
    )
    
    ### END CODE HERE ###
    
    return output_gen

In [None]:
# BEGIN UNIT TEST
import pickle

WEIGHTS_FROM_FILE = ()

with open('weights', 'rb') as file:
    WEIGHTS_FROM_FILE = pickle.load(file)

shape11 = trax.shapes.ShapeDtype((1, 1), dtype=np.int32)

def attention(*args, **kwargs):
    kwargs['predict_mem_len'] = 120
    kwargs['predict_drop_len'] = 120
    return tl.SelfAttention(*args, **kwargs)

test_model = ReformerLM(vocab_size=5, n_layers=1, mode='predict', attention_type=attention)

test_output_gen = ReformerLM_output_gen(test_model, "test", vocab_file=VOCAB_FILE, vocab_dir=VOCAB_DIR, temperature=0)

test_model.init_weights_and_state(shape11)

test_model.weights = WEIGHTS_FROM_FILE

output = []

for i in range(6):
    output.append(next(test_output_gen)[0])

print(output)

# free memory
del test_model 
del WEIGHTS_FROM_FILE
del test_output_gen
# END UNIT TEST

[Array(1, dtype=int32), Array(0, dtype=int32), Array(4, dtype=int32), Array(3, dtype=int32), Array(0, dtype=int32), Array(4, dtype=int32)]


In [None]:
test_model = ReformerLM(vocab_size=5, n_layers=1, mode='predict', attention_type=attention)
test_output_gen = ReformerLM_output_gen(test_model, "test", vocab_file=VOCAB_FILE, vocab_dir=VOCAB_DIR, temperature=0)

w4_unittest.test_ReformerLM_output_gen(test_model, test_output_gen)
del test_model, test_output_gen

Generated output [Array(1, dtype=int32), Array(0, dtype=int32), Array(4, dtype=int32), Array(3, dtype=int32), Array(0, dtype=int32), Array(4, dtype=int32)]
[92m All tests passed


***Expected value:***

[1, 0, 4, 3, 0, 4]

Great! Now you will be able to see the model in action. The utility function below will call the generator you just implemented and will just format the output to be easier to read. 

In [None]:
shape11 = trax.shapes.ShapeDtype((1, 1), dtype=np.int32)

def attention(*args, **kwargs):
    kwargs['predict_mem_len'] = 120  # max length for predictions
    kwargs['predict_drop_len'] = 120  # never drop old stuff
    return tl.SelfAttention(*args, **kwargs)

model = ReformerLM(
    vocab_size=33000,
    n_layers=6,
    mode='predict',
    attention_type=attention,
)

In [None]:
model.init_from_file('/content/gdrive/MyDrive/week4/chatbot_model1.pkl.gz',
                     weights_only=True, input_signature=shape11)

STARTING_STATE = model.state

In [None]:
# def generate_dialogue(ReformerLM, model_state, start_sentence, vocab_file, vocab_dir, max_len, temperature):
#     """
#     Args:
#         ReformerLM:  the Reformer language model you just trained
#         model_state (np.array): initial state of the model before decoding
#         start_sentence (string): starting sentence of the conversation
#         vocab_file (string): vocabulary filename
#         vocab_dir (string): directory of the vocabulary file
#         max_len (int): maximum number of tokens to generate 
#         temperature (float): parameter for sampling ranging from 0.0 to 1.0.
#             0.0: same as argmax, always pick the most probable token
#             1.0: sampling from the distribution (can sometimes say random things)

#     Returns:
#         generator: yields the next symbol generated by the model
#     """  
    
#     # define the delimiters we used during training
#     delimiter_1 = 'Person 1: ' 
#     delimiter_2 = 'Person 2: '
    
#     # initialize detokenized output
#     sentence = ''
    
#     # token counter
#     counter = 0
    
#     # output tokens. we insert a ': ' for formatting
#     result = [tokenize(': ', vocab_file=vocab_file, vocab_dir=vocab_dir)]
    
#     # reset the model state when starting a new dialogue
#     ReformerLM.state = model_state
    
#     # calls the output generator implemented earlier
#     output = ReformerLM_output_gen(ReformerLM, start_sentence, vocab_file=VOCAB_FILE, vocab_dir=VOCAB_DIR, temperature=temperature)
    
#     # print the starting sentence
#     print(start_sentence.split(delimiter_2)[0].strip())
    
#     # loop below yields the next tokens until max_len is reached. the if-elif is just for prettifying the output.
#     for o in output:
        
#         result.append(o)
        
#         sentence = detokenize(np.concatenate(result, axis=0), vocab_file=VOCAB_FILE, vocab_dir=VOCAB_DIR)
        
#         if sentence.endswith(delimiter_1):
#             sentence = sentence.split(delimiter_1)[0]
#             print(f'{delimiter_2}{sentence}')
#             sentence = ''
#             result.clear()
        
#         elif sentence.endswith(delimiter_2):
#             sentence = sentence.split(delimiter_2)[0]
#             print(f'{delimiter_1}{sentence}')
#             sentence = ''
#             result.clear()

#         counter += 1
        
#         if counter > max_len:
#             break    



In [None]:
def generate_dialogue(ReformerLM, model_state, start_sentence, vocab_file, vocab_dir, max_len, temperature):
    """
    Args:
        ReformerLM:  the Reformer language model you just trained
        model_state (np.array): initial state of the model before decoding
        start_sentence (string): starting sentence of the conversation
        vocab_file (string): vocabulary filename
        vocab_dir (string): directory of the vocabulary file
        max_len (int): maximum number of tokens to generate 
        temperature (float): parameter for sampling ranging from 0.0 to 1.0.
            0.0: same as argmax, always pick the most probable token
            1.0: sampling from the distribution (can sometimes say random things)

    Returns:
        generator: yields the next symbol generated by the model
    """  
    
    # define the delimiters we used during training
    delimiter_1 = 'Person 1: ' 
    delimiter_2 = 'Person 2: '
    
    # initialize detokenized output
    sentence = ''
    
    # token counter
    counter = 0
    
    # output tokens. we insert a ': ' for formatting
    result = [tokenize(': ', vocab_file=vocab_file, vocab_dir=vocab_dir)]
    
    # reset the model state when starting a new dialogue
    ReformerLM.state = model_state
    
    # calls the output generator implemented earlier
    output = ReformerLM_output_gen(ReformerLM, start_sentence, vocab_file=VOCAB_FILE, vocab_dir=VOCAB_DIR, temperature=temperature)
    
    # print the starting sentence
    print(start_sentence.split(delimiter_2)[0].strip())
    
    # loop below yields the next tokens until max_len is reached. the if-elif is just for prettifying the output.
    for o in output:
        
        result.append(o)
        
        sentence = detokenize(np.concatenate(result, axis=0), vocab_file=VOCAB_FILE, vocab_dir=VOCAB_DIR)
        
        if sentence.endswith(delimiter_1):
            sentence = sentence.split(delimiter_1)[0]
            print(f'{delimiter_2}{sentence}')
            sentence = ''
            result.clear()
        
        elif sentence.endswith(delimiter_2):
            sentence = sentence.split(delimiter_2)[0]
            print(f'{delimiter_1}{sentence}')
            sentence = ''
            result.clear()

        counter += 1
        
        if counter > max_len:
            break

We can now feed in different starting sentences and see how the model generates the dialogue. You can even input your own starting sentence. Just remember to ask a question that covers the topics in the Multiwoz dataset so you can generate a meaningful conversation.

In [None]:
sample_sentence = ' Person 1: Are there theatres in town? Person 2: '
generate_dialogue(ReformerLM=model, model_state=STARTING_STATE, start_sentence=sample_sentence, vocab_file=VOCAB_FILE, vocab_dir=VOCAB_DIR, max_len=120, temperature=0.2)

Person 1: Are there theatres in town?
Person 2: : There are 4 theatres in town. Do you have a preference on area? 
Person 1: No, I don't care. Which one do you recommend? 
Person 2: I recommend the Mumford Theatre. It's in the east at Anglia Ruskin Enterprise, east road. Would you like more information on it? 
Person 1: Yes, could I have the postcode, and entrance fee please? 
Person 1: The postcode is cb11pt then i hate i hate i hate i hate i need their info 


In [None]:
sample_sentence = ' Person 1: Is there a hospital nearby? Person 2: '
generate_dialogue(ReformerLM=model, model_state=STARTING_STATE, start_sentence=sample_sentence, vocab_file=VOCAB_FILE, vocab_dir=VOCAB_DIR, max_len=120, temperature=0.2)

Person 1: Is there a hospital nearby?
Person 2: : Addensbrookes Hospital is located at Hills Rd, Cambridge, postcode CB20QQ. 
Person 1: Thank you for the information. That's all I need. 
Person 2: Thank you for using our services.Goodbye.
Person 1: Thank you for your help. 
Person 2: Thank you for using our services.Goodbye.
Person 1: Goodbye. 


In [None]:
sample_sentence = ' Person 1: Can you book a taxi? Person 2: '
generate_dialogue(ReformerLM=model, model_state=STARTING_STATE, start_sentence=sample_sentence, vocab_file=VOCAB_FILE, vocab_dir=VOCAB_DIR, max_len=120, temperature=0.2)

Person 1: Can you book a taxi?
Person 2: : I sure can. When would you like to leave? 
Person 1: I want to leave after 13:00. 
Person 2: I'm sorry, I have no listings for that time. Would you like to try a different time? 
Person 1: Yes, let's try to find a train. 
Person 2: I'm sorry, but I'm not able to book that train. Could you change your request? 


In [None]:
sample_sentence = ' Person 1: Where is the Japanese restaurant? Person 2: '
generate_dialogue(ReformerLM=model, model_state=STARTING_STATE, start_sentence=sample_sentence, vocab_file=VOCAB_FILE, vocab_dir=VOCAB_DIR, max_len=120, temperature=0.2)

Person 1: Where is the Japanese restaurant?
Person 2: : There is no such listing for the restaurant called Cote. Would you like to book a table? 
Person 1: Yes, I would like to book a table for 1 at 18:00 on Friday. 
Person 2: I'm sorry but I am unable to book that for you. Would you like to try another time or day perhaps another day perhaps? 
Person 1: How about 18:00? 
Person 2: I was able to book you at Cote. Your reference is  
