## Assignment: Create a Chatbot

### Name: Nikiforos Mandilaras

### Email: nikiforosmandi@windowslive.com

### Date: 15/11/2019

### Introduction

This Notebook was developed as an assignment for the needs of National Center of Scientific Research Demokritos.
The goal is to create a chatbot, a conversatonal model that can give responses to the user's inputs. 

### Dependency packaging and Execution

In order to be able to execute this jupyter notebook, all depedent packages must be present. For this reason we save those along with their versions in requirements.txt (which is included in the submission folder). 

Then we can install them all at once using pip as it can be seen below. To avoid any collisions with existing packages it is necessary to create a new python virtual enviroment and run the command inside there. 

To access data the following code uses relative paths, assuming data are present in the same folder as the jupyter notebook. However all functions can accept absolute paths as well.  

In [1]:
!pip install -r requirements.txt



### Approach

To address this task we are going to need a pretrained language model.
Many such state of the art models are included in the widely used library **_trasformers_** developped by **_Hugging Face_**. We choose to use GPT as it is pretrained on predicting the next word and compared to other models it's relatively small since our resources are limited. 

More details about the approach are provided in the report.

In [1]:
import os
import pickle
import numpy as np
import glob
import copy
import torch
import torch.optim as optim
import torch.nn.functional as F
from torch.utils.data import Dataset, DataLoader 
from datetime import datetime
from transformers import OpenAIGPTLMHeadModel, OpenAIGPTTokenizer, WEIGHTS_NAME, CONFIG_NAME
from itertools import chain
from ast import literal_eval
from itertools import zip_longest
from sklearn.model_selection import train_test_split

### Initializing Model

Below we instantiate a GPT Pytorch model with pre-trained weights on language modelling task along with tokenizer.
Tokenizer is a helper class used to interact with the vocabulary in which our model has been pretrained.

In [2]:
def load_checkpoint(output_dir='openai-gpt'):
    """
    Loads GPT Model and the corresponding tokenizer from local checkpoint.
    If no path is specified the pretrained weights on language modelling task are downloaded.
    
    :param output_dir: path to checkpoint 
    :return : model and tokenizer
    """

    tokenizer = OpenAIGPTTokenizer.from_pretrained(output_dir)
    model = OpenAIGPTLMHeadModel.from_pretrained(output_dir)  
    
    return model, tokenizer

In [3]:
model, tokenizer = load_checkpoint() 

ftfy or spacy is not installed using BERT BasicTokenizer instead of SpaCy & ftfy.


In [4]:
print("Our language model have been pre-trained with a vocabulary of {} words.".format(tokenizer.vocab_size))

Our language model have been pre-trained with a vocabulary of 40478 words.


In [5]:
trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad)

print("Number of trainable model parameters: {}".format(trainable_params))

Number of trainable model parameters: 116534784


As we can see GPT model has been loaded and it consists of 12 layers with a total of more than 116 millios of trainable parameters. We avoided using the more recent GPT-2 as it has 1.5 billions parameters and that may causes issues loading it into RAM. 

In [8]:
def add_special_tokens_(model, tokenizer, special_tokens):
    """
    Adds special tokens to tokenizer and model.
    """
    orig_num_tokens = len(tokenizer.encoder)
    num_added_tokens = tokenizer.add_special_tokens(special_tokens) # adds only if not present
    
    if num_added_tokens > 0:
        print("New tokens added to model: {}".format(num_added_tokens))
        model.resize_token_embeddings(new_num_tokens=orig_num_tokens + num_added_tokens) 
    else:
        print("No new tokens found! Nothing added.")

We saw earlier that the pretrained model uses a vocabulary of around forty thousands words. Apart from those for our task we will need some extra tokens with special meaning. Hopefully tokenizer can handle this process for us.

In [9]:
# various constants needed

SPECIAL_TOKENS_DICT = {'bos_token': '<bos>', 'eos_token': '<eos>', 'pad_token': '<pad>',
                         'additional_special_tokens': ('<speaker1>', '<speaker2>')}

max_history = 2 # pairs of question/answer to be retained
max_sentence_length = 20 # maximum length of a sentence produced by the model 

temperature = 0.75 # increases confidence in the most propable outputs 

use_cuda = True # whether to try to use cuda or not

SPECIAL_TOKENS = ["<bos>", "<eos>", "<speaker1>", "<speaker2>", "<pad>"]
add_special_tokens_(model, tokenizer, SPECIAL_TOKENS_DICT)  

SPECIAL_TOKENS_IDS = tokenizer.convert_tokens_to_ids(SPECIAL_TOKENS)
TIME_FORMAT = '%Y%m%d_%H%M'

New tokens added to model: 5


We added 5 special tokens for denoting the start and the end of the sentences, the type of the input provided to the model(whether it came from the user or the bot) and a finally a token to denote padding. 

### Data Preprocessing

In this section we define a series of functions to appropiate handle the data. At first with function **_parser_** we read the txt files and we keep only the dialogs in a list. Also we tokenize phrases and we convert all words to their corresponding ids in the model's vocabulary. This step is performed early before we create multiple copies of each sentence in many samples as part of history, to optimize performance. 

In [46]:
def parser(datafolder='metalwoz-v1/dialogues/', cache_file='cache_folder/dialogs.txt'): 
    """
    Function that reads files, keeps only 'turns' from each entry and tokenizes them

    :param datafolder: path to the folder that contains the files
    :param cache_file: filepath to save the result
    :return: a list that contains dialogs, each dialog is a list of lists 
             where each of them represents the ids of a phrase,
    """
    try:    # try to open cache file
        with open(cache_file, "rb") as f:
            print("Cache file found loading content.")
            dialogs = pickle.load(f)
            return dialogs
        
    except: # cache file not created yet
        print("Cache file not found. Start processing.")       
    
        dialogs = []
        files = list(glob.glob(os.path.join(datafolder ,"*.txt")))

        for file in files:
            with open(file) as f:
                for line in f.readlines():

                    dialog = literal_eval(line)['turns'][1:] # keep only turns without the first sentence
                    dialog = [tokenizer.convert_tokens_to_ids(tokenizer.tokenize(phrase)) for phrase in dialog]  
                    dialogs.append(dialog) 

        if len(dialogs) > 0:
            print("Saving parsed dialogs to file: {}".format(cache_file))
            with open(cache_file, "wb") as f: # save result so future calls can retrieve it right away
                pickle.dump(dialogs, f)            
                
    return dialogs      

Then with the function **_extract_pairs_** we use those dialogs to create utterances of input, output pairs. In every such pair we preserve in the input the entire past history of the dialog. Additionally we discard the first sentence , which is fixed, and the last one when it said by the user, as no bot answer follows. 

In [11]:
def extract_pairs(dialogs = None, cache_file='cache_folder/pairs.txt'):
    """
    Function that creates pairs of input, output from dialogs, each dialogs corresponds now to many pairs.
    
    :param dialogs: a list with all the dialogs 
    :param cache_file: filepath to save the result
    
    :return : a list whose elements are pairs of input(history), output(expected bot reply)  
    """
    try:    # try to open cache file
        
        with open(cache_file, "rb") as f:
            print("Cache file found loading content.")
            pairs = pickle.load(f)       
            return pairs
        
    except:      # cache file not created yet
        print("Cache file not found. Start processing.")
        
        pairs = [] 
        for dialog in dialogs:
            
            t_dict = {'input': []}
            if len(dialog) % 2 != 0: # discard the last phrase if it was said by the user
                dialog = dialog[:-1]    
            dialog_it = iter(dialog)
            
            for i_phrase, o_phrase in zip_longest(dialog_it, dialog_it): # process phrases two by two        
                try:
                    t_dict["input"].append(t_dict["output"])
                except:
                    pass 
                t_dict["input"].append(i_phrase) # history
                t_dict["output"] = o_phrase
                pairs.append(t_dict)
                t_dict = copy.deepcopy(t_dict) # so future changes address only the new dict
           
        if len(pairs) > 0:
            print("Saving extracted pairs to file: {}".format(cache_file))          
            with open(cache_file, "wb") as f:  # save result so future calls can retrieve it right away
                pickle.dump(pairs, f)
            
        return pairs

A couple of auxilary functions follows. At first, **_adjust_history_** helps us to tune the number of quenstion/answer pairs retained in history. Obviously in every case at least one sentence from the user is kept as input. We conducted training experiments for different values of retained history in order to observe if and it what extend it affects model's performance.

In [12]:
def adjust_history(pairs, max_history=2): 
    """
    Reduces number of previous chat senteces that are going to be included in the input
    
    :param pairs: list with samples 
    :param max_history: Number of pairs of question/answer to be preserved (at least one is preserved)
    :return : two lists, pairs with fixed history and theirs corresponding seq_lenghts 
    """
    
    pairs_len = []
    for pair in pairs:
        
        pair['input'] = pair['input'][-(2*max_history+1):] # at least one phrase is preserved
        pair_len = sum(len(phrase) for phrase in pair['input']) + len(pair['output'])
        pairs_len.append(pair_len)
        
    return pairs, pairs_len   

Secondly, **_filter_samples_** assesses the samples based on their sequense length. Samples that their length exceeds a specified percentile are dropped. The impact of this function is crusial in terms of performance as we will see later. 

In [13]:
def filter_samples(samples, samples_len, percentile=90):
    """
    Filters samples based on sequence lengths.  
    
    :param samples: a list with samples
    :param samples_len: their corresponding lengths
    :param percentile: percentage of samples to preserve  
    
    :return : two lists, preserved samples and their lengths
    """
    
    samples_length = np.array(samples_len)
    reasonable_length = np.percentile(samples_length, percentile)
    print("{}% of the samples have sequence length less than {}".format(percentile, reasonable_length))
    
    samples_red, samples_len_red = [], []
    for sample, sample_len in zip(samples, samples_len):
        
        if sample_len <= reasonable_length:
            samples_red.append(sample)
            samples_len_red.append(sample_len)
    
    return samples_red, samples_len_red  

Executing all the aforementioned procedures leads to the following results.

In [47]:
dialogs = parser()

print("Number of dialogs in the whole dataset: {}".format(len(dialogs)))

pairs = extract_pairs(dialogs) # list of dictionaries of input history and bot's reply

# keep only portion of the chat history to reduce seq_length
pairs, pairs_len = adjust_history(pairs, max_history=max_history) 

print("Number of pairs created: {}".format(len(pairs)))
print("Maximum seq_length observed: {}".format(max(pairs_len)))

pairs_reduced, pairs_len_reduced = filter_samples(pairs, pairs_len) 
print("Number of pairs retained: {}".format(len(pairs_reduced)))

Cache file found loading content.
Number of dialogs in the whole dataset: 37884
Cache file found loading content.
Number of pairs created: 193985
Maximum seq_length observed: 684
90% of the samples have sequence length less than 79.0
Number of pairs retained: 175289


We can see that almost 200000 samples were created with a max sequence length of 684 tokens. This is a huge problems for training as every batch that comes into the model is a tensor with fixed dimensions. So the shape of this tensor is decided by the sample with the maximum length on the batch(smaller samples are padded). As a result training times increase dramatically.

Using **_filter_samples_** to cut off the largest 10% of the samples reduces the sequnce length to 79, which something that we can handle in compination with other techniques used afterwards.

### Shaping Data for training

For the needs of training procedure we split our dataset to train and validation set. Validation set is used to monitor the performance of the model at every training epoch.  

In [18]:
pairs_train, pairs_eval = train_test_split(pairs_reduced, test_size=0.3, shuffle=True)      

print("Number of samples in train set: {} and in validation: {}".format(len(pairs_train), len(pairs_eval)))

Number of samples in train set: 122702 and in validation: 52587


The input to our model consists of 4 components. Firstly the history along with the so far produced reply are given to the model. Apart from that the type(whether each token came from the user or the bot)of the input is provided. The labels, namely the tokens of the reply are provided again separately to the model and CrossEntropyLoss is returned. Final segment is the attention mask which denotes where real values ends, to prevent attention mechanism for taking into consideration padding.

Given the specified arguments the above procedure is calculated by the function **_create_model_inputs_**.

In [19]:
def create_model_inputs(history, reply, tokenizer, with_eos=True):
    """
    Function that creates the various parts of the model input from input/output pairs.
    """
    
    bos, eos, speaker1, speaker2 = SPECIAL_TOKENS_IDS[:-1]
    sequence = [[bos]] + history + [reply + ([eos] if with_eos else [])]
    seq_len = len(sequence) # sequence: list of lists
    sequence = [sequence[0]] + [[speaker2 if (seq_len-i) % 2 != 1 else speaker1] + s 
                                for i, s in enumerate(sequence[1:])]
    
    instance = {}
    instance["input_ids"] = list(chain(*sequence)) # words
    instance["token_type_ids"] = [speaker1] + [speaker2 if i % 2 else speaker1 
                                               for i, s in enumerate(sequence[1:]) for _ in s] # for each word
    instance["mask"] = [1] * len(instance["input_ids"])                                     # attention mask
    instance["lm_labels"] = ([-1] * sum(len(s) for s in sequence[:-1])) + [-1] + sequence[-1][1:]
    # TODO positional embeddings
    
    return instance

In the next blocks we use Dataset and Dataloader classes to help us create batches from our data. DialogDataset class wraps the data and it takes care of creating input segments from samples as well as it sorts samples to optimize padding.

In [20]:
class DialogDataset(Dataset):
    """
    Class that wraps data. It takes care of creating input segments from samples as well as
    it sorts samples to optimize padding. Used by the dataloader to sample entries.
    """

    def __init__(self, dialog_pairs):
        self.dataset = self.create_segments(dialog_pairs) # create the segments of the input
        self.dataset = self.sort_on_seq_length() # sorting

    def __len__(self):
        return len(self.dataset)
    
    def create_segments(self, dialog_pairs):
        """
        Creates input segments for each sample
        """
        dataset = []
        for pair in dialog_pairs:
            instance = create_model_inputs(pair['input'], pair['output'], tokenizer)
            dataset.append(instance)
        return dataset
    
    def sort_on_seq_length(self): # could be optimized to use bucket sorting as the number of sample is big
        """
        Sorts dataset based on seq_len to minimize padding afterwards
        """
        return sorted(self.dataset, key=lambda x: len(x['input_ids']))
    

    def __getitem__(self, index):
        return  self.dataset[index]

Dataloader has the role of getting samples from the specified Dataset object and collate them together to form batches. We want to avoid padding the dataset in the global level as by doing so every batch will be of the maximum sequence length. Instead we sorted all samples insided dataset class based on their length and we are preventing Dataloader to sample them at random so for this order to be preserved. 

In this way samples of similar size end up in the same batch and padding is performed inside **_custom_collate_fn_** on a batch level. 

In [21]:
def pad_sequenses(batch, pad_token=0):
    """
    Pads a list of tokens. Padding token differentiates for each input segment 
    """
    max_seq_len = max(len(entry["input_ids"]) for entry in batch)
    
    for entry in batch:
        for index_name in entry.keys():
            
            if index_name == "lm_labels":
                pad_token_ = -1
            elif index_name == "mask":
                pad_token_ = 0
            else:
                pad_token_ = pad_token
                
            entry[index_name] =  entry[index_name] + [pad_token_] * (max_seq_len - len(entry[index_name]))
  
    return batch  

In [36]:
def custom_collate_fn(batch):
    """
    Function that is provided by samples and stacks them together to form tensors.
    We call padding function here. 
    
    :param batch: a list of sambles
    :reutrn : input segments as tensors
    """
    batch = pad_sequenses(batch, SPECIAL_TOKENS_IDS[-1])
    
    inputs = [torch.stack(list(map(lambda x: torch.from_numpy( \
        np.array(x[index_name])), batch)), dim=0) for index_name in batch[0].keys()]

    inputs = [input_tensor.type(torch.LongTensor) for input_tensor in inputs]

    if use_cuda and torch.cuda.is_available():
        inputs = [input_tensor.cuda() for input_tensor in inputs]   

    return inputs  

The last part is the creation of the DialogDataset objects and the call to the DataLoader.

In [52]:
training_set = DialogDataset(pairs_train) 
validation_set = DialogDataset(pairs_eval)

In [50]:
TRAIN_BATCH_SIZE = 32 
EVAL_BATCH_SIZE = 64 

dataloader_train = DataLoader(training_set, batch_size=TRAIN_BATCH_SIZE, shuffle=False, 
                              collate_fn=custom_collate_fn, num_workers=0) 

dataloader_valid = DataLoader(validation_set, batch_size=EVAL_BATCH_SIZE, shuffle=False,
                              collate_fn=custom_collate_fn, num_workers=0)

The choice of the batch sizes was made carefully as the model along with the batch of the biggest sequence length must be able to fit in RAM memory.

### Training procedure



In [25]:
def train(model, dataloader, optimizer, print_period=400):
    """
    """
    epoch_loss = 0.0
    model.train()
    
    for i_batch, (input_ids, attention_mask, category_ids, label_ids) in enumerate(dataloader):
        
        loss, logits = model(input_ids, attention_mask, category_ids, labels=label_ids)
        optimizer.zero_grad() 
        loss.backward()
        optimizer.step()
        epoch_loss += loss.item()  
        
    return epoch_loss    

In [26]:
def evaluate(model, dataloader):
    """
    """
    epoch_loss = 0.0
    model.eval()
    with torch.no_grad():
        
        for i_batch, (input_ids, attention_mask, category_ids, label_ids) in enumerate(dataloader):
            
            loss, logits = model(input_ids, attention_mask, category_ids, labels=label_ids)
            epoch_loss += loss.item()
            
    return epoch_loss

In [27]:
def checkpoint_state(model, tokenizer, output_dir=None):
    """
    Function that saves models weights and configuration as well as tokenizer's voc.
    
    :param model: to checkpoint
    :param tokenizer: to checkpoint
    :param output_dir: directory where checkpointed files will be created
    :retun :
    """
    
    if output_dir is None:
        output_dir = 'metalwoz-v1/checkpoint_{}'.format(datetime.now().strftime(TIME_FORMAT))
    
    try:
        os.mkdir(output_dir)
    except FileExistsError:
        pass
    
    output_model_file = os.path.join(output_dir, WEIGHTS_NAME)
    output_config_file = os.path.join(output_dir, CONFIG_NAME)  
    
    torch.save(model.state_dict(), output_model_file) # checkpoint weights
    model.config.to_json_file(output_config_file)     # configuration
    tokenizer.save_vocabulary(output_dir)             # vocabulary

In [28]:
class EarlyStoppingException(Exception):
    def __init__(self, message):

        super().__init__(message)

In [41]:
def early_stopping(min_loss, cur_patience, max_patience, epoch_eval_loss):
    """
    """
    if (epoch_eval_loss >= min_loss):  
        cur_patience += 1
        if (cur_patience >= max_patience):
            raise EarlyStoppingException("Execution terminated due to Early Stopping")
    else:
        print("New min validation loss}")
        checkpoint_state(model, tokenizer) # checkpointing
        print("New checkpoint created")
        min_loss, cur_patience = epoch_eval_loss, 0    
        
    return min_loss, cur_patience    

In [42]:
def train_procedure(epochs, lr, use_cuda, min_loss, max_patience, cur_patience):
    """
    """
    if use_cuda and torch.cuda.is_available():
        model.cuda()

    optimizer = optim.Adam(model.parameters(), lr=lr) # , weight_decay=0.001 # TODO review those values
    for epoch in range(epochs):

        train_loss = train(model, dataloader_train, optimizer) / len(dataloader_train)
        print("Loss on train set: \t\t epoch {} : {:.4f}".format(epoch, train_loss))

        eval_loss = evaluate(model, dataloader_valid) / len(dataloader_valid)     
        print("Loss on validation set: \t epoch {} : {:.4f}".format(epoch, eval_loss))    

        try:
            min_loss, cur_patience = early_stopping(min_loss, cur_patience, max_patience, eval_loss) 
        except EarlyStoppingException as e:
            print("{} at epoch: {}".format(e, epoch))
            break    

In [51]:
epochs = 10
lr = 6.25e-5
min_loss, max_patience, cur_patience = np.inf, 3, 0

train_procedure(epochs, lr, use_cuda, min_loss, max_patience, cur_patience)

Loss on train set: 		 epoch 0 : 6.6461
Loss on validation set: 	 epoch 0 : 6.3160
New min validation loss}
New checkpoint created
Loss on train set: 		 epoch 1 : 6.1977
Loss on validation set: 	 epoch 1 : 5.8961
New min validation loss}
New checkpoint created
Loss on train set: 		 epoch 2 : 5.9382
Loss on validation set: 	 epoch 2 : 5.8530
New min validation loss}
New checkpoint created
Loss on train set: 		 epoch 3 : 5.8087
Loss on validation set: 	 epoch 3 : 5.8559
Loss on train set: 		 epoch 4 : 5.7147
Loss on validation set: 	 epoch 4 : 5.8270
New min validation loss}
New checkpoint created
Loss on train set: 		 epoch 5 : 5.6415
Loss on validation set: 	 epoch 5 : 5.8311
Loss on train set: 		 epoch 6 : 5.5931
Loss on validation set: 	 epoch 6 : 5.8536
Loss on train set: 		 epoch 7 : 5.5158
Loss on validation set: 	 epoch 7 : 5.8668
Execution terminated due to Early Stopping at epoch: 7


## Interaction with the bot - Inference

In [361]:
def format_input(history, reply_so_far):
    """
    """
    history = [tokenizer.encode(phrase) for phrase in history]
    
    instance = create_model_inputs(history, reply_so_far, tokenizer, with_eos=False)
    
    input_ids = torch.tensor(instance["input_ids"]).unsqueeze(0)
    token_type_ids = torch.tensor(instance["token_type_ids"]).unsqueeze(0)
    
    return input_ids, token_type_ids

In [360]:
def decoding(probs, logits, method="top_p"):
    """
    Functions that selects the next token to be emmited. Three different approaches are implemented: 
    
    Greedy: the most probable token is selected.
    Top-k : 
    Top-p : 
    
    :param logits: 
    :param method: the decoding method to be used, Values={'greedy', 'top_k', 'top_p'}
    :return: the selected token
    """
    top_k = 40 # sample from the 100 most probable tokens based on their probs
    top_p = 0.9 # sample from the n most probable tokens that have a cumulative probability at least 0.9 
    
    if method == "greedy":
        return torch.argmax(probs).item()
    
    elif method == "top_k":        
        prob_k = probs.topk(top_k)[0][-1].item() # value of the 100th most probable
        probs[probs < prob_k] = 0   # cut off the tail  
        
    elif method == "top_p":
        probs_sorted, probs_indexes = probs.sort(dim=-1, descending=True) # start the cumulation from the most probable token in descending order
        cum_probs = probs_sorted.cumsum(dim=-1)
        
        indices = cum_probs > top_p 
        indices[1:] = indices[:-1].clone()
        indices[0] = 0 # at least one token is preserved 
        
        probs[probs_indexes[indices]] = 0
    
    word = torch.multinomial(probs, 1).item()
    # TODO handle the case that special token was emitted in the first pick
    
    return word

In [359]:
def infer_answer(history, model, tokenizer, method="top_p"):
    """
    Function that generates word by word the bot answer, based on user input and previous history.
    
    :param history: a list of past sentences and last user's input, in plain text
    :param model: the model to be used for inference
    :return: a list with the words of the answer in plain text 
    """
    model.eval()
    reply_so_far = []
    with torch.no_grad():
    
        for i in range(max_sentence_length):
            
            input_ids, category_ids = format_input(history, reply_so_far)
            outputs = model(input_ids=input_ids, token_type_ids=category_ids)
            logits = outputs[0]
            logits = logits[0, -1, :] / temperature # keep last 
            probs = F.softmax(logits, dim=-1) 
            word = decoding(probs, logits, method=method) 
            
            if word in SPECIAL_TOKENS_IDS: # we stop inference if we find a special token without emitting this token
                print("Bot terminate sentence!")
                break
            reply_so_far.append(word)
            
        answer_text = tokenizer.decode(reply_so_far, skip_special_tokens=True)    
        return answer_text

In [358]:
def interact_with_bot(model, tokenizer, method='top_p'):
    """
    """
    bot_prompt = "bot:>>> "
    user_prompt = "user:>>> "

    history = []
    print(bot_prompt + "Hello how may I help you?")
    user_input = input(user_prompt)
    
    while user_input != "\q": # TODO check if we need to truncate user input to not exceed max_length
        
        history.append(user_input)
        answer = infer_answer(history, model, tokenizer, method=method)
        history.append(answer)
        
        history = history[-(2*max_history+1):]  # keep the same history as in the training 
        
        print(bot_prompt + answer)
        
        user_input = input(user_prompt) 

In [344]:
checkpoint_dir = "metalwoz-v1/checkpoint_20191114_1250"

model_loaded ,tokenizer_loaded  = load_checkpoint(checkpoint_dir)

add_special_tokens_(model_loaded, tokenizer_loaded)

ftfy or spacy is not installed using BERT BasicTokenizer instead of SpaCy & ftfy.


C:\Users\nikmand\nikmand\ncsr-chatbot\metalwoz-v1\checkpoint_20191114_1250


In [363]:
interact_with_bot(model_loaded, tokenizer_loaded, method='top_p')

bot:>>> Hello how may I help you?
user:>>> i need toothpaste
Bot terminate sentence!
bot:>>> 
user:>>> hey how are you?
Bot terminate sentence!
bot:>>> yes
user:>>> tell me something.
Bot terminate sentence!
bot:>>> yes
user:>>> \q


### References

