# Hybrid Code Network Pharma Bot

The implementation is based on this paper; which used the bAbi restaurant recommender dataset from facebook (https://research.fb.com/downloads/babi/):

Hybrid Code Networks: practical and efficient end-to-end dialog control with supervised and reinforcement learning  
Jason D. Williams, Kavosh Asadi, Geoffrey Zweig  
https://arxiv.org/abs/1702.03274

<img src="https://user-images.githubusercontent.com/166852/33999718-389cdb26-e0b9-11e7-8708-140da0803a5b.png" >

The code itself is forked from this repo https://github.com/jojonki/Hybrid-Code-Networks and has been adapted to use a new dataset. The dataset construction follows a similar process to the one used for bAbi task 5. <br>
• Each user question has four possible phrasings and one of seven possible questions is chosen randomly for each dialogue<br>
• The model outputs an action, some of which are masked until all the required data is gathered<br>
• Once all the additonal questions have been asked by the bot, the user then may change their mind about some of the answers

For the moment, the data is generated randomly and then stored in a SQL database, to allow scaling to a much larger dataset. Training and test sets of 1000 dialogues were generated using the same code (the test set could be made harder by holding out particular phrasings of questions).

**The dialog is stored as follows:**<br>
good morning<\t>hello, what can I help you with today?<\n><br>
what do I do with drug56?<\t>what version of drug56 is that, for example: oral or injectable?<\n><br>
intravenous<\t>ok, let me look in to that<\n><br>
actually i meant subcutaneous<\t>ok, is there anything else to update?<\n><br>
actually i meant drug41<\t>ok, is there anything else to update?<\n><br>
no<\t>answer q3: instructions for drug 41_subcutaneous<\n><br>
thank you<\t>you're welcome<\n><\n>

The first step in the code is to read a set of entities and store them in an Ordered Dictionary structure (get_entities function in utils).

In [10]:
from collections import OrderedDict
import re

def get_entities(fpath):
    # outputs list of entities for each type (dictionary key)
    entities = OrderedDict({'age_groups': [], 'conditions': [], 'symptoms': [], 'severities': [], 'delivery_methods': [],
                            'periods': [], 'strengths': [], 'units': [], 'drugs': []})
    with open(fpath, 'r') as file:
        # e.g. conditions<\t>heart problems
        lines = file.readlines()
        for l in lines:
            l = re.sub(r'\n', '', l)
            wds = l.split('\t')
            slot_type = wds[0] # ex) R_price
            slot_val = wds[1] # ex) cheap
            # if slot_type not in entities:
            #     entities[slot_type] = []
            if slot_type in entities:
                if slot_val not in entities[slot_type]:
                    entities[slot_type].append(slot_val)
    return entities

entities = get_entities('entities.txt')


Next the dataset will be read to extract the set of vocabulary and actions used, stored as lists.

In [11]:
def reduce_actions(ls, system_acts):
    sys_act = ls[1]
    sys_act = re.sub(r'drug[0-9]+', '<drug>', sys_act)
    if sys_act.startswith('answer'): sys_act = sys_act[:9]
    if sys_act.startswith('these are some'): sys_act = 'question symptoms'
    if sys_act.startswith('<drug> may not be'): sys_act = 'question conditions'
    if sys_act not in system_acts: system_acts.append(sys_act)
    return system_acts, sys_act


def preload(fpath, vocab, system_acts):
    # goes through dialog and builds vocab from user utterances and also system actions
    with open(fpath, 'r') as f:
        lines = f.readlines()
        # e.g. do you have something else<\t>sure let me find an other option for you
        for idx, l in enumerate(lines):
            l = l.rstrip()
            if l != '':
                ls = l.split("\t")
                uttr = ls[0].split(' ')
                if len(ls) == 2: # includes user and system utterance
                    for w in uttr:
                        if w not in vocab:
                            vocab.append(w)
                if len(ls) == 2: # includes user and system utterance
                    system_acts, _ = reduce_actions(ls, system_acts)
    vocab = sorted(vocab)
    system_acts = sorted(system_acts)
    return vocab, system_acts

fpath_train = 'dialogues_train.txt'
fpath_test = 'dialogues_test.txt'
SILENT = '<SILENT>'
UNK = '<UNK>'
system_acts = [SILENT]

vocab = []
vocab, system_acts = preload(fpath_train, vocab, system_acts) 


Now an indexing is created for the vocabulary and action sets, UNK is added for anythin unknown.

In [14]:
w2i = dict((w, i) for i, w in enumerate(vocab, 1))
i2w = dict((i, w) for i, w in enumerate(vocab, 1))
w2i[UNK] = 0
i2w[0] = UNK

Next the data is processed and organised into a list of environmental states which will be fed into the RNN. At each step of the sequence a state is input and the RNN outputs an action. 

Each state is composed of six vectors: x: user utterance, y: system action, c: context, b: Bag of Words, p: previous system action, f: action filter.

At this stage, the user utterances and system actions are stored as text. The context is an indicator vector to record which of the entities have been detected from the user.

In [15]:
def update_context(context, sentence, entities):
    # indicator vector for all entities found in sentence
    for idx, (ent_key, ent_vals) in enumerate(entities.items()):
        for w in sentence:
            if w in ent_vals:
                context[idx] = 1

A bag of words vector is another indicator vector recording which words are used in the utterance, according to the index defined above in w2i.

In [16]:
def get_bow(sentence, w2i):
    bow = [0] * len(w2i)
    for word in sentence:
        if word in w2i:
            bow[w2i[word]] += 1
    return bow

The action filter prevents certain actions from occuring until (or after) the required information is provided by the user.

In [17]:
def generate_act_filter(action_size, context):
    mask = [0] * action_size
    ''' context: {'0age_groups': [], '1conditions': [], '2symptoms': [], '3severities': [], '4delivery_methods': [], 
        '5periods': [], '6strengths': [], '7units': [], '8drugs': []}
    '''
    # standard small talk
    mask[0] = 1
    mask[1] = 1
    mask[9] = 1
    mask[10] = 1
    mask[11] = 1
    mask[16] = 1

    #clarifiaction questions if entities not found
    if context[1] == 0:
        mask[12] = 1
    if context[2] == 0:
        mask[13] = 1
    if context[6] == 0:
        mask[14] = 1
    if context[4] == 0:
        mask[15] = 1

    # answers
    if context[8] ==1:
        mask[2] = 1
        if context[1] == 1:
            mask[3] = 1
        if context[2] ==1:
            mask[6] = 1
        if context[4] == 1:
            mask[4] = 1
            mask[7] = 1
            mask[8] = 1
            if context[0] ==1 and context[6] ==1:
                mask[5] = 1
    return mask

**Now to bring that all together (and also creating an index for the actions):**

In [22]:
import copy

    
def load_data(fpath, entities, w2i, system_acts):
    # inputs from get_entities and preload
    data = []
    with open(fpath, 'r') as f:
        # e.g. do you have something else<\t>sure let me find an other option for you
        lines = f.readlines()
        # x: user uttr, y: sys act, c: context, b: BoW, p: previous sys act, f: action filter
        x, y, c, b, p, f = [], [], [], [], [], []
        context = [0] * len(entities.keys())
        for idx, l in enumerate(lines):
            l = l.rstrip()
            if l == '':
                data.append((x, y, c, b, p, f))
                # reset
                x, y, c, b, p, f = [], [], [], [], [], []
                context = [0] * len(entities.keys())
            else:
                ls = l.split("\t")
                uttr = ls[0].split(' ')
                update_context(context, uttr, entities)
                act_filter = generate_act_filter(len(system_acts), context)
                bow = get_bow(uttr, w2i)
                sys_act = SILENT
                if len(ls) == 2: # includes user and system utterance
                    system_acts, sys_act = reduce_actions(ls, system_acts)
                else:
                    continue

                x.append(uttr)
                if len(y) == 0:
                    p.append(SILENT)
                else:
                    p.append(y[-1])
                y.append(sys_act)
                c.append(copy.deepcopy(context))
                b.append(bow)
                f.append(act_filter)
    return data, system_acts

train_data, system_acts = load_data(fpath_train, entities, w2i, system_acts)
test_data, system_acts = load_data(fpath_test, entities, w2i, system_acts)
act2i = dict((act, i) for i, act in enumerate(system_acts))
print('vocab size:', len(vocab))
print('action size:', len(system_acts))

vocab size: 322
action size: 17
action_size: 17


A word embedding model is loaded for the words in the vocabulary from a word2vec model trained on the google news corpus (https://code.google.com/archive/p/word2vec/). To save doing this each time, the embeddings are pickled for future use.

In [36]:
from gensim.models.keyedvectors import KeyedVectors
from utils import save_pickle, load_pickle, load_embd_weights, load_data, to_var, add_padding

# get and save embeddings for words in vocab
# print('loading a word2vec binary...')
# model_path = '/Users/graeme/GoogleNews-vectors-negative300.bin'
# word2vec = KeyedVectors.load_word2vec_format('/Users/graeme/GoogleNews-vectors-negative300.bin', binary=True)
# print('done')
# pre_embd_w = load_embd_weights(word2vec, len(w2i), 300, w2i)
# save_pickle(pre_embd_w, 'pre_embd_w.pickle')
pre_embd_w = load_pickle('pre_embd_w.pickle')

load pre_embd_w.pickle


**Now the model is defined using torch.**

In [27]:
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.autograd import Variable
import random
import argparse


class WordEmbedding(nn.Module):
    '''
    In : (N, sentence_len)
    Out: (N, sentence_len, embd_size)
    '''
    def __init__(self, vocab_size, embd_size, pre_embd_w=None, is_train_embd=False):
        super(WordEmbedding, self).__init__()
        self.embedding = nn.Embedding(vocab_size, embd_size)
        if pre_embd_w is not None:
            print('pre embedding weight is set')
            self.embedding.weight = nn.Parameter(pre_embd_w, requires_grad=is_train_embd)

    def forward(self, x):
        return self.embedding(x)


class HybridCodeNetwork(nn.Module):
    def __init__(self, vocab_size, embd_size, hidden_size, action_size, context_size, pre_embd_w=None):
        super(HybridCodeNetwork, self).__init__()
        self.embd_size = embd_size
        self.hidden_size = hidden_size
        self.embedding = WordEmbedding(vocab_size, embd_size, pre_embd_w)
        lstm_in_dim = embd_size + vocab_size + action_size + context_size + 1 # + 1 (Unknown vocab tag)
        self.lstm = nn.LSTM(lstm_in_dim, hidden_size, batch_first=True)
        self.fc = nn.Linear(hidden_size, action_size)

    def forward(self, uttr, context, bow, prev, act_filter):
        # uttr       : (bs, dialog_len, sentence_len)
        # context    : (bs, dialog_len, context_dim)
        # bow        : (bs, dialog_len, vocab_size)
        # prev       : (bs, dialog_len, action_size)
        # act_filter : (bs, dialog_len, action_size)
        bs = uttr.size(0)
        dlg_len = uttr.size(1)
        sent_len = uttr.size(2)

        # .view() is used to reshape
        embd = self.embedding(uttr.view(bs, -1)) # (bs, dialog_len*sentence_len, embd)
        embd = embd.view(bs, dlg_len, sent_len, -1) # (bs, dialog_len, sentence_len, embd)
        embd = torch.mean(embd, 2) # (bs, dialog_len, embd)
        x = torch.cat((embd, context, bow, prev), 2) # (bs, dialog_len, embd+context_dim)
        x, (h, c) = self.lstm(x) # (bs, dialog_len, hid), ((1, bs, hid), (1, bs, hid))
        y = self.fc(F.tanh(x)) # (bs, dialog_len, action_size)
        y = F.softmax(y, -1) # (bs, dialog_len, action_size)
        y = y * act_filter
        return y

    
model = HybridCodeNetwork(vocab_size=len(vocab), embd_size=300, hidden_size=128, action_size=len(system_acts), 
                          context_size=len(entities.keys()), pre_embd_w=pre_embd_w)
if torch.cuda.is_available():
    model.cuda()
optimizer = torch.optim.Adadelta(filter(lambda p: p.requires_grad, model.parameters()))

pre embedding weight is set


**Next loading a couple of helper functions.**

In [31]:
def padding(data, default_val, maxlen):
    for i, d in enumerate(data):
        pad_len = maxlen - len(d)
        for _ in range(pad_len):
            data[i].append([default_val] * len(entities.keys()))
    return to_var(torch.FloatTensor(data))


def categorical_cross_entropy(preds, labels):
    loss = Variable(torch.zeros(1))
    for p, label in zip(preds, labels):
        loss -= torch.log(p[label] + 1.e-7).cpu()
    loss /= preds.size(0)
    return loss

Now, when the data for each batch is loaded, the user utterances and system actions are vectorised

In [33]:
def make_word_vector(uttrs_list, w2i, dialog_maxlen, uttr_maxlen):
    # returns batch of lists of word indices (as defined in w2i)
    dialog_list = []
    for uttrs in uttrs_list:
        dialog = []
        for sentence in uttrs:
            sent_vec = [w2i[w] if w in w2i else w2i[UNK] for w in sentence]
            sent_vec = add_padding(sent_vec, uttr_maxlen)
            dialog.append(sent_vec)
        for _ in range(dialog_maxlen - len(dialog)):
            dialog.append([0] * uttr_maxlen)
        dialog = torch.LongTensor(dialog[:dialog_maxlen])
        dialog_list.append(dialog)
    return to_var(torch.stack(dialog_list, 0))


def get_data_from_batch(batch, w2i, act2i):
    # vectorises input data
    uttrs_list = [d[0] for d in batch]
    dialog_maxlen = max([len(uttrs) for uttrs in uttrs_list])
    uttr_maxlen = max([len(u) for uttrs in uttrs_list for u in uttrs])
    uttr_var = make_word_vector(uttrs_list, w2i, dialog_maxlen, uttr_maxlen)

    batch_labels = [d[1] for d in batch]
    labels_var = []
    for labels in batch_labels:
        vec_labels = [act2i[l] for l in labels]
        pad_len = dialog_maxlen - len(labels)
        for _ in range(pad_len):
            vec_labels.append(act2i[SILENT])
        labels_var.append(torch.LongTensor(vec_labels))
    labels_var = to_var(torch.stack(labels_var, 0))

    batch_prev_acts = [d[4] for d in batch]
    prev_var = []
    for prev_acts in batch_prev_acts:
        vec_prev_acts = []
        for act in prev_acts:
            tmp = [0] * len(act2i)
            tmp[act2i[act]] = 1
            vec_prev_acts.append(tmp)
        pad_len = dialog_maxlen - len(prev_acts)
        for _ in range(pad_len):
            vec_prev_acts.append([0] * len(act2i))
        prev_var.append(torch.FloatTensor(vec_prev_acts))
    prev_var = to_var(torch.stack(prev_var, 0))

    context = copy.deepcopy([d[2] for d in batch])
    context = padding(context, 1, dialog_maxlen)

    bow = copy.deepcopy([d[3] for d in batch])
    bow = padding(bow, 0, dialog_maxlen)

    act_filter = copy.deepcopy([d[5] for d in batch])
    act_filter = padding(act_filter, 0, dialog_maxlen)

    return uttr_var, labels_var, context, bow, prev_var, act_filter

**Finally, we are ready to train the model**

In [37]:
def train(model, data, optimizer, w2i, act2i, n_epochs=2, batch_size=1):
    print('----Train---')
    data = copy.copy(data)
    for epoch in range(n_epochs):
        print('Epoch', epoch)
        random.shuffle(data)
        acc, total = 0, 0
        for batch_idx in range(0, len(data)-batch_size, batch_size):
            batch = data[batch_idx:batch_idx+batch_size]
            uttrs, labels, contexts, bows, prevs, act_fils = get_data_from_batch(batch, w2i, act2i)
            preds = model(uttrs, contexts, bows, prevs, act_fils)
            action_size = preds.size(-1)
            preds = preds.view(-1, action_size)
            labels = labels.view(-1)
            loss = categorical_cross_entropy(preds, labels)
            acc += torch.sum(labels == torch.max(preds, 1)[1]).data[0]
            total += labels.size(0)
            if batch_idx % (100 * batch_size) == 0:
                print('Acc: {:.3f}% ({}/{})'.format(100 * acc/total, acc, total))
                print('loss', loss.data[0])
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()

train(model, train_data, optimizer, w2i, act2i)


----Train---
Epoch 0
Acc: 0.000% (0/2)
loss 9.522615432739258
Acc: 35.639% (170/477)
loss 1.6240317821502686
Acc: 47.953% (445/928)
loss 0.6303272843360901
Acc: 56.516% (798/1412)
loss 1.1948012113571167
Acc: 61.983% (1169/1886)
loss 0.30662867426872253
Acc: 65.693% (1551/2361)
loss 0.32327380776405334
Acc: 68.087% (1901/2792)
loss 8.12259292602539
Acc: 70.227% (2262/3221)
loss 0.06248326227068901
Acc: 71.912% (2637/3667)
loss 0.048547301441431046
Acc: 73.522% (3035/4128)
loss 0.09099173545837402


**And to test**

In [39]:
def test(model, data, w2i, act2i, batch_size=1):
    print('----Test---')
    model.eval()
    acc, total = 0, 0
    for batch_idx in range(0, len(data)-batch_size, batch_size):
        batch = data[batch_idx:batch_idx+batch_size]
        uttrs, labels, contexts, bows, prevs, act_fils = get_data_from_batch(batch, w2i, act2i)
        preds = model(uttrs, contexts, bows, prevs, act_fils)
        action_size = preds.size(-1)
        preds = preds.view(-1, action_size)
        labels = labels.view(-1)
        acc += torch.sum(labels == torch.max(preds, 1)[1]).data[0]
        total += labels.size(0)
    print('Test Acc: {:.3f}% ({}/{})'.format(100 * acc/total, acc, total))

test(model, test_data, w2i, act2i)

----Test---
Test Acc: 87.154% (3969/4554)
