#Seq2Seq Notebook

#Original GPMN






##     Background
Chit-chat models are often criticized for not having consistent personas and unengaging speech. A study by Zhang et al (https://arxiv.org/pdf/1801.07243.pdf) sought to rectify this issue by taking chit-chat models and (i) conditioning them on persona profile information (allowing models to display a consistent given persona), and (ii) to collect and condition on information gathered about the agent they're talking to (allowing for more engaging, tailored dialogue). Several different architectures were explored, including generative models such as the profile-agnostic Seq2Seq and the profile-conditioned Generative Profile Memory Network (GPMN). 

Our study (Alex Berg, Anders Parslov, Anjalie Kini) will replicate Zhang et al's study on Seq2Seq and GPMNs, and extend this work by modifying these models and introducing transformer models. This notebook contains examples on how to run Seq2Seq models.

##Coding Notes
All files used and run in the unit below are from the original ParlAI implementation (parlai-master, or parlai-master-OLD in the shared repo). Note that the current ParlAI repo has deprecated and/or removed almost every file associated with the GPMN, so assembling the files required to run the GPMN took a significant amount of sleuthing and modifications to updated code (to undo deprecation).

In [2]:
#Prerequisites
from google.colab import drive
drive.mount('/content/gdrive')

Mounted at /content/gdrive


In [None]:
%cd "gdrive/My Drive/ParlAI-master"
! python setup.py develop
! pip install torch tensorboardX stop-words

#Table of Contents


Just a little example of how to run Seq2Seq


#Replication

In [15]:
import numpy as np
import os
import signal
import json

from parlai.core.agents import create_agent, create_agent_from_shared, get_agent_module
from parlai.core.worlds import create_task
from parlai.core.params import ParlaiParser
from parlai.core.utils import Timer, round_sigfigs, warn_once
from parlai.core.logs import TensorboardLogger
from parlai.scripts.build_dict import build_dict, setup_args as setup_dict_args
from parlai.core.distributed_utils import (
    sync_object, is_primary_worker, all_gather_list, is_distributed, num_workers
)
from parlai.scripts.build_pytorch_data import get_pyt_dict_file
from parlai.scripts.train_model import *

#Option 1: Most Basic Seq2Seq

This is only an example Seq2Seq (encoder-decoder model); it is not the one they actually use. However, it does seem more easy to manipulate than the real one. See below (option 2) for the real one.

In [30]:
#task='parlai.agents.local_human.local_human:LocalHumanAgent'
#model='projects.personachat.persona_seq2seq:PersonachatSeqseqAgentBasic'

task = "personachat"
model = "seq2seq"
batch_size = 36
lr = 1e-2
hidden_size = 128
args = f"""-m parlai.scripts.train_model -m {model} 
           -t {task} 
           -mf '/tmp/model' 
           -bs {batch_size} 
           -lr {lr} 
           -hs {hidden_size}"""

In [None]:
opt = setup_args().parse_args(args.split())

In [None]:
trainer = TrainLoop(opt) #This just defines a training object

In [None]:
trainer.train() #This actually trains it

#Option 2: Dedicated Seq2Seq (From original paper) -- with and without Attention

Check out persona_seq2seq.py in parlai-master-old -> projects -> persona_seq2seq. This is the actual version the paper uses, afaik. (The other seq2seq is merely an example of an encoder-decoder model)

In [45]:
import numpy as np
import os
import signal
import json

from parlai.core.agents import create_agent, create_agent_from_shared, get_agent_module
from parlai.core.worlds import create_task
from parlai.core.params import ParlaiParser
from parlai.core.utils import Timer, round_sigfigs, warn_once
from parlai.core.logs import TensorboardLogger
from parlai.scripts.build_dict import build_dict, setup_args as setup_dict_args
from parlai.core.distributed_utils import (sync_object, is_primary_worker, all_gather_list, is_distributed, num_workers)
from parlai.scripts.build_pytorch_data import get_pyt_dict_file
from parlai.scripts.train_model import *
from projects.personachat.persona_seq2seq import PersonachatSeqseqAgentSplit #NOTE! This is the important difference between the above OPTION 1 and OPTION 2

In [56]:
task = "personachat:self"
model = "projects.personachat.persona_seq2seq:PersonachatSeqseqAgentBasic"   #NOTE! This is the important difference between the above OPTION 1 and OPTION 2
#I have no idea what params are good for this mode; ripped this from a pre-trained model
lr = 1e-3
hidden_size = 1024
dr = 0.2
args = f"""-m {model} 
           -t {task} 
           -mf '/tmp/model' 
           -bs {batch_size} 
           -lr {lr} 
           -dr {dr}
           -hs {hidden_size}"""

In [None]:
opt = setup_args().parse_args(args.split())

In [None]:
trainer = TrainLoop(opt)

In [None]:
trainer.train()

#Additional: download a pretrained model and run an interactive version

In [50]:
from parlai.core.build_data import download_models
from parlai.core.params import ParlaiParser
from parlai.scripts.interactive import interactive
from projects.personachat.persona_seq2seq import PersonachatSeqseqAgentBasic

In [None]:
%cd parlai-master #probably remove this line; just make sure you're running this from the parlai-master directory
! seq2seq_withprofile_interactive.py #put this file (currently in the projects->personachat->scripts directory) in the parlai-master directory