New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seq2Seq model #96

Merged
merged 12 commits into from May 31, 2017

Conversation

Projects
None yet
3 participants
@alexholdenmiller
Member

alexholdenmiller commented May 22, 2017

First draft of simple seq2seq model

@jaseweston

This comment has been minimized.

Show comment
Hide comment
@jaseweston

jaseweston May 24, 2017

Contributor

remove?

@@ -0,0 +1,97 @@
# Copyright (c) 2017-present, Facebook, Inc.

This comment has been minimized.

@jaseweston

jaseweston May 24, 2017

Contributor

i'm confused it is called rnn_baselines (plural) but then you just have train.py and agents.py, but i guess they could be moved if we add more rnn baselines? or could just give them more specific names now

@jaseweston

jaseweston May 24, 2017

Contributor

i'm confused it is called rnn_baselines (plural) but then you just have train.py and agents.py, but i guess they could be moved if we add more rnn baselines? or could just give them more specific names now

This comment has been minimized.

@alexholdenmiller

alexholdenmiller May 26, 2017

Member

I'll update these! haven't yet

@alexholdenmiller

alexholdenmiller May 26, 2017

Member

I'll update these! haven't yet

@@ -0,0 +1,97 @@
# Copyright (c) 2017-present, Facebook, Inc.
# All rights reserved.

This comment has been minimized.

@jaseweston

jaseweston May 24, 2017

Contributor

we need to start making a general train program now. lets do it after you get this PR in

@jaseweston

jaseweston May 24, 2017

Contributor

we need to start making a general train program now. lets do it after you get this PR in

Show outdated Hide outdated parlai/agents/rnn_baselines/agents.py
class Seq2SeqAgent(Agent):
"""Simple agent which uses an LSTM to process incoming text observations."""

This comment has been minimized.

@jaseweston

jaseweston May 24, 2017

Contributor

explain a bit more the architecture please (GRUs, layers etc.)

@jaseweston

jaseweston May 24, 2017

Contributor

explain a bit more the architecture please (GRUs, layers etc.)

Show outdated Hide outdated parlai/core/dict.py
self.freq = SharedTable(self.freq)
self.tok2ind = SharedTable(self.tok2ind)
self.ind2tok = SharedTable(self.ind2tok)
# self.freq = SharedTable(self.freq)

This comment has been minimized.

@jaseweston

jaseweston May 24, 2017

Contributor

what happened here?

@jaseweston

jaseweston May 24, 2017

Contributor

what happened here?

Show outdated Hide outdated parlai/core/dict.py
@@ -294,8 +306,10 @@ def txt2vec(self, text, vec_type=np.ndarray):
(self[token] for token in self.tokenize(str(text))),
np.int
)
else:
elif vec_type == list:

This comment has been minimized.

@jaseweston

jaseweston May 24, 2017

Contributor

need to make sure you are not breaking anything else that uses the dict, e.g. IR baseline etc.

@jaseweston

jaseweston May 24, 2017

Contributor

need to make sure you are not breaking anything else that uses the dict, e.g. IR baseline etc.

This comment has been minimized.

@alexholdenmiller

alexholdenmiller May 26, 2017

Member

actually nothing was using this function except remote agent, looks like

@alexholdenmiller

alexholdenmiller May 26, 2017

Member

actually nothing was using this function except remote agent, looks like

Show outdated Hide outdated parlai/agents/rnn_baselines/agents.py
argparser.add_arg('--gpu', type=int, default=-1,
help='which GPU device to use')
def __init__(self, opt, shared=None):

This comment has been minimized.

@jaseweston

jaseweston May 24, 2017

Contributor

be cool if this model also ranked candidates, which we need for many of the parlAI tasks, i guess it doesnt yet?

@jaseweston

jaseweston May 24, 2017

Contributor

be cool if this model also ranked candidates, which we need for many of the parlAI tasks, i guess it doesnt yet?

This comment has been minimized.

@alexholdenmiller

alexholdenmiller May 26, 2017

Member

yeah this doesn't do that yet! I'd rather check it in before adding that functionality

@alexholdenmiller

alexholdenmiller May 26, 2017

Member

yeah this doesn't do that yet! I'd rather check it in before adding that functionality

@jaseweston

This comment has been minimized.

Show comment
Hide comment
@jaseweston

jaseweston May 26, 2017

Contributor

do we want to kill this before checking in?

do we want to kill this before checking in?

@jaseweston

This comment has been minimized.

Show comment
Hide comment
@jaseweston

jaseweston May 26, 2017

Contributor

i don't get why we need this? to fill a batch?

i don't get why we need this? to fill a batch?

This comment has been minimized.

Show comment
Hide comment
@alexholdenmiller

alexholdenmiller May 26, 2017

Member

basically yes--sometimes empty messages will need to be generated for the batch, and this allows the MultiTeacher to handle it instead of calling observe on the inner task teacher (this follows from below, where act was not called on the task, so it makes sense not to call observe on it either)

basically yes--sometimes empty messages will need to be generated for the batch, and this allows the MultiTeacher to handle it instead of calling observe on the inner task teacher (this follows from below, where act was not called on the task, so it makes sense not to call observe on it either)

@jaseweston

This comment has been minimized.

Show comment
Hide comment
@jaseweston

jaseweston May 26, 2017

Contributor

we didn't have a share before? why is this new?

we didn't have a share before? why is this new?

This comment has been minimized.

Show comment
Hide comment
@alexholdenmiller

alexholdenmiller May 26, 2017

Member

right, we didn't add a share before because we hadn't tried using multiteacher with batch yet, where we don't want to read every data file batchsize times when we can just share the original data into the other inner worlds of the BatchWorld

right, we didn't add a share before because we hadn't tried using multiteacher with batch yet, where we don't want to read every data file batchsize times when we can just share the original data into the other inner worlds of the BatchWorld

@jaseweston

This comment has been minimized.

Show comment
Hide comment
@jaseweston

jaseweston May 26, 2017

Contributor

we need to discuss these things again in person i think

we need to discuss these things again in person i think

This comment has been minimized.

Show comment
Hide comment
@alexholdenmiller

alexholdenmiller May 26, 2017

Member

this covers the case where we have e.g. batchsize = 32, but only 20 episodes in the data (like with babi 1k valid). the first 20 inner worlds in the batch should be assigned the first 20 episodes, and the other 12 inner worlds should just churn--they have already finished all the episodes assigned to them (zero episodes)

this covers the case where we have e.g. batchsize = 32, but only 20 episodes in the data (like with babi 1k valid). the first 20 inner worlds in the batch should be assigned the first 20 episodes, and the other 12 inner worlds should just churn--they have already finished all the episodes assigned to them (zero episodes)

@jaseweston

This comment has been minimized.

Show comment
Hide comment
@jaseweston

jaseweston May 26, 2017

Contributor

in general, it scares me how many changes you have in core for this PR which is supposed to be on RNNs.. !

Contributor

jaseweston commented on parlai/core/worlds.py in f774dd0 May 26, 2017

in general, it scares me how many changes you have in core for this PR which is supposed to be on RNNs.. !

This comment has been minimized.

Show comment
Hide comment
@alexholdenmiller

alexholdenmiller May 26, 2017

Member

moved them all to #105

moved them all to #105

@jaseweston

This comment has been minimized.

Show comment
Hide comment
@jaseweston

jaseweston May 26, 2017

Contributor

we have a standard param for model file names, no?

we have a standard param for model file names, no?

This comment has been minimized.

Show comment
Hide comment
@alexholdenmiller

alexholdenmiller May 26, 2017

Member

right I should move this over to that

right I should move this over to that

Show outdated Hide outdated parlai/core/params.py
default_downloads_path = os.path.join(self.parlai_home, 'downloads')
self.parser.add_argument(
'-t', '--task',
help='ParlAI task(s), e.g. "babi:Task1" or "babi,cbt"')
self.parser.add_argument(
'--logpath', default=default_log_path,

This comment has been minimized.

@jaseweston

jaseweston May 31, 2017

Contributor

still here?

@jaseweston

jaseweston May 31, 2017

Contributor

still here?

@alexholdenmiller alexholdenmiller merged commit 6fe19ce into master May 31, 2017

@alexholdenmiller alexholdenmiller deleted the first_learner branch May 31, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment