-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NMT #20
NMT #20
Conversation
batch_size=source_sentence.shape[0], | ||
attended=representation, | ||
attended_mask=tensor.ones(source_sentence.shape).T, | ||
glimpses=self.attention.take_glimpses.outputs[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
glimpses
is unnecessary, forgot to remove it
A quick question: why English and Czech? |
We did the least amount of preprocessing in Czech - English among others for wmt15 (only tokenization), so I thought it would be easier to setup for others, tho we can add other pairs as well, there is nothing specific or hard coded for Cs-En (a few names only which wont be a problem when changed) |
# send end of file, read output. | ||
mb_subprocess.stdin.close() | ||
stdout = mb_subprocess.stdout.readline() | ||
print "output ", stdout |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It shouldn't be there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed with 6df83ab
You use logger and prints at the same time. I think, we should stick with the logger. |
|
||
if j == 0: | ||
# Write to subprocess and file if it exists | ||
print >> mb_subprocess.stdin, trans_out |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use python3-style print everywhere else (from __future__ import print_function
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done with ecbc38c
Thanks for this example! In init.py:319 you save to model to Another related issue is that when setting I'm currently trying to figure out how to save and load the machine translation model, but unfortunately haven't had much success so far even when hardcoding the paths the model is saved to and loaded from. I'd really appreaciate if you could look into this. |
@fhirschmann thanks a lot for the pointers, i've changed the whole checkpointing structure and made it more specific for NMT-example. So currently only
|
@orhanf, thank you very very much for this. I was under the impression that the Save/Load architecture in blocks would suffice for this. There are some small problems with the current version:
In [4]: load("model/log")
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-4-d4d06dd157f9> in <module>()
----> 1 load("model/log")
/home/fabian/msc/exp/blocks-examples/env/src/theano/theano/misc/pkl_utils.pyc in load(f, persistent_load)
318 p = pickle.Unpickler(BytesIO(zip_file.open('pkl').read()))
319 p.persistent_load = persistent_load(zip_file)
--> 320 return p.load()
321
322
/home/fabian/msc/exp/blocks-examples/local/lib/python2.7/pickle.pyc in load(self)
856 while 1:
857 key = read(1)
--> 858 dispatch[key](self)
859 except _Stop, stopinst:
860 return stopinst.value
/home/fabian/msc/exp/blocks-examples/local/lib/python2.7/pickle.pyc in load_newobj(self)
1081 args = self.stack.pop()
1082 cls = self.stack[-1]
-> 1083 obj = cls.__new__(cls, *args)
1084 self.stack[-1] = obj
1085 dispatch[NEWOBJ] = load_newobj
TypeError: buffer() takes at least 1 argument (0 given) May I ask what version of Pyhton you are using? The last point may actually be a bug in Python 2.7. |
Please see this pull request as far as sampling is concerned. I have not yet gotten to the BLEU Validator. |
Another issue I found that may only be limited to sampling, but is more likely an issue for the NMT model itself: In stream_cs2en.py:51 you set the end-of-sequence marker to the size of the vocabulary. However, the EOS marker is never actually added when the model is computed. In GroundHog this was solved by setting the last element in the sequence to the EOS token, and indeed there are some remains in sampling.py:42 which do not seem to get executed at all. An example input sequence now looks like this (with a vocabulary size of 220):
Likewise, a sequence does not start with a BOS token, but I believe this was also the case in GroundHog. I also noticed that, disregarding the 0-padding, all sequences end with 1 (the UNK token). |
I figured out why the EOS token is not present. While |
@fhirschmann thanks a lot for the pointers and testing efforts again, really appreciate it :) Please see my comments below
This will be fixed as i start testing sampling/beam-search. The reason why we have separate
This seems more like of a blocks issue which is caused by logger resuming, i will take a closer look on this one soon.
Nice catch again, will try to figure the problem, but again the source of this problem is probably beyond the scope of this PR.
Python 2.7.6 -- 64-bit is default here at MILA
This is also apparently a sync blunder of mine, in Groundhog we set vocabulary size V and and eos idx to V-1. So please increase the vocabulary size by one or use eos idx as one minus vocabulary size (as you suggested) depending on your problem. So i am out of town attending a conference and will be back at MILA in one week, will try to resolve these issues as i have some time. |
Thanks @orhanf, I would very much like to continue to test and help fix the rest of this code. |
# Add early stopping based on bleu | ||
if config['bleu_script'] is not None: | ||
logger.info("Building bleu validator") | ||
BleuValidator(sampling_input, samples=samples, config=config, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You forgot to extensions.append()
here.
The binary value for |
I restarted travis, the test should pass now. |
That's weird. @orhanf , can you rebase? |
Well, it's a huge PR already. I'll merge it and open an issue to refactor it sometime in the future. |
@dmitriy-serdyuk, this is the initial implementation of RNN encoder-decoder with attention for machine translation. Working with the following versions (latest commits June30, 2015): blocks, fuel and picklable_itertools
Items TODO:
nottestedbut should be finenottested,have to clean it up and adapt the changes/fixes from NMTI will continue on these items this week, all minor issues compared to the whole PR
@ejls, can you please take a look if i am missing something.
@rizar your comments/recommendations are also highly welcomed