Add concatenative feature embeddings #147

bpopeters · 2017-08-01T15:12:45Z

This pull request adds support for concatenation of word and feature embeddings. Namely, if you have an x-dimension word embedding and a y-dimensional feature embedding, then these will be concatenated into an (x+y)-dimensional embedding for the encoder. This is the default setting for using features in Lua.

I believe the previous behavior of the embeddings class corresponds to the sum behavior, so I preserved it with the sum option.

The goal is for the interface to be the same is in Lua OpenNMT. Consequently, the command line interface has been changed slightly to match Lua.

-feat_merge accepts concat and sum; the default is concat
-feat_vec_exponent defaults to 0.7; it is used to compute size of feature embeddings when using concat
-feature_vec_size is now -feat_vec_size and its default is 20

While I was at it, I changed the implementation of make_positional_encodings() to one with vectorized tensor operations instead of for loops and the built-in math class. It is faster now.

srush · 2017-08-05T03:25:31Z

Great. This looks really good to me.

Also, I am moving this code around a bit in the torchtext branch, so I am also going to try to port it over as well. I might need you to take look there when it is done just to make sure everything got moved right.

srush · 2017-08-05T03:27:39Z

Also are we sure that sum is the same? Is there a linear+relu in lua torch?

bpopeters · 2017-08-05T09:37:35Z

I'm not totally sure if the sum is the same; I looked in the lua and couldn't figure it out. But at the very least, the result is of the same dimension as the word_vec_size, which is what should happen in the sum case. I also couldn't find much explanation of what's supposed to go on internally with the sum.

guillaumekln · 2017-08-05T18:00:33Z

Here is the Lua code:

https://github.com/OpenNMT/OpenNMT/blob/master/onmt/modules/FeaturesEmbedding.lua#L25

It just adds the embeddings, nothing more.

bpopeters · 2017-08-05T20:40:05Z

Thanks. I interpreted what was going on primarily based on the documentation here:

http://opennmt.net/OpenNMT/options/train/#sequence-to-sequence-with-attention-options

The documentation seems to be in error with respect to feat_vec_size. In order to merge the feature and word embeddings (which is necessary if summing is the only thing going on), then there should be no reason to ever use feat_vec_size with sum: the only allowable feature size is the word_vec_size. I assumed that the documentation was correct here, which is why I didn't dismiss the mlp feature thing as the implementation of sum outright because it allows an arbitrary feat_vec_size and returns the same dimension as a summing feature merge.

I'll get cracking on fixing this soon.

Added producer-consumer pattern to MTurk polling

bpopeters added 8 commits July 26, 2017 18:51

add -feat_merge and -feat_vec_exponent options for embeddings

5bd269b

change word_lut to emb_lut[0]

490bb91

Change default feat_vec_size

75a7e71

implement feature concatenation

26421ce

Add concat and sum feature embeddings, speed up positional encoding

ba164c0

update parameter names

ac0a4ae

remove print

76e6ada

fix parens bug

84996f4

srush merged commit 47305c7 into OpenNMT:master Aug 5, 2017

bpopeters deleted the embeddings branch August 5, 2017 09:07

bpopeters restored the embeddings branch August 8, 2017 08:02

marcotcr pushed a commit to marcotcr/OpenNMT-py that referenced this pull request Sep 20, 2017

Merge pull request OpenNMT#147 from facebookresearch/mturk

d43f88d

Added producer-consumer pattern to MTurk polling

vince62s mentioned this pull request Nov 2, 2018

OpenNMT brnn model parity #1031

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add concatenative feature embeddings #147

Add concatenative feature embeddings #147

bpopeters commented Aug 1, 2017

srush commented Aug 5, 2017

srush commented Aug 5, 2017

bpopeters commented Aug 5, 2017

guillaumekln commented Aug 5, 2017

bpopeters commented Aug 5, 2017

Add concatenative feature embeddings #147

Add concatenative feature embeddings #147

Conversation

bpopeters commented Aug 1, 2017

srush commented Aug 5, 2017

srush commented Aug 5, 2017

bpopeters commented Aug 5, 2017

guillaumekln commented Aug 5, 2017

bpopeters commented Aug 5, 2017