Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

does the key masking work? #33

Closed
liuwei1206 opened this issue May 26, 2018 · 6 comments
Closed

does the key masking work? #33

liuwei1206 opened this issue May 26, 2018 · 6 comments

Comments

@liuwei1206
Copy link

liuwei1206 commented May 26, 2018

Hi @Kyubyong
as you can see the key masking code as following:

# Key Masking
key_masks = tf.sign(tf.abs(tf.reduce_sum(keys, axis=-1))) # (N, T_k)
key_masks = tf.tile(key_masks, [num_heads, 1]) # (h*N, T_k)
key_masks = tf.tile(tf.expand_dims(key_masks, 1), [1, tf.shape(queries)[1], 1]) # (h*N, T_q, T_k)

the params keys,is the sum of word_embedding and position_embedding. it means that even the word in a sentence is padding 0, as add postion_embedding to the word_embedding, there's no 0 vector for the final word_embedding. therefore, the key_masks must all be one, no zero! so I'm confused if the code works?

@Duum
Copy link

Duum commented Jul 23, 2018

I have the same question, I don't think the mask work!.

@mingxiansen
Copy link

I tested the model, you are right, all of masking are not work! @Kyubyong

@bobobe
Copy link

bobobe commented Sep 5, 2018

i found that there are two method implement position embedding,key_mask can work on PE which params are training among model,but not work on PE which the paper says,because the padding embeddings are not 0 !who can fix it? @Kyubyong

@shaunzhuyw
Copy link

I alos found the same question. if the raw code are used, the mask doesn't work at all. I verify the code like this:
self.length_mask = tf.cast(tf.sequence_mask(length_batch, maxlen), tf.int32)
length_embedding = tf.Variable(tf.concat([tf.zeros(shape=(1, num_units)), tf.ones(shape=(maxlen-1, num_units))], 0), trainable=False)
self.length_mask_embedding = tf.nn.embedding_lookup(length_embedding, self.length_mask)
self.dec_position_embedding *= self.length_mask_embedding
self.dec += self.dec_position_embedding

where length_batch is the length of each sentence in the batch

@Yang-Charles
Copy link

I recently ran this model and found that it has not been running.
tqdm progressbar no running!

@zsgchinese
Copy link

I alos found the same question. if the raw code are used, the mask doesn't work at all. I verify the code like this:
self.length_mask = tf.cast(tf.sequence_mask(length_batch, maxlen), tf.int32)
length_embedding = tf.Variable(tf.concat([tf.zeros(shape=(1, num_units)), tf.ones(shape=(maxlen-1, num_units))], 0), trainable=False)
self.length_mask_embedding = tf.nn.embedding_lookup(length_embedding, self.length_mask)
self.dec_position_embedding *= self.length_mask_embedding
self.dec += self.dec_position_embedding

where length_batch is the length of each sentence in the batch

in the length_embedding, the second embedding of one should be a shape[1, num_units]? not the max_len - 1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants