Multihead scaled dot product attention. #7791

lcy-seso · 2018-01-23T11:32:05Z

No description provided.

guoshengCS

LGTM

guoshengCS · 2018-01-24T08:05:04Z

python/paddle/v2/fluid/layers/nn.py

+        if len(x_shape) == 1:
+            x_shape = [1] + x_shape
+        if len(y_shape) == 1:
+            y_shape = [1] + y_shape


If the rank of y is 1, it is treated as [D, 1] in nontransposed form.

guoshengCS · 2018-01-24T08:06:12Z

python/paddle/v2/fluid/nets.py

+                    [bs, max_sequence_length, num_heads * hidden_dim].
+        """
+
+        if len(x.shape) == 3: return


Might it be return x here.

guoshengCS · 2018-01-24T08:08:39Z

doc/api/v2/fluid/nets.rst

-dot_product_attention
---------------------
+scaled_dot_product_attention
+----------------------------
 ..  autofunction:: paddle.v2.fluid.nets.dot_product_attention


Might it be paddle.v2.fluid.nets.scaled_dot_product_attention here.

lcy-seso added 5 commits January 22, 2018 17:55

add wrapper for multihead_attention.

3e195d8

Merge branch 'develop' into multihead_attention

abf9395

add multi-head scaled_dot_product attention.

113cd6b

Merge branch 'develop' into multihead_attention

dace68a

add multihead_attention.

3be6c73

lcy-seso force-pushed the multihead_attention branch 2 times, most recently from 7442fe0 to 4a24f76 Compare January 23, 2018 11:43

fix bugs.

9396c6d

lcy-seso force-pushed the multihead_attention branch 2 times, most recently from 9c550a9 to c0ac68b Compare January 23, 2018 13:28

lcy-seso requested a review from guoshengCS January 23, 2018 13:29

lcy-seso force-pushed the multihead_attention branch from c0ac68b to 90f334e Compare January 23, 2018 13:36

guoshengCS previously approved these changes Jan 23, 2018

View reviewed changes

Merge branch 'develop' into multihead_attention

d163592

lcy-seso dismissed guoshengCS’s stale review via d163592 January 24, 2018 00:39

lcy-seso force-pushed the multihead_attention branch from 90f334e to d163592 Compare January 24, 2018 00:39

guoshengCS previously approved these changes Jan 24, 2018

View reviewed changes

lcy-seso dismissed guoshengCS’s stale review via dc11205 January 24, 2018 02:36

lcy-seso force-pushed the multihead_attention branch from dc11205 to 738cc13 Compare January 24, 2018 02:37

guoshengCS previously approved these changes Jan 24, 2018

View reviewed changes

fix the documentation.

0d96899

lcy-seso dismissed guoshengCS’s stale review via 0d96899 January 24, 2018 04:57

lcy-seso force-pushed the multihead_attention branch 2 times, most recently from f4e5bd0 to d6f2d79 Compare January 24, 2018 06:42

add linear projection to q, k and v.

d00eb53

lcy-seso force-pushed the multihead_attention branch from d6f2d79 to d00eb53 Compare January 24, 2018 06:53

guoshengCS reviewed Jan 24, 2018

View reviewed changes

follow comments.

7210d0f

guoshengCS approved these changes Jan 24, 2018

View reviewed changes

lcy-seso merged commit 32a5dfd into PaddlePaddle:develop Jan 24, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multihead scaled dot product attention. #7791

Multihead scaled dot product attention. #7791

lcy-seso commented Jan 23, 2018 •

edited

guoshengCS left a comment

guoshengCS Jan 24, 2018

guoshengCS Jan 24, 2018

guoshengCS Jan 24, 2018 •

edited

Multihead scaled dot product attention. #7791

Multihead scaled dot product attention. #7791

Conversation

lcy-seso commented Jan 23, 2018 • edited

guoshengCS left a comment

Choose a reason for hiding this comment

guoshengCS Jan 24, 2018

Choose a reason for hiding this comment

guoshengCS Jan 24, 2018

Choose a reason for hiding this comment

guoshengCS Jan 24, 2018 • edited

Choose a reason for hiding this comment

lcy-seso commented Jan 23, 2018 •

edited

guoshengCS Jan 24, 2018 •

edited