add dot-product attention #4674

ranqiu92 · 2017-10-10T08:35:56Z

lcy-seso

Almost LGTM.

lcy-seso · 2017-10-11T07:02:52Z

python/paddle/trainer_config_helpers/networks.py

@@ -1396,6 +1398,85 @@ def simple_attention(encoded_sequence,
        input=scaled, pooling_type=SumPooling(), name="%s_pooling" % name)


+@wrap_name_default()
+def dot_product_attention(encoded_sequence,
+                          attending_sequence,


attending_sequence --> attended_sequence

lcy-seso · 2017-10-11T07:06:21Z

python/paddle/trainer_config_helpers/networks.py

+                          name=None):
+    """
+    Calculate and return a context vector with dot-product attention mechanism.
+    Size of the context vector equals to size of the attending_sequence.


The dimension of context vector equals to the dimension of the attended sequence.

lcy-seso · 2017-10-11T07:06:58Z

python/paddle/trainer_config_helpers/networks.py

+        c_{i} & = \\sum_{j=1}^{T_{x}}a_{i,j}z_{j}
+
+    where :math:`h_{j}` is the jth element of encoded_sequence,
+    :math:`z_{j}` is the jth element of attending_sequence,


attended sequence

lcy-seso · 2017-10-12T13:27:51Z

python/paddle/trainer_config_helpers/networks.py

+    ..  code-block:: python
+
+        context = dot_product_attention(encoded_sequence=enc_seq,
+                                        attending_sequence=att_seq,


attending_sequence --> attended_sequence

lcy-seso · 2017-10-13T01:07:51Z

python/paddle/trainer_config_helpers/networks.py

+                                        attending_sequence=att_seq,
+                                        transformed_state=state,)
+
+    :param name: name of the dot-product attention model.


A prefix attached to the name of each layer that defined inside the dot_product_attention.

lcy-seso · 2017-10-13T01:15:08Z

python/paddle/trainer_config_helpers/networks.py

+    :type softmax_param_attr: ParameterAttribute
+    :param encoded_sequence: output of the encoder
+    :type encoded_sequence: LayerOutput
+    :param attending_sequence: attention weight is computed by a feed forward neural


The attention weight ...

lcy-seso · 2017-10-13T01:16:39Z

python/paddle/trainer_config_helpers/networks.py

+                               hidden state of previous time step and encoder's output.
+                               attending_sequence is the sequence to be attended.
+    :type attending_sequence: LayerOutput
+    :param transformed_state: transformed hidden state of decoder in previous time step,


The transformed ...

Are words "transformed hidden state" the commonly accepted name used in the original paper?

i use this just for flexibility consideration

lcy-seso · 2017-10-13T01:18:53Z

python/paddle/trainer_config_helpers/networks.py

+                               attending_sequence is the sequence to be attended.
+    :type attending_sequence: LayerOutput
+    :param transformed_state: transformed hidden state of decoder in previous time step,
+                              its size should equal to encoded_sequence's. Here we do the


whose dimension should be equal to encoded_sequence's dimension. Or use a period at the end of last sentence and changes "its" into "Its".

lcy-seso · 2017-10-13T01:21:33Z

python/paddle/trainer_config_helpers/networks.py

+                              transformation outside dot_product_attention for flexibility
+                              consideration.
+    :type transformed_state: LayerOutput
+    :return: a context vector


The context vector.

lcy-seso · 2017-10-13T01:22:01Z

python/paddle/trainer_config_helpers/networks.py

+    :return: a context vector
+    :rtype: LayerOutput
+    """
+    assert transformed_state.size == encoded_sequence.size


please leaves a message to explain the check.

… attention

lcy-seso

LGTM.

add dot-product attention

4545a05

lcy-seso requested changes Oct 13, 2017

View reviewed changes

ranqiu added 2 commits October 17, 2017 17:15

refine dot-product attention according to the comments

7832019

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

7ad1525

… attention

lcy-seso approved these changes Oct 17, 2017

View reviewed changes

lcy-seso merged commit f12f61d into PaddlePaddle:develop Oct 17, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add dot-product attention #4674

add dot-product attention #4674

ranqiu92 commented Oct 10, 2017

lcy-seso left a comment

lcy-seso Oct 11, 2017

ranqiu92 Oct 17, 2017

lcy-seso Oct 11, 2017

ranqiu92 Oct 17, 2017

lcy-seso Oct 11, 2017

ranqiu92 Oct 17, 2017

lcy-seso Oct 12, 2017

ranqiu92 Oct 17, 2017

lcy-seso Oct 13, 2017

ranqiu92 Oct 17, 2017

lcy-seso Oct 13, 2017

ranqiu92 Oct 17, 2017

lcy-seso Oct 13, 2017

ranqiu92 Oct 17, 2017

lcy-seso Oct 13, 2017

ranqiu92 Oct 17, 2017

lcy-seso Oct 13, 2017

ranqiu92 Oct 17, 2017

lcy-seso Oct 13, 2017

ranqiu92 Oct 17, 2017

lcy-seso left a comment

add dot-product attention #4674

add dot-product attention #4674

Conversation

ranqiu92 commented Oct 10, 2017

lcy-seso left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lcy-seso left a comment

Choose a reason for hiding this comment