Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add embedding 2.0 #26649

Merged
merged 27 commits into from
Sep 1, 2020
Merged

Conversation

seiriosPlus
Copy link
Collaborator

@seiriosPlus seiriosPlus commented Aug 25, 2020

PR types

New features

PR changes

OPs

Describe

add new embedding function and change it's interface

@paddle-bot-old
Copy link

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

[0.0, 0.0, ..., 0.0 ]] # padding data
It will pad all-zero data when ids is 0.
Args:
input(Variable): A Tensor or LoDTensor with type int64, which contains the id information.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For 2.0, dtype Variable -> Tensor

input(Variable): A Tensor or LoDTensor with type int64, which contains the id information.
The last dimension of Tensor shape must be equal to 1. The value of the input id should
satisfy :math:`0<= id < size[0]` .
weight (Variable): The weight. A Tensor with shape of lookup table parameter. It should have two elements which
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above, use Tensor, recommend to check and amend the expression totally.

to :ref:`api_guide_Name`. Usually name is no need to set and
None by default.
Returns:
Variable: Embedding Tensor or LoDTensor mapped by input. The data type is the same as :attr:`dtype` .
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable -> Tensor
No LoDTensor. show Tensor only in docs.

[0.0, 0.0, ..., 0.0 ]]] # padding data
The input padding_idx is less than 0, it is automatically converted to padding_idx = -1 + 128 = 127
It will pad all-zero data when ids is 127.
Case 2:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In 2.0, LoDTensor is not recommended. LoD examples can be removed.

@@ -0,0 +1,57 @@
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2019 -> 2020

indicates the size of the dictionary of embeddings and the size of each embedding vector respectively.
is_sparse(bool): The flag indicating whether to use sparse update. This parameter only
affects the performance of the backwards gradient update. It is recommended to set
True because sparse update is faster. But some optimizer does not support sparse update,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... some optimizers do not support...

such as :ref:`api_fluid_optimizer_AdadeltaOptimizer` , :ref:`api_fluid_optimizer_AdamaxOptimizer` ,
:ref:`api_fluid_optimizer_DecayedAdagradOptimizer` , :ref:`api_fluid_optimizer_FtrlOptimizer` ,
:ref:`api_fluid_optimizer_LambOptimizer` and :ref:`api_fluid_optimizer_LarsMomentumOptimizer` .
In these case, is_sparse must be False. Default: False.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

case -> cases

@@ -108,3 +108,113 @@ def one_hot(x, num_classes, name=None):
outputs={'Out': one_hot_out},
stop_gradient=True)
return one_hot_out


def embedding(input, weight, padding_idx=None, is_sparse=True, name=None):
Copy link
Contributor

@Heeenrrry Heeenrrry Aug 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a preview page of this part is required to check the display format and layout.

@iclementine
Copy link

确认了一下详细的参数名,在这里记录一下

paddle.nn.Embedding(num_embeddings, embedding_dim, 
                    padding_idx=None, 
                    sparse=False, 
                    weight_attr=None, 
                    name=None)

paddle.nn.Embedding.forward(self, x)

paddle.nn.functional.embedding(x, weight, padding_idx=None,  sparse=False, name=None)

Copy link

@iclementine iclementine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

文档需要进行一些修改,因为 lookup_v2_op 不要求输入的最后一维是 1,输出的形状是输入的形状后面 append 一个 embedding_size.

Case 1:
input is a Tensor. padding_idx = -1
input.data = [[[1], [3]], [[2], [4]], [[4], [127]]]
input.shape = [3, 2, 1]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的形状说明是否要改一下,新的 op 不需要输入的最后一维是 1,而是在 Input 的 shape 后面附加一个 embedding_size 维度吧

.. code-block:: python
import paddle.fluid as fluid
import numpy as np
data = fluid.data(name='x', shape=[None, 1], dtype='int64')

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的形状也是,不要求最后一维是 1, output 会在 input 的维度后面附加一个 embedding_size

def embedding(input, weight, padding_idx=None, is_sparse=False, name=None):
"""
The operator is used to lookup embeddings vector of ids provided by :attr:`input` .
It automatically constructs a 2D embedding matrix based on the

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这些应该也是要修改,因为这个 functional.embedding 并不创建 weight.

Heeenrrry
Heeenrrry previously approved these changes Aug 27, 2020
**weight** (Parameter): the learnable weights of this layer.

Returns:
Variable: Embedding Tensor mapped by input. The data type is the same as :attr:`dtype` .
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is Tensor


Case 1:
input is a Tensor. padding_idx = -1
input.data = [[[1], [3]], [[2], [4]], [[4], [127]]]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这么写的实际形状是 [3, 2, 1]

"""
The operator is used to lookup embeddings vector of ids provided by :attr:`input` .

This OP requires the last dimension of Tensor shape must be equal to 1. The shape

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的行为也是和 append 一个 emb_size 而不是 replacing the last dimension of the input Tensor shape
with emb_size

None by default.

Returns:
Tensor: Embedding Tensor or LoDTensor mapped by input. The data type is the same as :attr:`dtype` .

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

data type 是和 weight 的 dtype 一致。

Examples:
.. code-block:: python

import paddle.fluid as fluid

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

建议使用动态图做为示例代码。

iclementine
iclementine previously approved these changes Aug 27, 2020
Copy link

@iclementine iclementine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
I added some comments, may be left for another document_fix PR?

@seiriosPlus
Copy link
Collaborator Author

LGTM
I added some comments, may be left for another document_fix PR?

I will pull another request for document_fix.

@seiriosPlus seiriosPlus dismissed stale reviews from iclementine and Heeenrrry via c2ebb07 August 28, 2020 01:50
iclementine
iclementine previously approved these changes Aug 28, 2020
Copy link

@iclementine iclementine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

iclementine
iclementine previously approved these changes Aug 28, 2020
Copy link

@iclementine iclementine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@seiriosPlus
Copy link
Collaborator Author

image

sparse(bool): The flag indicating whether to use sparse update. This parameter only
affects the performance of the backwards gradient update. It is recommended to set
True because sparse update is faster. But some optimizers does not support sparse update,
such as :ref:`api_fluid_optimizer_AdadeltaOptimizer` , :ref:`api_fluid_optimizer_AdamaxOptimizer` ,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里后续需要更新为paddle.optimizer下的API文档

The operator is used to lookup embeddings vector of ids provided by :attr:`input` .

The shape of output Tensor is generated by appending the last dimension of the input Tensor shape
with emb_size.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

emb_size -> embedding size


The shape of output Tensor is generated by appending the last dimension of the input Tensor shape
with emb_size.
**Note:** The id in :attr:`input` must satisfy :math:`0 =< id < size[0]` ,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

size[0] -> weight.shape[0]

padding_idx = -1
input.data = [[1, 3], [2, 4], [4, 127]]
input.shape = [3, 2]
Given size = [128, 16]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

input.data -> x.data
input.shape -> x.shape
Given size -> weight.shape

It will pad all-zero data when ids is 127.

Args:
x(Tensor): A Tensor or LoDTensor with type int64, which contains the id information.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. 去掉LoDTensor
  2. 增加支持int32
  3. 最后一维不要求是1
  4. size[0] -> weight.size[0]

inp_word.shape # [2, 3]
dict_size = 20

emb = nn.Embedding(dict_size, 32, weight_attr='emb.w', sparse=False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里用embedding的例子

helper = LayerHelper('embedding', **locals())
dtype = helper.input_dtype()

check_variable_and_dtype(x, 'input', ['int64'], 'embedding')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

支持下int32

@@ -15,7 +15,7 @@
# TODO: define the common classes to build a neural network
from ...fluid.dygraph import BilinearTensorProduct #DEFINE_ALIAS
from ...fluid.dygraph import Pool2D #DEFINE_ALIAS
from ...fluid.dygraph import Embedding #DEFINE_ALIAS
from ...fluid.dygraph import Linear #DEFINE_ALIAS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为啥把Embedding改成Linear?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

解决冲突, 并没有修改这个地方。

sparse(bool): The flag indicating whether to use sparse update. This parameter only
affects the performance of the backwards gradient update. It is recommended to set
True because sparse update is faster. But some optimizer does not support sparse update,
such as :ref:`api_fluid_optimizer_AdadeltaOptimizer` , :ref:`api_fluid_optimizer_AdamaxOptimizer` ,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同上


emb = nn.Embedding(dict_size,
32,
weight_attr='emb.w',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

emb.w 是啥?

XiaoguangHu01
XiaoguangHu01 previously approved these changes Aug 31, 2020
Copy link
Contributor

@XiaoguangHu01 XiaoguangHu01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

XiaoguangHu01
XiaoguangHu01 previously approved these changes Aug 31, 2020
Copy link
Contributor

@XiaoguangHu01 XiaoguangHu01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

chalsliu
chalsliu previously approved these changes Aug 31, 2020
@phlrain phlrain self-requested a review August 31, 2020 13:30
The shape of output Tensor is generated by appending the last dimension of the input Tensor shape
with embedding size.
**Note:** The id in :attr:`input` must satisfy :math:`0 =< id < weight.shape[0]` ,
otherwise the program will throw an exception and exit.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

input -> x


.. code-block:: python

import paddle
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个示例代码是静态图的吧。
需要改成动态图的示例代码。
直接用paddle.randn创建的tensor作为weight是不是就可以?

"""
:alias_main: paddle.nn.Embedding
:alias: paddle.nn.Embedding,paddle.nn.layer.Embedding,paddle.nn.layer.common.Embedding
:old_api: paddle.fluid.dygraph.Embedding
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不需要这三行alias

For specific usage, refer to code examples. It implements the function of the Embedding Layer.
This layer is used to lookup embeddings vector of ids provided by :attr:`input` .
It automatically constructs a 2D embedding matrix based on the
input :attr:`size` (vocab_size, emb_size) and :attr:`dtype` .
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:attr:size没有了, attr:dtype 也没有了。

The shape of output Tensor is generated by appending an emb_size dimension to the
last dimension of the input Tensor shape.

**Note:** The id in :attr:`input` must satisfy :math:`0 =< id < size[0]` ,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

size[0] -> num_embeddings

such as :ref:`api_optimizer_AdadeltaOptimizer` , :ref:`api_optimizer_AdamaxOptimizer` ,
:ref:`api_optimizer_DecayedAdagradOptimizer` , :ref:`api_optimizer_FtrlOptimizer` ,
:ref:`api_optimizer_LambOptimizer` and :ref:`api_optimizer_LarsMomentumOptimizer` .
In these case, is_sparse must be False. Default: False.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_sparse

default weight parameter property is used. See usage for details in :ref:`api_fluid_ParamAttr` . In addition,
user-defined or pre-trained word vectors can be loaded with the :attr:`param_attr` parameter.
The local word vector needs to be transformed into numpy format, and the shape of local word
vector should be consistent with :attr:`size` . Then :ref:`api_fluid_initializer_NumpyArrayInitializer`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

size没有了

user-defined or pre-trained word vectors can be loaded with the :attr:`param_attr` parameter.
The local word vector needs to be transformed into numpy format, and the shape of local word
vector should be consistent with :attr:`size` . Then :ref:`api_fluid_initializer_NumpyArrayInitializer`
is used to load custom or pre-trained word vectors. See code example 2 for details.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is code example 2?

emb = nn.Embedding(
dict_size,
32,
sparse=False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

缺失
对emb的调用。


Attribute:
**weight** (Parameter): the learnable weights of this layer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"""
Shape
Input:
Output:
"""
的形式说明一下forward时的输入输出的形状。

Copy link
Contributor

@jzhang533 jzhang533 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will have followup pr to fix docs, after this pr is landed.

Copy link
Collaborator

@phlrain phlrain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for api

const int64_t *ids_p = nullptr;

if (ids_t->type() == framework::proto::VarType::INT32) {
InputTypeCovert<
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这种类型转换会影响性能,建议写到kernel中,这个优化可以放在后需的pr中优化

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我看看有没有更好的解法, 不用强转就可以解决最好

@chalsliu chalsliu self-requested a review September 1, 2020 03:25
@seiriosPlus seiriosPlus merged commit ebc5f99 into PaddlePaddle:develop Sep 1, 2020
seiriosPlus added a commit to seiriosPlus/Paddle that referenced this pull request Sep 2, 2020
* add embedding 2.0

* add embedding support input int32
seiriosPlus added a commit that referenced this pull request Sep 2, 2020
* add embedding 2.0

* add embedding support input int32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants