Add Factorization Machine Layer #4859

will-am · 2017-10-17T03:16:10Z

Resolve #4628 #3664 #3971

…chine_layer

dzhwinter · 2017-10-18T05:31:33Z

paddle/gserver/layers/FactorizationMachineLayer.cpp

+  const MatrixPtr& inputV = getInputValue(0);
+
+  size_t batchSize = inputV->getHeight();
+  size_t size = getSize();


what is getSize mean? I cannot validate this snippet of code without your comment.

getSize returns the output size of this layer.

请按照上面 @dzhwinter 的comment，为变量起一个更有意义的名字。

已改为outputSize

…chine_layer

…_CpuSparseMatrix

…chine_layer

lcy-seso

some simple comments first.

lcy-seso · 2017-11-14T06:22:32Z

paddle/gserver/layers/FactorizationMachineLayer.cpp

+  factorSize_ = config_.factor_size();
+
+  /* initialize the latentVectors_ */
+  CHECK_EQ(inputLayers_.size(), 1UL);


35 ~ 40 行不要在 init 里面做，移到 forward 里面。

lcy-seso · 2017-11-14T06:23:28Z

paddle/gserver/layers/FactorizationMachineLayer.cpp

+  latentVectors_ =
+      std::unique_ptr<Weight>(new Weight(height, factorSize_, parameters_[0]));
+
+  v2_ = Matrix::create(height, factorSize_, false, useGpu_);


v2_ 这个命名不可读，请使用有意义更可读的名字。

已改为latentVectorsSquare_

lcy-seso · 2017-11-14T06:24:28Z

paddle/gserver/layers/FactorizationMachineLayer.cpp

+  const MatrixPtr& inputV = getInputValue(0);
+
+  size_t batchSize = inputV->getHeight();
+  size_t size = getSize();


请按照上面 @dzhwinter 的comment，为变量起一个更有意义的名字。

lcy-seso · 2017-11-14T06:30:03Z

paddle/gserver/layers/FactorizationMachineLayer.cpp

+  Matrix::resizeOrCreate(tmpMul_, batchSize, factorSize_, false, useGpu_);
+  Matrix::resizeOrCreate(tmpOut_, batchSize, factorSize_, false, useGpu_);
+
+  REGISTER_TIMER_INFO("FwMulTimer", getName().c_str());


如果要使用 REGISTER_TIMER_INFO 第一个参数是 Timer的名字，这里是从 FC copy过来的吧，请把名字改一下。

lcy-seso · 2017-11-14T06:30:12Z

paddle/gserver/layers/FactorizationMachineLayer.cpp

+  outV->sumRows(*tmpOut_, -0.5, 1.0);
+
+  /* activation */ {
+    REGISTER_TIMER_INFO("FwAtvTimer", getName().c_str());


请改一下Timer的名字。

lcy-seso · 2017-11-14T06:30:31Z

paddle/gserver/layers/FactorizationMachineLayer.cpp

+
+void FactorizationMachineLayer::backward(const UpdateCallback& callback) {
+  /* Do derivation */ {
+    REGISTER_TIMER_INFO("BpAvtTimer", getName().c_str());


请注意改一下Timer的名字。

lcy-seso · 2017-11-14T06:40:02Z

paddle/gserver/layers/FactorizationMachineLayer.cpp

+      CpuSparseMatrix* x2_s = dynamic_cast<CpuSparseMatrix*>(x2_.get());
+      CpuSparseMatrix* tmpIn_s = dynamic_cast<CpuSparseMatrix*>(tmpIn.get());
+      tmpIn_s->copyFrom(*inputV_s);
+      tmpIn_s->rowScale(0, *inputV_s, *oGrad);


inputV_s x2_s tmpIn_s inputV_s 这些命名的风格不统一，请按照layers里面的风格进行统一。并且，这些变量的命名不可读，请考虑使用有意义的名字。

已改为sparseInputV, sparseInputSquare, sparseTmpInput

lcy-seso · 2017-11-14T06:40:51Z

paddle/gserver/layers/FactorizationMachineLayer.cpp

+      latentVectors_->getWGrad()->mul(*tmpIn_s->getTranspose(), *tmpMul_, 1, 1);
+      tmpIn_s->rowScale(0, *x2_s, *oGrad);
+
+      MatrixPtr ones = Matrix::create(1, inputV->getHeight(), false, useGpu_);


把临时变量ones变成员变量。

lcy-seso · 2017-11-14T06:41:39Z

paddle/gserver/layers/FactorizationMachineLayer.cpp

+  if (inGrad != NULL) {
+    MatrixPtr latentVectors_T = latentVectors_->getW()->getTranspose();
+    inGrad->mul(*tmpMul_, *latentVectors_T, 1, 1);
+    tmpSum_T->sumRows(*v2_, -1, 0);


tmpSum_T请修改一下变量的命名风格。

已改为tmpSumTrans

…chine_layer

lcy-seso · 2017-11-14T07:23:25Z

paddle/gserver/tests/test_LayerGrad.cpp

+  config.biasSize = 0;
+  config.inputDefs.push_back({type, "layer_0", 128, 1280});
+  config.layerConfig.add_inputs();
+  testLayerGrad(config, "factorization_machine", 16, false, useGpu, false);


SparseMatrix 作为输时请添加单测。

lcy-seso · 2017-11-14T07:43:00Z

paddle/gserver/layers/FactorizationMachineLayer.cpp

+}
+
+void FactorizationMachineLayer::forward(PassType passType) {
+  Layer::forward(passType);


不支持GPU上运行请加检查并提示错误。

lcy-seso · 2017-11-14T08:10:51Z

paddle/gserver/layers/FactorizationMachineLayer.h

+ *
+ * \f[
+ *     y = \sum_{i=1}^{n-1}\sum_{j=i+1}^n\langle v_i, v_j \rangle x_i x_j
+ * \f]


You can cite the inference paper here.

lcy-seso · 2017-11-14T08:13:28Z

paddle/math/CpuSparseMatrix.cpp

+  for (size_t i = 0; i < height_; i++) {
+    size_t start = getRowStartIdx(i);
+    size_t end = getRowStartIdx(i + 1);
+    CHECK(start == b.getRowStartIdx(i));


CHECK --> CHECK_EQ

lcy-seso · 2017-11-14T08:14:50Z

python/paddle/trainer_config_helpers/layers.py

+    The Factorization Machine can effectively capture feature interactions
+    especially when the input is sparse. In practice, usually order 2 feature
+    interactions are considered using Factorization Machine with the formula:
+    .. math::


line 7166 之前空一行。
line 7167 之后空一行，否则公式无法正常显示。

lcy-seso · 2017-11-14T08:19:16Z

python/paddle/trainer_config_helpers/layers.py

+    The Factorization Machine models pairwise feature interactions as inner
+    product of the learned latent vectors corresponding to each input feature.
+    The Factorization Machine can effectively capture feature interactions
+    especially when the input is sparse. In practice, usually order 2 feature


usually order 2 feature --> this implementation only consider the 2-order feature interactions.

请在注释中增加一下对FM层实现所参考的原论文的引用。

lcy-seso · 2017-11-14T08:59:14Z

python/paddle/trainer_config_helpers/layers.py

+        is the latent vector corresponding to each input dimesion. The size of
+        each latent vector is k.
+    .. code-block:: python
+       factor_machine = factorization_machine(input=input_layer, factor_size=10)


7172 行之前空一行，
7173 行之后空一行。

lcy-seso · 2017-11-14T09:11:43Z

proto/ModelConfig.proto

@@ -540,6 +540,9 @@ message LayerConfig {

  // for switch order layer
  optional ReshapeConfig reshape_conf = 59;
+
+  // for factorization machine layer
+  optional uint32 factor_size = 60;


为什么不能复用 Layer 的size，而新定义这个字段。

Layer的size是输出的维度，而这个是内部使用的隐变量(factor)的维度

要是用size感觉会有歧义

抱歉，我理解错误。忽略。

lcy-seso · 2017-11-14T09:22:09Z

python/paddle/trainer_config_helpers/layers.py

+    :param layer_attr: Extra Layer config.
+    :type layer_attr: ExtraLayerAttribute|None
+    :return: LayerOutput object.
+    :rtype: LayerOutput


请在comment和示例代码中注明这一层本身并不是 FM，只是完成二阶特征组合部分。需要和其它层配置使用，在simple code 部分给出一个完整的示例。

…chine_layer

lcy-seso · 2017-11-27T09:36:55Z

python/paddle/trainer_config_helpers/layers.py

+
+
+@wrap_name_default()
+@wrap_act_default(act=LinearActivation())


这里只可以使用非线性激活函数吧。如果从原理上不能使用非线性激活，就把激活写死，不要让用户来设置了。

可以用非线性的~

lcy-seso · 2017-11-27T09:45:21Z

python/paddle/trainer_config_helpers/layers.py

+    especially when the input is sparse.
+
+    This implementation only consider the 2-order feature interactions using
+    Factorization Machine with the formula:


除了这个注释之外，在 7426 行加一个完整的配置，方便用户看到这一层的文档时，能够写出来一个完整的 FM 模型。

注释一下支持的 input 类型和不支持的类型。

好的，我加下

lcy-seso · 2017-11-27T09:46:56Z

python/paddle/trainer_config_helpers/layers.py

+    :param input: The input layer.
+    :type input: LayerOutput
+    :param factor_size: The hyperparameter that defines the dimensionality of
+                        the latent vector size


句末加上句号。

lcy-seso · 2017-11-27T09:47:23Z

python/paddle/trainer_config_helpers/layers.py

+    :param factor_size: The hyperparameter that defines the dimensionality of
+                        the latent vector size
+    :type context_len: int
+    :param act: Activation Type. Default is linear activation.


原理上这里可以使用非线性激活吗？应该不可以吧。

lcy-seso · 2017-11-27T09:48:12Z

python/paddle/trainer_config_helpers/layers.py

+    :param act: Activation Type. Default is linear activation.
+    :type act: BaseActivation
+    :param param_attr: The Parameter Attribute. If None, the latent vectors will
+                       be initialized smartly. It's better to set it by


作为注释，还是解释一下 “be initialized smartly” 到底是怎样初始化的。

lcy-seso · 2017-11-27T09:55:18Z

paddle/gserver/layers/FactorizationMachineLayer.cpp

+      Matrix::resizeOrCreate(negOnes_, 1, inputV->getHeight(), false, useGpu_);
+      negOnes_->zeroMem();
+      negOnes_->add(-1);
+      tmpSum_->mul(*negOnes_, *sparseTmpInput, 1, 0);


// this = scaleAB*(a*b) + scaleT*this mul(const Matrix& a, const Matrix& b, real scaleAB, real scaleT)

125 ~ 127 行为什么不能是：

ones_->ones(); tmpSum_->mul(*ones_, *sparseTmpInput, -1, 0);

Paddle/paddle/math/Matrix.cpp

Line 2944 in b28b2f1

CHECK_EQ(scaleAB, static_cast<real>(1.0));

因为b是sparse的时候mul只支持scaleAB是1，不支持其他value

lcy-seso · 2017-11-27T10:02:20Z

paddle/gserver/layers/FactorizationMachineLayer.cpp

+
+  /* activation */ {
+    REGISTER_TIMER_INFO("FmFwAtvTimer", getName().c_str());
+    forwardActivation();


FM 层可以加非线性激活吗？如果原理上不可以（我记得不可以，可以再确认下），这里可以删掉。如果允许，就保留。

可以加非线性的激活~

这里算的只是二阶交叉项，你的意思是如果我在二阶交叉项使用非线性激活A，一阶项使用非线性激活B，这样也可以吗？

虽然没有看到这样用的，但理论上应该是可以的~

will-am added 8 commits October 11, 2017 16:07

Add framework of the factorization machine layer

1644c72

Merge remote-tracking branch 'upstream/develop' into factorization_ma…

5e78c7a

…chine_layer

Merge remote-tracking branch 'upstream/develop' into factorization_ma…

28c9810

…chine_layer

Remove unnecessary configs

f504c8a

Implement factorization machine layer

947b6a7

Merge remote-tracking branch 'upstream/develop' into factorization_ma…

22c5d1f

…chine_layer

Fix tests for factorization machine layer

2ce8f18

Merge remote-tracking branch 'upstream/develop' into factorization_ma…

b3cd679

…chine_layer

will-am requested a review from lcy-seso October 17, 2017 03:16

lcy-seso requested review from Yancey1989 and dzhwinter October 17, 2017 03:59

will-am added 5 commits October 17, 2017 12:20

Reduce the input size in testing factorization machine

86053e7

Merge remote-tracking branch 'upstream/develop' into factorization_ma…

0574915

…chine_layer

Change pow to square in factorization machine layer

9741ade

Fix dims in config parser for factorization machine layer

8654e8a

Merge remote-tracking branch 'upstream/develop' into factorization_ma…

a30d53b

…chine_layer

dzhwinter reviewed Oct 18, 2017

View reviewed changes

will-am added 9 commits October 18, 2017 15:36

Fix creation of tmp variable in factorization machine layer

4c72b06

Merge remote-tracking branch 'upstream/develop' into factorization_ma…

a8526f1

…chine_layer

Merge remote-tracking branch 'upstream/develop' into factorization_ma…

822ff38

…chine_layer

Merge remote-tracking branch 'upstream/develop' into factorization_ma…

601c1a3

…chine_layer

Add sparse matrix support in factorization machine layer

d9062cd

Add rowScale for CpuSparseMatrix

509ae79

Merge remote-tracking branch 'upstream/develop' into add_rowScale_for…

477e3eb

…_CpuSparseMatrix

Add sparse input support for factorization machine layer

4172fc0

Merge remote-tracking branch 'upstream/develop' into factorization_ma…

f7941db

…chine_layer

lcy-seso requested changes Nov 14, 2017

View reviewed changes

will-am added 2 commits November 14, 2017 14:59

Merge branch 'add_rowScale_for_CpuSparseMatrix' into factorization_ma…

3ff683f

…chine_layer

Merge remote-tracking branch 'upstream/develop' into factorization_ma…

e5135e8

…chine_layer

lcy-seso reviewed Nov 14, 2017

View reviewed changes

will-am added 14 commits November 16, 2017 17:15

Update variable names and docs for factorization machine layer

7a1a586

Merge remote-tracking branch 'upstream/develop' into factorization_ma…

d6e35ec

…chine_layer

Fix typo in factorization machine layer

0b6afb5

Add unitest for factorization machine layer with sparse input

09f4f92

Merge remote-tracking branch 'upstream/develop' into factorization_ma…

571ef90

…chine_layer

Merge remote-tracking branch 'upstream/develop' into factorization_ma…

5ee63bb

…chine_layer

Merge remote-tracking branch 'upstream/develop' into factorization_ma…

6a0cfd9

…chine_layer

Update docs for factorization machine layer

d5a6c81

Merge remote-tracking branch 'upstream/develop' into factorization_ma…

5392a50

…chine_layer

Add support of sparse_binary_vector as input for fm layer

6fed6f2

Merge remote-tracking branch 'upstream/develop' into factorization_ma…

13ec6f9

…chine_layer

change clone to resizeOrCreate in fm layer

74a699a

Merge remote-tracking branch 'upstream/develop' into factorization_ma…

b80cdce

…chine_layer

Merge remote-tracking branch 'upstream/develop' into factorization_ma…

89e63b1

…chine_layer

lcy-seso reviewed Nov 27, 2017

View reviewed changes

Update docs for fm layer

8a283db

lcy-seso approved these changes Nov 27, 2017

View reviewed changes

will-am merged commit 95cdbfe into PaddlePaddle:develop Nov 27, 2017

will-am deleted the factorization_machine_layer branch November 27, 2017 14:09



		@wrap_name_default()
		@wrap_act_default(act=LinearActivation())

Add Factorization Machine Layer #4859

Add Factorization Machine Layer #4859

Conversation

will-am commented Oct 17, 2017 • edited by lcy-seso Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lcy-seso left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lcy-seso Nov 14, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lcy-seso Nov 14, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lcy-seso Nov 14, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lcy-seso Nov 27, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

will-am commented Oct 17, 2017 •

edited by lcy-seso

Loading

lcy-seso Nov 14, 2017 •

edited

Loading

lcy-seso Nov 14, 2017 •

edited

Loading

lcy-seso Nov 14, 2017 •

edited

Loading

lcy-seso Nov 27, 2017 •

edited

Loading