The hyper parameters of paddle.optimizer does not work in v2 API. #2042

qingqing01 · 2017-05-07T15:44:51Z

The hyper parameters in the paddle.optimizer does not work in v2 API. For example, using momentum optimizer in the sentiment demo as follows,

    optimizer = paddle.optimizer.Momentum(
        learning_rate=2e-3,
        momentum=0.9,
        gradient_clipping_threshold=25.0,
        regularization=paddle.optimizer.L2Regularization(rate=8e-4),
        model_average=paddle.optimizer.ModelAverage(average_window=0.5))

Then print the proto-string of config before this line in python/paddle/v2/trainer.py, it can be found that the proto-string of parameters does not contain the hyper parameters, such as L2 regularization and momentum. The momentum is 0 if you print it before this line in paddle/parameter/FirstOrderOptimizer.h. The proto-string of parameters are as follows,

parameters {
  name: "___embedding_layer_0__.w0"
  size: 658816
  initial_mean: 0.0
  initial_std: 0.0139387206988
  dims: 5147
  dims: 128
  initial_strategy: 0
  initial_smart: true
}
parameters {
  name: "___sequence_conv_pool_0___conv_fc.w0"
  size: 49152
  initial_mean: 0.0
  initial_std: 0.051031036308
  dims: 384
  dims: 128
  initial_strategy: 0
  initial_smart: true
}
....

But the correct proto-string of parameters should contain decay_rate and momentum, as follows,

parameters {
  name: "___embedding_0__.w0"
  size: 3840000
  momentum: 0.9
  initial_mean: 0.0
  initial_std: 0.0057735026919
  decay_rate: 0.0008
  dims: 30000
  dims: 128
  initial_strategy: 0
  initial_smart: true
  gradient_clipping_threshold: 25.0
}
parameters {
  name: "___fc_layer_0__.w0"
  size: 65536
  momentum: 0.9
  initial_mean: 0.0
  initial_std: 0.0883883476483
  decay_rate: 0.0008
  dims: 128
  dims: 512
  initial_strategy: 0
  initial_smart: true
  gradient_clipping_threshold: 25.0
}
...

The text was updated successfully, but these errors were encountered:

reyoung · 2017-05-10T07:26:06Z

This bug is hard to fix. Because we split model configuration into two parts, the topology configuration, and the optimizer settings. When we config and parse topology, there is no optimizer information we set. However, the weight_decay belongs to topology now.

Here is a step by step solution to this issue.

Disable weight_decay in optimizer settings. If a user wants a global weight_decay, he can maintain a ParamAttr by himself.
Make weight_decay out of Parameter Configuration or set the global weight_decay in topology configuration(e.g. ModelConfig) itself.

The second step is a little bit hard to implement, may change the C++ Core of Paddle.

qingqing01 · 2017-05-10T09:15:59Z

不仅仅是weight_decay的问题，还包括momentum, gradient_clipping_threshold。

reyoung · 2017-05-10T09:30:42Z

The solution might be as following.

The global weight_decay, momentum, gradient_clipping_threshold should be saved into proto::OptimizationConfig which saved inside TrainerConfig.proto, just like learning_rate in OptimizationConfig.

And then we can get the global weight_decay in all optimizers.

lcy-seso · 2017-06-07T07:01:20Z

This problem has been fixed by this PR #2288.
Thank you for the issue.

lcy-seso · 2017-06-22T08:00:20Z

This problem is not solved yet, so I reopen it.

qingqing01 · 2017-08-18T08:05:27Z

close since the v2 API has fixed this issue.

…dygraph mode (PaddlePaddle#2042)

qingqing01 added the Bug label May 8, 2017

reyoung self-assigned this May 8, 2017

reyoung added this to 未规划 in Scrum Board May 8, 2017

lcy-seso mentioned this issue May 9, 2017

Summary of Bugs of V2 APIs PaddlePaddle/models#33

Closed

11 tasks

luotao1 added this to 已有BUG in V2 API Enhancement May 9, 2017

lcy-seso added this to Top priorities in Defects board May 10, 2017

lcy-seso moved this from Not in schedule to Next Week in Defects board May 10, 2017

lcy-seso moved this from Next Week to Current Week ToDo in Defects board May 10, 2017

lcy-seso moved this from Current Week ToDo to Not in schedule in Defects board May 10, 2017

reyoung moved this from Not in schedule to Next Week in Defects board May 10, 2017

lcy-seso moved this from Next Week to Doing in Defects board May 22, 2017

lcy-seso removed this from Doing in Defects board May 22, 2017

qingqing01 mentioned this issue Jun 1, 2017

can not reproduce the result of book/03.image_classification/ #2345

Closed

lcy-seso moved this from BUG to 已完成 in V2 API Enhancement Jun 7, 2017

lcy-seso closed this as completed Jun 7, 2017

lcy-seso reopened this Jun 22, 2017

lcy-seso mentioned this issue Jun 22, 2017

globally set parameters cannot work in V2 API #2488

Closed

lcy-seso self-assigned this Jun 22, 2017

qingqing01 closed this as completed Aug 18, 2017

heavengate pushed a commit to heavengate/Paddle that referenced this issue Aug 16, 2021

fix build error when using paddle_inference_lib 2.0rc1 on windows in …

6a46b3c

…dygraph mode (PaddlePaddle#2042)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The hyper parameters of paddle.optimizer does not work in v2 API. #2042

The hyper parameters of paddle.optimizer does not work in v2 API. #2042

qingqing01 commented May 7, 2017 •

edited

Loading

reyoung commented May 10, 2017

qingqing01 commented May 10, 2017 •

edited

Loading

reyoung commented May 10, 2017

lcy-seso commented Jun 7, 2017

lcy-seso commented Jun 22, 2017

qingqing01 commented Aug 18, 2017

The hyper parameters of paddle.optimizer does not work in v2 API. #2042

The hyper parameters of paddle.optimizer does not work in v2 API. #2042

Comments

qingqing01 commented May 7, 2017 • edited Loading

reyoung commented May 10, 2017

qingqing01 commented May 10, 2017 • edited Loading

reyoung commented May 10, 2017

lcy-seso commented Jun 7, 2017

lcy-seso commented Jun 22, 2017

qingqing01 commented Aug 18, 2017

qingqing01 commented May 7, 2017 •

edited

Loading

qingqing01 commented May 10, 2017 •

edited

Loading