Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Smoothl1 loss and Python API. #1870

Merged
merged 10 commits into from
Apr 25, 2017
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions doc/api/v1/trainer_config_helpers/layers.rst
Original file line number Diff line number Diff line change
Expand Up @@ -498,6 +498,12 @@ hsigmoid
:members: hsigmoid
:noindex:

smooth_l1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

请加到doc/api/v2/config/layer.rst里面,v1里面的不用改了。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

---------
.. automodule:: paddle.trainer_config_helpers.layers
:members: smooth_l1
:noindex:

Check Layer
============

Expand Down
2 changes: 1 addition & 1 deletion paddle/gserver/layers/CostLayer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -217,7 +217,7 @@ void SmoothL1CostLayer::forwardImp(Matrix& output,
targetCpu->copyFrom(target);
outputCpu->copyFrom(output);
labelCpu->copyFrom(*label.value);
targetCpu->smoothL1(*outputCpu, *(labelCpu));
targetCpu->smoothL1(*outputCpu, *labelCpu);
target.copyFrom(*targetCpu);
} else {
target.smoothL1(output, *label.value);
Expand Down
10 changes: 6 additions & 4 deletions paddle/gserver/layers/CostLayer.h
Original file line number Diff line number Diff line change
Expand Up @@ -91,8 +91,8 @@ class MultiClassCrossEntropy : public CostLayer {
*
* [1] Jacob Devlin, Rabih Zbib, Zhongqiang Huang, Thomas Lamar,
* Richard Schwartz, and John Makhoul. Fast and robust neural
* network joint models for statistical machine translation.
* In Proceedings of the ACL 2014 Conference.
* network joint models for statistical machine translation. * In
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

格式好像出错了

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert了~

* Proceedings of the ACL 2014 Conference.
*/
class MultiClassCrossEntropyWithSelfNorm : public CostLayer {
public:
Expand Down Expand Up @@ -164,9 +164,11 @@ class SumOfSquaresCostLayer : public CostLayer {
* tasks.
* \f[
* L =
* (output - label)^2 * 0.5 / -1 < (output - label) < 1 /
* (output - label) - 0.5 / otherwise /
* 0.5 * x^2 if / -1 < |x| < 1 /
* |x| - 0.5 / otherwise /
* \f]
*
* x = output - label
*/
class SmoothL1CostLayer : public CostLayer {
public:
Expand Down
2 changes: 1 addition & 1 deletion paddle/gserver/tests/test_LayerGrad.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1685,7 +1685,7 @@ TEST(Layer, smooth_l1) {
config.layerConfig.add_inputs();

for (auto useGpu : {false, true}) {
testLayerGrad(config, "smooth_l1", 100, false, useGpu, false, 2.0);
testLayerGrad(config, "smooth_l1", 100, false, useGpu, false);
}
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you extend the loss test to have more than 1 dim output ?

  config.inputDefs.push_back({INPUT_DATA, "layer_0", 1, 0});
  config.inputDefs.push_back({INPUT_DATA_TARGET, "layer_1", 1, 0});

Change the size of input and label to be 10 in order to make sure it works with more than 1 dim

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thanks!

Expand Down
30 changes: 17 additions & 13 deletions paddle/math/Matrix.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3616,17 +3616,18 @@ void CpuMatrix::smoothL1(Matrix& output, Matrix& label) {
CHECK_EQ(output.getHeight(), numSamples);
CHECK_EQ(label.getWidth(), dim);
CHECK_EQ(getWidth(), (size_t)1);
real* out = output.getData();

real* cost = getData();
real* out = output.getData();
real* lbl = label.getData();

for (size_t i = 0; i < numSamples; ++i, out += dim, cost += dim, lbl += dim) {
for (size_t i = 0; i < numSamples; ++i, out += dim, lbl += dim) {
for (size_t j = 0; j < dim; ++j) {
cost[j] = std::fabs(out[j] - lbl[j]);
if (cost[j] < 1.0)
cost[j] = 0.5 * cost[j] * cost[j];
real absVal = std::fabs(out[j] - lbl[j]);
if (absVal < 1.0)
cost[i] += 0.5 * absVal * absVal;
else
cost[j] = cost[j] - 0.5;
cost[i] += absVal - 0.5;
}
}
}
Expand All @@ -3640,17 +3641,20 @@ void CpuMatrix::smoothL1Bp(Matrix& output, Matrix& label) {
CHECK_EQ(label.getHeight(), numSamples);
CHECK_EQ(output.getHeight(), numSamples);
CHECK_EQ(label.getWidth(), dim);
CHECK_EQ(getWidth(), (size_t)1);
CHECK_EQ(getWidth(), dim);

real* out = output.getData();
real* cost = getData();
real* lbl = label.getData();
real* grad = getData();

// f'(x) = x if |x| < 1
// = sign(x) otherwise
for (size_t i = 0; i < numSamples; ++i, out += dim, cost += dim, lbl += dim) {
for (size_t i = 0; i < numSamples; ++i, out += dim, grad += dim, lbl += dim) {
for (size_t j = 0; j < dim; ++j) {
cost[j] = out[j] - lbl[j];
if (std::fabs(cost[j]) >= 1) cost[j] = (0 < cost[j]) - (cost[j] < 0);
real val = out[j] - lbl[j];
if (std::fabs(val) < 1) {
grad[j] += val;
Copy link
Contributor

@Noplz Noplz Apr 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里是不是应该是grad[i] += val; 上面循环中去掉grad += dim grad的维度应该与numSamples的数量一致吧。之前在循环里加了dim,因为check了dim必须是1,如果dim不是1就错了。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

grad的大小为:numSamples * dim,代表第一个input对应的gradient.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

可以看https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/gserver/layers/CostLayer.cpp 这个文件 60行->68行-> 227行 -> 244行。

grad和out的维度一致。

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

嗯。。是我想错了。

} else {
grad[j] += (real(0) < val) - (val < real(0));
}
}
}
}
Expand Down
1 change: 1 addition & 0 deletions python/paddle/trainer/config_parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -2119,6 +2119,7 @@ def init(cls, name, inputs, device=None, coeff=1.):
define_cost('SoftBinaryClassCrossEntropy', 'soft_binary_class_cross_entropy')
define_cost('HuberTwoClass', 'huber')
define_cost('SumCost', 'sum_cost')
define_cost('SmoothL1Cost', 'smooth_l1')


@config_layer('hsigmoid')
Expand Down
57 changes: 55 additions & 2 deletions python/paddle/trainer_config_helpers/layers.py
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,7 @@
'spp_layer',
'pad_layer',
'eos_layer',
'smooth_l1_cost',
'layer_support',
]

Expand Down Expand Up @@ -201,6 +202,7 @@ class LayerType(object):
SOFT_BIN_CLASS_CROSS_ENTROPY = "soft_binary_class_cross_entropy"
MULTI_BIN_LABEL_CROSS_ENTROPY = "multi_binary_label_cross_entropy"
SUM_COST = "sum_cost"
SMOOTH_L1 = "smooth_l1"

@staticmethod
def is_layer_type(type_name):
Expand Down Expand Up @@ -5249,8 +5251,6 @@ def multi_binary_label_cross_entropy(input,
:type input: LayerOutput
:param label: The input label.
:type input: LayerOutput
:param type: The type of cost.
:type type: basestring
:param name: The name of this layers. It is not necessary.
:type name: None|basestring
:param coeff: The coefficient affects the gradient in the backward.
Expand Down Expand Up @@ -5279,3 +5279,56 @@ def multi_binary_label_cross_entropy(input,
LayerType.MULTI_BIN_LABEL_CROSS_ENTROPY,
parents=[input, label],
size=1)


@wrap_name_default()
@layer_support()
def smooth_l1_cost(input, label, name=None, layer_attr=None):
"""
This is a L1 loss but more smooth. It requires that the
size of input and label are equal. The formula is as follows,

.. math::

L = \sum_{i} smooth_{L1}(input_i - label_i)

in which

.. math::

mooth_{L1}(x) =
\begin{cases}
0.5x^2& \text{if} |x| < 1 \\
|x|-0.5& \text{otherwise}
\end{cases}

More details can be found by referring to `Fast R-CNN
<https://arxiv.org/pdf/1504.08083v2.pdf>`_

.. code-block:: python

cost = smooth_l1_cost(input=input_layer,
label=label_layer)

:param input: The input layer.
:type input: LayerOutput
:param label: The input label.
:type input: LayerOutput
:param name: The name of this layers. It is not necessary.
:type name: None|basestring
:param layer_attr: Extra Layer Attribute.
:type layer_attr: ExtraLayerAttribute
:return: LayerOutput object.
:rtype: LayerOutput
"""
assert isinstance(input, LayerOutput)
assert isinstance(label, LayerOutput)
assert input.size == label.size

Layer(
name=name,
type=LayerType.SMOOTH_L1,
inputs=[input.name, label.name],
**ExtraLayerAttribute.to_kwargs(layer_attr))
return LayerOutput(
name, LayerType.SMOOTH_L1, parents=[input, label], size=1)
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,6 @@ last_first_seq test_expand_layer test_ntm_layers test_hsigmoid
img_layers img_trans_layers util_layers simple_rnn_layers unused_layers test_cost_layers
test_rnn_group shared_fc shared_lstm shared_gru test_cost_layers_with_weight
test_spp_layer test_bilinear_interp test_maxout test_bi_grumemory math_ops
test_seq_concat_reshape)
test_seq_concat_reshape test_smooth_l1)

export whole_configs=(test_split_datasource)
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
type: "nn"
layers {
name: "input"
type: "data"
size: 300
active_type: ""
}
layers {
name: "label"
type: "data"
size: 300
active_type: ""
}
layers {
name: "__smooth_l1_cost_0__"
type: "smooth_l1"
size: 1
active_type: ""
inputs {
input_layer_name: "input"
}
inputs {
input_layer_name: "label"
}
coeff: 1.0
}
input_layer_names: "input"
input_layer_names: "label"
output_layer_names: "__smooth_l1_cost_0__"
sub_models {
name: "root"
layer_names: "input"
layer_names: "label"
layer_names: "__smooth_l1_cost_0__"
input_layer_names: "input"
input_layer_names: "label"
output_layer_names: "__smooth_l1_cost_0__"
is_recurrent_layer_group: false
}

Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
from paddle.trainer_config_helpers import *

data = data_layer(name='input', size=300)
lbl = data_layer(name='label', size=300)
smooth_l1 = smooth_l1_cost(input=data, label=lbl)

outputs(smooth_l1)