Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Huber loss #819

Merged
merged 1 commit into from
Feb 21, 2018
Merged

Added Huber loss #819

merged 1 commit into from
Feb 21, 2018

Conversation

Sentient07
Copy link
Contributor

For #814

Copy link
Member

@f0k f0k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, and sorry for the delay! There are some things to be changed, but it looks good in general.

targets : Theano 2D tensor or 1D tensor
Either a vector of int giving the correct class index per data point
or a 2D tensor of one-hot encoding of the correct class in the same
layout as predictions (non-binary targets in [0, 1] do not work!)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wasn't this meant for regression? You copied the description from a multi-class loss.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! Yes, I will revert.

.. math:: L_\delta (diff) = \frac{diff^2}{2} & \text{if } |diff| \le \
\delta
.. math:: L_\delta (diff) & = & \delta (|diff| - \frac{\delta}{2} ),\
&\text{else}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't checked what this renders like, but shouldn't this use a single .. math:: statement? http://www.sphinx-doc.org/en/stable/ext/math.html#directive-math
It seems rendering cases with a one-sided bracket is not easily possible in Sphinx, so using two lines is a good workaround.

Copy link
Contributor Author

@Sentient07 Sentient07 Apr 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used two here because there are two different expressions that this will return, based on the value of delta.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually you'd use a single expression with a conditional bracket, but that's not easily possible in Sphinx (http://tex.stackexchange.com/questions/122407/writing-conditional-equations-with-braces-in-sphinx).
Using two lines seems like a good alternative, but you can write two lines in a single .. math:: statement, as shown at http://www.sphinx-doc.org/en/stable/ext/math.html#directive-math. I guess that would be cleaner.

layout as predictions (non-binary targets in [0, 1] do not work!)
delta : scalar, default 1
This delta value is defaulted to 1, for SmoothL1Loss
described in Fast-RCNN paper[1].
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be [1]_, with an underscore in the end and a space before, if I recall the syntax correctly.

Notes
-----
This is an alternative to the Least Squared loss for
regression problems.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

least squares, or better squared error because that's what the function is called in Lasagne.

Returns
-------
Theano 1D tensor
An expression for the item-wise huber loss.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, this was copied from a multi-class objective docstring. Copy it from the squared error instead (it should be a tensor of arbitrary dimensionality, and the element-wise loss).

diff = targets - predictions
ift = 0.5 * squared_error(targets, predictions)
iff = delta * (abs(diff) - delta / 2.)
return theano.tensor.switch(abs(diff) <= delta, ift, iff).sum()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't sum in the end, it should return the element-wise loss.

https://arxiv.org/pdf/1504.08083.pdf
"""
predictions, targets = align_targets(predictions, targets)
diff = targets - predictions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use abs_diff = abs(targets - predictions) instead. Maximizes reuse of Theano expressions, so Theano doesn't have to merge them later.

@@ -153,6 +153,35 @@ def test_binary_hinge_loss(colvect):


@pytest.mark.parametrize('colvect', (False, True))
def test_huber_loss(colvect):
from lasagne.objectives import huber_loss
delta = [0.5, 1.0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can add delta to the test parameters and add another parametrize. Looks a little cleaner.

----------
.. [1] Ross Girshick et al (2015):
Fast RCNN
https://arxiv.org/pdf/1504.08083.pdf
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also cite Huber, maybe as the first reference? https://en.wikipedia.org/wiki/Huber_loss#cite_note-1

@f0k
Copy link
Member

f0k commented Apr 4, 2017

Oh, and I forgot, the new function should be included in __all__ at the top of the file, and in the module docstring at the top of the file, and in docs/modules/objectives.rst.

@Sentient07
Copy link
Contributor Author

I have added a new commit that addresses all the comments to the best of my knowledge. I haven't verified myself what sphinx renders. There seems to be some problem with my browser(safari). If the sphinx isn't fine, please let me know I will change it and try to build that on another system and push one more commit.
Thanks

Copy link
Member

@f0k f0k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the update! Some more comments (or repetitions of earlier comments). Almost there!

@@ -86,6 +86,7 @@
"aggregate",
"binary_hinge_loss",
"multiclass_hinge_loss",
"huber_loss",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also add it to the module docstring further above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And to the .rst file as I mentioned in #819 (comment). Thank you!

.. math::
L_\delta (diff) = \frac{diff^2}{2} & \text{if } |diff| \le \ delta \\

L_\delta (diff) & = & \delta (|diff| - \frac{\delta}{2} ),\ \\
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The alignment characters (&) still seem off -- does this render correctly in Sphinx?

or a 2D tensor of one-hot encoding of the correct class in the same
layout as predictions (non-binary targets in [0, 1] do not work!)
Ground truth to which the prediction is to be compared
with.
Copy link
Member

@f0k f0k Apr 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please copy the predictions and targets from squared_error. This is still not correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am sorry, it's a little unclear to me. You mean to say, i should be having predictions and targets in a single line separated by , similar to squared_error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like this?

    Parameters
    ----------
    predictions,  targets : Theano tensors

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, or you can also call it a and b, since the loss is completely symmetric. Your current docstring says they should be 1D or 2D tensors, which is too strict! They can be anything -- it's a drop-in replacement for the squared error.

regression problems.

References
----------
.. [1] Ross Girshick et al (2015):
Fast RCNN
https://arxiv.org/pdf/1504.08083.pdf

.. [2] Huber, Peter et al (1964)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line should have a colon : in the end. Sorry for the apparent nitpick, but note that the line break is only visible when viewing the docstring in Python, not in the Sphinx rendering. Also please add a period . after the title. Have a look at how this renders to be sure. (Last bullet point: https://github.com/Lasagne/Lasagne/blob/master/.github/PULL_REQUEST_TEMPLATE.md)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, didn't see the comment about your browser problem. Do you have another browser you can try? (Using another computer seems overkill!) Otherwise I can also check out your PR and test the documentation here, but of course it's easier if you can debug and fix things directly!

"""
predictions, targets = align_targets(predictions, targets)
diff = targets - predictions
ab_diff = abs(targets - predictions)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd strongly prefer abs_diff.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was a typo. I don't have anything against abs_diff, but yeah I understand the convention :)

l1 = huber_loss(a, b, delta[0])
l2 = huber_loss(a, b, delta[1])
l1 = huber_loss(a, b, delta)
l2 = huber_loss(a, b, delta)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're computing the same thing twice now. The idea of making delta a test parameter was to remove the duplication :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! Yeah, sorry. I rushed through that PR. I will check every comment again before the next commit.

@Sentient07
Copy link
Contributor Author

I very much apologise for the delay. The two months were really hectic in seeking positions. I have addressed the comments that were left over. The sphinx now looks good in my local build. If there are anymore changes, I will make them right away!
screenshot at jun 01 01-24-30

Copy link
Member

@f0k f0k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect, and thanks a lot for including the screenshot! There's just a bug in your test.

abs_diff = abs(x - y)
ift = 0.5 * abs_diff ** 2
iff = delta * (abs_diff - delta / 2.)
z = np.where(abdiff <= delta, ift, iff)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd merge this right away, only all your tests failed with NameError: global name 'abdiff' is not defined.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about that, I will fix it.

@f0k
Copy link
Member

f0k commented Jun 1, 2017

Looks good now, can you please squash everything into the first commit?

Addressed comments

Fixed sphinx errors

changed variable name
@f0k f0k merged commit ffc8b8a into Lasagne:master Feb 21, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants