Align subclassing guide with docstring of Layer.add_loss() #1144

arnoegw · 2022-12-01T12:06:13Z

which uses reduce_mean(), not reduce_sum(), to achieve independence of batch sizes, consistent with Keras' default loss reduction by SUM_OVER_BATCH_SIZE. Along the way, clarify the difference between add_loss and Loss.

which uses `reduce_mean()`, not `reduce_sum()`, to achieve independence of batch sizes, consistent with Keras' default loss reduction by SUM_OVER_BATCH_SIZE. Along the way, clarify the difference between `add_loss` and `Loss`.

arnoegw · 2022-12-01T12:08:53Z

This is from a conversation with @rchao and @yuefengz, who have more context here.

guides/making_new_layers_and_models_via_subclassing.py

Addresses review comment.

arnoegw · 2022-12-06T07:41:54Z

The CI failure looks unrelated:

Traceback (most recent call last):
  File "/opt/hostedtoolcache/Python/3.9.15/x64/bin/black", line 8, in <module>
    sys.exit(patched_main())
  File "src/black/__init__.py", line 1423, in patched_main
  File "src/black/__init__.py", line 1409, in patch_click
ImportError: cannot import name '_unicodefun' from 'click' (/opt/hostedtoolcache/Python/3.9.15/x64/lib/python3.9/site-packages/click/__init__.py)

rchao · 2022-12-07T17:18:40Z

guides/making_new_layers_and_models_via_subclassing.py


 """
+Notice that `add_loss()` can take the result of plain TensorFlow operations.
+There is no need to call a `Loss` object here.


But I assume, Loss objects can be used as well, is this correct? The wording would sound like it's discouraged.

My motivation for this change stems from design discussions with colleagues who had the impression that subclasses of Loss would be the natural way to encapsulate some custom loss functions that we planned to use with add_loss().

So let's consider what class Loss adds on top of a simple tensor-to-tensor function from (y_true, y_pred) to a total loss:

A split between init-time hparams and call-time inputs, providing a parametrized callback for the training loop.

Serialization and deserialization as part of a Keras Model config.

Wrapping of a user-level call() function with automated application of per-example weights...

...and reduction across examples.

For use in Model.compile(loss=...), all of this makes sense.

For use with Layer.add_loss(...), not so much:

add_loss() only takes the final result, no need to pass a parametrized callback.

add_loss() does not track a Loss object for serialization as such

Example weights are typically not available within the model definition.

The one type of reduction that makes sense for add_loss() is SUM_OVER_BATCH_SIZE, and that will unexpectedly blow up in your face the moment you try to move the Model under a distribution strategy.

So, not only did the similarity in name point us towards an abstraction that was overly complex (items 1-3), it actually hurt us as soon as we wanted to leverage TensorFlow's key feature: scaling up with distributed training (item 4).

Bottom line: Yes, I think it is appropriate to gently discourage an unreflected use of Loss objects with add_loss(). I would have liked to leave a stronger hint at item 4 (see first commit of this PR), but I couldn't find a good link target to provide the necessary context.

Friendly ping... I think the proposed wording is completely correct and useful guidance for the user: While calling a Loss object is possible, with some care, a plain Tensor is often more appropriate (more often than the similar names suggest).

I agree, especially given that majority of the Loss classes have the corresponding loss functions to use, there really isn't a need to complicate the workflow by using a class wrapper. It doesn't seem appropriate for a model builder to easily make mistake by using a reduction method that doesn't work with distribute strategies either, so discouraging using that is an option too. In any case, the way it's stated here lgtm.

Thanks again for the change and patience!

arnoegw · 2022-12-20T12:48:29Z

Thank you, @rchao! The GitHub checks are telling me I need review from a code owner, and @MarkDaoust was requested to do it.

fchollet

LGTM.

Align subclassing guide with docstring of Layer.add_loss()

f6f5d27

which uses `reduce_mean()`, not `reduce_sum()`, to achieve independence of batch sizes, consistent with Keras' default loss reduction by SUM_OVER_BATCH_SIZE. Along the way, clarify the difference between `add_loss` and `Loss`.

arnoegw requested review from MarkDaoust and fchollet as code owners December 1, 2022 12:06

fchollet reviewed Dec 1, 2022

View reviewed changes

guides/making_new_layers_and_models_via_subclassing.py Outdated Show resolved Hide resolved

Update making_new_layers_and_models_via_subclassing.py

90f3c6a

Addresses review comment.

rchao reviewed Dec 7, 2022

View reviewed changes

rchao approved these changes Dec 19, 2022

View reviewed changes

Merge branch 'keras-team:master' into patch-2

2dd71c6

fchollet approved these changes Dec 21, 2022

View reviewed changes

fchollet merged commit c05e0c7 into keras-team:master Dec 21, 2022

arnoegw mentioned this pull request Aug 17, 2023

Align guides with docstring of Layer.add_loss(): #1493

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Align subclassing guide with docstring of Layer.add_loss() #1144

Align subclassing guide with docstring of Layer.add_loss() #1144

Uh oh!

arnoegw commented Dec 1, 2022

Uh oh!

arnoegw commented Dec 1, 2022

Uh oh!

Uh oh!

arnoegw commented Dec 6, 2022

Uh oh!

rchao Dec 7, 2022

Uh oh!

arnoegw Dec 7, 2022

Uh oh!

arnoegw Dec 19, 2022

Uh oh!

rchao Dec 19, 2022

Uh oh!

arnoegw commented Dec 20, 2022

Uh oh!

fchollet left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Align subclassing guide with docstring of Layer.add_loss() #1144

Align subclassing guide with docstring of Layer.add_loss() #1144

Uh oh!

Conversation

arnoegw commented Dec 1, 2022

Uh oh!

arnoegw commented Dec 1, 2022

Uh oh!

Uh oh!

arnoegw commented Dec 6, 2022

Uh oh!

rchao Dec 7, 2022

Choose a reason for hiding this comment

Uh oh!

arnoegw Dec 7, 2022

Choose a reason for hiding this comment

Uh oh!

arnoegw Dec 19, 2022

Choose a reason for hiding this comment

Uh oh!

rchao Dec 19, 2022

Choose a reason for hiding this comment

Uh oh!

arnoegw commented Dec 20, 2022

Uh oh!

fchollet left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants