Try adding an error if activation and loss are mismatched #1008

mattdangerw · 2023-04-26T18:24:21Z

Example output:

>>> classifier.compile(loss="sparse_categorical_crossentropy")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/mattdangerw/checkout/keras-nlp/keras_nlp/models/task.py", line 82, in compile
    self._check_from_logits_mismatch()
  File "/home/mattdangerw/checkout/keras-nlp/keras_nlp/models/task.py", line 72, in _check_from_logits_mismatch
    raise ValueError(
ValueError: The `loss` passed to `compile()` expects softmax probability output, but the model
is configured to output logits (`activation=None`). This will not converge! Pass `from_logits=True`
to your loss, e.g. `loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True)`.

mattdangerw · 2023-04-26T18:27:39Z

/gcbrun

mattdangerw · 2023-04-26T18:29:34Z

/gcbrun

Example output: ``` >>> classifier.compile(loss="sparse_categorical_crossentropy") Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/mattdangerw/checkout/keras-nlp/keras_nlp/models/task.py", line 82, in compile self._check_from_logits_mismatch() File "/home/mattdangerw/checkout/keras-nlp/keras_nlp/models/task.py", line 72, in _check_from_logits_mismatch raise ValueError( ValueError: The `loss` passed to `compile()` expects softmax probability output, but the model is configured to output logits (`activation=None`). This will not converge! Pass `from_logits=True` to your loss, e.g. `loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True)`. ```

mattdangerw · 2023-04-26T20:22:30Z

/gcbrun

jbischof

No problem with this, but I wonder if this is specific to KerasNLP or would apply equally to any Keras model. Is this is level of hand-holding that Keras users expect?

mattdangerw · 2023-04-26T22:20:29Z

No problem with this, but I wonder if this is specific to KerasNLP or would apply equally to any Keras model. Is this is level of hand-holding that Keras users expect?

IMO this is one of the larger pain points in Keras (see here for example), and with our default compilation in KerasNLP, we even further obfuscate these tricky choices from the entry level user.

From talking with @fchollet the reason this is not in core Keras, is because there is no well defined way to get the output activation of a model as a general concept. This is not true with our classifiers, model output activation is quite defined.

Put another way, for this Keras user (yours truly), I am often frustrated that I get zero feedback from framework if I misconfigure this. Opinions may vary, but I would greatly appreciate this as an end user.

jbischof · 2023-04-26T22:54:53Z

Great, ship away

Example output: ``` >>> classifier.compile(loss="sparse_categorical_crossentropy") Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/mattdangerw/checkout/keras-nlp/keras_nlp/models/task.py", line 82, in compile self._check_from_logits_mismatch() File "/home/mattdangerw/checkout/keras-nlp/keras_nlp/models/task.py", line 72, in _check_from_logits_mismatch raise ValueError( ValueError: The `loss` passed to `compile()` expects softmax probability output, but the model is configured to output logits (`activation=None`). This will not converge! Pass `from_logits=True` to your loss, e.g. `loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True)`. ```

mattdangerw requested review from jbischof and fchollet April 26, 2023 18:26

mattdangerw force-pushed the activation-error branch from acb56a9 to 26df444 Compare April 26, 2023 18:27

mattdangerw force-pushed the activation-error branch from 26df444 to c841d8b Compare April 26, 2023 18:29

mattdangerw force-pushed the activation-error branch from c841d8b to 8366616 Compare April 26, 2023 18:40

mattdangerw force-pushed the activation-error branch from 8366616 to 0b1e982 Compare April 26, 2023 18:41

jbischof approved these changes Apr 26, 2023

View reviewed changes

fchollet approved these changes Apr 27, 2023

View reviewed changes

mattdangerw merged commit c71c9de into keras-team:master Apr 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Try adding an error if activation and loss are mismatched #1008

Try adding an error if activation and loss are mismatched #1008

Uh oh!

mattdangerw commented Apr 26, 2023

Uh oh!

mattdangerw commented Apr 26, 2023

Uh oh!

mattdangerw commented Apr 26, 2023

Uh oh!

mattdangerw commented Apr 26, 2023

Uh oh!

jbischof left a comment

Uh oh!

mattdangerw commented Apr 26, 2023

Uh oh!

jbischof commented Apr 26, 2023

Uh oh!

Uh oh!

Try adding an error if activation and loss are mismatched #1008

Try adding an error if activation and loss are mismatched #1008

Uh oh!

Conversation

mattdangerw commented Apr 26, 2023

Uh oh!

mattdangerw commented Apr 26, 2023

Uh oh!

mattdangerw commented Apr 26, 2023

Uh oh!

mattdangerw commented Apr 26, 2023

Uh oh!

jbischof left a comment

Choose a reason for hiding this comment

Uh oh!

mattdangerw commented Apr 26, 2023

Uh oh!

jbischof commented Apr 26, 2023

Uh oh!

Uh oh!