Skip to content

Conversation

mattdangerw
Copy link
Member

Example output:

>>> classifier.compile(loss="sparse_categorical_crossentropy")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/mattdangerw/checkout/keras-nlp/keras_nlp/models/task.py", line 82, in compile
    self._check_from_logits_mismatch()
  File "/home/mattdangerw/checkout/keras-nlp/keras_nlp/models/task.py", line 72, in _check_from_logits_mismatch
    raise ValueError(
ValueError: The `loss` passed to `compile()` expects softmax probability output, but the model
is configured to output logits (`activation=None`). This will not converge! Pass `from_logits=True`
to your loss, e.g. `loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True)`.

@mattdangerw
Copy link
Member Author

/gcbrun

@mattdangerw
Copy link
Member Author

/gcbrun

Example output:

```
>>> classifier.compile(loss="sparse_categorical_crossentropy")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/mattdangerw/checkout/keras-nlp/keras_nlp/models/task.py", line 82, in compile
    self._check_from_logits_mismatch()
  File "/home/mattdangerw/checkout/keras-nlp/keras_nlp/models/task.py", line 72, in _check_from_logits_mismatch
    raise ValueError(
ValueError: The `loss` passed to `compile()` expects softmax probability output, but the model is configured to output logits (`activation=None`). This will not converge! Pass `from_logits=True` to your loss, e.g. `loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True)`.
```
@mattdangerw
Copy link
Member Author

/gcbrun

Copy link
Contributor

@jbischof jbischof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem with this, but I wonder if this is specific to KerasNLP or would apply equally to any Keras model. Is this is level of hand-holding that Keras users expect?

@mattdangerw
Copy link
Member Author

No problem with this, but I wonder if this is specific to KerasNLP or would apply equally to any Keras model. Is this is level of hand-holding that Keras users expect?

IMO this is one of the larger pain points in Keras (see here for example), and with our default compilation in KerasNLP, we even further obfuscate these tricky choices from the entry level user.

From talking with @fchollet the reason this is not in core Keras, is because there is no well defined way to get the output activation of a model as a general concept. This is not true with our classifiers, model output activation is quite defined.

Put another way, for this Keras user (yours truly), I am often frustrated that I get zero feedback from framework if I misconfigure this. Opinions may vary, but I would greatly appreciate this as an end user.

@jbischof
Copy link
Contributor

Great, ship away

@mattdangerw mattdangerw merged commit c71c9de into keras-team:master Apr 28, 2023
chenmoneygithub pushed a commit that referenced this pull request Apr 28, 2023
Example output:

```
>>> classifier.compile(loss="sparse_categorical_crossentropy")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/mattdangerw/checkout/keras-nlp/keras_nlp/models/task.py", line 82, in compile
    self._check_from_logits_mismatch()
  File "/home/mattdangerw/checkout/keras-nlp/keras_nlp/models/task.py", line 72, in _check_from_logits_mismatch
    raise ValueError(
ValueError: The `loss` passed to `compile()` expects softmax probability output, but the model is configured to output logits (`activation=None`). This will not converge! Pass `from_logits=True` to your loss, e.g. `loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True)`.
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants