-
Notifications
You must be signed in to change notification settings - Fork 301
Try adding an error if activation and loss are mismatched #1008
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
acb56a9
to
26df444
Compare
/gcbrun |
26df444
to
c841d8b
Compare
/gcbrun |
c841d8b
to
8366616
Compare
Example output: ``` >>> classifier.compile(loss="sparse_categorical_crossentropy") Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/mattdangerw/checkout/keras-nlp/keras_nlp/models/task.py", line 82, in compile self._check_from_logits_mismatch() File "/home/mattdangerw/checkout/keras-nlp/keras_nlp/models/task.py", line 72, in _check_from_logits_mismatch raise ValueError( ValueError: The `loss` passed to `compile()` expects softmax probability output, but the model is configured to output logits (`activation=None`). This will not converge! Pass `from_logits=True` to your loss, e.g. `loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True)`. ```
8366616
to
0b1e982
Compare
/gcbrun |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem with this, but I wonder if this is specific to KerasNLP or would apply equally to any Keras model. Is this is level of hand-holding that Keras users expect?
IMO this is one of the larger pain points in Keras (see here for example), and with our default compilation in KerasNLP, we even further obfuscate these tricky choices from the entry level user. From talking with @fchollet the reason this is not in core Keras, is because there is no well defined way to get the output activation of a model as a general concept. This is not true with our classifiers, model output activation is quite defined. Put another way, for this Keras user (yours truly), I am often frustrated that I get zero feedback from framework if I misconfigure this. Opinions may vary, but I would greatly appreciate this as an end user. |
Great, ship away |
Example output: ``` >>> classifier.compile(loss="sparse_categorical_crossentropy") Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/mattdangerw/checkout/keras-nlp/keras_nlp/models/task.py", line 82, in compile self._check_from_logits_mismatch() File "/home/mattdangerw/checkout/keras-nlp/keras_nlp/models/task.py", line 72, in _check_from_logits_mismatch raise ValueError( ValueError: The `loss` passed to `compile()` expects softmax probability output, but the model is configured to output logits (`activation=None`). This will not converge! Pass `from_logits=True` to your loss, e.g. `loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True)`. ```
Example output: