-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug: with with strategy.scope():
BERT output loses it's shape
#870
Comments
I was able to replicate the same behaviour although this does not looks like a blocker as both model (model with strategy.scope() and model without scope) produces same output with same shape (1, 128, 2). Please find attached the gist. Please let us know if this blocks you. Thank you! |
@singhniraj08 What if we put some layer after that, which requires shape from the previous layer, maybe Here is the notebook, you may find it helpful. Here: https://www.kaggle.com/code/maifeeulasad/tfhub-bert-with-scope/ |
I tried the same setup with adding The Flatten layer resulted in error for which suggested workaround is to use pooled output from bert encoder as shown below. This will result in flattened output from encoder.
@alenarepina, Can you please look into this issue where the output shape from bert encoder shows (None, None, 768) instead of (None, 128, 768) when using tf.distribute.MirroredStrategy() scope. |
@singhniraj08, here is a gist: https://colab.research.google.com/drive/1NLeVirYdVeHGit7QGsIvook6K_ezFtgB?usp=sharing You aren't using the And maybe in some updated version, it works, but the thing is, it breaks more vital tf features, like |
It's best to forward this type of question to the main TensorFlow repo. On the tfhub.dev side, we are hosting models pre-trained by various publishers. The semantics of how code behaves under different distribution strategies is under the control of the TensorFlow library. |
What happened?
I was trying to use BERT hosted at:
And it should give me multiple outputs. And these shapes should look something like:
(None, 128, 768)
. And sometimes it's working. But withwith strategy.scope():
it's loosing it's shape. It's becoming(None, None, 768)
.Relevant code
Relevant log output
tensorflow_hub Version
0.12.0 (latest stable release)
TensorFlow Version
2.7
Other libraries
tensorflow-text==2.7.0
Python Version
3.x
OS
Linux
The text was updated successfully, but these errors were encountered: