-
Notifications
You must be signed in to change notification settings - Fork 25.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
model.save() does not save keras model that includes DIstillBert layer #4444
Comments
Same issue |
Hi, we don't fully support saving/loading these models using keras' save/load methods (yet). In the meantime, please use |
Hello @LysandreJik , Could you point me a direction and tell me a little more about the implementation procedure, so that I could do research and possibly implement the methods? If everything goes well, I could make a pull request that might benefit others as well. Sabber |
I had this exact error. I got around it by saving the weights and the code that creates the model. After training your model, run When you want to reuse the model later, run your code until |
Thanks, works perfectly |
Does this work now with newer versions? |
I am also facing same issue. Any solution. |
The issue still occurs on TF 2.6.0 which is very disappointing.
|
This still occurs, not only with distilbert but also many others. I don't see why this issue was closed - The described workaround is quite cumbersome and error-prone, and I don't see why this cannot be implemented inside the library, given that the configuration should already be in place to allow overriding get_config / from_config methods? |
Hi, TF maintainer here! You're right, and we're going to reopen this one. We're very constrained on time right now, though - I'll try to investigate it as soon as I get the chance. |
Thanks for reopening this. I think i was able to work around it by using the model.distilbert property, which itself is the base layer. Maybe it would be as simple as returning the base layers get_config/from_config with some tweaks? |
@Zahlii You are correct - the underlying issue is simply that |
We've attempted a patch at #14361 - if anyone has any suggestions, or wants to try it out, please let us know! You can test the PR branch with |
The patch has now been merged. It'll be in the next release, or if anyone else is encountering this issue before then, you can install from master with |
Since the patch in #14361 has been reverted, is there a timeline for a fix? (Or is there a known workaround one could use?) Thanks :) |
Hi @Rocketknight1 , thanks for your reply! You are right, it does work when saving in the tensorflow format (not hdf5). This does solve the issue I was facing. What did not work for me was this (minimal example adapted from #14430 ):
Output:
and then it fails with
|
Hi @skbaur, your code runs fine for me! Here's my outputs:
Can you try, in order:
and let me know if either of those fixes it for you? |
Option 1. already seems to work (Installing transformers from master with pip install git+https://github.com/huggingface/transformers.git , but not updating TF). The error reappears when downgrading back to transformers 4.12.5. |
@skbaur It seems like one of the relevant PRs didn't make it into the release, in that case - please use the master version for now, and hopefully once 4.13 is released you can just use that instead! |
馃悰 Bug
Information
I am trying to build a Keras Sequential model, where, I use DistillBERT as a non-trainable embedding layer. The model complies and fits well, even predict method works. But when I want to save it using model.save(model.h5), It fails and shows the following error:
The language I am using the model in English.
The problem arises when using my own modified scripts: (give details below)
The tasks I am working on is my own dataset.
To reproduce
Steps to reproduce the behavior:
You can get the same error if you try:
An interesting observation:
if you save the model without specifying ".h5" like
it saves the model as TensorFlow saved_model format and creates folders (assets (empty), variables, and some index files). But if you try to load the model, it produces different errors related to the DistillBert/Bert. It may be due to some naming inconsistency (input_ids vs. inputs, see below) inside the DistillBert model.
Expected behavior
I expect to have a normal saving and loading of the model.
Environment info
transformers
version: 2.9.1The text was updated successfully, but these errors were encountered: