-
Notifications
You must be signed in to change notification settings - Fork 27.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Ensure TF model configs can be converted to proper JSON #14415
Conversation
Thanks for this! I see a few failing tests, but I think something like this should work. One thing I'd suggest: I think the |
Yes, I think these here need to be fixed individually, as the config indeed is not JSONifiable. Regarding your last comment, I'll try to see if I can add that and a test case for it later.
|
That's strange, the config for at least some of them seems to convert fine for me. For example, this works (after installing from master):
|
It seems the problem rather is on loading the config: When running PretrainedConfig.from_dict(), the returned config will be of class PretrainedConfig, NOT of the specific models class; hence getattr calls to non-standard properties fail. Is there a way to get the correct config class when calling from_config()? Otherwise, we would need to save the class name as part of the get_config(), and when calling from_config() use this to map to the correct class. EDIT: I think we can use cls.config_class. Lets see if the tests go through.
|
@Zahlii Since we know things are broken, we're going to merge this PR urgently, and then quickly work on testing to follow it up. I'll tag you in the PR - we're planning a revamp to TF testing, since the tests that would have caught this were marked as tooslow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for thee fix!
@Zahlii Another update, and a change of plan! We're going to revert the last commit, do the fixes in this PR, and I might add some testing to this PR before it's merged. Is that okay with you? |
Sure, go ahead and let me know if I can support further. |
Cool, thank you! |
Small comment without having checked the code - I observed that the saved model format per default traces all functions. For my use cases, I always disabled that because it added an enormous overhead, and with a correct config handling it wasn't required. On the one hand this rids the requirement for the config stuff, on the other hand it is much slower. How is this currently handled , both inside tests and others? https://www.tensorflow.org/api_docs/python/tf/keras/models/save_model |
@@ -28,6 +28,7 @@ | |||
from requests.exceptions import HTTPError | |||
from transformers import is_tf_available | |||
from transformers.models.auto import get_values | |||
from transformers.testing_utils import tooslow # noqa: F401 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not leave it in the imports below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This caused style issues because of the #noqa tag
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But if it's not used anymore, why note remove it, I'm confused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I'm sorry! I thought it should stay imported because it might be needed in that file, but that's probably a stupid way to do things. Removing it!
@@ -28,6 +28,7 @@ | |||
from requests.exceptions import HTTPError | |||
from transformers import is_tf_available | |||
from transformers.models.auto import get_values | |||
from transformers.testing_utils import tooslow # noqa: F401 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But if it's not used anymore, why note remove it, I'm confused.
This is a good point - the short answer is that we want it to work when people save their model like that, but like you we found it was much too slow to test every model with it in the CI. The solution we went with in this PR was to keep it as a 'core' test only for the most commonly used model classes (BERT, GPT2 and BART), and hope that if there's a problem with it that it shows up there. |
What does this PR do?
This is an extension to https://github.com/huggingface/transformers/pull/14361/files, which hopefully will prevent errors such as #14403 from going unnoticed.
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@Rocketknight1 I assume this test (if run) will fail for quite some architectures, I will try to see if I can provide a fix on this PR, feel free to review/comment.