Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Ensure TF model configs can be converted to proper JSON #14415

Merged
merged 14 commits into from
Nov 17, 2021
Merged

[WIP] Ensure TF model configs can be converted to proper JSON #14415

merged 14 commits into from
Nov 17, 2021

Conversation

Zahlii
Copy link
Contributor

@Zahlii Zahlii commented Nov 16, 2021

What does this PR do?

This is an extension to https://github.com/huggingface/transformers/pull/14361/files, which hopefully will prevent errors such as #14403 from going unnoticed.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

@Rocketknight1 I assume this test (if run) will fail for quite some architectures, I will try to see if I can provide a fix on this PR, feel free to review/comment.

@Zahlii Zahlii changed the title Ensure TF model configs can be converted to proper JSON Draft: Ensure TF model configs can be converted to proper JSON Nov 16, 2021
@Zahlii Zahlii changed the title Draft: Ensure TF model configs can be converted to proper JSON [WIP] Ensure TF model configs can be converted to proper JSON Nov 16, 2021
@Rocketknight1
Copy link
Member

Thanks for this! I see a few failing tests, but I think something like this should work. One thing I'd suggest: I think the from_config() method should probably work whether a dict or a true config object is passed. Can we check the type of the input, and only do the conversion if it's actually a dict?

@Zahlii
Copy link
Contributor Author

Zahlii commented Nov 16, 2021

Yes, I think these here need to be fixed individually, as the config indeed is not JSONifiable. Regarding your last comment, I'll try to see if I can add that and a test case for it later.


FAILED tests/test_modeling_tf_distilbert.py::TFDistilBertModelTest::test_save_load_config
FAILED tests/test_modeling_tf_funnel.py::TFFunnelModelTest::test_save_load_config
FAILED tests/test_modeling_tf_gpt2.py::TFGPT2ModelTest::test_save_load_config
FAILED tests/test_modeling_tf_flaubert.py::TFFlaubertModelTest::test_save_load_config
FAILED tests/test_modeling_tf_funnel.py::TFFunnelBaseModelTest::test_save_load_config
FAILED tests/test_modeling_tf_led.py::TFLEDModelTest::test_save_load_config
FAILED tests/test_modeling_tf_xlm.py::TFXLMModelTest::test_save_load_config
FAILED tests/test_modeling_tf_xlnet.py::TFXLNetModelTest::test_save_load_config

@Rocketknight1
Copy link
Member

Rocketknight1 commented Nov 16, 2021

That's strange, the config for at least some of them seems to convert fine for me. For example, this works (after installing from master):

from transformers import TFAutoModel
import json

model = TFAutoModel.from_pretrained('distilbert-base-cased')
json.dumps(model.get_config().to_dict())

@Zahlii
Copy link
Contributor Author

Zahlii commented Nov 16, 2021

It seems the problem rather is on loading the config: When running PretrainedConfig.from_dict(), the returned config will be of class PretrainedConfig, NOT of the specific models class; hence getattr calls to non-standard properties fail.

Is there a way to get the correct config class when calling from_config()? Otherwise, we would need to save the class name as part of the get_config(), and when calling from_config() use this to map to the correct class.

EDIT: I think we can use cls.config_class. Lets see if the tests go through.

src/transformers/models/distilbert/modeling_tf_distilbert.py:347: in __init__
    self.num_hidden_layers = config.num_hidden_layers
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = PretrainedConfig {
  "activation": "gelu",
  "attention_dropout": 0.1,
  "dim": 32,
  "dropout": 0.1,
  "hidden_act": ..._classif_dropout": 0.2,
  "sinusoidal_pos_embds": false,
  "transformers_version": "4.13.0.dev0",
  "vocab_size": 99
}

key = 'num_hidden_layers'

    def __getattribute__(self, key):
        if key != "attribute_map" and key in super().__getattribute__("attribute_map"):
            key = super().__getattribute__("attribute_map")[key]
>       return super().__getattribute__(key)
E       AttributeError: 'PretrainedConfig' object has no attribute 'num_hidden_layers'

@Rocketknight1
Copy link
Member

Rocketknight1 commented Nov 17, 2021

@Zahlii Since we know things are broken, we're going to merge this PR urgently, and then quickly work on testing to follow it up. I'll tag you in the PR - we're planning a revamp to TF testing, since the tests that would have caught this were marked as tooslow.

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for thee fix!

@Rocketknight1
Copy link
Member

@Zahlii Another update, and a change of plan! We're going to revert the last commit, do the fixes in this PR, and I might add some testing to this PR before it's merged. Is that okay with you?

@Zahlii
Copy link
Contributor Author

Zahlii commented Nov 17, 2021

Sure, go ahead and let me know if I can support further.

@Rocketknight1
Copy link
Member

Cool, thank you!

@Zahlii
Copy link
Contributor Author

Zahlii commented Nov 17, 2021

Small comment without having checked the code - I observed that the saved model format per default traces all functions. For my use cases, I always disabled that because it added an enormous overhead, and with a correct config handling it wasn't required. On the one hand this rids the requirement for the config stuff, on the other hand it is much slower. How is this currently handled , both inside tests and others? https://www.tensorflow.org/api_docs/python/tf/keras/models/save_model

@@ -28,6 +28,7 @@
from requests.exceptions import HTTPError
from transformers import is_tf_available
from transformers.models.auto import get_values
from transformers.testing_utils import tooslow # noqa: F401
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not leave it in the imports below?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This caused style issues because of the #noqa tag

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But if it's not used anymore, why note remove it, I'm confused.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I'm sorry! I thought it should stay imported because it might be needed in that file, but that's probably a stupid way to do things. Removing it!

@@ -28,6 +28,7 @@
from requests.exceptions import HTTPError
from transformers import is_tf_available
from transformers.models.auto import get_values
from transformers.testing_utils import tooslow # noqa: F401
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But if it's not used anymore, why note remove it, I'm confused.

@Rocketknight1 Rocketknight1 merged commit 1991da0 into huggingface:master Nov 17, 2021
@Rocketknight1
Copy link
Member

Small comment without having checked the code - I observed that the saved model format per default traces all functions. For my use cases, I always disabled that because it added an enormous overhead, and with a correct config handling it wasn't required. On the one hand this rids the requirement for the config stuff, on the other hand it is much slower. How is this currently handled , both inside tests and others? https://www.tensorflow.org/api_docs/python/tf/keras/models/save_model

This is a good point - the short answer is that we want it to work when people save their model like that, but like you we found it was much too slow to test every model with it in the CI. The solution we went with in this PR was to keep it as a 'core' test only for the most commonly used model classes (BERT, GPT2 and BART), and hope that if there's a problem with it that it shows up there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants