TF GPT2: clearer model variable naming with @unpack_inputs #16311

cakiki · 2022-03-21T20:12:37Z

What does this PR do?

Addresses #16051

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum?
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@gante @Rocketknight1

cakiki · 2022-03-21T20:42:17Z

Some tests failed locally: 12 failed, 37 passed, 1 skipped, 1 warning in 205.75s (0:03:25)

One such test: ValueError: The following keyword arguments are not supported by this model: ['past_key_values']. even though GPT2 calls it past as opposed to past_key_values. Shouldn't the test be rewritten?

HuggingFaceDocBuilderDev · 2022-03-21T20:46:00Z

The documentation is not available anymore as the PR was closed or merged.

gante · 2022-03-22T10:55:24Z

Hey @cakiki 👋 The problem you're seeing is related to another change that is happening at the same time -- we are refactoring the auto-regressive generation function, where the past input variable in some old models got updated to past_key_values, to be consistent throughout models/frameworks.

(see next comment)

gante · 2022-03-22T11:06:08Z

@sgugger @patrickvonplaten calling for your opinion here.

TL;DR:

In a generate() refactor past PR, I made prepare_inputs_for_generate() uniform across famerworks. In the process, one of the output keys in TF GPT2 was updated from past to past_key_values -- removing the past was one of the TODO goals flagged by @patrickvonplaten;
However, the current version of the model expects past as an input, if passed as a keyword argument [which raises the error @cakiki is seeing];
In PT/FLAX, this input is called past_key_values.

To fix it we have two options:

Revert the output of prepare_inputs_for_generate() from past_key_values to past;
We update the GPT2 input from past to past_key_values. It would be an API change, but this variable is mostly used in generate(), right?

I'm pro option 2, but WDYT?

gante

Other than the issue under discussion, looks solid 🔥

src/transformers/models/gpt2/modeling_tf_gpt2.py

patrickvonplaten · 2022-03-22T11:16:14Z

I'm in favor of 1). I haven't done a good job at reviewing the refactor PR I think - see: #15944 (comment).

Backwards compatibility for models such as GPT2 is extremely important and people do use past outside of generate. Think the easy fix here is to just revert the line above.

The other possibility is to deprecate past in general for all models in TF, but this should be done over a deprecation cycle so that users are aware of the change.

patrickvonplaten · 2022-03-22T11:16:45Z

Were there other models where we renamed past to past_key_values without changing the keyword argument name in the forward function?

gante · 2022-03-22T11:18:09Z

Were there other models where we renamed past to past_key_values without changing the keyword argument name in the forward function?

Perhaps, going to check and open a PR to fix it (including this one)

@cakiki I will fix the issue in a separate PR, and will ping you to rebase with main when it is sorted

gante · 2022-03-23T18:42:10Z

@cakiki the fix is merged -- rebasing with main should fix the problems you're seeing :)

gante · 2022-03-23T18:43:19Z

(please rerun the tests locally before merging, and confirm here that they pass)

cakiki · 2022-03-23T19:23:08Z

Rebasing with main did indeed solve most of the failing tests. The following 3 are still failing, but they're unrelated to the previous issue.

======================================================================================================= short test summary info =======================================================================================================
FAILED tests/gpt2/test_modeling_tf_gpt2.py::TFGPT2ModelTest::test_gpt2_xla_generate - TypeError: function() got an unexpected keyword argument 'jit_compile'
FAILED tests/gpt2/test_modeling_tf_gpt2.py::TFGPT2ModelTest::test_onnx_runtime_optimize - ModuleNotFoundError: No module named 'onnxruntime'
FAILED tests/gpt2/test_modeling_tf_gpt2.py::TFGPT2ModelLanguageGenerationTest::test_lm_generate_gpt2_xla - TypeError: function() got an unexpected keyword argument 'jit_compile'

gante · 2022-03-23T19:40:49Z

@cakiki

jit_compile as a flag of tf.function() was added in TF2.5, can you confirm that you have TF >= 2.5? If not, can you try reruning after updating TF to a version equal or higher than 2.5? [added a personal TODO to throw an error if TF<2.5]
for the other error, can you try reruning the tests after reinstalling transformers with pip install -e ".[dev,onnx]"? It should be because of the onnx special dependencies :)

cakiki · 2022-03-23T20:26:05Z

@gante I uninstalled tensorflow and explicitly pinned it >=2.5 and that worked. I noticed that pip install .[dev] was installing a bunch of tensorflow versions then finally settling on version 2.3 (fresh virtual env). (setup.py sets it to >=2.3)

test_onnx_runtime_optimize is the only test still failing:

E           onnxruntime.capi.onnxruntime_pybind11_state.InvalidGraph: [ONNXRuntimeError] : 10 : INVALID_GRAPH : This is an invalid model. In Node, ("tfgp_t2for_sequence_classification_22/GatherV2", GatherV2, "", -1) : ("tfgp_t2for_sequence_classification_22/score/Tensordot:0": tensor(float),"tfgp_t2for_sequence_classification_22/sub:0": tensor(int32),"tfgp_t2for_sequence_classification_22/sub/y:0": tensor(int32),) -> ("logits": tensor(float),) , Error No Op registered for GatherV2 with domain_version of 10

../venv/lib/python3.6/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py:370: InvalidGraph

ERROR    tf2onnx.tfonnx:tfonnx.py:303 Failed to convert node 'tfgp_t2for_sequence_classification_22/GatherV2' (fct=<bound method GatherV2.version_1 of <class 'tf2onnx.onnx_opset.tensor.GatherV2'>>)
'OP=GatherV2\nName=tfgp_t2for_sequence_classification_22/GatherV2\nInputs:\n\ttfgp_t2for_sequence_classification_22/score/Tensordot:0=Reshape, [-1, -1, 2], 1\n\ttfgp_t2for_sequence_classification_22/sub:0=Sub, [-1], 6\n\ttfgp_t2for_sequence_classification_22/GatherV2/axis:0=Const, [], 6\nOutpus:\n\ttfgp_t2for_sequence_classification_22/GatherV2:0=[-1, 2], 1'
Traceback (most recent call last):
  File "/media/ssd/BIGSCIENCE/venv/lib/python3.6/site-packages/tf2onnx/tfonnx.py", line 292, in tensorflow_onnx_mapping
    func(g, node, **kwargs, initialized_tables=initialized_tables, dequantize=dequantize)
  File "/media/ssd/BIGSCIENCE/venv/lib/python3.6/site-packages/tf2onnx/onnx_opset/tensor.py", line 444, in version_1
    utils.make_sure(node.get_attr_value("batch_dims", 0) == 0, err_msg)
  File "/media/ssd/BIGSCIENCE/venv/lib/python3.6/site-packages/tf2onnx/utils.py", line 260, in make_sure
    raise ValueError("make_sure failure: " + error_msg % args)
ValueError: make_sure failure: Opset 12 required for batch_dims attribute of GatherV2

cakiki · 2022-03-23T20:27:17Z

If it helps:

- `transformers` version: 4.18.0.dev0
- Platform: Linux-4.15.0-171-generic-x86_64-with-Ubuntu-18.04-bionic
- Python version: 3.6.9
- Huggingface_hub version: 0.4.0
- PyTorch version (GPU?): 1.10.1+cu102 (True)
- Tensorflow version (GPU?): 2.6.2 (False)
- Flax version (CPU?/GPU?/TPU?): 0.3.5 (cpu)
- Jax version: 0.2.17
- JaxLib version: 0.1.69

gante · 2022-03-24T12:03:40Z

@cakiki It is settling on TF 2.3 because of python 3.6-related limitations on several packages :( We are actually having an internal discussion about potentially dropping support to python 3.6, since it is causing issues with both TF and PT (e.g. TF 2.8 requires python >= 3.7).

As for the onnx test, let's not worry about it, since it is failing on master as well.

gante · 2022-03-24T12:06:59Z

I see that there are further errors in the tests, I will take a look (I think I know how to fix it). I will ping here again when the related fix is merged.

Hah, it seems like you got the best model to apply @unpack_inputs on :D

gante · 2022-03-29T18:34:01Z

@cakiki I've been working on issues related to TF GPT-2, and it seems like I've also solved the errors here. I've tested locally with these changes on top of main, and all tests pass (except the onnx one, which is also failing on main).

We've recently renamed our branch from master to main, so CI won't turn green until we rebase -- which I've just pushed :)

cakiki · 2022-03-29T19:02:48Z

@gante Thank you for the update!

gante · 2022-03-29T19:35:17Z

The failing test is being tracked internally -- merging

@cakiki thank you for the contribution! (and for the patience)

cakiki marked this pull request as ready for review March 21, 2022 20:41

gante self-requested a review March 22, 2022 11:08

gante approved these changes Mar 22, 2022

View reviewed changes

src/transformers/models/gpt2/modeling_tf_gpt2.py Outdated Show resolved Hide resolved

gante mentioned this pull request Mar 22, 2022

TF - Fix interchangeable past/past_key_values and revert output variable name in GPT2 #16332

Merged

cakiki added 7 commits March 29, 2022 18:33

add unpack_inputs decorator to Main Layer

0693915

add unpack_inputs decorator to Model

7e87da6

add unpack_inputs decorator to LMHead Model

086d5ca

add unpack_inputs decorator to Double Head Model

83b267a

add unpack_inputs decorator to Sequence Classification Model

9144492

run fixup recipe

5df624e

make unpack_inputs the first decorator

c122e9c

gante merged commit ee18d4d into huggingface:main Mar 29, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TF GPT2: clearer model variable naming with @unpack_inputs #16311

TF GPT2: clearer model variable naming with @unpack_inputs #16311

cakiki commented Mar 21, 2022

cakiki commented Mar 21, 2022

HuggingFaceDocBuilderDev commented Mar 21, 2022 •

edited

gante commented Mar 22, 2022

gante commented Mar 22, 2022 •

edited

gante left a comment

patrickvonplaten commented Mar 22, 2022

patrickvonplaten commented Mar 22, 2022

gante commented Mar 22, 2022

gante commented Mar 23, 2022

gante commented Mar 23, 2022

cakiki commented Mar 23, 2022

gante commented Mar 23, 2022

cakiki commented Mar 23, 2022

cakiki commented Mar 23, 2022

gante commented Mar 24, 2022 •

edited

gante commented Mar 24, 2022

gante commented Mar 29, 2022

cakiki commented Mar 29, 2022

gante commented Mar 29, 2022

TF GPT2: clearer model variable naming with @unpack_inputs #16311

TF GPT2: clearer model variable naming with @unpack_inputs #16311

Conversation

cakiki commented Mar 21, 2022

What does this PR do?

Before submitting

Who can review?

cakiki commented Mar 21, 2022

HuggingFaceDocBuilderDev commented Mar 21, 2022 • edited

gante commented Mar 22, 2022

gante commented Mar 22, 2022 • edited

gante left a comment

Choose a reason for hiding this comment

patrickvonplaten commented Mar 22, 2022

patrickvonplaten commented Mar 22, 2022

gante commented Mar 22, 2022

gante commented Mar 23, 2022

gante commented Mar 23, 2022

cakiki commented Mar 23, 2022

gante commented Mar 23, 2022

cakiki commented Mar 23, 2022

cakiki commented Mar 23, 2022

gante commented Mar 24, 2022 • edited

gante commented Mar 24, 2022

gante commented Mar 29, 2022

cakiki commented Mar 29, 2022

gante commented Mar 29, 2022

HuggingFaceDocBuilderDev commented Mar 21, 2022 •

edited

gante commented Mar 22, 2022 •

edited

gante commented Mar 24, 2022 •

edited