Tapas tf #13393

kamalkraj · 2021-09-02T15:51:54Z

What does this PR do?

TF Tapas

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@NielsRogge @sgugger @LysandreJik @Rocketknight1

kamalkraj · 2021-09-02T15:52:46Z

kamalkraj · 2021-09-02T15:58:03Z

pending

adding unit test cases
adding model to TFAutoTableQuestionAnswering pipeline
adding unit test for the pipeline
fix all the copy pasted code comments from pt->tf
update tapas.rst with TF sample code
add #Copied Comments`
push tf weights to official model hub - help needed.
update MLM model, Add TAPAS MLM-only models #13408
...

kamalkraj · 2021-09-08T14:05:18Z

@LysandreJik
ready for review 🤗

LysandreJik · 2021-09-08T14:54:49Z

Great news @kamalkraj!

@NielsRogge and @Rocketknight1, could you take a look at this?

docs/source/model_doc/tapas.rst

src/transformers/modeling_tf_pytorch_utils.py

src/transformers/models/tapas/configuration_tapas.py

tests/test_modeling_tf_common.py

src/transformers/models/tapas/modeling_tf_tapas.py

Rocketknight1 · 2021-09-20T18:21:05Z

Hi, I'd like to apologize for not getting to this sooner! It's a huge PR, but I'll try to get through it today or tomorrow and give feedback where I can.

src/transformers/models/tapas/modeling_tf_tapas.py

kamalkraj · 2021-11-22T17:07:52Z

Thanks, @Rocketknight1
Fixed the issue.

@NielsRogge
Could you please help me to upload the model weights to the official hub?

NielsRogge · 2021-11-23T14:13:49Z

@kamalkraj sure!

Btw I just discovered a (tiny) bug in the forward pass of TAPAS (when config.select_one_column is set to False). Let me open a PR for it first, such that you can include the fix in this PR.

NielsRogge · 2021-11-29T13:40:51Z

@kamalkraj I'm uploading all TAPAS TF checkpoints to the hub.

Can you resolve the conflict shown above? Also, can you confirm the test_pt_tf_model_equivalence tests pass? They don't seem to pass on CI.

kamalkraj · 2021-11-29T14:01:42Z

@NielsRogge

kamalkraj · 2021-11-29T14:07:21Z

@NielsRogge
Test in CI failed due to version conflict of Tensorflow Probability with Tensorflow, Should I pin the version ?

NielsRogge · 2021-11-29T14:12:41Z

The TF version should not be pinned, but the TF probability version can be pinned.

kamalkraj · 2021-11-29T14:19:28Z

@NielsRogge
In the setup.py the Tensorflow version is specified >= 2.3

https://github.com/kamalkraj/transformers/blob/fbad9bb56e8f67dca2c29fb21a5a017c823c57b7/setup.py#L155-L156

But in the CI Tensorflow installed is 2.6.2, Any idea
https://app.circleci.com/pipelines/github/huggingface/transformers/30600/workflows/a636c0a9-a0f0-4fbe-8124-276b7ec5d6c5/jobs/312697?invite=true#step-111-4303

Current TF latest version is 2.7

LysandreJik · 2021-11-29T14:52:17Z

It seems that this is due to pip not upgrading itself correctly:

WARNING: You are using pip version 21.2.4; however, version 21.3.1 is available.
You should consider upgrading via the '/usr/local/bin/python -m pip install --upgrade pip' command.

Could you try to update the following in your PR:

transformers/.circleci/config.yml

Line 82 in 25156eb

- run: pip install --upgrade pip

to

/usr/local/bin/python -m pip install --upgrade pip

to check if it changes anything? Thanks, @kamalkraj

LysandreJik · 2021-11-29T15:20:21Z

Thanks for trying it out, it seems like it didn't work out. I'll try a few things and come back to you.

LysandreJik · 2021-11-29T15:54:13Z

Found the error! TensorFlow 2.7 does not support Python 3.6 anymore (cc @sgugger, @Rocketknight1, @patrickvonplaten, @patil-suraj).

Could you update this line: https://github.com/huggingface/transformers/blob/master/.circleci/config.yml#L68

to have circleci/python:3.7 as an image?

kamalkraj · 2021-11-29T16:23:11Z

Thanks @LysandreJik

Should I revert this commit f18cfa9?

I have changed the python version only here run_tests_torch_and_tf.

kamalkraj · 2021-11-29T17:01:21Z

@NielsRogge
Thank you so much for uploading all the TF models to the hub.
I have updated the tests to load the model directly from the hub, rather than using from_pt

Ready to merge 🤗

LysandreJik · 2021-11-30T08:53:22Z

Indeed, if you can revert the pip commit then we're ready to go! We can also merge it and revert it afterwards, do you want to take care of that @NielsRogge?

LysandreJik · 2021-11-30T09:50:38Z

Fantastic @kamalkraj, let's merge it once it's all green

* TF Tapas first commit * updated docs * updated logger message * updated pytorch weight conversion script to support scalar array * added use_cache to tapas model config to work properly with tf input_processing * 1. rm embeddings_sum 2. added # Copied 3. + TFTapasMLMHead 4. and lot other small fixes * updated docs * + test for tapas * updated testing_utils to check is_tensorflow_probability_available * converted model logits post processing using numpy to work with both PT and TF models * + TFAutoModelForTableQuestionAnswering * added TF support * added test for TFAutoModelForTableQuestionAnswering * added test for TFAutoModelForTableQuestionAnswering pipeline * updated auto model docs * fixed typo in import * added tensorflow_probability to run tests * updated MLM head * updated tapas.rst with TF model docs * fixed optimizer import in docs * updated convert to np data from pt model is not `transformers.tokenization_utils_base.BatchEncoding` after pipeline upgrade * updated pipeline: 1. with torch.no_gard removed, pipeline forward handles 2. token_type_ids converted to numpy * updated docs. * removed `use_cache` from config * removed floats_tensor * updated code comment * updated Copyright Year and logits_aggregation Optional * updated docs and comments * updated docstring * fixed model weight loading * make fixup * fix indentation * added tf slow pipeline test * pip upgrade * upgrade python to 3.7 * removed from_pt from tests * revert commit f18cfa9

kamalkraj force-pushed the tapas-tf branch from 09d0eb3 to eaff9a9 Compare September 4, 2021 13:00

kamalkraj marked this pull request as ready for review September 4, 2021 13:55

kamalkraj force-pushed the tapas-tf branch 2 times, most recently from 13269bf to 477463f Compare September 8, 2021 13:40

LysandreJik requested review from NielsRogge and Rocketknight1 September 8, 2021 14:54

kamalkraj force-pushed the tapas-tf branch 2 times, most recently from 47f0d80 to aacddb8 Compare September 10, 2021 17:19