[Don't merge yet][T5, Bart] Allow t5 torch trace #6268

patrickvonplaten · 2020-08-05T15:36:30Z

This PR would fix #5647 .

It's not a great solution IMO though.
The problem with torch script is that one cannot pass keyword arguments, but has to pass positional arguments and it is not possible to pass None because every input is required to be a tensor.
Because T5 requires both input_ids and decoder_input_ids, the two arguments should arguably be placed as the first two arguments.

There might be use cases though, where the same error would occur, which we could not save then, e.g. one wants to input input_embeds.

Maybe @LysandreJik @sgugger have a better idea.

sgugger · 2020-08-05T15:39:06Z

Looks innocuous to me, is there some test this allows us to enable for jit and T5?

codecov · 2020-08-05T15:42:07Z

Codecov Report

Merging #6268 into master will decrease coverage by 1.28%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #6268      +/-   ##
==========================================
- Coverage   79.79%   78.50%   -1.29%     
==========================================
  Files         148      148              
  Lines       27196    27196              
==========================================
- Hits        21701    21351     -350     
- Misses       5495     5845     +350

Impacted Files	Coverage Δ
src/transformers/modeling_bart.py	`95.76% <ø> (ø)`
src/transformers/modeling_t5.py	`84.46% <ø> (+1.13%)`	⬆️
src/transformers/modeling_tf_mobilebert.py	`24.55% <0.00%> (-70.09%)`	⬇️
src/transformers/tokenization_t5.py	`71.83% <0.00%> (-23.95%)`	⬇️
src/transformers/tokenization_openai.py	`71.21% <0.00%> (-12.88%)`	⬇️
src/transformers/file_utils.py	`82.44% <0.00%> (+0.25%)`	⬆️
src/transformers/tokenization_dpr.py	`57.65% <0.00%> (+4.50%)`	⬆️
src/transformers/modeling_tf_flaubert.py	`87.73% <0.00%> (+63.19%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9f57e39...ac001c4. Read the comment docs.

patrickvonplaten · 2020-08-05T15:51:27Z

tests/test_modeling_common.py

@@ -245,15 +245,18 @@ def _create_and_check_torchscript(self, config, inputs_dict):
            inputs = self._prepare_for_class(inputs_dict, model_class)["input_ids"]  # Let's keep only input_ids

            try:
-                traced_gpt2 = torch.jit.trace(model, inputs)
+                if model.__class__.__name__ in ["T5Model", "T5ForConditionalGeneration"]:


Quite hacky here, but I didn't see another way....

Isn't it more general as is_encoder_decoder?

Think bart can run without requiring the decoder_input_ids @sshleifer...but I guess it would be cleaner to call it encoder_decoder here...we will have to slightly change Bart then.

If I change to is_encoder_decoder=True, and do the corresponding changes for Bart, Bart hits an assert:

assert attn_output.size() == (bsz * self.num_heads, tgt_len, self.head_dim)

Not really sure what's going on there....do you have an idea @sshleifer ?

I'll push the changes -> let's see if we want to change Bart accordingly or revert to the T5 hack...

I wrote that assert and should have written an error message, but I can't understand the issue without looking more closely. Does it matter that the second arg to BartForConditionalGeneration.forward is attention_mask?

Its outdated, seems like you figured it out! I will add a message to my assert.

patrickvonplaten · 2020-08-05T15:51:39Z

tests/test_modeling_t5.py

@@ -282,7 +282,7 @@ class T5ModelTest(ModelTesterMixin, unittest.TestCase):
    all_model_classes = (T5Model, T5ForConditionalGeneration) if is_torch_available() else ()
    all_generative_model_classes = (T5ForConditionalGeneration,) if is_torch_available() else ()
    test_pruning = False
-    test_torchscript = False
+    test_torchscript = True


turn test on

patrickvonplaten · 2020-08-08T15:30:02Z

After digging a bit deeper why Bart tests fail, I think the reason is that the Bart cache/past_key_value data structure is not compatible with torchscript. For now the Bart tests pass because returning the cache/past_key_vaule is disabled - see

transformers/tests/test_modeling_common.py

Line 252 in ac001c4

model.config.use_cache = False

.

A bug was filed for this problem: #6348.

patrickvonplaten · 2020-08-08T15:31:04Z

PR should be good for merge IMO. @LysandreJik @sshleifer @sgugger - would be great if you can take a quick second look.

sshleifer

LGTM. I'll do my part after this merges!

I'm a little concerned that jit cares about arg ordering, but python doesn't care about kwarg ordering, so there might be some very confusing bugs in the future. Don't have a great idea of how to fix + maintain the same API.

sshleifer · 2020-08-09T02:01:17Z

tests/test_modeling_common.py

            except RuntimeError:
                self.fail("Couldn't trace module.")

            with tempfile.TemporaryDirectory() as tmp_dir_name:
                pt_file_name = os.path.join(tmp_dir_name, "traced_model.pt")

                try:
-                    torch.jit.save(traced_gpt2, pt_file_name)
+                    torch.jit.save(traced_model, pt_file_name)


(out of scope)
Why is the try/except -> self.fail pattern useful?

Without try/except you get an error that traces back to a line with the word save in it.
I just looked at self.fail and it turns things into AssertionErrors, which seems like strictly less info.

sgugger

LGTM, great work!

patrickvonplaten · 2020-08-11T08:52:36Z

Putting this on hold for now as it introduces a breaking change.

misrasaurabh1 · 2020-09-02T20:27:51Z

Any updates on this?

patrickvonplaten · 2020-09-03T13:35:27Z

The problem is that it breaks backwards compatibility in a sense that the positional arguments of Bart and T5 are changed. At the moment this is the only option to make torch tracing work for Bart and T5 though...there might be a possiblity to trace a wrapper around the model though - see pytorch/pytorch#14455 . But this currently leads to another problem which is probably related to our PyTorch models not being scriptable at the moment.

patrickvonplaten requested review from LysandreJik and sgugger August 5, 2020 15:36

patrickvonplaten mentioned this pull request Aug 5, 2020

T5 TorchScript (Trace) Conversion #5647

Closed

patrickvonplaten commented Aug 5, 2020

View reviewed changes

sshleifer approved these changes Aug 6, 2020

View reviewed changes

patrickvonplaten added 4 commits August 8, 2020 17:03

fix issue

b6569ca

adapt docs

2f72f79

enable t5 torch test

8428e3a

more general in pytorch trace test

0038b24

patrickvonplaten force-pushed the allow_t5_torch_trace branch from 1352cc6 to 0038b24 Compare August 8, 2020 15:03

fix bart test

ac001c4

patrickvonplaten changed the title ~~[T5] Allow t5 torch trace~~ [T5, Bart] Allow t5 torch trace Aug 8, 2020

sshleifer approved these changes Aug 9, 2020

View reviewed changes

sgugger approved these changes Aug 10, 2020

View reviewed changes

patrickvonplaten changed the title ~~[T5, Bart] Allow t5 torch trace~~ [Don't merge yet][T5, Bart] Allow t5 torch trace Aug 11, 2020

patrickvonplaten closed this Oct 1, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Don't merge yet][T5, Bart] Allow t5 torch trace #6268

[Don't merge yet][T5, Bart] Allow t5 torch trace #6268

patrickvonplaten commented Aug 5, 2020 •

edited

sgugger commented Aug 5, 2020

codecov bot commented Aug 5, 2020 •

edited

patrickvonplaten Aug 5, 2020

sgugger Aug 5, 2020

patrickvonplaten Aug 5, 2020

patrickvonplaten Aug 5, 2020 •

edited

patrickvonplaten Aug 5, 2020

sshleifer Aug 6, 2020

sshleifer Aug 6, 2020

patrickvonplaten Aug 5, 2020

patrickvonplaten commented Aug 8, 2020

patrickvonplaten commented Aug 8, 2020

sshleifer left a comment

sshleifer Aug 9, 2020

sgugger left a comment

patrickvonplaten commented Aug 11, 2020

misrasaurabh1 commented Sep 2, 2020

patrickvonplaten commented Sep 3, 2020 •

edited

[Don't merge yet][T5, Bart] Allow t5 torch trace #6268

[Don't merge yet][T5, Bart] Allow t5 torch trace #6268

Conversation

patrickvonplaten commented Aug 5, 2020 • edited

sgugger commented Aug 5, 2020

codecov bot commented Aug 5, 2020 • edited

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

patrickvonplaten Aug 5, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

patrickvonplaten commented Aug 8, 2020

patrickvonplaten commented Aug 8, 2020

sshleifer left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sgugger left a comment

Choose a reason for hiding this comment

patrickvonplaten commented Aug 11, 2020

misrasaurabh1 commented Sep 2, 2020

patrickvonplaten commented Sep 3, 2020 • edited

patrickvonplaten commented Aug 5, 2020 •

edited

codecov bot commented Aug 5, 2020 •

edited

patrickvonplaten Aug 5, 2020 •

edited

patrickvonplaten commented Sep 3, 2020 •

edited