Revert "Fix weight loading issue" #14406

patrickvonplaten · 2021-11-15T18:33:35Z

This reverts commit a67d47b.

patrickvonplaten · 2021-11-15T19:12:04Z

Sorry for reverting the PR here - it's on me! I merged it too quickly. We had some internal discussion and came to the conclusion that this hack is probably not worth the functionality it would give us here.

Saving and loading a model with tempfile inside the from_encoder_decoder_pretrained(...) function is a big hack and it's questionable whether it's worth it.

Just to compare the current design to how it would look like if we revert the PR for @LysandreJik @sgugger @Rocketknight1

If we leave master as it is, one can convert a PyTorch model checkpoint correctly as follows:

current design

from transformers import EncoderDecoderModel, TFEncoderDecoderModel

_model = EncoderDecoderModel.from_pretrained("patrickvonplaten/bert2bert-cnn_dailymail-fp16")
_model.encoder.save_pretrained("./encoder")
_model.decoder.save_pretrained("./decoder")

model = TFEncoderDecoderModel.from_encoder_decoder_pretrained(
    "./encoder", "./decoder", encoder_from_pt=True, decoder_from_pt=True
)

# then this works:
model.save_pretrained("./")
model = TFEncoderDecoderModel.from_pretrained("./")

If we remove the hack, the (in my opinion currently only way) to convert a PT checkpoint to TF is the following:

design after removing hack

from transformers import EncoderDecoderModel, TFEncoderDecoderModel, TFAutoModel, TFAutoModelForCausalLM

_model = EncoderDecoderModel.from_pretrained("patrickvonplaten/bert2bert-cnn_dailymail-fp16")
_model.encoder.save_pretrained("./encoder")
_model.decoder.save_pretrained("./decoder")

# all these lines are currently done automatically. There is not really a way around doing them if we remove the hack IMO
_encoder = TFAutoModel.from_pretrained("./encoder", from_pt=True)
_decoder = TFAutoModelForCausalLM.from_pretrained("./decoder", from_pt=True)
_encoder.save_pretrained("./encoder")
_decoder.save_pretrained("./decoder")

model = TFEncoderDecoderModel.from_encoder_decoder_pretrained("./encoder", "./decoder")

# then this works:
model.save_pretrained("./")
model = TFEncoderDecoderModel.from_pretrained("./")

So we can see that removing the hack would force the user to do the exact same thing we are doing right now

patrickvonplaten · 2021-11-15T19:12:49Z

Give that the hack only lives in modeling_tf_encoder_decoder.py and having thought about it again, I'm actually in favor of not merging this PR, but I defer to @LysandreJik and @sgugger to decide here.

ydshieh · 2021-11-15T21:25:47Z

No problem for me. I leave HF members to make the decision. Just to make this clear to users would be fine on my side 😀

LysandreJik · 2021-11-15T21:49:12Z

Agree to close this and keep the current hack, as long as we mention that the TFEncoderDecoder is experimental.

In my opinion, TensorFlow is globally ill-suited for managing several models into a single one like it is done here, and it will always have some hacky/kind-of-broken edge cases.

I would advocate for keeping the work @ydshieh has done so far and see if the community appreciates/uses the feature before spending time refactoring this complex piece of software.

sgugger · 2021-11-15T22:17:44Z

I agree with you two @LysandreJik and @patrickvonplaten, even if I'm really not a fan of the hack behind the scenes. Let's worry about making it better when we have a wide adoption of the TFEncoderDecoder :-)

Revert "Fix weight loading issue (#14016)"

dd2d5e3

This reverts commit a67d47b.

patrickvonplaten closed this Nov 15, 2021

patrickvonplaten deleted the revert-14016-fix_tf_enc_dec_weight_loading branch November 15, 2021 22:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revert "Fix weight loading issue" #14406

Revert "Fix weight loading issue" #14406

patrickvonplaten commented Nov 15, 2021

patrickvonplaten commented Nov 15, 2021 •

edited

Loading

patrickvonplaten commented Nov 15, 2021 •

edited

Loading

ydshieh commented Nov 15, 2021

LysandreJik commented Nov 15, 2021

sgugger commented Nov 15, 2021

Revert "Fix weight loading issue" #14406

Revert "Fix weight loading issue" #14406

Conversation

patrickvonplaten commented Nov 15, 2021

patrickvonplaten commented Nov 15, 2021 • edited Loading

current design

design after removing hack

patrickvonplaten commented Nov 15, 2021 • edited Loading

ydshieh commented Nov 15, 2021

LysandreJik commented Nov 15, 2021

sgugger commented Nov 15, 2021

patrickvonplaten commented Nov 15, 2021 •

edited

Loading

patrickvonplaten commented Nov 15, 2021 •

edited

Loading