Fix `serving_output` for TF composite models (encoder-decoder like models) #22743

ydshieh · 2023-04-13T12:16:15Z

What does this PR do?

[If the concept is approved, I will apply the same changes to other TF encoder-decoder family of models]

The composite models use its components' configurations. See for example

transformers/src/transformers/modeling_tf_utils.py

Lines 426 to 430 in 95e7057

    
           # Encoder Decoder models delegate the application of the configuration options to their inner models. 
        
           if "EncoderDecoder" in self.__class__.__name__: 
        
               config = None 
        
           else: 
        
               config = self.config

However, in some places, our codebase still try to access some attributes at the top level of the configuration (i.e. not inside the 2 components), like

transformers/src/transformers/models/vision_encoder_decoder/modeling_tf_vision_encoder_decoder.py

Lines 664 to 669 in 95e7057

    
           def serving_output(self, output): 
        
               pkv = tf.tuple(output.past_key_values)[1] if self.config.use_cache else None 
        
               dec_hs = tf.convert_to_tensor(output.decoder_hidden_states) if self.config.output_hidden_states else None 
        
               dec_attns = tf.convert_to_tensor(output.decoder_attentions) if self.config.output_attentions else None 
        
               enc_hs = tf.convert_to_tensor(output.encoder_hidden_states) if self.config.output_hidden_states else None 
        
               enc_attns = tf.convert_to_tensor(output.encoder_attentions) if self.config.output_attentions else None

In particular, self.config may not have use_cache, for example, for this checkpoint "nlpconnect/vit-gpt2-image-captioning". We should instead look self.config.deocder.use_cache.

This PR try to follow the rule of # Encoder Decoder models delegate the application of the configuration options to their inner models..

This PR is also (another) one necessary step to fix #22731.

ydshieh · 2023-04-13T12:23:16Z

src/transformers/models/vision_encoder_decoder/modeling_tf_vision_encoder_decoder.py

        cross_attns = (
            tf.convert_to_tensor(output.cross_attentions)
-            if self.config.output_attentions and output.cross_attentions is not None
+            if self.config.decoder.output_attentions and output.cross_attentions is not None


Use the attributes in the components' configurations (i.e. self.config.encoder and self.config.decoder)

HuggingFaceDocBuilderDev · 2023-04-13T12:33:57Z

The documentation is not available anymore as the PR was closed or merged.

sgugger

Looks good to me but would like for a TF expert to have a look too!

Rocketknight1

This makes sense to me!

…dels) (huggingface#22743) * fix * style * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

ydshieh added 2 commits April 13, 2023 13:59

fix

3bc3e84

style

f328fb9

ydshieh commented Apr 13, 2023

View reviewed changes

ydshieh requested review from amyeroberts, sgugger and Rocketknight1 and removed request for amyeroberts, sgugger and Rocketknight1 April 13, 2023 12:34

sgugger approved these changes Apr 13, 2023

View reviewed changes

ydshieh requested a review from gante April 13, 2023 18:54

Rocketknight1 approved these changes Apr 13, 2023

View reviewed changes

ydshieh removed the request for review from amyeroberts April 13, 2023 19:05

fix

ea73728

ydshieh removed the request for review from gante April 13, 2023 20:23

ydshieh merged commit a6752a7 into main Apr 13, 2023
4 checks passed

ydshieh deleted the fix_save_2 branch April 13, 2023 21:45

novice03 pushed a commit to novice03/transformers that referenced this pull request Jun 23, 2023

Fix serving_output for TF composite models (encoder-decoder like mo…

7de5bdc

…dels) (huggingface#22743) * fix * style * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix `serving_output` for TF composite models (encoder-decoder like models) #22743

Fix `serving_output` for TF composite models (encoder-decoder like models) #22743

ydshieh commented Apr 13, 2023 •

edited

ydshieh Apr 13, 2023

HuggingFaceDocBuilderDev commented Apr 13, 2023 •

edited

sgugger left a comment

Rocketknight1 left a comment

	# Encoder Decoder models delegate the application of the configuration options to their inner models.
	if "EncoderDecoder" in self.__class__.__name__:
	config = None
	else:
	config = self.config

	def serving_output(self, output):
	pkv = tf.tuple(output.past_key_values)[1] if self.config.use_cache else None
	dec_hs = tf.convert_to_tensor(output.decoder_hidden_states) if self.config.output_hidden_states else None
	dec_attns = tf.convert_to_tensor(output.decoder_attentions) if self.config.output_attentions else None
	enc_hs = tf.convert_to_tensor(output.encoder_hidden_states) if self.config.output_hidden_states else None
	enc_attns = tf.convert_to_tensor(output.encoder_attentions) if self.config.output_attentions else None

Fix serving_output for TF composite models (encoder-decoder like models) #22743

Fix serving_output for TF composite models (encoder-decoder like models) #22743

Conversation

ydshieh commented Apr 13, 2023 • edited

What does this PR do?

ydshieh Apr 13, 2023

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Apr 13, 2023 • edited

sgugger left a comment

Choose a reason for hiding this comment

Rocketknight1 left a comment

Choose a reason for hiding this comment

Fix `serving_output` for TF composite models (encoder-decoder like models) #22743

Fix `serving_output` for TF composite models (encoder-decoder like models) #22743

ydshieh commented Apr 13, 2023 •

edited

HuggingFaceDocBuilderDev commented Apr 13, 2023 •

edited