Move logic into pixelshuffle layer by amyeroberts · Pull Request #17899 · huggingface/transformers

amyeroberts · 2022-06-27T15:01:21Z

What does this PR do?

Moves logic relating to PixelShuffle layer into layer class. This is to provide a consistent usage wrt the PyTorch pixel shuffle layer and makes sure all necessary logic is ported if any #Copied from statements are used.

Also renamed layer PixelShuffle -> TFSwinPixelShuffle to reflect naming in the rest of the repo. The following was run to make sure the models are still compatible with current weights:

from transformers import AutoFeatureExtractor, TFSwinForImageClassification

checkpoint = "microsoft/swin-tiny-patch4-window7-224"

# relative_position_index isn't updated during training. In TF set as instance param
print("\nTFSwinForImageClassification - from PyTorch checkpoint")
tf_model = TFSwinForImageClassification.from_pretrained(checkpoint, from_pt=True)
print("\nTFSwinForImageClassification - from TF checkpoint")
tf_model = TFSwinForImageClassification.from_pretrained(checkpoint)

With the following output. Note: relative_position_index isn't updated during training and is set as an instance param in the TF model

TFSwinForImageClassification - from PyTorch checkpoint
Some weights of the PyTorch model were not used when initializing the TF 2.0 model TFSwinForImageClassification: ['swin.encoder.layers.3.blocks.1.attention.self.relative_position_index', 'swin.encoder.layers.2.blocks.0.attention.self.relative_position_index', 'swin.encoder.layers.2.blocks.4.attention.self.relative_position_index', 'swin.encoder.layers.1.blocks.0.attention.self.relative_position_index', 'swin.encoder.layers.2.blocks.2.attention.self.relative_position_index', 'swin.encoder.layers.1.blocks.1.attention.self.relative_position_index', 'swin.encoder.layers.3.blocks.0.attention.self.relative_position_index', 'swin.encoder.layers.0.blocks.1.attention.self.relative_position_index', 'swin.encoder.layers.0.blocks.0.attention.self.relative_position_index', 'swin.encoder.layers.2.blocks.1.attention.self.relative_position_index', 'swin.encoder.layers.2.blocks.3.attention.self.relative_position_index', 'swin.encoder.layers.2.blocks.5.attention.self.relative_position_index']
- This IS expected if you are initializing TFSwinForImageClassification from a PyTorch model trained on another task or with another architecture (e.g. initializing a TFBertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing TFSwinForImageClassification from a PyTorch model that you expect to be exactly identical (e.g. initializing a TFBertForSequenceClassification model from a BertForSequenceClassification model).
All the weights of TFSwinForImageClassification were initialized from the PyTorch model.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFSwinForImageClassification for predictions without further training.

TFSwinForImageClassification - from TF checkpoint
All model checkpoint layers were used when initializing TFSwinForImageClassification.

All the layers of TFSwinForImageClassification were initialized from the model checkpoint at microsoft/swin-tiny-patch4-window7-224.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFSwinForImageClassification for predictions without further training.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

HuggingFaceDocBuilderDev · 2022-06-27T15:14:40Z

The documentation is not available anymore as the PR was closed or merged.

sgugger

LGTM!

Rocketknight1

LGTM as well!

Rocketknight1 · 2022-06-28T12:02:22Z

src/transformers/models/swin/modeling_tf_swin.py

+        permutation = tf.constant(
+            [[i + j * block_size_squared for i in range(block_size_squared) for j in range(output_depth)]]
+        )
+        hidden_states = tf.gather(params=hidden_states, indices=tf.tile(permutation, [batch_size, 1]), batch_dims=-1)


Any appearance of gather in this kind of context is a 100% guarantee that someone is emulating the specific details of a weird Torch function.

Too true 😭

* Move all pixelshuffle logic into layer * Rename layer * Use correct input to function

amyeroberts added 2 commits June 27, 2022 15:41

Move all pixelshuffle logic into layer

9039899

Rename layer

634b9f8

Use correct input to function

2c57cfa

amyeroberts requested review from Rocketknight1 and sgugger and removed request for sgugger June 27, 2022 21:51

sgugger approved these changes Jun 27, 2022

View reviewed changes

Rocketknight1 approved these changes Jun 28, 2022

View reviewed changes

amyeroberts merged commit f71895a into huggingface:main Jun 28, 2022

amyeroberts deleted the move-logic-into-pixelshuffle-layer branch June 28, 2022 12:04

viclzhu pushed a commit to viclzhu/transformers that referenced this pull request Jul 18, 2022

Move logic into pixelshuffle layer (huggingface#17899)

9e7e484

* Move all pixelshuffle logic into layer * Rename layer * Use correct input to function

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move logic into pixelshuffle layer#17899

Move logic into pixelshuffle layer#17899
amyeroberts merged 3 commits intohuggingface:mainfrom
amyeroberts:move-logic-into-pixelshuffle-layer

amyeroberts commented Jun 27, 2022

Uh oh!

HuggingFaceDocBuilderDev commented Jun 27, 2022 •

edited

Loading

Uh oh!

sgugger left a comment

Uh oh!

Rocketknight1 left a comment

Uh oh!

Rocketknight1 Jun 28, 2022

Uh oh!

amyeroberts Jun 28, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

amyeroberts commented Jun 27, 2022

What does this PR do?

Before submitting

Uh oh!

HuggingFaceDocBuilderDev commented Jun 27, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sgugger left a comment

Choose a reason for hiding this comment

Uh oh!

Rocketknight1 left a comment

Choose a reason for hiding this comment

Uh oh!

Rocketknight1 Jun 28, 2022

Choose a reason for hiding this comment

Uh oh!

amyeroberts Jun 28, 2022

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

HuggingFaceDocBuilderDev commented Jun 27, 2022 •

edited

Loading