[models] update vit and transformer layer norm #1059

felixdittrich92 · 2022-09-12T19:58:31Z

This PR:

apply suggestions for vit @frgfm

Any feedback is welcome 🤗

codecov · 2022-09-12T21:10:54Z

Codecov Report

Merging #1059 (564a789) into main (28a6cce) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##             main    #1059   +/-   ##
=======================================
  Coverage   95.13%   95.14%           
=======================================
  Files         141      141           
  Lines        5819     5827    +8     
=======================================
+ Hits         5536     5544    +8     
  Misses        283      283

Flag	Coverage Δ
unittests	`95.14% <100.00%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
doctr/models/classification/zoo.py	`100.00% <ø> (ø)`
doctr/models/classification/vit/pytorch.py	`100.00% <100.00%> (+2.43%)`	⬆️
doctr/models/classification/vit/tensorflow.py	`97.82% <100.00%> (ø)`
doctr/models/modules/transformer/pytorch.py	`100.00% <100.00%> (ø)`
doctr/models/modules/transformer/tensorflow.py	`99.03% <100.00%> (+0.04%)`	⬆️
doctr/models/modules/vision_transformer/pytorch.py	`100.00% <100.00%> (ø)`
doctr/transforms/functional/base.py	`95.65% <0.00%> (-1.45%)`	⬇️
doctr/transforms/modules/base.py	`94.59% <0.00%> (ø)`

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

felixdittrich92 · 2022-09-13T07:00:57Z

@frgfm About nn.Sequential i would keep nn.Module to avoid to build an nn.Module for the head

odulcy-mindee

Thanks @felixdittrich92 !

Also, thanks @frgfm for the review 👍

felixdittrich92 · 2022-09-14T10:07:18Z

@odulcy-mindee will update some last stuff in a few minutes

odulcy-mindee · 2022-09-14T10:08:28Z

@felixdittrich92 ok, I'll review that after 👌

felixdittrich92 · 2022-09-14T12:25:28Z

Last PR:

fix minor mistake in PatchEmbedding now the model runs well (without bigger drops)
ViT PT now also as Sequential and Classifier as standalone module (suggestion from @frgfm )

felixdittrich92 · 2022-09-14T13:51:12Z

ok weird behaviour ... Conv2D padding='valid' works well without (which should be the default) it doesnt ...
disabled onnx export ftm: pytorch/pytorch#68880

felixdittrich92 · 2022-09-14T14:30:06Z

@odulcy-mindee should be ok now 😅 Unfortunately i saw with padding='valid' and onnx is a known issue and processed internally by microsoft so i think we will get a fix soon

felixdittrich92 · 2022-09-14T19:22:56Z

Will test "manual" patchify without Conv tomorrow morning... maybe a better solution

felixdittrich92 · 2022-09-15T06:22:56Z

Now it works much better and ONNX works also the only disadvantage is, that it is a bit slower as using Conv2d with padding='valid' (introduced in pytorch 1.10). TF side still well running with Conv2d

(doctr-dev) felix@felix-GS66-Stealth-11UH:~/Desktop/doctr$ python3 /home/felix/Desktop/doctr/references/classification/train_pytorch.py vit_b --epochs=50
2022-09-15 07:56:02.911032: I tensorflow/core/util/util.cc:169] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
Namespace(amp=False, arch='vit_b', batch_size=64, device=None, epochs=50, export_onnx=False, find_lr=False, font='FreeMono.ttf,FreeSans.ttf,FreeSerif.ttf', input_size=32, lr=0.001, name=None, pretrained=False, push_to_hub=False, resume=None, sched='cosine', show_samples=False, test_only=False, train_samples=1000, val_samples=20, vocab='french', wb=False, weight_decay=0, workers=None)
Validation set loaded in 1.966s (2520 samples in 40 batches)
Train set loaded in 0.2431s (126000 samples in 1968 batches)
Validation loss decreased inf --> 2.42893: saving state...                                                                                                      
Epoch 1/50 - Validation loss: 2.42893 (Acc: 30.28%)
Validation loss decreased 2.42893 --> 2.14431: saving state...                                                                                                  
Epoch 2/50 - Validation loss: 2.14431 (Acc: 39.84%)
Validation loss decreased 2.14431 --> 1.9619: saving state...                                                                                                   
Epoch 3/50 - Validation loss: 1.9619 (Acc: 42.74%)

felixdittrich92 · 2022-09-15T06:42:09Z

Now i'm really done for review 😅 sry for all the changes afterwards :)

doctr/models/classification/vit/pytorch.py

doctr/models/modules/vision_transformer/pytorch.py

odulcy-mindee

🚀 🚀

frgfm · 2022-09-16T10:24:57Z

For reference, this is linked to #1050 (always better to be able to trace back the evolution/fixes :))

frgfm

Late review again, but hopefully it helps :)

frgfm · 2022-09-16T10:27:29Z

doctr/models/classification/vit/pytorch.py

+class ClassifierHead(nn.Module):
+    """Classifier head for Vision Transformer
+
+    Args:
+        in_channels: number of input channels
+        num_classes: number of output classes
+    """
+
+    def __init__(
+        self,
+        in_channels: int,
+        num_classes: int,
+    ) -> None:
+        super().__init__()
+
+        self.head = nn.Linear(in_channels, num_classes)
+
+    def forward(self, x: torch.Tensor) -> torch.Tensor:
+        # (batch_size, num_classes) cls token
+        return self.head(x[:, 0])


Mmmmh, what's the difference with

head = nn.Linear(in_channels, num_classes) ... out = head(x[:, 0])

(Linear actually supports higher dimensions than 2, we can reshape it afterwards I think)

It would be cleaner to add a squeeze or flatten layer in the sequential, rather than creating a class that is doing 99% the same as a Linear :)

frgfm · 2022-09-16T10:30:02Z

doctr/models/classification/vit/pytorch.py

@@ -109,7 +120,7 @@ def _vit(
    return model


-def vit(pretrained: bool = False, **kwargs: Any) -> VisionTransformer:
+def vit_b(pretrained: bool = False, **kwargs: Any) -> VisionTransformer:
    """VisionTransformer architecture as described in


I suggest specifying the version of the archi in the docstring as well "VisionTransformer-B"

frgfm · 2022-09-16T10:30:11Z

doctr/models/classification/vit/tensorflow.py

@@ -135,7 +135,7 @@ def _vit(
    return model


-def vit(pretrained: bool = False, **kwargs: Any) -> VisionTransformer:
+def vit_b(pretrained: bool = False, **kwargs: Any) -> VisionTransformer:
    """VisionTransformer architecture as described in


frgfm · 2022-09-16T10:31:10Z

doctr/models/modules/vision_transformer/pytorch.py


    def forward(self, x: torch.Tensor) -> torch.Tensor:
        B, C, H, W = x.shape
        assert H % self.patch_size[0] == 0, "Image height must be divisible by patch height"
        assert W % self.patch_size[1] == 0, "Image width must be divisible by patch width"

-        patches = self.proj(x)  # BCHW
+        # patchify image without convolution
+        # adopted from:


typo "adapted"

update vit and transformer layer norm

805a88b

felixdittrich92 added this to the 0.6.0 milestone Sep 12, 2022

felixdittrich92 added module: models Related to doctr.models framework: pytorch Related to PyTorch backend framework: tensorflow Related to TensorFlow backend type: misc Miscellaneous labels Sep 12, 2022

felixdittrich92 added 2 commits September 12, 2022 22:02

update typo

cf32bfb

revert to module

edd8b99

felixdittrich92 requested a review from frgfm September 12, 2022 20:19

felixdittrich92 self-assigned this Sep 12, 2022

felixdittrich92 added 2 commits September 12, 2022 22:38

update hf models

5f0f12e

update onnx tf vit

c591725

rename norm layers

95cae19

felixdittrich92 requested a review from odulcy-mindee September 13, 2022 06:54

odulcy-mindee previously approved these changes Sep 14, 2022

View reviewed changes

update pt vit

cdf4e34

felixdittrich92 dismissed odulcy-mindee’s stale review via cdf4e34 September 14, 2022 12:22

felixdittrich92 requested a review from odulcy-mindee September 14, 2022 12:23

felixdittrich92 added 3 commits September 14, 2022 14:43

revert to module

6dacb33

update

58dda1d

update

4ec3c42

revert to Sequential

9f7f55d

trigger CI

2e521a8

patchify manual in pytorch

13fd1ff

odulcy-mindee reviewed Sep 15, 2022

View reviewed changes

doctr/models/classification/vit/pytorch.py Show resolved Hide resolved

doctr/models/modules/vision_transformer/pytorch.py Outdated Show resolved Hide resolved

apply suggestion

564a789

felixdittrich92 requested a review from odulcy-mindee September 15, 2022 08:38

odulcy-mindee approved these changes Sep 15, 2022

View reviewed changes

felixdittrich92 merged commit a95baaa into mindee:main Sep 15, 2022

felixdittrich92 deleted the transformer-updates branch September 15, 2022 09:03

frgfm reviewed Sep 16, 2022

View reviewed changes

felixdittrich92 mentioned this pull request Sep 26, 2022

Release tracker - v0.6.0 #791

Closed

85 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[models] update vit and transformer layer norm #1059

[models] update vit and transformer layer norm #1059

felixdittrich92 commented Sep 12, 2022

codecov bot commented Sep 12, 2022 •

edited

felixdittrich92 commented Sep 13, 2022 •

edited

odulcy-mindee left a comment

felixdittrich92 commented Sep 14, 2022

odulcy-mindee commented Sep 14, 2022

felixdittrich92 commented Sep 14, 2022 •

edited

felixdittrich92 commented Sep 14, 2022 •

edited

felixdittrich92 commented Sep 14, 2022

felixdittrich92 commented Sep 14, 2022

felixdittrich92 commented Sep 15, 2022

felixdittrich92 commented Sep 15, 2022

odulcy-mindee left a comment

frgfm commented Sep 16, 2022

frgfm left a comment

frgfm Sep 16, 2022

frgfm Sep 16, 2022

frgfm Sep 16, 2022

frgfm Sep 16, 2022

frgfm Sep 16, 2022

[models] update vit and transformer layer norm #1059

[models] update vit and transformer layer norm #1059

Conversation

felixdittrich92 commented Sep 12, 2022

codecov bot commented Sep 12, 2022 • edited

Codecov Report

felixdittrich92 commented Sep 13, 2022 • edited

odulcy-mindee left a comment

Choose a reason for hiding this comment

felixdittrich92 commented Sep 14, 2022

odulcy-mindee commented Sep 14, 2022

felixdittrich92 commented Sep 14, 2022 • edited

felixdittrich92 commented Sep 14, 2022 • edited

felixdittrich92 commented Sep 14, 2022

felixdittrich92 commented Sep 14, 2022

felixdittrich92 commented Sep 15, 2022

felixdittrich92 commented Sep 15, 2022

odulcy-mindee left a comment

Choose a reason for hiding this comment

frgfm commented Sep 16, 2022

frgfm left a comment

Choose a reason for hiding this comment

frgfm Sep 16, 2022

Choose a reason for hiding this comment

frgfm Sep 16, 2022

Choose a reason for hiding this comment

frgfm Sep 16, 2022

Choose a reason for hiding this comment

frgfm Sep 16, 2022

Choose a reason for hiding this comment

frgfm Sep 16, 2022

Choose a reason for hiding this comment

codecov bot commented Sep 12, 2022 •

edited

felixdittrich92 commented Sep 13, 2022 •

edited

felixdittrich92 commented Sep 14, 2022 •

edited

felixdittrich92 commented Sep 14, 2022 •

edited