Add BEiT #12994

NielsRogge · 2021-08-03T15:32:37Z

What does this PR do?

It adds BEiT: BERT Pre-Training of Image Transformers to the library. It's the first paper that enables self-supervised pre-trained Vision Transformers (ViTs) to outperform their supervised pre-training counterparts. As a picture says more than a thousand (or 16x16?) words, this is a good summary of the approach:

The authors used OpenAI's DALL-E's encoder to map images to tokens, which the model then needs to predict based on masked patches. There are 3 models defined: BEiTModel, BEiTForMaskedImageModeling and BEiTForImageClassification.

This PR also cleans up some scripts from the library, namely those that defined id2label dicts for several datasets. I have removed imagenet_classes.py and coco_classes.py from the utils directory. Instead, id2label's are now defined on the hub in their own repository. These can then be used in conversion scripts using the huggingface_hub library.

To do

Add all checkpoints to the hub, under the "Microsoft" namespace. Perhaps discuss the model names, because for example microsoft/beit_base_patch16_224_pt22k_ft22k_to_1k is getting out of hand
Would be cool to have a working colab for the BEiTForMaskedImageModeling model. For this, tagging one of the original authors: @donglixp

In a future PR, I also plan to add the semantic segmentation model, which obtains SOTA on Ade20k.

sgugger

Awesome addition! No big remark on my side, this looks ready to be merged soon (as long as the tests are fixed ;-) ), left a few comments.

README.md

docs/source/model_doc/beit.rst

src/transformers/__init__.py

src/transformers/image_utils.py

src/transformers/models/detr/convert_detr_original_pytorch_checkpoint_to_pytorch.py

src/transformers/models/vit/convert_vit_timm_to_pytorch.py

tests/test_modeling_beit.py

sgugger · 2021-08-04T06:59:32Z

tests/test_modeling_beit.py

+
+
+@require_torch
+class BEiTModelTest(ModelTesterMixin, unittest.TestCase):


Quick question. The tests are different for a bunch of vision models now, maybe we should have a special tester class for them and refactor the common tests of vision models there? I'm not familiar enough with how similar those tests are to be sure it's worth it, so tell me if it makes no sense.

tests/test_modeling_beit.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

NielsRogge · 2021-08-04T10:38:05Z

I've uploaded all checkpoints to the hub: https://huggingface.co/models?search=microsoft/beit

I've renamed the checkpoints which are fine-tuned on ImageNet-1k (after being intermediately fine-tuned on ImageNet-22k) to be just microsoft/beit-base-patch16-224, etc.

@donglixp if you're interested, could you write model cards for these models? Model cards are READMEs that describe the models in detail. You can take inspiration from ViT's model card.

Also, I do have a notebook for BEiTForMaskedImageModeling, but it's not working as expected. Could you please take a look? https://colab.research.google.com/drive/1Mjt-3jHw9HYMXECmSdDlbiG59ZAw-Z0T?usp=sharing

LysandreJik

Overall very clean! I think you can safely ignore the error linked to model templates, it's running make fixup which is looking for a file that was deleted in this PR.

Left just a nit regarding the naming convention.

src/transformers/models/beit/modeling_beit.py

JStumpp · 2021-09-29T10:36:58Z

@NielsRogge great work, any news on the future PR, to add the semantic segmentation model and the pretrained Ade20k? Thanks!

NielsRogge · 2021-10-21T09:32:57Z

@JStumpp say no more, it's added ;)

NielsRogge added 21 commits August 2, 2021 18:24

First pass

3219cb1

Make conversion script work

98f8e04

Improve conversion script

feb6f0b

Fix bug, conversion script working

1bbaf73

Improve conversion script, implement BEiTFeatureExtractor

10408e1

Make conversion script work based on URL

bfac3d5

Improve conversion script

3f306eb

Add tests, add documentation

da3de43

Fix bug in conversion script

1daf6b9

Fix another bug

188b442

Add support for converting masked image modeling model

c3683e3

Add support for converting masked image modeling

ec0608b

Fix bug

98651a8

Add print statement for debugging

40c0e73

Fix another bug

1b83592

Make conversion script finally work for masked image modeling models

f30d05e

Move id2label for datasets to JSON files on the hub

291dbc6

Make sure id's are read in as integers

e07fd07

Add integration tests

c9978ee

Make style & quality

5bc33e8

Fix test, add BEiT to README

1cb1ea5

NielsRogge requested review from sgugger and LysandreJik August 3, 2021 15:32

donglixp mentioned this pull request Aug 4, 2021

Minimal loading pretrain BEiT model microsoft/unilm#373

Closed

sgugger approved these changes Aug 4, 2021

View reviewed changes

NielsRogge and others added 3 commits August 4, 2021 09:44

Apply suggestions from @sgugger's review

ad790ec

Apply suggestions from code review

6ca6486

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Make quality

12831d1

Replace nielsr by microsoft in tests, add docs

25cdcc2

LysandreJik approved these changes Aug 4, 2021

View reviewed changes

src/transformers/models/beit/modeling_beit.py Outdated Show resolved Hide resolved

NielsRogge added 3 commits August 4, 2021 17:21

Rename BEiT to Beit

0dd112c

Minor fix

c0e9237

Fix docs of BeitForMaskedImageModeling

f2796fe

NielsRogge merged commit 83e5a10 into huggingface:master Aug 4, 2021

NielsRogge mentioned this pull request Oct 21, 2021

Add BeitForSemanticSegmentation #14096

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add BEiT #12994

Add BEiT #12994

NielsRogge commented Aug 3, 2021 •

edited

Loading

sgugger left a comment

sgugger Aug 4, 2021

NielsRogge commented Aug 4, 2021 •

edited

Loading

LysandreJik left a comment

JStumpp commented Sep 29, 2021

NielsRogge commented Oct 21, 2021



		@require_torch
		class BEiTModelTest(ModelTesterMixin, unittest.TestCase):

Add BEiT #12994

Add BEiT #12994

Conversation

NielsRogge commented Aug 3, 2021 • edited Loading

What does this PR do?

To do

sgugger left a comment

Choose a reason for hiding this comment

sgugger Aug 4, 2021

Choose a reason for hiding this comment

NielsRogge commented Aug 4, 2021 • edited Loading

LysandreJik left a comment

Choose a reason for hiding this comment

JStumpp commented Sep 29, 2021

NielsRogge commented Oct 21, 2021

NielsRogge commented Aug 3, 2021 •

edited

Loading

NielsRogge commented Aug 4, 2021 •

edited

Loading