Skip to content

Conversation

@FrancescoSaverioZuppichini
Copy link
Contributor

@FrancescoSaverioZuppichini FrancescoSaverioZuppichini commented Mar 9, 2022

What does this PR do?

This PR adds Visual Attention Network.

Currently, the model can be used as follows

import requests
from io import BytesIO
res = requests.get('https://github.com/huggingface/transformers/blob/master/tests/fixtures/tests_samples/COCO/000000039769.png?raw=true')
image = Image.open(BytesIO(res.content))

feature_extractor = AutoFeatureExtractor.from_pretrained("zuppif/van-base")
model = VanForImageClassification.from_pretrained("zuppif/van-base").eval()

inputs = feature_extractor(image, return_tensors="pt")
outputs = model(**inputs)
print(model.config.id2label[torch.argmax(outputs.logits).item()])
# tabby, tabby cat

TODO

  • modeling
  • weights
  • doc
  • tests

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Mar 9, 2022

The documentation is not available anymore as the PR was closed or merged.

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this model!

("vit_mae", "ViTFeatureExtractor"),
("segformer", "SegformerFeatureExtractor"),
("convnext", "ConvNextFeatureExtractor"),
("van", "ConvNextFeatureExtractor"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-using this for all models? ;)

You're sure it expects the same transforms at evaluation time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was trained on imagenet

logger = logging.get_logger(__name__)

VAN_PRETRAINED_CONFIG_ARCHIVE_MAP = {
"van-base": "https://huggingface.co/zuppif/van-base/blob/main/config.json",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be updated

src/transformers/models/convnext/modeling_convnext.py
src/transformers/models/poolformer/modeling_poolformer.py
src/transformers/models/vit_mae/modeling_vit_mae.py
src/transformers/models/van/modeling_van.py
Copy link
Contributor

@NielsRogge NielsRogge Mar 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding!

Did you checks the tests pass?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is done in the CI, isn't it?

@FrancescoSaverioZuppichini
Copy link
Contributor Author

Thanks for the reviews. I've resolved all the comments that can be resolved and asked the authors if we can create an organization for them in the hub.

@FrancescoSaverioZuppichini FrancescoSaverioZuppichini changed the title [WIP] Visual Attention Network (VAN) Visual Attention Network (VAN) Mar 14, 2022
@FrancescoSaverioZuppichini FrancescoSaverioZuppichini merged commit 0a05720 into master Mar 15, 2022
@FrancescoSaverioZuppichini FrancescoSaverioZuppichini deleted the modeling_van branch March 15, 2022 07:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants