Skip to content

Conversation

@FrancescoSaverioZuppichini
Copy link
Contributor

@FrancescoSaverioZuppichini FrancescoSaverioZuppichini commented Feb 22, 2022

What does this PR do?

This WIP PR adds ResNet.

Currently, the model can be used as follows

import requests
from io import BytesIO
res = requests.get('https://github.com/huggingface/transformers/blob/master/tests/fixtures/tests_samples/COCO/000000039769.png?raw=true')
image = Image.open(BytesIO(res.content))

feature_extractor = AutoFeatureExtractor.from_pretrained("Francesco/resnet50")
model = ResNetForImageClassification.from_pretrained("Francesco/resnet50").eval()

inputs = feature_extractor(image, return_tensors="pt")
outputs = model(**inputs)
print(model.config.id2label[torch.argmax(outputs.logits).item()])
# tiger cat

@HuggingFaceDocBuilder
Copy link

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this model! It's very clean already, I just have one naming suggestion.
Make sure that the conversion script can work for someone who is not yourself and I think we could expand it to work for more checkpoints?

@FrancescoSaverioZuppichini
Copy link
Contributor Author

FrancescoSaverioZuppichini commented Feb 25, 2022

I've removed the regression loss from ForImageClassification since it is not a loss used for image classification. This open for discussion

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regression part should be added back. There are problems that can be framed as regression from images (guessing some coordinates inside the image, the age of a subject in a picture, the angle at which a picture was taken etc.).

@FrancescoSaverioZuppichini
Copy link
Contributor Author

I've removed the D variant code, it will be added by a different PR. Three new models outputs have been added: BaseModelOutputWithNoAttention, BaseModelOutputWithNoAttentionAndWithPooling and ImageClassificationModelOutput



@dataclass
class BaseModelOutputWithNoAttentionAndWithPooling(ModelOutput):
Copy link
Contributor

@NielsRogge NielsRogge Mar 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a horrible name.. but ok i guess, since BaseModelOutputWithPooling is already taken? cc @sgugger @LysandreJik

Copy link
Contributor

@NielsRogge NielsRogge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@FrancescoSaverioZuppichini FrancescoSaverioZuppichini merged commit e3008c6 into master Mar 14, 2022
@sgugger sgugger changed the title [WIP] Resnet Resnet Mar 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants