Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using default transforms when creating embeddings index - confusion on resize and crop transform #326

Closed
lehigh123 opened this issue Nov 26, 2023 · 2 comments

Comments

@lehigh123
Copy link

I am using DinoV2 with FAISS to do similarity search across a database of images. See these DinoV2-related issues on this.

In this thread @patricklabatut mentions

@Suhail To generate features from the pre-trained backbones, just use a transform similar to the standard one used for evaluating on image classification with the typical ImageNet normalization mean and std (see what's used in the code). Also, as noted in the model card, the model can also use image sizes that are multiple of the patch size.

The transforms that are linked in the corresponding code have a resize of 256 followed by a center crop of 224 - does this mean that given an image that is 256x256 32 pixels are ignored horizontally and vertically (because the center crop is less than the resize)?

If the images in my dataset have distinctive imagery on their borders does this mean these default transforms will crop that information out?

@qasfb
Copy link
Contributor

qasfb commented Nov 27, 2023

Yes this is what it means, you can try without cropping. Maybe check what gives you the best results for your case ?

@lehigh123
Copy link
Author

Thanks! removing the cropping gave better results for my use case (library of book cover images)

@qasfb qasfb closed this as completed Nov 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants