Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

img has been cut off abnormally #119

Merged
merged 2 commits into from Apr 12, 2022
Merged

Conversation

TITC
Copy link
Collaborator

@TITC TITC commented Apr 12, 2022

assume We have an image, with a size of (352,31), and the min_dimensions is (32,32). Then it's qualified any judgment. Image (352,31) is being pasted to a new blank Image whose size is (32,32).

        if any([s < min_dimensions[i] for i, s in enumerate(img.size)]):
            padded_im = Image.new('L', min_dimensions, 255)
            padded_im.paste(img, img.getbbox())
            img = padded_im

assume We have an image, with a width of 352 and a high of 31, and the `min_dimensions` is 32 $\times$ 32. Then it's qualified `any` judgment. Image 352$\times$31 is being pasted to a new blank Image whose size is 32 $\times$ 32. 

```python
        if any([s < min_dimensions[i] for i, s in enumerate(img.size)]):
            padded_im = Image.new('L', min_dimensions, 255)
            padded_im.paste(img, img.getbbox())
            img = padded_im
```
@TITC
Copy link
Collaborator Author

TITC commented Apr 12, 2022

Here is the picture we have.
163

and output as beow
image

@TITC
Copy link
Collaborator Author

TITC commented Apr 12, 2022

And I wonder why we need the minmax_size function? Does this is used to prevent downsample process dimension not meeting the requirements?
image

@lukas-blecher lukas-blecher merged commit 65b38b8 into lukas-blecher:main Apr 12, 2022
@lukas-blecher
Copy link
Owner

Thank you very much!
The idea behind the minmax_size function is that the ViT can accept only a very discrete amount of image sizes. The formula you showed above is for CNNs right?
That's what pad is for but it doesn't cover the edge case of a too large image.
Come to think about it the function would be fine without any of the min_dimensions stuff.

@TITC
Copy link
Collaborator Author

TITC commented Apr 12, 2022

Thanks for the explanation, seems I need to reread ViT's paper carefully and its GitHub repository.

@TITC TITC deleted the patch-1 branch April 12, 2022 11:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants