New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
transforms: Invert #547
transforms: Invert #547
Conversation
@sebastianberns looks good, but could you please provide more background on benefits of invert applied on RGB images ? |
In any case, where the information of interest in a pixel image is placed on a white background this function would be helpful for data preprocessing. We want a black background since a convolution by default adds zeros all around as padding. While this is obvious for grayscale (MNIST) this similarly holds for any number of channels. (Thinking about this now, the invert function could be rewritten to simply invert all channels but the alpha channel.) That being said, I personally use the invert function rather to manipulate the background than the foreground. |
@sebastianberns sorry for delayed answer. I agree that this could make sense on gray-level objects and where background is homogenious and white. I still have a doubt on inversion of RGB images. Take as example your demo, where a frog is inverted. In such cases network will see green, brownish frogs and as well as blue and magenta ones and probably it will learn only geometric features rather than color features to distiguish frogs. |
Please, take as much time as you need for reviews and replies. Also, please understand this PR as my way of sharing what I thought useful. Ultimately it shall be the maintainers’ decision to add any feature or not. I’m not familiar with the corresponding policies. My demo is an illustration of the feature’s capability, not an exposition of its purpose. Your doubts about the inversion of color images with bleed (that is to say a photo that covers the whole image map with no uniform margin) are reasonable, of course, because inverting in such a case is not necessary, and not the use case I am drawing. Since you somehow accept this feature’s usefulness for grayscale images unquestioned, let me insist that inversion is useful because of a uniform (usually white) background, independent of the number of channels (1, L, RGB, LAB, …). Even if the image of the frog that you reference was black and white still it would not make a difference inverting it. But imagine the same frog sitting on a white background instead of the ground of a forest (similarly to the data in fashion-MNIST and Shapenet I have mentioned). Then, I argue, you’d want to invert it. If you were to reject color inversion I can offer to rewrite to only allow single-channel images. |
In the meantime, I have set up a separate repository to save this custom transform for future reference. |
I'm not convinced about this transformation. If the main use-case is to invert white color background into black, this would be better as a user layer for a particular application, but I'm not 100% sure that it's general enough to be merged in torchvision. Thanks for the PR though! |
Hi!
I had written a transform that inverts grayscale images from a custom dataset for my own use.
Since I believe this might be useful to others I wanted to share this extended version which accepts PIL images of modes "L", "LA", "RGB" and "RGBA".
I have further written a unit test which is included in the PR.
A simple demo is available here.