Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iaa.Grayscale may be more intuitive if it defaulted to alpha=1 #315

Closed
Erotemic opened this issue May 16, 2019 · 7 comments · Fixed by #582
Closed

iaa.Grayscale may be more intuitive if it defaulted to alpha=1 #315

Erotemic opened this issue May 16, 2019 · 7 comments · Fixed by #582
Labels
TODO Planned to be added to the library or changed in the future

Comments

@Erotemic
Copy link
Contributor

I'm writing this issue to propose a minor API change.

When using iaa.Grayscale, I was initially calling it with all default parameters and I was confused as to why my images still had color. Upon inspection of the docstring, I discovered that alpha defaults to zero, which explained the behavior I was seeing.

While this is all perfectly sensible and consistent, my feeling is that it may be more intuitive to have the default parameters for iaa.Grayscale actually perform the de-saturation operation rather than defaulting to a no-op.

While this is not a huge API change, it is still and API change, and this library is widely used, so I understand concerns about that. However, the library hasn't hit 1.0, yet so if the maintainer(s) think this proposal is sensible, now is the time to do it.

@aleju
Copy link
Owner

aleju commented May 16, 2019

I guess this is similar to the request in aleju/imgaug-doc#7 .
Currently, most of the augmenters have default argument values that initialize the augmenters to do nothing. I agree that this isn't very reasonable, so one of the future versions will likely change these defaults by either

  1. removing them (which forces the library user to set them manually)
  2. or by using defaults that are optimal for augmentation (in the case of Grayscale that could be alpha=(0.0, 1.0) or in the case of Fliplr it could be p=0.5)
  3. or by using defaults that are sensible outside of augmentation (in the case of Grayscale that could be alpha=1.0 or in the case of Flipr it could be p=1.0).

Note sure yet which one of these choices is the best one in the long run. (1) Removing them is annoying for the user, (2) setting defaults that are sensible for augmentation can lead to very confusing bugs, and (3) setting the defaults for non-augmentation looks elegantly, but incurs risk that people train accidentally with bad augmentations.
Maybe (3) would work with better debugging tooling to e.g. regularly save the generated augmentations to image files for visual inspection. Then again, many users would probably not use such tooling.

@Erotemic
Copy link
Contributor Author

(2) setting defaults that are sensible for augmentation can lead to very confusing bugs

How so? I feel like if a user adds an augmenter they intent it to do something. Of course not all augmentations are appropriate for all images, and "sensible values" may just clobber all the image information in some cases. However, the user should expect that augmenters are doing something and if they see weird bugs I would think the natural thing to do is turning off augmenters to find the offending one.

I personally like option 2 the best.

@aleju
Copy link
Owner

aleju commented May 17, 2019

(2) can be confusing if you use augmenters on their own.
E.g. if you only want to train on grayscale images and do something like

image = imageio.imread(image_path)
image_gray = Grayscale()(image=image)

then you might expect your loaded image to be in grayscale, assuming that the grayscale augmenter is defined as Grayscale(alpha=1.0). But if that augmenter is defined as Grayscale(alpha=(0.0, 1.0)) you may sometimes end up with high alpha values (i.e. images that look like they were grayscaled) and sometimes with low alpha values (i.e. images that look like they were not grayscaled). Depending on where the operation is placed, it could seem like the operation was not executed at all, so you make some small changes to fix the problem, run the script again and see that now the image is in grayscale, leading you to believe that your changes fixed the problem. Of course it still remains, because you just randomly hit a grayscaled image. Later on you see that the image again is not in grayscale, so you assume that you broke your "fixed" code again, so you try to re-apply the fix and end up confused why it doesn't work anymore, when it did in the past. So you try something else and seemingly it is fixed again, and so on...

For grayscale that isn't that bad as there is a high chance of hitting an alpha around 0.5, giving away that there is something "wrong" with the augmenter. But e.g. Fliplr at 50% probability does either flip or not flip, making it harder to debug when you aren't aware that it has an associated probability of being active.

@Erotemic
Copy link
Contributor Author

That makes sense. I'm familiar with this use case because I do sometimes use augmenters on their own. I can see how a non-noop defaults may cause confusion for developers. However, this is a catch-22 situation, because noop defaults also cause confusion for developers (i.e. "I used an augmenter, why didn't it do anything?"). Of course, the ultimate solution to both of these issues is documentation (which imgaug does an excellent job at providing), but I think tweaking the exiting defaults may result in a better tradeoff and reduce the total confusion encountered when using a new library.

I've been giving this some thought, and I think the best choice might be on a per-augmenter basis. For instance, I don't think there is any "reasonable" default setting for something like iaa.Affine, which can have lots of different effects. However, for Grayscale I do think a default where it does something (whether that be alpha=1 or alpha=(0, 1)) makes the most sense. I think the main distinction is in the number of different effects an augmenter can have --- i.e. the parameter dimensional of the augmenter.

All this is well and good, but just stating opinions isn't very useful for the growth of this library. Instead, perhaps its best to move this discussion towards creating a guideline for setting augmenter defaults.

Here is my proposal:


Modivation: imgaug is a library for stochastic on-the-fly image augmentation targeted towards training deep neural networks. It should be as simple as possible to create augmenters and compose custom augmentation schemes. As a guiding principle most augmenters are given defaults such that they have some small effect. It should be emphasized that while augmenters can be seeded to behave deterministically, they are implicitly random, and thus without explicit intervention, are likely to behave differently in each subsequent run.

  • When to use non-identity default values: For augmenters that either have one very specific effect (like FlipLR, Invert, Rot90), or an effect on a 1-dimensional spectrum (like Grayscale, GaussianBlur), it makes more sense to default to a value where they are doing something. This allows the user to write simpler coder that more effectively communicates high level ideas of what the augmentation strategy is.

  • When to use noop default values: For augmenters that have multiple important parameters (like Affine), it makes the most sense to use defaults that cause no-effect. This prevents the user from having to zero out parameters in the cases where the user does not want them (e.g. the user wants translation, and rotation, but no scaling).

// Note: Affine is the only augmenter I can think of that really think of that meets this criterion. PerspectiveTransform seems to me like it should actually have a small perturbing default.

  • When to not provide any default values: For augmenters that require specific information about a use case to be useful (like Resize, Convolve, ChangeColorspace). In these cases sensible defaults cannot be chosen and it is always up to the user to provide some arguments.

My main point is that I think a user should simply be able to more-or-less blindly construct a list of augmenters without setting any parameters and (1) be able to immediately see them do something and (2) have that something be potentially useful for training a network.

I'm currently doing this with the Clouds, Fog, and FastSnowyLandscape augmenters. I have only a very vague idea of how they might be working under the hood, but I was able to throw them into a Sequential augmenter and see that they produced very reasonable results (while also impressing the customer).

Anyway, let me know what you think.

@aleju
Copy link
Owner

aleju commented May 18, 2019

These proposals seem very sensible to me. I guess I will try to improve the defaults for the next version according to these guidelines.

My main point is that I think a user should simply be able to more-or-less blindly construct a list of augmenters without setting any parameters and (1) be able to immediately see them do something and (2) have that something be potentially useful for training a network.

So I assume that means that you would be more in favor of gearing the defaults towards augmentation, e.g. Grayscale(alpha=(0.0, 1.0)) instead of Grayscale(alpha=1.0)? I guess there is no way to set the defaults optimally for all use cases.

I'm currently doing this with the Clouds, Fog, and FastSnowyLandscape augmenters. I have only a very vague idea of how they might be working under the hood, but I was able to throw them into a Sequential augmenter and see that they produced very reasonable results (while also impressing the customer).

For FastSnowyLandscape there is a link in the docstring to an article that explains the arguments. As far as I recall the algorithm was fairly simple. For the other two the arguments are currently on the wrong abstraction level and go way too much into detail. I also always use the defaults because of that.

@Erotemic
Copy link
Contributor Author

So I assume that means that you would be more in favor of gearing the defaults towards augmentation, e.g. Grayscale(alpha=(0.0, 1.0)) instead of Grayscale(alpha=1.0)? I guess there is no way to set the defaults optimally for all use cases.

Tradeoffs are the name of the game in Computer Science 😄. And yes, I think the alpha=(0.0, 1.0) makes the most sense according to these guidelines. I also have a hunch that it is Pareto efficient for the users of this library, which would be nice.

For FastSnowyLandscape there is a link in the docstring to an article that explains the arguments. As far as I recall the algorithm was fairly simple. For the other two the arguments are currently on the wrong abstraction level and go way too much into detail. I also always use the defaults because of that.

I looked into the link in the docstring, and I think I have a decent understanding of the algorithm. My main point was that its nice that I don't have to really understand them to use them. While it only took me 5-10 minutes to click through the links, read the article, and grok it, its not always possible to devote that much time. I appreciate the design where I can simply drop it into my Sequential and see immediate results.

@aleju
Copy link
Owner

aleju commented May 20, 2019

Tradeoffs are the name of the game in Computer Science 😄. And yes, I think the alpha=(0.0, 1.0) makes the most sense according to these guidelines. I also have a hunch that it is Pareto efficient for the users of this library, which would be nice.

Yes, it's probably the best possible design choice. Will change it that way.

I looked into the link in the docstring, and I think I have a decent understanding of the algorithm. My main point was that its nice that I don't have to really understand them to use them. While it only took me 5-10 minutes to click through the links, read the article, and grok it, its not always possible to devote that much time. I appreciate the design where I can simply drop it into my Sequential and see immediate results.

I see, makes sense!

@aleju aleju added the TODO Planned to be added to the library or changed in the future label Nov 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
TODO Planned to be added to the library or changed in the future
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants