Add uniform noise op #554

gselzer · 2018-04-19T20:39:42Z

This PR adds an op to add uniformly distributed noise throughout an image, as an alternative to the non-uniform AddGaussianNoise op, renamed in this PR from AddNoise.

imagejan · 2018-04-27T07:27:26Z

src/main/java/net/imagej/ops/filter/addUniformNoise/AddUniformNoiseRealType.java

+		do {
+			final double newVal = (rangeMax - rangeMin) * rng.nextDouble(true, true) +
+				rangeMin + input.getRealDouble();
+			if (newVal <= input.getMaxValue() && newVal >= input.getMinValue()) {


I wonder what would be the least surprising behavior here. When I have an UnsignedByteType image with all pixels at 255, this would create negative noise only, thereby shifting the image mean to be <255. When adding any noise, I'd expect the image mean to be essentially unchanged. Although that's a bit special case, how about making it optional if the result should be 1) clipped, 2) overflowing, or 3) some smarter way of adding noise, such as the one here?

Thanks for the reply!

I think that clipping can be pretty easily done by manipulating the rangeMin and rangeMax parameters. rangeMax just has to be set to 0 in this scenario.

I do not understand what you mean by overflow. Are you suggesting that you would like to have values in the ouptut that are greater than 255? If so you can just convert your image to ShortType or something else that allows larger values. Or are you meaning that it would be desired to wrap values with positive noise values back around to the lower values of the type (i.e. if given rangeMin = -4 and rangeMax = 4 we generate a noise value of 4 for your input of 255, thus the output is set to 3)? If so this is not correct, seeing as output < rangeMin. Can you elaborate on what you mean, if you mean something different?

I guess by this I mean that I do not see any reason to add these options, since it all can be controlled by the type of the input and the rangeMin and rangeMax parameters. And you are right in that the mean of your example will be less than 255. However in this scenario I think that is expected, because there is no way to go above the starting value.

Yes, sorry, when I said "overflow", I meant wrapping values from max to min and vice versa, e.g. 255 + 4 -> 3. When working with ImgLib2 types (such as UnsignedByteType), this is the observed behavior, and it actually comes in handy when working with cyclic colormaps (see also http://peterkovesi.com/projects/colourmaps/) to represent e.g. directionality. For that special case, it would be nice to be able to choose between wrapping, clipping, or possibly type-widening behavior for ops in general (from a user perspective).

ctrueden · 2018-05-01T18:38:59Z

When working with ImgLib2 types (such as UnsignedByteType), this is the observed behavior, and it actually comes in handy when working with cyclic colormaps

I suggest caution: there is no guarantee that all ImgLib2 types will behave in this way. (@tpietzsch @axtimwalde please correct me if I am wrong. ) Behavior in this case is left up to the implementation. E.g., IIRC, there are some types that when you setReal(value) them with an out-of-bounds value, they get corrupted somehow. @awalter17 do you remember?

awalter17 · 2018-05-01T21:00:58Z

there are some types that when you setReal(value) them with an out-of-bounds value, they get corrupted somehow. @awalter17 do you remember?

I don't remember a problem with setReal (sorry!), but there was inconsistency with the out-of-bounds behavior (some wrapped, others clamped). I think most of them have masking/wrapping behavior now (imglib/imglib2#96 - this was for constructors but I think it affected the setters as well). That being said, I don't think RealType guarantees this wrapping behavior for setReal or any of the setters.

imagejan · 2018-05-02T06:55:10Z

I suggest caution: there is no guarantee that all ImgLib2 types will behave in this way.

Yes, I figured it would be undefined behavior (as always in ImgLib2, it's the callers responsibility to conform to types and intervals...), that's why i said "observed" behavior 😄 Ok, I will not rely on that, but the "cyclic scale" use case came to my mind as something where this would be really useful.

when you setReal(value) them with an out-of-bounds value

BTW, are there terms to distinguish between out-of-bounds for intervals (i.e. x,y,z,...) and out-of-bounds for types (i.e. values)?

Anyhow, this is now far off-topic for this PR. I suggest to simply ignore my comments (and maybe just keep in mind that it would be useful to have consistent clipping/wrapping behavior across ops in general).

This op was renamed to make it clear that the noise generated by this op is NON-uniform. This distinction is being made to separate this op from a uniform noise generation op (coming in a later commit)

In previous commits the op did not allow input image data to influence the output values. Now the algorithm allows, given a nonzero input image (in which all of the pixels do not have to be the same value), the input image to influence the output noise, i.e. if it is desired to add uniform noise with range [-4, 4] upon an imported image, the algorithm will now add a value x such that -4 <= x <= 4 to each pixel in the image, whatever value that may be, as long as it does not extend beyond the range of the type.

Previously the Op would recompute the calculation whenver the Mersenne Twister provided a random value that would result in the input going outside of the type bounds, in the hopes that another computation would provide a result that was inside the type bounds. Not only can this be slow, but also does not line up with what the user might expect. Suppose that the input to the Op is an UnsignedByteType of value 254, with rangeMin=0 and rangeMax=4. Since we have uniform noise, we would expect that the probability that the output has an equal probability of being 254, 255, 256, 257, 258 (with a probability of 0.2 for each). But we know that this UnsignedByteType cannot hold a value greater than 255. If the Op redoes the calculation every time the output is greater than 255, then we know that the output will be 254 half of the time and 255 half of the time. Therefore no probability matches what might be expected. Consider the same example, but instead of a recomputation we clamp the output when it is outside the type bounds. We now have P(output=254)=0.2, P(output=255)=0.8. In this case, all outputs within the type range will have the expected probability of occurring, which makes more sense than the scenario explained above. This is why the change was made.

gselzer · 2020-05-19T16:53:43Z

Alright, I gave this PR some TLC. @imagejan, after some thinking I realized that I liked your solution a bit better, as I described in 5440061. I found it difficult to treat floating point and integer types the same, however (see 04af3fe), so I created two different Ops.

I did, however, leave the computations for the integer type Op in doubles, however, to allow for some support for some of our larger IntegerTypes. The only other option that I saw was to convert everything to BigIntegers, but I did not want to do that for performance reasons. (Note that we cannot simply do this all using Imglib math because that would leave us open to all of the issues of the edge behavior of each type) Let me know what you think.

This allows us to handle integer types and realTypes separately. The reason for this is as follows: It seems clear that we would want to allow floating point randomness when our type allows floating points If we always base our calculations on floating points, this results in edge effects when we add noise to integer types. This is due to rounding: suppose we have a pseudorandomly generated double between 0 and 2. Most IntegerTypes simply round that double to a long, integer, etc in setReal. Thus the probability we add 0 to the input is 0.25, 1 to the input is 0.5, and 2 to the input is 0.25. To get around this case we want to generate the random number differently when we have an integer type. For these reasons we have two different Ops to perform the different noise generation and have shared logic for setting the output. N.B. We do not allow wrapping for floating point types. This is because we operate within the following constraint: the number after the max of the type should map to the minimum of the type. In the case of floating points, what then should happen to a number equal to 0.5 + the maximum of the type? In this paradigm, we would expect it to be the minimum of the type - 0.5; this is impossible, however, since this would be outside of the range of the type. For this reason, we always clamp.

The default seed resulted in non-random noise. Removing it ensures that the result is different every time (unless the user passes a seed)

gselzer · 2020-05-19T19:05:45Z

I think the way setting the seed parameter works should be flipped, i.e. don't use a constant seed unless otherwise specified.

Thanks for catching that @rimadoma! Should be updated in the uniform noise Ops. It looks like AddGaussianNoise and AddPoissonNoise do this as well, should we alter them with the risk of breaking backwards compatibility?

Also Random is not thread safe.

Sure, yes. I used MersenneTwisterFast at @ctrueden's recommendation (which is thread safe according to the Javadoc). We can switch this in the other noise Ops as well if we decide it is okay to do so.

ctrueden · 2020-05-20T02:42:21Z

The previous default seed parameter was correct. One rule of ops is that they are deterministic. If you want TRULY RANDOM noise in your script, then you must generate a random number (externally to ops) and pass that as the seed. I know it's annoying, but it's one of the guarantees of ops.

Therefore, we also need to be careful about multithreaded pseudorandomness. Just because MersenneTwisterFast is thread-safe doesn't mean the random order will be deterministic per thread. You need to use a separate instance of the RNG per thread. Happy to discuss more tomorrow during our meeting, @gselzer.

imagejan reviewed Apr 27, 2018

View reviewed changes

ctrueden added the to do label Oct 3, 2018

gselzer and others added 6 commits May 18, 2020 11:36

AddNoise: Rename to AddGaussianNoise

dc92582

This op was renamed to make it clear that the noise generated by this op is NON-uniform. This distinction is being made to separate this op from a uniform noise generation op (coming in a later commit)

Add AddUniformNoise Op.

e2895d9

Cleanup / format

7121444

Update Javadoc for cleaner explanation

4aa53c6

gselzer force-pushed the addUniformNoise branch 2 times, most recently from 67f7e97 to 04af3fe Compare May 19, 2020 16:46

gselzer added 2 commits May 19, 2020 12:24

Add tests for Uniform Noise op

0bc3d3d

gselzer force-pushed the addUniformNoise branch from 90e1436 to 0bc3d3d Compare May 19, 2020 17:36

Remove default seed

d134a6c

The default seed resulted in non-random noise. Removing it ensures that the result is different every time (unless the user passes a seed)

gselzer mentioned this pull request Jan 9, 2024

Uniform Noise Adder using LoopBuilder scijava/incubator#115

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add uniform noise op #554

Add uniform noise op #554

gselzer commented Apr 19, 2018 •

edited

imagejan Apr 27, 2018

gselzer Apr 30, 2018 •

edited

imagejan May 1, 2018

ctrueden commented May 1, 2018

awalter17 commented May 1, 2018

imagejan commented May 2, 2018 •

edited

gselzer commented May 19, 2020

gselzer commented May 19, 2020 •

edited

ctrueden commented May 20, 2020 •

edited

Add uniform noise op #554

Are you sure you want to change the base?

Add uniform noise op #554

Conversation

gselzer commented Apr 19, 2018 • edited

imagejan Apr 27, 2018

Choose a reason for hiding this comment

gselzer Apr 30, 2018 • edited

Choose a reason for hiding this comment

imagejan May 1, 2018

Choose a reason for hiding this comment

ctrueden commented May 1, 2018

awalter17 commented May 1, 2018

imagejan commented May 2, 2018 • edited

gselzer commented May 19, 2020

gselzer commented May 19, 2020 • edited

ctrueden commented May 20, 2020 • edited

gselzer commented Apr 19, 2018 •

edited

gselzer Apr 30, 2018 •

edited

imagejan commented May 2, 2018 •

edited

gselzer commented May 19, 2020 •

edited

ctrueden commented May 20, 2020 •

edited