Fix cellnoise bug #912

lgritz · 2018-08-29T05:48:48Z

WARNING: possible visible change in procedural patterns

It's expected that cellnoise(x) == cellnoise(floor(x)), but we found an
ugly bug wherein cellnoise internally used a helper function
quick_floor, which is supposed to take the floor of a float argument and
return its integer equivalent. But unfortunately it had a really odd
error: when x was an exact negative integer, it gave the value for the
next smaller integer.

quick_floor( 1.00) ==  1
quick_floor( 0.01) ==  0
quick_floor( 0.00) ==  0
quick_floor(-0.01) == -1
quick_floor(-0.99) == -1
quick_floor(-1.00) == -2    // WRONG!!! Should be -1
quick_floor(-1.01) == -2

This in turn caused cellnoise lookups of exact negative integer values
to be incorrect lookups of the next lower-numbered cell, i.e.,
cellnoise(i) having the same value as cellnoise(i-epsilon), but again,
only for i<0 and i==floor(i) exact integer values.

Now, if your shader made a visual representation of callnoise(x,y), it
would look just fine, unless a particular x,y fell exactly on integer
coordinates, in which case it would be ever so slightly off (by being
the color of the square immedately to the left instead of to the right,
as it should be, say). Nobody would ever notice that.

Also, if your shader took the floor before passing to cellnoise, like:
cellnoise(floor(x),floor(y)), it would also look correct, though the
negative values would be wrong, since cellnoise(-1,-1) would have
returned the value that should have been cellnoise(-2,-2), etc., but
you never would have noticed that.

But... there were some possible uses of cellnoise where it was behaving
very strangely, and the main place you would notice this is certain
types of "voronoi" noises where you are lookup up cellnoise values while
looping over adjacent lattice points. (Thanks, Zap, for noticing this,
and so sorry that there was a legit bug here!)

It was a very slight possible problem for perlin style noise as well,
but since those noises are "continuous" and defined by gradients, you
really, really, would not have known that the occasional pixel had noise
results that were ever so slightly off in its derivatives only at those
isolated integer lattice points. I think no visible artifacts were
discernable under real-world conditions.

The solution is to bite the bullet and replace the clever (but extremely
minor) speedup for computing floor, with a legit floor call. It's ever
so slightly more expensive if you isolate the cost of just raw cellnoise
calls, but in the context of a real render I don't think you'll be able
to measure it.

WARNING: This may change the appearance of certain "Voronoi" patterns,
or possibly other uses of cellnoise as random number sequences. It also
changes the appearance of "Gabor" noise, which internally also used the
faulty quick_floor in such a way that it affects the whole pattern.

Note that this should NOT change the appearance of any of the
perlin-style noises, and you may not notice any change for most ordinary
uses of cellnoise, either.

I changed the oslnoise unit tests to check that this is working
properly, and updated all of the noise unit tests that generate images
to "center" the patterns so that they visibly display the negative
regions of the noise domain (they had all shown only positive domain,
which is probably inviting trouble if there was ever a noise bug that
only affects what happens for negative values or right on the axes).
This caused a lot of the reference images to change, even for the tests
of noise patterns that were unaffected. Also, I took the chance to merge
a few of the signed/unsigned tests for less repetition.

I'm very sorry about this; changing the appearance of noise functions is
not something we take lightly, we do it only as a last resort. It was
clearly wrong before and had some very unexpected failure modes. But I'm
only making this fix in master, which is destined to be "OSL 2.0", so I
figure it's as good a time as any to make a change that could possibly
make visible changes in some shaders.

If you see this as some kind of major betrayal that will screw up your
productions, let me know and if pressed I can perhaps make some kind of
emulation of the old buggy behavior for people who need to preserve it.
But I'd rather not do that if it's not absolutely necessary.

fpsunflower · 2018-08-29T16:39:50Z

src/include/OSL/oslnoise.h

@@ -145,24 +145,13 @@ namespace {

 // return the greatest integer <= x
 inline OSL_HOSTDEVICE int quick_floor (float x) {


Is there any point in having this wrapper now?

fpsunflower · 2018-08-29T16:40:30Z

src/include/OSL/oslnoise.h

@@ -145,24 +145,13 @@ namespace {

 // return the greatest integer <= x
 inline OSL_HOSTDEVICE int quick_floor (float x) {
-    return (int) x - ((x < 0) ? 1 : 0);
+    return (int)floorf(x);
 }

 #ifndef __CUDA_ARCH__
 // return the greatest integer <= x, for 4 values at once
 OIIO_FORCEINLINE int4 quick_floor (const float4& x) {


Same here ... why not just call ifloor directly everywhere?

OK, updated the PR with this rename as requested. The reason I didn't include it at first was that it swamps and obscures the actual bug fix, making the review confusing. But if you like the rename, I'll merge it together.

fpsunflower · 2018-08-29T16:42:24Z

Otherwise LGTM!

fpsunflower · 2018-08-30T15:33:24Z

LGTM!

I like this way better not just because it tidies up the redundant name, but it also makes sure you saw all the places that were actually using the buggy function.

lgritz · 2018-08-30T19:38:12Z

Hmmm... I guess the other potential reason to leave it as it was is if somebody chimes in later with a "holy crap, I need the old pattern", it would make it very easy to have a build-time switch that restores the old behavior. Not so easy once everything is changed to ifloor.

WARNING: possible visible change in procedural patterns It's expected that cellnoise(x) == cellnoise(floor(x)), but we found an ugly bug wherein cellnoise internally used a helper function quick_floor, which is supposed to take the floor of a float argument and return its integer equivalent. But unfortunately it had a really odd error: when x was an exact negative integer, it gave the value for the next smaller integer. quick_floor( 1.00) == 1 quick_floor( 0.01) == 0 quick_floor( 0.00) == 0 quick_floor(-0.01) == -1 quick_floor(-0.99) == -1 quick_floor(-1.00) == -2 // WRONG!!! Should be -1 quick_floor(-1.01) == -2 This in turn caused cellnoise lookups of exact negative integer values to be incorrect lookups of the next lower-numbered cell, i.e., cellnoise(i) having the same value as cellnoise(i-epsilon), but again, only for i<0 and i==floor(i) exact integer values. Now, if your shader made a visual representation of callnoise(x,y), it would look just fine, unless a particular x,y fell *exactly* on integer coordinates, in which case it would be ever so slightly off (by being the color of the square immedately to the left instead of to the right, as it should be, say). Nobody would ever notice that. Also, if your shader took the floor before passing to cellnoise, like: cellnoise(floor(x),floor(y)), it would also *look* correct, though the negative values would be wrong, since cellnoise(-1,-1) would have returned the value that should have been cellnoise(-2,-2), etc., but you never would have noticed that. But... there were some possible uses of cellnoise where it was behaving very strangely, and the main place you would notice this is certain types of "voronoi" noises where you are lookup up cellnoise values while looping over adjacent lattice points. (Thanks, Zap, for noticing this, and so sorry that there was a legit bug here!) It was a very slight possible problem for perlin style noise as well, but since those noises are "continuous" and defined by gradients, you really, really, would not have known that the occasional pixel had noise results that were ever so slightly off in its derivatives only at those isolated integer lattice points. I think no visible artifacts were discernable under real-world conditions. The solution is to bite the bullet and replace the clever (but extremely minor) speedup for computing floor, with a legit floor call. It's ever so slightly more expensive if you isolate the cost of just raw cellnoise calls, but in the context of a real render I don't think you'll be able to measure it. WARNING: This may change the appearance of certain "Voronoi" patterns, or possibly other uses of cellnoise as random number sequences. It also changes the appearance of "Gabor" noise, which internally also used the faulty quick_floor in such a way that it affects the whole pattern. Note that this should NOT change the appearance of any of the perlin-style noises, and you may not notice any change for most ordinary uses of cellnoise, either. I changed the oslnoise unit tests to check that this is working properly, and updated all of the noise unit tests that generate images to "center" the patterns so that they visibly display the negative regions of the noise domain (they had all shown only positive domain, which is probably inviting trouble if there was ever a noise bug that only affects what happens for negative values or right on the axes). This caused a lot of the reference images to change, even for the tests of noise patterns that were unaffected. Also, I took the chance to merge a few of the signed/unsigned tests for less repetition. I'm very sorry about this; changing the appearance of noise functions is not something we take lightly, we do it only as a last resort. It was clearly wrong before and had some very unexpected failure modes. But I'm only making this fix in master, which is destined to be "OSL 2.0", so I figure it's as good a time as any to make a change that could possibly make visible changes in some shaders. If you see this as some kind of major betrayal that will screw up your productions, let me know and if pressed I can perhaps make some kind of emulation of the old buggy behavior for people who need to preserve it. But I'd rather not do that if it's not absolutely necessary.

lgritz force-pushed the lg-noisebug branch from ffd427b to 23ff5fa Compare August 29, 2018 05:48

fpsunflower reviewed Aug 29, 2018

View reviewed changes

lgritz force-pushed the lg-noisebug branch from 23ff5fa to e3f6ebd Compare August 29, 2018 19:50

lgritz force-pushed the lg-noisebug branch from ebe1b6c to dd6d85a Compare September 4, 2018 23:47

lgritz added 2 commits September 5, 2018 09:55

Replace quick_floor with ifloor (912)

5ec807c

lgritz force-pushed the lg-noisebug branch from dd6d85a to 5ec807c Compare September 5, 2018 16:55

lgritz merged commit 5ec807c into AcademySoftwareFoundation:master Sep 5, 2018

lgritz deleted the lg-noisebug branch September 14, 2018 19:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix cellnoise bug #912

Fix cellnoise bug #912

lgritz commented Aug 29, 2018

fpsunflower Aug 29, 2018

fpsunflower Aug 29, 2018

lgritz Aug 30, 2018

fpsunflower commented Aug 29, 2018

fpsunflower commented Aug 30, 2018

lgritz commented Aug 30, 2018

		@@ -145,24 +145,13 @@ namespace {

		// return the greatest integer <= x
		inline OSL_HOSTDEVICE int quick_floor (float x) {

Fix cellnoise bug #912

Fix cellnoise bug #912

Conversation

lgritz commented Aug 29, 2018

fpsunflower Aug 29, 2018

Choose a reason for hiding this comment

fpsunflower Aug 29, 2018

Choose a reason for hiding this comment

lgritz Aug 30, 2018

Choose a reason for hiding this comment

fpsunflower commented Aug 29, 2018

fpsunflower commented Aug 30, 2018

lgritz commented Aug 30, 2018