Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix cellnoise bug #912

Merged
merged 2 commits into from
Sep 5, 2018
Merged

Conversation

lgritz
Copy link
Collaborator

@lgritz lgritz commented Aug 29, 2018

WARNING: possible visible change in procedural patterns

It's expected that cellnoise(x) == cellnoise(floor(x)), but we found an
ugly bug wherein cellnoise internally used a helper function
quick_floor, which is supposed to take the floor of a float argument and
return its integer equivalent. But unfortunately it had a really odd
error: when x was an exact negative integer, it gave the value for the
next smaller integer.

quick_floor( 1.00) ==  1
quick_floor( 0.01) ==  0
quick_floor( 0.00) ==  0
quick_floor(-0.01) == -1
quick_floor(-0.99) == -1
quick_floor(-1.00) == -2    // WRONG!!! Should be -1
quick_floor(-1.01) == -2

This in turn caused cellnoise lookups of exact negative integer values
to be incorrect lookups of the next lower-numbered cell, i.e.,
cellnoise(i) having the same value as cellnoise(i-epsilon), but again,
only for i<0 and i==floor(i) exact integer values.

Now, if your shader made a visual representation of callnoise(x,y), it
would look just fine, unless a particular x,y fell exactly on integer
coordinates, in which case it would be ever so slightly off (by being
the color of the square immedately to the left instead of to the right,
as it should be, say). Nobody would ever notice that.

Also, if your shader took the floor before passing to cellnoise, like:
cellnoise(floor(x),floor(y)), it would also look correct, though the
negative values would be wrong, since cellnoise(-1,-1) would have
returned the value that should have been cellnoise(-2,-2), etc., but
you never would have noticed that.

But... there were some possible uses of cellnoise where it was behaving
very strangely, and the main place you would notice this is certain
types of "voronoi" noises where you are lookup up cellnoise values while
looping over adjacent lattice points. (Thanks, Zap, for noticing this,
and so sorry that there was a legit bug here!)

It was a very slight possible problem for perlin style noise as well,
but since those noises are "continuous" and defined by gradients, you
really, really, would not have known that the occasional pixel had noise
results that were ever so slightly off in its derivatives only at those
isolated integer lattice points. I think no visible artifacts were
discernable under real-world conditions.

The solution is to bite the bullet and replace the clever (but extremely
minor) speedup for computing floor, with a legit floor call. It's ever
so slightly more expensive if you isolate the cost of just raw cellnoise
calls, but in the context of a real render I don't think you'll be able
to measure it.

WARNING: This may change the appearance of certain "Voronoi" patterns,
or possibly other uses of cellnoise as random number sequences. It also
changes the appearance of "Gabor" noise, which internally also used the
faulty quick_floor in such a way that it affects the whole pattern.

Note that this should NOT change the appearance of any of the
perlin-style noises, and you may not notice any change for most ordinary
uses of cellnoise, either.

I changed the oslnoise unit tests to check that this is working
properly, and updated all of the noise unit tests that generate images
to "center" the patterns so that they visibly display the negative
regions of the noise domain (they had all shown only positive domain,
which is probably inviting trouble if there was ever a noise bug that
only affects what happens for negative values or right on the axes).
This caused a lot of the reference images to change, even for the tests
of noise patterns that were unaffected. Also, I took the chance to merge
a few of the signed/unsigned tests for less repetition.

I'm very sorry about this; changing the appearance of noise functions is
not something we take lightly, we do it only as a last resort. It was
clearly wrong before and had some very unexpected failure modes. But I'm
only making this fix in master, which is destined to be "OSL 2.0", so I
figure it's as good a time as any to make a change that could possibly
make visible changes in some shaders.

If you see this as some kind of major betrayal that will screw up your
productions, let me know and if pressed I can perhaps make some kind of
emulation of the old buggy behavior for people who need to preserve it.
But I'd rather not do that if it's not absolutely necessary.

@@ -145,24 +145,13 @@ namespace {

// return the greatest integer <= x
inline OSL_HOSTDEVICE int quick_floor (float x) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any point in having this wrapper now?

@@ -145,24 +145,13 @@ namespace {

// return the greatest integer <= x
inline OSL_HOSTDEVICE int quick_floor (float x) {
return (int) x - ((x < 0) ? 1 : 0);
return (int)floorf(x);
}

#ifndef __CUDA_ARCH__
// return the greatest integer <= x, for 4 values at once
OIIO_FORCEINLINE int4 quick_floor (const float4& x) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here ... why not just call ifloor directly everywhere?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, updated the PR with this rename as requested. The reason I didn't include it at first was that it swamps and obscures the actual bug fix, making the review confusing. But if you like the rename, I'll merge it together.

@fpsunflower
Copy link
Contributor

Otherwise LGTM!

@fpsunflower
Copy link
Contributor

LGTM!

I like this way better not just because it tidies up the redundant name, but it also makes sure you saw all the places that were actually using the buggy function.

@lgritz
Copy link
Collaborator Author

lgritz commented Aug 30, 2018

Hmmm... I guess the other potential reason to leave it as it was is if somebody chimes in later with a "holy crap, I need the old pattern", it would make it very easy to have a build-time switch that restores the old behavior. Not so easy once everything is changed to ifloor.

WARNING: possible visible change in procedural patterns

It's expected that cellnoise(x) == cellnoise(floor(x)), but we found an
ugly bug wherein cellnoise internally used a helper function
quick_floor, which is supposed to take the floor of a float argument and
return its integer equivalent. But unfortunately it had a really odd
error: when x was an exact negative integer, it gave the value for the
next smaller integer.

    quick_floor( 1.00) ==  1
    quick_floor( 0.01) ==  0
    quick_floor( 0.00) ==  0
    quick_floor(-0.01) == -1
    quick_floor(-0.99) == -1
    quick_floor(-1.00) == -2    // WRONG!!! Should be -1
    quick_floor(-1.01) == -2

This in turn caused cellnoise lookups of exact negative integer values
to be incorrect lookups of the next lower-numbered cell, i.e.,
cellnoise(i) having the same value as cellnoise(i-epsilon), but again,
only for i<0 and i==floor(i) exact integer values.

Now, if your shader made a visual representation of callnoise(x,y), it
would look just fine, unless a particular x,y fell *exactly* on integer
coordinates, in which case it would be ever so slightly off (by being
the color of the square immedately to the left instead of to the right,
as it should be, say).  Nobody would ever notice that.

Also, if your shader took the floor before passing to cellnoise, like:
cellnoise(floor(x),floor(y)), it would also *look* correct, though the
negative values would be wrong, since cellnoise(-1,-1) would have
returned the value that should have been cellnoise(-2,-2), etc., but
you never would have noticed that.

But... there were some possible uses of cellnoise where it was behaving
very strangely, and the main place you would notice this is certain
types of "voronoi" noises where you are lookup up cellnoise values while
looping over adjacent lattice points. (Thanks, Zap, for noticing this,
and so sorry that there was a legit bug here!)

It was a very slight possible problem for perlin style noise as well,
but since those noises are "continuous" and defined by gradients, you
really, really, would not have known that the occasional pixel had noise
results that were ever so slightly off in its derivatives only at those
isolated integer lattice points. I think no visible artifacts were
discernable under real-world conditions.

The solution is to bite the bullet and replace the clever (but extremely
minor) speedup for computing floor, with a legit floor call.  It's ever
so slightly more expensive if you isolate the cost of just raw cellnoise
calls, but in the context of a real render I don't think you'll be able
to measure it.

WARNING: This may change the appearance of certain "Voronoi" patterns,
or possibly other uses of cellnoise as random number sequences. It also
changes the appearance of "Gabor" noise, which internally also used the
faulty quick_floor in such a way that it affects the whole pattern.

Note that this should NOT change the appearance of any of the
perlin-style noises, and you may not notice any change for most ordinary
uses of cellnoise, either.

I changed the oslnoise unit tests to check that this is working
properly, and updated all of the noise unit tests that generate images
to "center" the patterns so that they visibly display the negative
regions of the noise domain (they had all shown only positive domain,
which is probably inviting trouble if there was ever a noise bug that
only affects what happens for negative values or right on the axes).
This caused a lot of the reference images to change, even for the tests
of noise patterns that were unaffected. Also, I took the chance to merge
a few of the signed/unsigned tests for less repetition.

I'm very sorry about this; changing the appearance of noise functions is
not something we take lightly, we do it only as a last resort. It was
clearly wrong before and had some very unexpected failure modes. But I'm
only making this fix in master, which is destined to be "OSL 2.0", so I
figure it's as good a time as any to make a change that could possibly
make visible changes in some shaders.

If you see this as some kind of major betrayal that will screw up your
productions, let me know and if pressed I can perhaps make some kind of
emulation of the old buggy behavior for people who need to preserve it.
But I'd rather not do that if it's not absolutely necessary.
@lgritz lgritz merged commit 5ec807c into AcademySoftwareFoundation:master Sep 5, 2018
@lgritz lgritz deleted the lg-noisebug branch September 14, 2018 19:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants