Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Halide floating point behavior #3813

Open
LazyDodo opened this issue Apr 12, 2019 · 13 comments
Open

Halide floating point behavior #3813

LazyDodo opened this issue Apr 12, 2019 · 13 comments
Labels

Comments

@LazyDodo
Copy link
Contributor

I'm having issues with a halide being a little too aggressive in it's handling of floating point calculations.

The issue at its core is that halide thinks that transforming (x+1)*n into (x*n)+n is a valid thing to do, and while that holds true for integer operations, floating point is less forgiving about this and will introduce small differences in the calculation, nvidia warns about this behavior of floating point math here .

Now this issue has come up before in issue #3549 which suggested the use of strict_float which did seem to initially solve the issue, it seems it was just better at masking it.

Even with strict_float applied, halide still seemingly randomly decides to mess with the order of things. I attached a heavily commented repro case in python (pastebin version available here).

Due to its use of strict_float it needs PR #3811 to function so make sure you have a recent build.

noise_cpu.zip

@LazyDodo
Copy link
Contributor Author

Took a stab at resolving this and postulated removing the AllowReassoc flag from the fastmath options here would resolve the issue. Kinda stumped it didn't, any suggestions on where to look next?

@LazyDodo
Copy link
Contributor Author

Found the cause, the solution is still a little unclear though, no_float_simplify is initialized to false here however nothing ever checks if the strict flag is on and changes this value, hence unsafe float optimizations are always allowed.

@abadams abadams added the bug label Apr 15, 2019
@abadams
Copy link
Member

abadams commented Apr 15, 2019

Hrm, it's supposed to be set to true when visiting the strict_float intrinsic in Simplify_Call.cpp

@abadams
Copy link
Member

abadams commented Apr 15, 2019

So maybe the bug is that strict_float wasn't injected somewhere it should have been?

@LazyDodo
Copy link
Contributor Author

yeah just saw that flag as well and it does seem to work, turning up the debug level in the codegen gives me

With lets: (let t0 = sin_f32(((float32(y)*78.233002f) + ((float32(x) + 1.000000f)*12.989800f))) in strict_float(((t0*43758.546875f) - floor_f32((t0*43758.546875f)))))

You may be onto something there, it seems to only partially mark the expression as strict float

@abadams
Copy link
Member

abadams commented Apr 15, 2019 via email

@zvookin
Copy link
Member

zvookin commented Apr 15, 2019

Part of the reason for wrapping the strict_float intrinsic around all Expr of floating-point type is to ensure all IR transformations see it. (The other big reason is to make sure simplification doesn't do any simplifying as the intrinsic will never match any of its patterns. It also cuts off visiting those parts of the tree, but even if it visits inside of them, it shouldn't do anything.) It is probably ok to CSE something so long as a) the strict_float occurrences inside it are preserved, and b) the substitution is also wrapped in strict_float. The latter would seem unlikely I guess.

@zvookin
Copy link
Member

zvookin commented Jun 18, 2019

I'm going to guess what is happening here is that x + 1 is an integer expression and with strict float the IR gets simplified from (x + 1) * strict_float(val) to x * strict_float(val) + strict_float(val) (or vice-versa). The work around would be to write e.g. (x + 1.0f) * val in the source code.

@LazyDodo
Copy link
Contributor Author

You actually had it right in your April 15 post, CSE is losing the strict_float attribute during its substitutions, debugged into it to the point where I could validate your suspicion but have lacked time since to actually sit down and fix it.

zvookin added a commit that referenced this issue Jul 15, 2019
@zvookin
Copy link
Member

zvookin commented Jul 15, 2019

Please see #4012 . I haven't been able to get the reported test failure to reproduce however. @LazyDodo can you try this and see if it has any effect.

@abadams
Copy link
Member

abadams commented Jul 15, 2019

Is this being strictified twice? Generating nested strict_floats: strict_float(strict_float(x)) seems suboptimal

@LazyDodo
Copy link
Contributor Author

Guess this comment was for #4012 ?

The test only applies strict_float once at the very outer level

import halide as hl

def hlrand(x,y):
    tx = x * 12.9898
    ty = y * 78.233
    dot = ty + tx
    sval = hl.sin(dot)
    sval_mul = sval * 43758.5453123
    fract = sval_mul - hl.floor(sval_mul)
    Res = fract
    return Res

def noise_strict(x,y, offset):
    noise = hl.Func("noise")
    noise[x,y] = hl.strict_float(hlrand( x + offset, y))
    return noise
    
def main():
    x,y = hl.Var("x"), hl.Var("y")
 
    cpunoise_offset = noise_strict(x,y,1.0 )
    cpunoise_offset.compile_to_lowered_stmt("cpunoise_offs.txt", [])

 
if __name__ == '__main__':
    main()

@abadams
Copy link
Member

abadams commented Jul 15, 2019

Oops, yeah. I'll make the comment on that thread instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants