You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
On line 124 of math.jl, we set up transcendental functions of arbitrary reals so that they call f(float(x)). Because float(::Int8) and float(::Int16) return a Float32, this means e.g. log(uint8(2)) returns a Float32. This is consistent, but I'm not sure it's desirable. The input is exact, so precision of the result seems just as important as if the input were an Int.
The text was updated successfully, but these errors were encountered:
Sounds reasonable to me. One of Kahan's standard "misconceptions of floating-point" is that "Arithmetic much more precise than the data it operates upon is needless, and wasteful," and it looks like we fell into that trap here by promoting integers by default to the narrowest fp type that can represent them exactly.
On line 124 of math.jl, we set up transcendental functions of arbitrary reals so that they call
f(float(x))
. Becausefloat(::Int8)
andfloat(::Int16)
return a Float32, this means e.g.log(uint8(2))
returns a Float32. This is consistent, but I'm not sure it's desirable. The input is exact, so precision of the result seems just as important as if the input were anInt
.The text was updated successfully, but these errors were encountered: