Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dividing x ≠ 0 by zero returns NaN instead of ±Inf #12

Closed
mtanneau opened this issue Dec 24, 2020 · 4 comments
Closed

Dividing x ≠ 0 by zero returns NaN instead of ±Inf #12

mtanneau opened this issue Dec 24, 2020 · 4 comments

Comments

@mtanneau
Copy link

The convention for dividing by zero is not uniform across all levels of precision:

one(Float64) / zero(Float64)      # Inf
one(Float64x1) / zero(Float64x1)  # Inf
one(Float64x2) / zero(Float64x2)  # NaN

Is it possible to have Float64x2, etc follow the same convention as Float64?
I have parts of my code in Tulip.jl that rely on this convention.

I think this should cover all cases:

x y x/y
>0 +0 +Inf
>0 -0 +Inf
±0 ±0 NaN
<0 +0 -Inf
<0 -0 +Inf
@dzhang314
Copy link
Owner

Hey @mtanneau , thank you for you interest in my project! You are absolutely correct that the types provided by MultiFloats.jl do not have proper IEEE Inf and NaN propagation semantics. Unfortunately, this is a rather difficult issue to fix. You see, arithmetic operations on MultiFloat values are implemented using indeterminate forms such as ((a + b) - a) - b which would evaluate to zero in exact arithmetic. When evaluated in floating-point arithmetic with round-to-nearest mode, these expressions allow me to compute the round-off error incurred in (say) the floating-point addition a + b, so that the error can be propagated from one component of a MultiFloat to the next.

These expressions are inherently unfriendly to Inf values, since they produce indeterminate forms like Inf - Inf and Inf / Inf all over the place. Therefore, I completely ignored the issue of correctly propagating these values, in the interest of making the common case (arithmetic with finite values) as fast as possible. If you disassemble, for example, the implementation of the + operator for Float64x4, you'll notice that it's a straight line of native Float64 operations -- totally branchless and vectorizable. If we were to insert checks for Inf and NaN in all MultiFloat arithmetic operators, I fear we would lose much of the performance uplift that MultiFloats.jl was designed to provide.

This isn't to say the situation is hopeless, though! I am certainly interested in making MultiFloats.jl compatible with as many libraries as I can. Does Tulip.jl crucially depend on Inf/NaN propagation everywhere, or only in select parts of the codebase? If the latter, it would be possible for me to provide a macro, like

@MultiFloats.IEEE begin
    c = a * a + b * b
    # ... code that depends on Inf/NaN propagation
end
# ... code that doesn't depend on Inf/NaN propagation

where we could just take the performance hit for the bits that matter.

In any case, thanks for bringing this issue to my attention. Violation of IEEE semantics, even if intentional, is a design deficiency that should be documented loud and clear on the MultiFloat.jl readme, so I'll go ahead and add a note about it.

@mtanneau
Copy link
Author

Thanks for the answer :)
TLDR: I'll handle the edge cases manually where they happen, to keep the codebase arithmetic-free.

I dug into my code a bit more. As far as I can tell, my issues comes from updates of the form l += a, where l may be infinite (a is always finite). This is where NaNs appear and things break.
It's not too wide-spread (and not in performance-critical parts), so it's OK for me to manually handle the infinite cases.

(FYI, there are other parts where a division by zero might happen, but in that case the result would be multiplied by false afterwards, so the NaNs are not a problem; as long as the convention Inf * false == NaN * false == 0 holds.)

@dzhang314
Copy link
Owner

Excellent! I'm glad to hear that the issue could be handled on your side.

By the way, since you're using MultiFloats.jl for optimization, you'll be glad to hear that I fixed a bug today where needless destructive cancellation would occur when subtracting two MultiFloats that are very close in magnitude. I actually have my own collection of nonlinear optimization routines for use with MultiFloats.jl, and this was driving me crazy. ("I'm turning up the precision, why isn't the objective going down???")

I'll make the v1.0 release of MultiFloats.jl imminently, with this fix and a couple enhancements.

@dzhang314
Copy link
Owner

v1.0 of MultiFloats.jl is now released! Closing this issue, as I've added a "Caveats" section to the README which documents violation of strict IEEE Inf/NaN semantics.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants