Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-thread safe use of setrounding #612

Closed
Joel-Dahne opened this issue Jan 9, 2024 · 11 comments · Fixed by #615
Closed

Non-thread safe use of setrounding #612

Joel-Dahne opened this issue Jan 9, 2024 · 11 comments · Fixed by #615

Comments

@Joel-Dahne
Copy link
Contributor

The slow versions of the _round methods in src/interval/rounding.jl use BigFloat together with setrounding in a way that is not thread-safe. For setprecision there is a precision_lock defined at the top of the file that is used to make it thread-safe, but not such look is used for the setrounding. Starting Julia with two threads I could reliably produce errors with the following code

x = interval(BigFloat, 1)
y = sin(x)
Threads.@threads for _ in 1:1000
    z = sin(x)
    @assert isequal_interval(y, z)
end

In many cases (though not all) I also found that after running this loop even the non-threaded behavior is wrong. Most of the time this fails:

setrounding(BigFloat, RoundNearest) # Reset rounding in case this would be an issue
z = sin(x)
@assert isequal_interval(y, z)

So there is some global state that gets messed up.

I made a quick attempt at fixing this by adding locks around setround, but I could still not get it to work properly. So there might be something else that I'm missing as well.

@Joel-Dahne
Copy link
Contributor Author

Some more details, maybe it helps someone with the debugging...

I can reproduce it also with the code

x = bareinterval(BigFloat, 1)
y = sin(x)
Threads.@threads for _ in 1:100000
    z = IntervalArithmetic._unsafe_bareinterval(BigFloat, IntervalArithmetic._sin_round(x.lo, RoundDown), IntervalArithmetic._sin_round(x.hi, RoundUp))
    if !isequal_interval(y, z)
        @show y z
    end
end

BUT, for this code my attempt at adding locks for the use of setrounding seems to work. I don't know why this is the case, as far as I understand the codes should do more or less the same.

@dpsanders
Copy link
Member

Ugh, setrounding has always been a mess, and BigFloats have always had too much global state.

I suggest using the accurate mode (or whatever it is now called) that just uses prevfloat and nextfloat for rounding.

@dpsanders
Copy link
Member

But of course it would be great to get this fixed too.

@Joel-Dahne
Copy link
Contributor Author

Joel-Dahne commented Jan 10, 2024

Thinking more about it, it would not be enough to use locks anyway since any other part of the program could still change the rounding mode. The same holds true for the precision_lock that guards the setprecision.

I think the simplest solution is to just call MPFR directly and not use the BigFloat interface. So for example _add_round would become

function _add_round(::IntervalRounding{:slow}, x::BigFloat, y::BigFloat, r::RoundingMode)
    @assert precision(x) == precision(y) # Should this be required?
    z = BigFloat(precision = precision(x))
    @ccall Base.MPFR.libmpfr.mpfr_add(
        z::Ref{BigFloat},
        x::Ref{BigFloat},
        y::Ref{BigFloat},
        r::Base.MPFR.MPFRRoundingMode,
    )::Int32
    return z
end

Regarding the @assert for the precision this is something we had to decide how to handle in Arblib.jl as well. In our case we most of the time use the maximum precision of the input arguments, this makes sense in our context. Here I'm not so sure, what does "correct rounding" mean when mixing precisions I guess is the question.

@Joel-Dahne
Copy link
Contributor Author

Ugh, setrounding has always been a mess, and BigFloats have always had too much global state.

I suggest using the accurate mode (or whatever it is now called) that just uses prevfloat and nextfloat for rounding.

The :fast rounding seems to be deprecated. The docstring for IntervalRounding mentions it as unsupported and the implementations of it are commented our in the source code.

But I can mention that this bug is not an issue for me in practice. I'm currently looking at to what extent it would be possible to have Arblib.jl and IntervalArithmetic.jl work together. For that I have needed to look at a lot of the implementations to see what is really required. That's when I spotted this bug!

@OlivierHnt
Copy link
Member

OlivierHnt commented Jan 10, 2024

Ah we need to fix this asap. I will look into the proposed solution #612 (comment)

Indeed, the :accurate mode (now named :fast) is not supported due to #374. Since the number of ulp also depends on the system's architecture, I did not take the time to fix this yet.

@Joel-Dahne
Copy link
Contributor Author

Joel-Dahne commented Jan 10, 2024

For _bigequiv I think it is enough to just define

_bigequiv(x::AbstractFloat) = BigFloat(x, precision = precision(x))

For Rational this function doesn't really make sense since you would need to control the rounding.

Looking closer it does however seem like rationals are not treated correctly currently? For example IntervalArithmetic._sin_round(1 // 1, RoundDown) starts by converting 1 // 1 using float, but this is not guaranteed to give the correct rounding?

@OlivierHnt
Copy link
Member

Yes, I noticed this as well. Presumably, we ought to do T(1//10, RoundDown) and T(1//10, RoundUp) where T<:AbstractFloat

@Joel-Dahne
Copy link
Contributor Author

You would need to take into account if the function you are computing is increasing or decreasing though, and pick which way to round depending on that. This seems like a lot of work to do for all functions, it would more or less be implementing everything from scratch. I don't see a reasonable solution to this.

The only solution I can see is to only allow exact arithmetic for rational intervals. It doesn't really make much sense to use them in any other case as far as I can tell. But of course people could have use cases I haven't thought about.

@OlivierHnt
Copy link
Member

OlivierHnt commented Jan 10, 2024

I see what you mean, yes it is a bit more intricate.

The only solution I can see is to only allow exact arithmetic for rational intervals. It doesn't really make much sense to use them in any other case as far as I can tell. But of course people could have use cases I haven't thought about.

Yea I do not use it myself, and I would agree that the use for rational interval bounds seem mainly that some operations are exact. I'll keep a note on this.

@OlivierHnt
Copy link
Member

OlivierHnt commented Jan 11, 2024

Mhm indeed, adding a lock before calling CRlibm functions seems to correctly fix the error described in #612 (comment)

While executing

x = interval(BigFloat, 1)
y = sin(x)
Threads.@threads for _ in 1:1000
    z = sin(x)
    @assert isequal_interval(y, z)
end

still errors...
In fact, the precision of BigFloat changes after! E.g.

julia> Threads.@threads for _ in 1:100000
           z = sin(x)
           # @assert isequal_interval(y, z)
           if !isequal_interval(y, z)
               @show y z
               error()
           end
       end
y = BareInterval{BigFloat}(0.8414709848078965066525023216302989996225630607983710656727517099919104043912312, 0.8414709848078965066525023216302989996225630607983710656727517099919104043912398)
y = BareInterval{BigFloat}(0.8414709848078965066525023216302989996225630607983710656727517099919104043912312, 0.8414709848078965066525023216302989996225630607983710656727517099919104043912398)
y = BareInterval{BigFloat}(0.8414709848078965066525023216302989996225630607983710656727517099919104043912312, 0.8414709848078965066525023216302989996225630607983710656727517099919104043912398)
z = BareInterval{BigFloat}(0.8414709848078965066525023216302989996225630607983710656727517099919104043912312, 0.8414709848078965066525023216302989996225630607983710656727517099919104043912396689486415)
y = BareInterval{BigFloat}(0.8414709848078965066525023216302989996225630607983710656727517099919104043912312, 0.8414709848078965066525023216302989996225630607983710656727517099919104043912398)
y = BareInterval{BigFloat}(0.8414709848078965066525023216302989996225630607983710656727517099919104043912312, 0.8414709848078965066525023216302989996225630607983710656727517099919104043912398)
z = BareInterval{BigFloat}(0.8414709848078965066525023216302989996225630607983710656727517099919104043912312, 0.8414709848078965066525023216302989996225630607983710656727517099919104043912396689486415)
z = BareInterval{BigFloat}(0.84147098480789650665250232163029899962256306079837106567275170999191040439123966894863974354305248, 0.8414709848078965066525023216302989996225630607983710656727517099919104043912396689486415)
z = BareInterval{BigFloat}(0.8414709848078965066525023216302989996225630607983710656727517099919104043912396689486395, 0.8414709848078965066525023216302989996225630607983710656727517099919104043912396689486415)
z = BareInterval{BigFloat}(0.8414709848078965066525023216302989996225630607983710656727517099919104043912312, 0.8414709848078965066525023216302989996225630607983710656727517099919104043912396689486415)
ERROR: TaskFailedException

    nested task error: 
    Stacktrace:
     [1] error()
       @ Base ./error.jl:44
     [2] macro expansion
       @ ./REPL[78]:6 [inlined]
     [3] (::var"#287#threadsfor_fun#53"{var"#287#threadsfor_fun#52#54"{UnitRange{Int64}}})(tid::Int64; onethread::Bool)
       @ Main ./threadingconstructs.jl:214
     [4] #287#threadsfor_fun
       @ Main ./threadingconstructs.jl:181 [inlined]
     [5] (::Base.Threads.var"#1#2"{var"#287#threadsfor_fun#53"{var"#287#threadsfor_fun#52#54"{UnitRange{Int64}}}, Int64})()
       @ Base.Threads ./threadingconstructs.jl:153

...and 7 more exceptions.

Stacktrace:
 [1] threading_run(fun::var"#287#threadsfor_fun#53"{var"#287#threadsfor_fun#52#54"{UnitRange{Int64}}}, static::Bool)
   @ Base.Threads ./threadingconstructs.jl:171
 [2] macro expansion
   @ ./threadingconstructs.jl:219 [inlined]
 [3] top-level scope
   @ ./REPL[78]:1

julia> precision(BigFloat)
288

Not sure what is going on; our implementation of sin is not thread safe and somehow implicitly triggers a change in the global precision of BigFloat 😞

A quick search in Julia's repo pointed me to the PR JuliaLang/julia#51362. Perhaps there is something to learn from it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants