Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add floating point environment to task state #51277

Open
simonbyrne opened this issue Sep 12, 2023 · 9 comments · May be fixed by #51288
Open

add floating point environment to task state #51277

simonbyrne opened this issue Sep 12, 2023 · 9 comments · May be fixed by #51288
Labels
domain:maths Mathematical functions domain:multithreading Base.Threads and related functionality

Comments

@simonbyrne
Copy link
Contributor

I'd like to make the floating point environment (which controls rounding mode and floating point exceptions) a bit more useful. Currently we use the C behavior where the floating point environment is set per-thread, which doesn't really make sense in the context of tasks which can switch threads.

I'd like to propose adding the floating point environment (fenv_t object) to the task state. The simplest approach I can think of is that every time a thread switches tasks, we capture the current floating point environment of the current task and save it to the struct (via fegetenv), and set the environment to that of the new task (via fesetenv). In this way, the floating point environment should follow the task (and be inherited by child tasks).

@vchuravy
Copy link
Sponsor Member

LLVM assumes fenv is set to default [1]. In order for the code to be correct under a different fenv LLVM provides constraint fp ops [2].

I am not convinced we should expose fenv to the user yet alone restore it and set it on task switch. (What about exceptions?) This is before considering the the cost of setting and saving fenv on task switch.

There is the additional issue of the Julia optimizer performing constant propagation which like LLVM assumes a specific fenv.

So I think to provide sensible semantics we should follow LLVM's lead, provide constraint fp semantics (codegen thereof is a bit tricky) and added complication is that our math functions are implemented in Julia, LLVM also has constrained libm functions [3].

[1] https://llvm.org/docs/LangRef.html#floating-point-environment
[2] https://llvm.org/docs/LangRef.html#constrainedfp
[3] https://llvm.org/docs/LangRef.html#id2264

@simonbyrne
Copy link
Contributor Author

simonbyrne commented Sep 12, 2023

To clarify, my eventual aim would be to support (in order of priority)

  1. trap on floating point exceptions (e.g. throw a Julia exception at the location a NaN occurs)
  2. check floating point sticky bits (e.g. check if a NaN occurred anywhere in a block)
  3. support for floating point rounding modes (this is a sort of "nice to have", but far less useful than the others)

There are effectively two things we would need to change:

  1. Runtime support
    a. A coherent model for the floating point environment: I think it would make sense to do this at the task level, and have child tasks inherit their parents environment at the time they were created (which is what this issue proposes). Currently we just inherit the C behaviour, which is to do it at the thread level (which doesn't really make any sense for Julia, since we have little control over what threads executate which tasks).

    b. Interface for controlling the floating point environment (setting/querying different flags)

    c. Handling of floating point exceptions (see Trap floating point exceptions #27705 and Add support for floating point exceptions #47930)

  2. Compiler support: make sure that the compiler is aware of the floating point environment. Historically LLVM had pretty bad support for this, but it looks like the situation has improved recently. I admit I have less of an idea what would be required here (perhaps a method overlay for non-default environments?), but I concede it is definitely not trivial.

However I don't think 2 is actually necessary precondition for 1: personally I would actually find it useful to be able to trap on NaN even if it wasn't 100% correct all of the time (i.e. it missed some, or caught the occasional spurious exception). Moreover, I think we probably do need runtime support before we can discuss what changes to the compiler are needed.

What I would like to propose would be to add the minimum necessary features to support 1 in Julia itself (basically what I propose above, along with some variant of #47930), and we can put the actual interface itself could either be in a package or an experimental module.

@simonbyrne
Copy link
Contributor Author

This is before considering the the cost of setting and saving fenv on task switch.

This is a fair point: I would be curious to know what this would add to the cost of a task switch.

@vchuravy
Copy link
Sponsor Member

If we decide to add constrained ops to the language and thus have proper semantics for this in the optimizer and code-generator we may have to consider the fenv state that we need to store and restore on Task-Switch and exception handling.

Otherwise access to the fenv is not yet defined behavior (even though we use it for rounding semantics, but we may get away with that since it's unlikely we will switch tasks). We should probably replace the rounding code with calls to constrained intrinsics.

@simonbyrne
Copy link
Contributor Author

(even though we use it for rounding semantics, but we may get away with that since it's unlikely we will switch tasks).

we actually deprecated that a while back. Currently setrounding only applies to BigFloat.

@vchuravy
Copy link
Sponsor Member

Ah! Just grepped and didn't see that the only usage was in the tests https://github.com/JuliaLang/julia/blob/4af6be80f238fce9cd124925d488a4c7171cf74a/test/rounding.jl#L80C1-L80C1

@Keno
Copy link
Member

Keno commented Sep 13, 2023

I think having the floating point environment be global (or task local) state is just a design mistake. Rather, the operations should take explicit floating point controls (possible hidden behind abstractions) and if the hardware has global FP state, the compiler should mux it. Of course, dynamically scoped floating point state can be re-built on top of that if really desired, but I'm not sure it would be.

simonbyrne added a commit that referenced this issue Sep 13, 2023
@simonbyrne simonbyrne linked a pull request Sep 13, 2023 that will close this issue
@brenhinkeller brenhinkeller added domain:maths Mathematical functions domain:multithreading Base.Threads and related functionality labels Sep 14, 2023
@mikmoore
Copy link
Contributor

mikmoore commented Nov 8, 2023

Just here to say that something that gets setrounding to be usable on native types would be useful.

As part of a long-running thread on the precision of rand, it's come up that, in the absence of native rounding, emulating operations like convert(Float32, u::UInt32) with RoundDown/RoundToZero rounding costs significant performance. If somebody knows better ways to emulate this specific operation, I'd welcome your contribution in that thread.

simonbyrne added a commit that referenced this issue Dec 2, 2023
@vtjnash
Copy link
Sponsor Member

vtjnash commented Feb 1, 2024

just a design mistake

It is a platform ABI mistake in most existing compilers (since the ABI fails to specify whether it is a callee or caller saved register, resulting in compilers failing to do either). But is that ABI bug possible for us to fix, given the legacy of compilers implementing this badly? It seems like we would need to define it as callee-saved to be compatible, and declare it is an LLVM bug for any of our function calls (e.g. fesetenv) to fail to restore registers, unless they are intrinsic functions (e.g. llvm.set.fpmode). In theory this is fixable in LLVM, because the compiler already implements this for every other register except for this one, but it might require significant work.

simonbyrne added a commit that referenced this issue Feb 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:maths Mathematical functions domain:multithreading Base.Threads and related functionality
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants