-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hypot
performance questionable
#36353
Comments
I like this fast track, the only thing that comes to my mind Is if the ulp error is enough |
I am concerned about the degradation of accuracy, too. However, as for the hypot_f32(a::Float32, b::Float32) = Float32(sqrt(Float64(a)^2 + Float64(b)^2)) (I have not identified the cases which would lead to inaccuracy yet.) |
BTW, the current implementation of |
Can anybody run the benchmarks on a machine, with |
This comment has been minimized.
This comment has been minimized.
would also be interested in the times on that machine, using |
Please wait a minute. 🤔 Does this Line 662 in d6d5208
cf. #31922 (comment) |
It looks like it does... julia> f(T) = eps(sqrt(floatmin(T)))
f (generic function with 1 method)
julia> @code_llvm f(Float64)
; @ REPL[9]:1 within `f'
define double @julia_f_17254(%jl_value_t addrspace(10)*) {
top:
; ┌ @ float.jl:745 within `eps'
%1 = call double @julia_ldexp_918(double 0x3CB0000000000000, i64 -511)
; └
ret double %1
} |
This seems to work correctly: julia> g(T) = eps(T)*sqrt(floatmin(T))
g (generic function with 1 method)
julia> @code_llvm g(Float64)
; @ REPL[21]:1 within `g'
define double @julia_g_17323(%jl_value_t addrspace(10)*) {
top:
ret double 0x1CC0000000000000
} |
The scaling constant in `hypot` wasn't being constant folded. Fixes #36353.
PR #36365 improves the relation from 10-fold to 7-fold on my machine (without FMA_NATIVE). That may go down to 5-fold on FMA machines. |
I had noticed that evaluating those constants seems to slow things down but did not know why. Glad someone has a notion about that. |
micro optimization... @inline function hypot(x::T, y::T) where T<:AbstractFloat
absx, absy = abs(x), abs(y)
# Return Inf if either or both inputs is Inf (Compliance with IEEE754)
isinf(absx) && return absx
isinf(absy) && return absy
# Order the operands (ax >= ay)
ax = ifelse(absx < absy, absy, absx)
ay = ifelse(absx < absy, absx, absy)
... Edit: |
@kimikage can you open a pull request? |
Perhaps I can, but I don't think it's worth the separate PR now. We still need the discussion about #33011, the approach of OP and the |
One minor comment I think
would actually need to be
Since the goal is to have x > y. |
The scaling constant in `hypot` wasn't being constant folded. Fixes #36353.
I'll leave this open since my fix was only partial |
The scaling constant in `hypot` wasn't being constant folded. Fixes JuliaLang#36353.
@kimikage I like the julia> a = sqrt(typemax(Float32) |> prevfloat)
1.8446743f19
julia> hypot(a,a)
2.6087633f19
julia> hypot_f32(a,a)
2.6087633f19 |
yeah. I would approve that PR. Trying to do fancy stuff with |
|
I just made massive use of
hypot(a, b)
with real arguments. I discovered an interesting performance bottleneck.I am aware of #31922 and #33224, which both focus on accuracy.
I want to present the following measurements:
That is a runtime factor of 10 compared to the (maybe naive) implementation:
Comparing the accuracy of both implementations gives:
The return values of
foo
are the mean values of the relative deviations from the accurate values in 1-norm, 2-norm, and Inf-norm.I want to discuss, if those deviations in accuracy justify the performance burden.
The text was updated successfully, but these errors were encountered: