Revisit the algorithm used to compute ranges with floating points #33677

samo-lin · 2019-10-25T17:03:25Z

We have observed unexpected results for some ranges with floating point numbers. Here is a short example:

Let's construct a vector with x-coordinates (xv) and with values in-between (xc).

$>julia

julia> xmin = 0.0;

julia> xmax = 1.86038;

julia> Lx = xmax - xmin;

julia> nx = 7;

julia> dx = Lx/nx;

julia> xv  = [xmin:dx:xmax...]
8-element Array{Float64,1}:

 0.0               
 0.2657685714285714
 0.5315371428571428
 0.7973057142857143
 1.0630742857142856
 1.3288428571428572
 1.5946114285714286
 1.86038           

julia> xc  = [xmin+dx/2:dx:xmax-dx/2...]
6-element Array{Float64,1}:
 0.1328842857142857 
 0.39865285714285714
 0.6644214285714285 
 0.93019            
 1.1959585714285714 
 1.4617271428571428

We can observe that xc contains 6 elements rather than 7 elements as expected. Concretely, we would expect xc to contain a 7th value being approximately xmax-dx/2

julia> xmax-dx/2
1.7274957142857141

in particularly given that we have:

julia> xmax-dx/2 - xc[end] ≈ dx
true

I suppose that xc contains no 7th value, because xmin+dx/2 + (nx-1)*dx

julia> xmin+dx/2 + (nx-1)*dx
1.7274957142857144

is greater than the end value of the range (xmax-dx/2):

julia> xmin+dx/2 + (nx-1)*dx > xmax-dx/2

Finally, compare with the result obtained with linspace where the analogically computed xc contains 7 values:

julia> xc = LinRange(xmin+dx/2, xmax-dx/2, nx)
7-element LinRange{Float64}:
 0.132884,0.398653,0.664421,0.93019,1.19596,1.46173,1.7275

I have seen that the algorithm used to compute ranges with floating points has been discussed a lot in 2013. However, I believe the algorithm needs to be revisited one more time following the above example.

Thanks!

The text was updated successfully, but these errors were encountered:

timholy · 2019-10-25T17:16:25Z

Nope, it's already perfect. Never wrong.

It will always be possible to perform a sufficient number of roundoff-prone calculations and get something that will surprise someone. Fortunately, you can use range(first, last, length=n) to get exactly what you want.

samo-lin · 2019-10-25T17:33:35Z

Maybe a solution could then be to swap automatically to the LinRange algorithm if (<range end> - <range start>)/<range step> boils down to an integer as in this case:

julia> nsteps = ((xmax-dx/2) - (xmin+dx/2))/dx
6.0

In this example we would get as already noted above:

julia> LinRange(xmin+dx/2, xmax-dx/2, Int(nsteps)+1)
7-element LinRange{Float64}:
 0.132884,0.398653,0.664421,0.93019,1.19596,1.46173,1.7275

mbauman · 2019-10-25T18:31:15Z

We wouldn't want to return a LinRange — as that would be type-unstable — but we could perhaps return a differently-computed StepRangeLen (that is, what you get from range(start, stop, Int(nsteps)+1)). Not sure how this would interact with the rationalization step and performance, but it might be worth doing.

The construction of ranges is most robust when you don't do too many independent floating point computations on the endpoints/steps — each might introduce some precision loss, pushing you farther away from an exact solution. What you can do instead is to do the computation on the range itself — like:

julia> xv = (xmin:dx:xmax)
0.0:0.2657685714285714:1.86038

julia> xc = (xv .+ dx/2)[1:end-1]
0.1328842857142857:0.2657685714285714:1.7274957142857144

julia> collect(xc)
7-element Array{Float64,1}:
 0.1328842857142857
 0.39865285714285714
 0.6644214285714286
 0.93019
 1.1959585714285714
 1.4617271428571428
 1.7274957142857144

Note that the last endpoint there is not the same as xmax - dx/2! It is, however, the same as xc[end-1] + dx/2! Which value do you really want? This is the crux of the problem.

julia> xmax - dx/2
1.7274957142857141

julia> xv[end-1]+dx/2
1.7274957142857144

samo-lin · 2019-10-26T09:24:09Z

Thanks to Tim and Matt for the comments!

We wouldn't want to return a LinRange — as that would be type-unstable — but we could perhaps return a differently-computed StepRangeLen

When I wrote that one could maybe swap automatically to the LinRange algorithm, I really meant to refer only to the algorithm. I don't want to suggest any implementation as I have no knowledge about how the different range functions are implemented.
In what differs the algorithm used for StepRangeLen from the algorithm used for LinRange?

Which value do you really want? This is the crux of the problem.

I am aware that there is no right or wrong when choosing between different algorithms used to compute ranges with floating point numbers (there was already a lot of discussion on that topic in 2003 as noted above). All that is are some expectations a common user has about what he will obtain when calling a range function for floating point numbers and I believe we should try to match as much as possible these expectations. I am glad to see that you agree that "it might be worth doing" for the case brought up in this issue.

samo-lin · 2019-10-28T10:18:43Z

Not sure how this would interact with the rationalization step and performance, but it might be worth doing.

I believe the algorithm swap should not be performance relevant if the decision which algorithm to use is taken only once per range object (at the moment of its creation); this information could maybe be stored in some way in the range object (ideally in a way that no branching is required when computing the range elements)?

It seems that previously the compensated arithmetic code was all designed ad-hoc instead of using the standard algorithms. Some of the introduced code will also be used in `base/div.jl` to fix I didn't check how much this improves the situation. In particular, the example in JuliaLang#33677 still gives the same result, and I wasn't able to evaluate JuliaLang#23497 because of how much Julia changed in the meantime.

It seems that previously the compensated arithmetic code was all designed ad-hoc instead of using the standard algorithms. Some of the introduced code will also be used in `base/div.jl` to fix issue JuliaLang#49450. I didn't check how much this improves the situation. In particular, the example in JuliaLang#33677 still gives the same result, and I wasn't able to evaluate JuliaLang#23497 because of how much Julia changed in the meantime.

Update the arithmetic code to use algorithms published since twiceprecision.jl got written. Handle complex numbers. Prevent harmful promotions. Some of the introduced extended precision arithmetic code will also be used in `base/div.jl` to tackle issue JuliaLang#49450 with PR JuliaLang#49561. TODO: evaluate the accuracy improvements and write tests after JuliaLang#49616 gets merged Results: The example in JuliaLang#33677 still gives the same result. I wasn't able to evaluate JuliaLang#23497 because of how much Julia changed in the meantime. TODO: check whether there are improvements for JuliaLang#26553 Updates JuliaLang#49589

Update the arithmetic code to use algorithms published since twiceprecision.jl got written. Handle complex numbers. Prevent some harmful promotions. Some of the introduced extended precision arithmetic code will also be used in `base/div.jl` to tackle issue JuliaLang#49450 with PR JuliaLang#49561. TODO: write tests TODO: some tests fail sometimes: ``` Test Failed at /home/nsajko/tmp/julia/test/ranges.jl:223 Expression: cmp_sn2(Tw(xw / yw), astuple(x / y)..., slopbits) ``` Results: The example in JuliaLang#33677 still gives the same result. I wasn't able to evaluate JuliaLang#23497 because of how much Julia changed in the meantime. Proof that JuliaLang#26553 is fixed: ```julia-repl julia> const TwicePrecision = Base.TwicePrecision Base.TwicePrecision julia> a = TwicePrecision{Float64}(0.42857142857142855, 2.3790493384824782e-17) Base.TwicePrecision{Float64}(0.42857142857142855, 2.3790493384824782e-17) julia> b = TwicePrecision{Float64}(3.4, 8.881784197001253e-17) Base.TwicePrecision{Float64}(3.4, 8.881784197001253e-17) julia> a_minus_b_inexact = Tuple(a - b) (-2.9714285714285715, 1.0150610510858574e-16) julia> BigFloat(a) == sum(map(BigFloat, Tuple(a))) true julia> BigFloat(b) == sum(map(BigFloat, Tuple(b))) true julia> a_minus_b = BigFloat(a) - BigFloat(b) -2.971428571428571428571428571428577679589762353999797347402693647078382455095635 julia> Float64(a_minus_b) == first(a_minus_b_inexact) true julia> Float64(a_minus_b - Float64(a_minus_b)) == a_minus_b_inexact[begin + 1] true ``` Fixes JuliaLang#26553 Updates JuliaLang#49589

Update the arithmetic code to use algorithms published since twiceprecision.jl got written. Handle complex numbers. Prevent some harmful promotions. Some of the introduced extended precision arithmetic code will also be used in `base/div.jl` to tackle issue JuliaLang#49450 with PR JuliaLang#49561. Results: The example in JuliaLang#33677 still gives the same result. I wasn't able to evaluate JuliaLang#23497 because of how much Julia changed in the meantime. Proof that JuliaLang#26553 is fixed: ```julia-repl julia> const TwicePrecision = Base.TwicePrecision Base.TwicePrecision julia> a = TwicePrecision{Float64}(0.42857142857142855, 2.3790493384824782e-17) Base.TwicePrecision{Float64}(0.42857142857142855, 2.3790493384824782e-17) julia> b = TwicePrecision{Float64}(3.4, 8.881784197001253e-17) Base.TwicePrecision{Float64}(3.4, 8.881784197001253e-17) julia> a_minus_b_inexact = Tuple(a - b) (-2.9714285714285715, 1.0150610510858574e-16) julia> BigFloat(a) == sum(map(BigFloat, Tuple(a))) true julia> BigFloat(b) == sum(map(BigFloat, Tuple(b))) true julia> a_minus_b = BigFloat(a) - BigFloat(b) -2.971428571428571428571428571428577679589762353999797347402693647078382455095635 julia> Float64(a_minus_b) == first(a_minus_b_inexact) true julia> Float64(a_minus_b - Float64(a_minus_b)) == a_minus_b_inexact[begin + 1] true ``` Fixes JuliaLang#26553 Updates JuliaLang#49589

brenhinkeller added the domain:ranges Everything AbstractRange label Nov 20, 2022

nsajko mentioned this issue Apr 29, 2023

Base: make TwicePrecision more accurate using standard algorithms #49569

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revisit the algorithm used to compute ranges with floating points #33677

Revisit the algorithm used to compute ranges with floating points #33677

samo-lin commented Oct 25, 2019 •

edited

Loading

timholy commented Oct 25, 2019

samo-lin commented Oct 25, 2019 •

edited

Loading

mbauman commented Oct 25, 2019 •

edited

Loading

samo-lin commented Oct 26, 2019

samo-lin commented Oct 28, 2019 •

edited

Loading

Revisit the algorithm used to compute ranges with floating points #33677

Revisit the algorithm used to compute ranges with floating points #33677

Comments

samo-lin commented Oct 25, 2019 • edited Loading

timholy commented Oct 25, 2019

samo-lin commented Oct 25, 2019 • edited Loading

mbauman commented Oct 25, 2019 • edited Loading

samo-lin commented Oct 26, 2019

samo-lin commented Oct 28, 2019 • edited Loading

samo-lin commented Oct 25, 2019 •

edited

Loading

samo-lin commented Oct 25, 2019 •

edited

Loading

mbauman commented Oct 25, 2019 •

edited

Loading

samo-lin commented Oct 28, 2019 •

edited

Loading