Switch float printing from grisu to ryu algorithm #32799

quinnj · 2019-08-05T17:47:31Z

Because it's faster and simpler.

TODO:

Update the license of ryu code
remove grisu code (might need to chat w/ @StefanKarpinski about printf here, but I think that's the only place we were using grisu internals)
Currently, this prints Float16/Float32/Float64 the same all the time (like we do for Signed); personally, I'm not a huge fan of the current 1.0, 1.0f0, Float16(1.0) inconsistency, so I'd rather just treat them like signed integers and print them all the same. If people think that's too heretical, I can look into keeping the same "typed" logic

Will fix #14698 and #22137

JeffBezanson · 2019-08-05T17:52:48Z

so I'd rather just treat them like integers and print them all the same

Unsigned integers are printed differently by show.

quinnj · 2019-08-05T19:14:04Z

Unsigned integers are printed differently by show.

Yes, I should clarify that I was referring to Signed and how Int8, Int16, Int32, and Int64 all show the same, regardless of size.

JeffreySarnoff · 2019-08-07T10:52:22Z

@quinnj something in the transition is reaching out from Base and grabbing DoubleFloats.
I am sure this is not intended, but I am not familiar with the flow of control and fallbacks.
If you shine some light on what is happening and what, added into DoubleFloats, would
preclude crashing -- I am happy to do that.
https://discourse.julialang.org/t/base-printf-ini-dec-has-a-fall-thru-issue/27247

quinnj · 2019-08-07T13:37:29Z

Ok, I've pushed some updates here, notably adding some enhancements to Ryu.writeshortest and Ryu.writefixed to allow better matching of how we print things in Base (specifically, embedding a bunch of the logic from this function into the float-printing algorithms themselves).

So to give an overall status update on everything going on here:

Switching from using grisu to ryu algorithm for float printing; the ryu code is simpler and more performant (typically in the range of 2x-3x, but in some cases more)
For Float64, we pretty much print things exactly how we used to, the sole difference I'm aware of is (current) 1.0e-45 vs. (ryu) 1e-45; that is, for integral scientific notation printing, ryu avoids the extra .0. Both are valid floats for any parsers out there. If people think this would be too disruptive a change, I can try to dig into the ryu code again to see where this could be added, but personally, I like the more concise notation while the e-45 still has a strong "this is a float thing" connotation
For Float16/Float32, this PR removes their special display, i.e. Float16(1.0) and 1.0f0 both print as 1.0; this matches what we do for Int8/Int16/Int32/Int64/BigInt printing. I personally like the consistency, while noting that the implementation code is much simpler, and the outputting is faster (though admittedly, show performance isn't necessarily the most important).

The remaining TODO item is switching over the usages of Grisu in printf.jl to use Ryu. I've looked into it a little, and part of the difficulty is that Grisu is "lower level" than Ryu, with Grisu.grisu populating just plain digits in a buffer, and returning the # of digits, the position of the decimal point, and whether the input was negative. It's then left up to the user to take the buffer of digits and do the actual printing. Ryu, on the other hand, takes a buffer, and populates everything, negative sign, digits, decimal point, scientific notation, etc. So in my mind, we have two options, copy-paste some parts of Ryu into a new Ryu.printf function that only does what the Grisu.grisu function does, populating digits and returning len/decimal point. That requires no changes in printf.jl, but has some non-ideal code duplication in Ryu. Otherwise, we adapt printf.jl to be able to use Ryu directly, probably w/o needing to do as much work (since Ryu can do more). Probably depending on what @StefanKarpinski thinks, I can take one approach or the other.

simonbyrne · 2019-08-08T23:02:44Z

This looks really nice.

It would be nice if it was somewhat extensible (allowing digit separators, custom spacing, exponent control, etc.) I'm thinking in particular of WIP: string formatting Julep Juleps#49 (yes I know it is old...).
I'd also love a way to print the exact decimal value of a float.

quinnj · 2019-08-13T17:37:56Z

Alright, I think this is good to go; the 32-bit and windows 64-bit CI failures look unrelated. As a quick update/recap:

The commit in this PR add the the base/ryu directory w/ functions Ryu.writeshortest, Ryu.writefixed and Ryu.writeexp, which correspond to the printf format options %g, %f, and %e, respectively
This PR doesn't remove the grisu code, since the Printf stdlib relies heavily on grisu; see Rewrite printf functionality #32859
This PR switches core show for Float16, Float32, and Float64to use the ryu writeshortest algorithm instead of grisu
The only functional change here should be dropping the Float16 and Float32 type-specific printing; e.g. Float16(1.0) => 1.0 and 1.0f6 => 1.0e6. Note that I previously considered also having 1.0e6 => 1e6, but have since change it back to printing at least one decimal digit on integral scientific notation prints
Performance is great! On the order of 2-5x faster across the board, here are a couple examples to whet the appetite of your merge finger:

julia> @btime Ryu.writeshortest(2.2250738585072014e-308)
  61.852 ns (2 allocations: 496 bytes)
"2.2250738585072014e-308"

julia> @btime string(2.2250738585072014e-308)
  299.455 ns (7 allocations: 352 bytes)
"2.2250738585072014e-308"

julia> @btime Ryu.writeshortest(1.0)
  46.744 ns (2 allocations: 480 bytes)
"1.0"

julia> @btime string(1.0)
  162.301 ns (4 allocations: 192 bytes)
"1.0"

julia> @btime Ryu.writeshortest(512.0e+3)
  50.171 ns (2 allocations: 480 bytes)
"512000.0"

julia> @btime string(512.0e+3)
  189.952 ns (4 allocations: 192 bytes)
"512000.0"

Note that I didn't add a NEWS entry because I'm not sure where I'd put it; we seem to have a 1.3 section, but I think 1.3 is frozen? So this would be 1.4? Not sure.

StefanKarpinski · 2019-08-13T18:14:01Z

1.3 is not yet frozen, you've got a couple of days. Then again, I'm not sure if we want to merge this right before 1.3 feature freeze or very early in the 1.4 release cycle.

quinnj · 2019-08-13T19:38:46Z

Ah, good to know. I mean, speaking somewhat selfishly, I'd rather merge sooner than wait for the next release cycle, just because I've kind of got the momentum going here and can respond quickly to things that crop up over the release candidates. But if people feel like it's too risky, then I guess we can wait. I'm not currently planning on doing any more here or w/ Printf now that the tests are all passing (there's the one weird generate_precompile.jl thing that @KristofferC says should be fixed soon), but I am going to incorporate the code, probably into Parsers.jl that can use the code on older versions.

StefanKarpinski · 2019-08-13T20:50:19Z

Yeah, I'd say let's go for it and get lots of testing in the 1.3 release.

vtjnash · 2019-08-14T04:13:36Z

base/ryu/Ryu.jl

+        decchar::UInt8=UInt8('.')) where {T <: Base.IEEEFloat}
+    buf = Vector{UInt8}(undef, neededdigits(T))
+    pos = writeshortest(buf, 1, x)
+    return unsafe_string(pointer(buf), pos-1)


forcing an (unsafe) copy here seem unnecessary. above do _string_n(neededdigits(T)) to make the Vector, then use String(resize!(buf, pos - 1)) here

What if we just add an @assert that pos <= length(buf)? It seems like that would actually ensure more safety (in the case pos becomes something unsafe-ly large)

Yes, but that would still be slow. I'm suggesting you don't make an extra copy of the data here by sticking with the safe API.

Oh.......got it. Yeah, we can do that.

Well, we actually want StringVector(n) instead of _string_n, right? The underlying functions all expect a Vector{UInt8}.

vtjnash · 2019-08-14T04:14:30Z

base/ryu/exp.jl

+        i = len - 1
+        while i >= 0
+            j = p10bits - e2
+            #=@inbounds=# mula, mulb, mulc = POW10_SPLIT[POW10_OFFSET[idx + 1] + i + 1]


Suggested change

#=@inbounds=# mula, mulb, mulc = POW10_SPLIT[POW10_OFFSET[idx + 1] + i + 1]

mula, mulb, mulc = POW10_SPLIT[POW10_OFFSET[idx + 1] + i + 1]

vtjnash · 2019-08-14T04:17:39Z

base/ryu/exp.jl

+    end
+    if e >= 100
+        c = e % 10
+        unsafe_copyto!(buf, pos, DIGIT_TABLE, 2 * div(e, 10) + 1, 2)


Suggested change

unsafe_copyto!(buf, pos, DIGIT_TABLE, 2 * div(e, 10) + 1, 2)

copyto!(buf, pos, DIGIT_TABLE, 2 * div(e, 10) + 1, 2)

vtjnash · 2019-08-14T04:18:32Z

base/ryu/shortest.jl

+    exp_form = true
+    pt = nexp + olength
+    prec = precision
+    if -4 < pt <= (precision == -1 ? 6 : precision) #&& !(pt >= olength && abs(mod(x + 0.05, 10^(pt - olength)) - 0.05) > 0.05)


Suggested change

if -4 < pt <= (precision == -1 ? 6 : precision) #&& !(pt >= olength && abs(mod(x + 0.05, 10^(pt - olength)) - 0.05) > 0.05)

if -4 < pt <= (precision == -1 ? 6 : precision)

vtjnash · 2019-08-14T04:20:29Z

base/ryu/utils.jl

+const MANTISSA_MASK = 0x000fffffffffffff
+const EXP_MASK = 0x00000000000007ff
+
+memcpy(d, doff, s, soff, n) = ccall(:memcpy, Cvoid, (Ptr{UInt8}, Ptr{UInt8}, Int), d + doff - 1, s + soff - 1, n)


wrong method signature for memcpy and memmove—also I think this is already unsafe_copyto!

vtjnash · 2019-08-14T04:25:24Z

base/ryu/Ryu.jl

+    end
+    buf = Vector{UInt8}(undef, neededdigits(T))
+    pos = writeshortest(buf, 1, x)
+    GC.@preserve buf unsafe_write(io, pointer(buf), pos - 1)


Suggested change

GC.@preserve buf unsafe_write(io, pointer(buf), pos - 1)

write(io, resize!(bu, pos - 1))

although it also seems like write(io, writeshortest(x)) would give the same result?

write(io, writeshortest(x)) would make an extra copy of the buffer though (when creating the string)

vtjnash · 2019-08-14T04:28:42Z

base/ryu/shortest.jl

@@ -0,0 +1,362 @@
+@inline function writeshortest(buf::Vector{UInt8}, pos, x::T,


as the only function which uses unsafe operations, should we add some bounds-checking for sanity?

vtjnash · 2019-08-14T04:36:54Z

I'm not really qualified to comment on the algorithm, so I just made a few comments on other stuff.

printf.jl to be able to use Ryu directly

you could see how we handled BigFloat, which had the same problem, but I know you're already looking into this.

vtjnash

seems to have always failed on linux32 CI?

quinnj · 2019-08-14T12:58:21Z

On the 32-bit CI: both windows & linux 32-bit have seemed to be really flaky. The windows CI seems to often have some kind of error building julia dependencies, and the linux 32-bit times out during the "generate precompile statements" step. I could look further into the latter, but at first glance it didn't seem like the changes here would have caused any issues.

vtjnash · 2019-08-14T14:24:31Z

I think you're making that up: I don't see any other time the linux32 bot run into an error like that in at least the past week, except every time it ran this PR. There were a couple configuration errors seen earlier this week so it would fail outside of the build, but those have gotten fixed.

quinnj · 2019-08-14T17:05:29Z

Sorry, I didn't mean they've been flaky in general, just on this PR; I inspected the 32-bit results a few times and the errors seems different enough from run to run that it didn't seem likely that anything here was a problem.

I've since identified indeed a bug on 32-bit and pushed a fix; along with updates to address @vtjnash's review comments.

quinnj · 2019-08-27T15:57:13Z

I'll merge this in 12 hours or so unless there are other comments/concerns.

not applicable: passes on 32-bit now

Keno · 2019-08-28T12:42:02Z

For Float16/Float32, this PR removes their special display, i.e. Float16(1.0) and 1.0f0 both print as 1.0; this matches what we do for Int8/Int16/Int32/Int64/BigInt printing. I personally like the consistency, while noting that the implementation code is much simpler, and the outputting is faster (though admittedly, show performance isn't necessarily the most important).

I really don't think I'm ok with this (sorry for being late to the party). For T,S <: Signed (with T of larger bit size than S), we have the property that for some string representation x, parse(T, x) == T(parse(S, x)). However, in floating point this may not necessarily be true. Example:

julia> x = rand(Float16)
0.907

julia> x == 0.907
false

julia> Float64(x)
0.9072265625

Keno · 2019-08-28T12:52:08Z

base/essentials.jl


 julia> reinterpret(Float32, UInt32[1 2 3 4 5])
 1×5 reinterpret(Float32, ::Array{UInt32,2}):
- 1.4013e-45  2.8026e-45  4.2039e-45  5.60519e-45  7.00649e-45


What happened here? These are not the same floating point value:

julia> 1.4013e-45 == 1.0e-45 false

Ignore me, they are in Float32:

julia> 1.4013f-45 == 1.0f-45 true

JeffBezanson · 2019-08-28T15:24:29Z

I agree we should restore the type info in show.

quinnj · 2019-08-28T17:06:21Z

So in my mind, this is a key difference between show and repr: show is just that, showing things, useful, pragmatic defaults. repr on the other hand has one additional requirement, the output can be input to get the same "thing". Do I have this wrong? My proposal then, would be to have repr print the Float16/1.0f0 junk, but have show be the more consistent, clean output as changed in this PR. Happy to go with whatever people want, but just wanted to give some context from where I came from w/ the change here.

JeffBezanson · 2019-08-28T17:43:51Z

  repr(x; context=nothing)

  Create a string from any value using the show function. You should not add methods to repr; define a
  show method instead.

quinnj · 2019-08-29T15:49:47Z

@Keno, I understand the example you share of the difference between Signed and floats (w/ floats essentially adding extra precision when converted "up"), but I also don't necessarily feel that printing additional type information really solves anything, as that "issue" exists regardless. It's just a fundamental property of IEEE floats there that the up conversion adds extra precision; does printing type information help a user somehow be more aware of that? I don't think that's certain at all.

Frankly, I think

julia> Float16(1)
Float16(1.0)

julia> Float32(1)
1.0f0

julia> Float64(1)
1.0

is inconsistent and ugly. While we're making changes to float printing code internals, I think it's a good time to try and address this and come up with something better. Maybe we introduce 1.0h0 for Float16 literal syntax? Maybe we print Float32 as Float32(1.0)? I think there are a few different things we can consider, but I feel like we should at least improve the situation here.

JeffBezanson · 2019-08-29T15:55:57Z

The immediate problem is this:

julia> repr(Any[1.0f0, 1.0])
"Any[1.0, 1.0]"

because show == repr. We could restore the type info for show, but remove it for show with a MIME type. That way they'd all print as 1.0 on their own in the REPL.

jmkuhn · 2019-08-29T18:45:14Z

Even before this PR we have:

julia> repr(Any[1.0f0, 1.0, big"1.0"])
"Any[1.0f0, 1.0, 1.0]"

StefanKarpinski · 2019-08-29T18:47:06Z

Yeah, the BigFloat printing should probably be fixed to print as big"1.0". A major issue there, of course, is that big"1.0" means different things depending on what the precision is set to.

simonbyrne · 2019-08-29T19:21:53Z

What if we were to break numerical juxtaposition for just macros? We could then allow something like

1.0@big

to be parsed as

@val_big("1.0")

(and similar for @val_F16, @val_F32, etc.)

chethega · 2019-08-29T20:11:53Z

I am a really big fan if the 1.0f0 printing and syntax.

I would actually prefer to extend it, i.e. print signed integers like 0i8, 14i16, 5i32 (on 64 bit systems, and just 5 on 32 bit), 7 (on 64 bit systems, and 7i64 on 64 bit) and -6i128. Unfortunately breaking for people who have variables named i8 or the like, and use juxtaposition for multiplication.

simonbyrne · 2020-02-14T22:13:05Z

microsoft/STL#498 includes a fairly extensive test suite of Ryu functionality that would be good to incorporate.

JeffreySarnoff · 2020-02-16T18:24:41Z

q.v. microsoft/STL@999ccb9#diff-114a6cc59bd499c417b11cd3355bfdf3

Switch float printing from grisu to ryu algorithm

fredrikekre added the domain:display and printing Aesthetics and correctness of printed representations of objects. label Aug 5, 2019

quinnj force-pushed the jq/ryu branch from cc5df43 to a46aba3 Compare August 7, 2019 12:55

quinnj mentioned this pull request Aug 7, 2019

@sprintf throws a bound error - but only on some inputs. #22137

Closed

quinnj force-pushed the jq/ryu branch from b788d6f to 0a2c30e Compare August 10, 2019 17:41

quinnj mentioned this pull request Aug 10, 2019

Rewrite printf functionality #32859

Merged

quinnj force-pushed the jq/ryu branch 3 times, most recently from 937e8ae to ca2b3fd Compare August 13, 2019 16:12

vtjnash reviewed Aug 14, 2019

View reviewed changes

vtjnash previously requested changes Aug 14, 2019

View reviewed changes

Switch float printing from grisu to ryu algorithm

869090b

quinnj force-pushed the jq/ryu branch from ca2b3fd to 869090b Compare August 14, 2019 16:11

quinnj mentioned this pull request Aug 22, 2019

Scientific notation write issue in v0.5 JuliaData/CSV.jl#480

Closed

mbauman requested a review from vtjnash August 27, 2019 16:12

quinnj merged commit ef75269 into master Aug 28, 2019

delete-merged-branch bot deleted the jq/ryu branch August 28, 2019 02:25

Keno added the status:triage This should be discussed on a triage call label Aug 28, 2019

Keno reviewed Aug 28, 2019

View reviewed changes

JeffBezanson mentioned this pull request Aug 29, 2019

restore type info in show of float types? #33115

Closed

quinnj mentioned this pull request Sep 17, 2019

Taking Juleps seriously #33239

Closed

Keno removed the status:triage This should be discussed on a triage call label Feb 27, 2020

Keno pushed a commit that referenced this pull request Jun 5, 2024

Merge pull request #32799 from JuliaLang/jq/ryu

26fa26e

Switch float printing from grisu to ryu algorithm

	#=@inbounds=# mula, mulb, mulc = POW10_SPLIT[POW10_OFFSET[idx + 1] + i + 1]
	mula, mulb, mulc = POW10_SPLIT[POW10_OFFSET[idx + 1] + i + 1]

	unsafe_copyto!(buf, pos, DIGIT_TABLE, 2 * div(e, 10) + 1, 2)
	copyto!(buf, pos, DIGIT_TABLE, 2 * div(e, 10) + 1, 2)

	if -4 < pt <= (precision == -1 ? 6 : precision) #&& !(pt >= olength && abs(mod(x + 0.05, 10^(pt - olength)) - 0.05) > 0.05)
	if -4 < pt <= (precision == -1 ? 6 : precision)

	GC.@preserve buf unsafe_write(io, pointer(buf), pos - 1)
	write(io, resize!(bu, pos - 1))

		@@ -0,0 +1,362 @@
		@inline function writeshortest(buf::Vector{UInt8}, pos, x::T,

Switch float printing from grisu to ryu algorithm #32799

Switch float printing from grisu to ryu algorithm #32799

Conversation

quinnj commented Aug 5, 2019 • edited Loading

JeffBezanson commented Aug 5, 2019

quinnj commented Aug 5, 2019

JeffreySarnoff commented Aug 7, 2019

quinnj commented Aug 7, 2019

simonbyrne commented Aug 8, 2019

quinnj commented Aug 13, 2019

StefanKarpinski commented Aug 13, 2019

quinnj commented Aug 13, 2019

StefanKarpinski commented Aug 13, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vtjnash commented Aug 14, 2019

vtjnash left a comment

Choose a reason for hiding this comment

quinnj commented Aug 14, 2019

vtjnash commented Aug 14, 2019

quinnj commented Aug 14, 2019

quinnj commented Aug 27, 2019

Keno commented Aug 28, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JeffBezanson commented Aug 28, 2019

quinnj commented Aug 28, 2019

JeffBezanson commented Aug 28, 2019

quinnj commented Aug 29, 2019

JeffBezanson commented Aug 29, 2019

jmkuhn commented Aug 29, 2019

StefanKarpinski commented Aug 29, 2019

simonbyrne commented Aug 29, 2019

chethega commented Aug 29, 2019

simonbyrne commented Feb 14, 2020 • edited Loading

JeffreySarnoff commented Feb 16, 2020 • edited Loading

quinnj commented Aug 5, 2019 •

edited

Loading

Keno commented Aug 28, 2019 •

edited

Loading

simonbyrne commented Feb 14, 2020 •

edited

Loading

JeffreySarnoff commented Feb 16, 2020 •

edited

Loading