Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve performance for number to string conversion functions #28661

Merged
merged 2 commits into from
Aug 15, 2018

Conversation

KristofferC
Copy link
Sponsor Member

This is kinda silly but it is what it is...

Benchmark:

for base in (2, 8, 10, 16)
    @btime begin
        u = rand(UInt32)
        string(u; base=$base)
    end
end

Before

  385.406 ns (2 allocations: 118 bytes)
  168.925 ns (2 allocations: 96 bytes)
  131.273 ns (2 allocations: 96 bytes)
  149.896 ns (2 allocations: 96 bytes)

After

  72.232 ns (2 allocations: 118 bytes)
  49.118 ns (2 allocations: 96 bytes)
  61.106 ns (2 allocations: 96 bytes)
  68.188 ns (2 allocations: 96 bytes)

@KristofferC
Copy link
Sponsor Member Author

Noticed this when looking into the parse_int regression reported in JuliaLang/www.julialang.org#139.

@KristofferC
Copy link
Sponsor Member Author

The following patch also manages to speed it up:

diff --git a/base/char.jl b/base/char.jl
index 700c61db27c..d071a3988bd 100644
--- a/base/char.jl
+++ b/base/char.jl
@@ -108,10 +108,14 @@ and [`show_invalid`](@ref).
 """
 isoverlong(c::AbstractChar) = false

-function UInt32(c::Char)
+@inline function UInt32(c::Char)
     # TODO: use optimized inline LLVM
     u = reinterpret(UInt32, c)
     u < 0x80000000 && return u >> 24
+    return _uint32(u)
+end
+
+@noinline function _uint32(u::UInt32)
     l1 = leading_ones(u)
     t0 = trailing_zeros(u) & 56
     (l1 == 1) | (8l1 + t0 > 32) |
@@ -142,8 +146,12 @@ that support overlong encodings should implement `Base.decode_overlong`.
 """
 decode_overlong

-function Char(u::UInt32)
+@inline function Char(u::UInt32)
     u < 0x80 && return reinterpret(Char, u << 24)
+    return _char(u)
+end
+
+@noinline function _char(u::UInt32)
     u < 0x00200000 || code_point_err(u)::Union{}
     c = ((u << 0) & 0x0000003f) | ((u << 2) & 0x00003f00) |
         ((u << 4) & 0x003f0000) | ((u << 6) & 0x3f000000)

It is a bit more general but I guess might slow things down if the old Char function inlined and you are dealing with non ASCII a lot?

@johnfgibson
Copy link
Contributor

That is great. Any chance there's a similar fix for a performance regression in the print_to_file microbenchmark? It went from 6.86 in julia-0.6.4 to 10.8 in julia-0.7.0 and 1.0.0. The core of the benchmark is

for i = 1:n
    @printf(io, "%d %d\n", i, i + 1)
end

@jebej
Copy link
Contributor

jebej commented Aug 15, 2018

Maybe use Int('0') for clarity?

@KristofferC
Copy link
Sponsor Member Author

That is great. Any chance there's a similar fix for a performance regression in the print_to_file microbenchmark? It went from 6.86 in julia-0.6.4 to 10.8 in julia-0.7.0 and 1.0.0. The core of the benchmark is

There are, in general, String regressions a bit here and there. I'll see if there is some simple fix that can be applied like here.

@KristofferC
Copy link
Sponsor Member Author

Maybe use Int('0') for clarity?

That kills performance.

@KristofferC
Copy link
Sponsor Member Author

For now, I think we should go with this "local" performance improvement and then we can look at a perhaps more general but more intrusive change like #28661 (comment).

@jebej
Copy link
Contributor

jebej commented Aug 15, 2018

That kills performance.

Argh, I've been using this type of pattern. Isn't the conversion done at compile time? Why would it be worse than just '0'?

@KristofferC
Copy link
Sponsor Member Author

Argh, I've been using this type of pattern. Isn't the conversion done at compile time? Why would it be worse than just '0'?

It should be and in some cases it does.

julia> f() = UInt8('0')
f (generic function with 1 method)

julia> @code_llvm f()

; Function f
; Location: REPL[4]:1
define i8 @julia_f_34955() {
top:
  ret i8 48
}

But for some reason, it doesn't seem to happen here. Not sure why.

@KristofferC KristofferC merged commit 472fe5f into master Aug 15, 2018
@KristofferC KristofferC deleted the kc/perf_parse branch August 15, 2018 19:04
@vtjnash
Copy link
Sponsor Member

vtjnash commented Aug 15, 2018

Typically, we handle that case by writing it thusly:

@eval function f() = begin
    some_computation($(some_constant_computation(some_constants)))
end

KristofferC added a commit that referenced this pull request Aug 19, 2018
* improve performance for number to string conversion functions

(cherry picked from commit 472fe5f)
KristofferC added a commit that referenced this pull request Aug 19, 2018
* improve performance for number to string conversion functions

(cherry picked from commit 472fe5f)
@KristofferC KristofferC mentioned this pull request Aug 19, 2018
KristofferC added a commit that referenced this pull request Aug 19, 2018
* improve performance for number to string conversion functions

(cherry picked from commit 472fe5f)
KristofferC added a commit that referenced this pull request Sep 8, 2018
* improve performance for number to string conversion functions

(cherry picked from commit 472fe5f)
KristofferC added a commit that referenced this pull request Sep 8, 2018
* improve performance for number to string conversion functions

(cherry picked from commit 472fe5f)
KristofferC added a commit that referenced this pull request Feb 11, 2019
* improve performance for number to string conversion functions

(cherry picked from commit 472fe5f)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:strings "Strings!" performance Must go faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants