Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance regression in "spellcheck" string processing benchmark #50458

Closed
KristofferC opened this issue Jul 7, 2023 · 7 comments
Closed
Labels
domain:strings "Strings!" kind:regression Regression in behavior compared to a previous version performance Must go faster
Milestone

Comments

@KristofferC
Copy link
Sponsor Member

KristofferC commented Jul 7, 2023

The benchmark at https://github.com/JuliaCI/BaseBenchmarks.jl/blob/master/src/problem/SpellCheck.jl has a ~4x regression vs 1.9. A repro that can be copy pasted is

using Downloads
spellcheck = "https://raw.githubusercontent.com/JuliaCI/BaseBenchmarks.jl/master/src/problem/data/norvig_spellcheck.txt"
mkpath("data")
Downloads.download(spellcheck, joinpath("data", "norvig_spellcheck.txt"))

module ProblemBenchmarks
    using Downloads
    const PROBLEM_DATA_DIR = joinpath(dirname(@__FILE__), "data")
    file = Downloads.download("https://raw.githubusercontent.com/JuliaCI/BaseBenchmarks.jl/master/src/problem/SpellCheck.jl")
    include(file)
end

using BenchmarkTools

@btime ProblemBenchmarks.SpellCheck.perf_spellcheck()

This gives

1.324 s (23983215 allocations: 1.49 GiB) # 1.9
4.900 s (133224596 allocations: 4.76 GiB) # 1.10

Quickly looking at a profile, this looks suspicious:

 13╎    ╎    ╎    ╎    ╎    ╎    ╎   2154 none:0; (::Main.ProblemBenchmarks.SpellCheck.var"#4#9")(::Tuple{Tuple{String, String}, Char})
 18╎    ╎    ╎    ╎    ╎    ╎    ╎    2057 @Base/strings/substring.jl:225; string
489╎    ╎    ╎    ╎    ╎    ╎    ╎     489  @Base/strings/substring.jl:229; _string(::String, ::Vararg{Union{Char, SubString{String}, String, Symbol}})
156╎    ╎    ╎    ╎    ╎    ╎    ╎     156  @Base/strings/substring.jl:231; _string(::String, ::Vararg{Union{Char, SubString{String}, String, Symbol}})
  3╎    ╎    ╎    ╎    ╎    ╎    ╎     555  @Base/strings/substring.jl:243; _string(::String, ::Vararg{Union{Char, SubString{String}, String, Symbol}})
551╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎ 552  @Base/tuple.jl:72; iterate(t::Tuple{String, Vararg{Union{Char, SubString{String}, String, Symbol}}}, i::Int64)
165╎    ╎    ╎    ╎    ╎    ╎    ╎     165  @Base/strings/substring.jl:246; _string(::String, ::Vararg{Union{Char, SubString{String}, String, Symbol}})
  1╎    ╎    ╎    ╎    ╎    ╎    ╎     593  @Base/strings/substring.jl:254; _string(::String, ::Vararg{Union{Char, SubString{String}, String, Symbol}})
586╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎ 592  @Base/tuple.jl:72; iterate(t::Tuple{String, Vararg{Union{Char, SubString{String}, String, Symbol}}}, i::Int64)

In 1.9, there seems to be way less time spent in that part:

   24╎    ╎    ╎    ╎    ╎    ╎    ╎   201   @Base/array.jl:0; (::Main.ProblemBenchmarks.SpellCheck.var"#4#9")(::Tuple{Tuple{String, String}, Char})
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    63    @Base/strings/substring.jl:237; string(::String, ::Char, ::Vararg{Union{Char, SubString{String}, String, Symbol}})
   63╎    ╎    ╎    ╎    ╎    ╎    ╎     63    @Base/strings/string.jl:90; _string_n
@KristofferC KristofferC added performance Must go faster kind:regression Regression in behavior compared to a previous version domain:strings "Strings!" labels Jul 7, 2023
@KristofferC KristofferC added this to the 1.10 milestone Jul 7, 2023
@gbaraldi
Copy link
Member

I'm pretty sure #49249 is the same as this?

@gbaraldi
Copy link
Member

image
This is very similar to what I posted on the other issue, specificically
image
appears in both

@oscardssmith
Copy link
Member

Is this fixed by #50444?

@gbaraldi
Copy link
Member

Also no ;)

@oscardssmith
Copy link
Member

oscardssmith commented Aug 11, 2023

reduced to

@btime string("a", 'b')

On 1.9: 29.058 ns (3 allocations: 88 bytes)
On 1.10 175.154 ns (6 allocations: 184 bytes)

@oscardssmith
Copy link
Member

reduced further to

julia> Core.Compiler.tmerge(Tuple{Base.BitSigned, Int}, Nothing)
Union{Nothing, Tuple{Any, Any}}

julia> Core.Compiler.tmerge(Tuple{Any, Int}, Nothing)
Union{Nothing, Tuple{Any, Int64}}

fixed by #50927.
That said I still think we might want to merge #50891. Thoughts?

@oscardssmith
Copy link
Member

fixed by #50929 (once that gets backported)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:strings "Strings!" kind:regression Regression in behavior compared to a previous version performance Must go faster
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants