inference regression on `afoldl` #39175

jmichel7 · 2021-01-10T11:21:23Z

I tried my big algebra package on 1.6.0-beta1. Many parts are slightly faster, but one important part is many times slower.
I found several problems but the shortest MWE is

struct Perm
  d::Vector{Int}
end
function Perm(x::Integer...)
  if isempty(x) return Perm(Int[]) end
  d=Int.(1:max(x...))
  for i in 1:length(x)-1
    d[x[i]]=x[i+1]
  end
  d[x[end]]=x[1]
  Perm(d)
end
struct SymmetricGroup
  gens::Vector{Perm}
end
SymmetricGroup(n::Int)=SymmetricGroup([Perm(i,i+1) for i in 1:n-1])

using BenchmarkTools

# on julia 1.5.3
@btime SymmetricGroup(20)
  1.080 μs (39 allocations: 3.72 KiB)

# on julia 1.6.0-beta1
@btime SymmetricGroup(20)
  7.658 μs (115 allocations: 6.39 KiB)

The text was updated successfully, but these errors were encountered:

sostock · 2021-01-10T13:10:03Z

A major contribution seems to be the operation Int.(1:max(x...)).

Julia 1.5.3:

julia> f(x...) = Int.(1:max(x...))
f (generic function with 1 method)

julia> @btime f(1,2)
  50.888 ns (1 allocation: 96 bytes)

master:

julia> f(x...) = Int.(1:max(x...))
f (generic function with 1 method)

julia> @btime f(1,2)
  444.369 ns (5 allocations: 240 bytes)

cafaxo · 2021-01-10T13:27:13Z

The current master does not specialize the function.
Forcing specialization by doing f(x::Vararg{Int, N}) where N = Int.(1:max(x...)) works around this.

jmichel7 · 2021-01-10T13:41:35Z

It is not only the line you are focusing on. The following version

function Perm(x::Int...)
  if isempty(x) return Perm(Int[]) end
  y=collect(x)
  d=collect(1:maximum(y))
  for i in 1:length(y)-1
    d[y[i]]=y[i+1]
  end
  d[y[end]]=y[1]
  Perm(d)
end

Shows a factor of two difference between 1.5.3 and 1.6.0-beta1

cafaxo · 2021-01-10T13:50:32Z

If specialization of Perm is forced by doing
function Perm(x::Vararg{Int, N}) where N,
the performance difference is gone.

jebej · 2021-01-10T16:50:04Z

Specializing with function Perm(x::Vararg{Int, N}) where N works for me too. It speeds up both julia versions to be twice as fast as the original on 1.5.

jmichel7 · 2021-01-13T11:37:38Z

Is this going to be fixed in 1.6? It seems to me ... is frequently used, needing to replace it with Vararg to get back the performance is bothersome.

cafaxo · 2021-01-13T11:49:52Z

The performance tips page says

As a heuristic, Julia avoids automatically specializing on argument type parameters in three specific cases: Type, Function, and Vararg. Julia will always specialize when the argument is used within the method, but not if the argument is just passed through to another function.

Maybe it is not clear that the second sentence only refers to Function arguments.

jmichel7 · 2021-01-13T11:53:45Z

I do not understand. Do you find normal that the performance will be degraded in 1.6 with respect to 1.5?

timholy · 2021-01-13T11:59:46Z

@jmichel7, first, please be patient. You only filed this 3 days ago, there are not legions of people available to fix these issues and all of them have big TODO lists.

I am not certain this will be fixed, though; Julia's specialization heuristics have always influenced performance, both runtime and compile-time (latency). We've long recommended for getindex to use the Vararg{Int,N} format (see https://docs.julialang.org/en/v1/manual/interfaces/#man-interface-array). Julia 1.6 has worked very hard to get the latency down, and this might be a by-product. Let's see if someone else chimes in.

vtjnash · 2021-01-25T22:32:33Z

For some reason we're no longer able or willing to infer through afoldl in code_typed(max, (Int, Vararg{Int}))[end]

vtjnash · 2021-01-26T16:51:53Z

Ah, we changed this intentionally in the new version:
=> v1.5

julia> Base._methods_by_ftype(Tuple{typeof(Base.afoldl), typeof(max), Int64, Vararg{Int64}},
    4, Base.get_world_counter())
4-element Vector{Any}:
 Core.MethodMatch(Tuple{typeof(Base.afoldl), typeof(max), Int64}, svec(), afoldl(op, a) in Base at operators.jl:583, false)
 Core.MethodMatch(Tuple{typeof(Base.afoldl), typeof(max), Int64, Int64}, svec(), afoldl(op, a, b) in Base at operators.jl:584, false)
 Core.MethodMatch(Tuple{typeof(Base.afoldl), typeof(max), Int64, Int64, Int64, Int64, Int64, Int64, Int64, Int64, Int64, Int64, Int64, Int64, Int64, Int64, Int64, Int64, Vararg{Int64}}, svec(), afoldl(op, a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, qs...) in Base at operators.jl:586, false)
 Core.MethodMatch(Tuple{typeof(Base.afoldl), typeof(max), Int64, Int64, Vararg{Int64}}, svec(), afoldl(op, a, b, c...) in Base at operators.jl:585, false)

=> v1.6

julia> Base._methods_by_ftype(Tuple{typeof(Base.afoldl), typeof(max), Int64, Vararg{Int64}},
    3, Base.get_world_counter())
false

With constant-propagation, inference (and Varargs runtime) is likely better able to handle this version now (and it removes the n^2 behavior definitions for semi-low argument counts). Now we also need to force Vararg specialization up to 16 arguments manually, so we do that explicitly (and slightly hack-y). Fixes regression in #39175

With constant-propagation, inference (and Varargs runtime) is likely better able to handle this version now (and it removes the n^2 behavior definitions for semi-low argument counts). Now we also need to force Vararg specialization up to 16 arguments manually, so we do that explicitly (and slightly hack-y). Fixes regression in #39175 (cherry picked from commit 5cd1e3e)

With constant-propagation, inference (and Varargs runtime) is likely better able to handle this version now (and it removes the n^2 behavior definitions for semi-low argument counts). Now we also need to force Vararg specialization up to 16 arguments manually, so we do that explicitly (and slightly hack-y). Fixes regression in JuliaLang#39175

With constant-propagation, inference (and Varargs runtime) is likely better able to handle this version now (and it removes the n^2 behavior definitions for semi-low argument counts). Now we also need to force Vararg specialization up to 16 arguments manually, so we do that explicitly (and slightly hack-y). Fixes regression in #39175 (cherry picked from commit 5cd1e3e)

simeonschaub added kind:regression Regression in behavior compared to a previous version performance Must go faster labels Jan 10, 2021

cafaxo mentioned this issue Jan 13, 2021

Varargs invoke is inconsistently inferred JuliaDebug/Cthulhu.jl#119

Open

vtjnash changed the title ~~This function is 10 times slower on julia1.6.0-beta1 than on 1.5.3~~ inference regression on afoldl Jan 25, 2021

vtjnash mentioned this issue Jan 27, 2021

implement with fewer afoldl methods #39414

Merged

vtjnash closed this as completed in #39414 Jan 27, 2021

vtjnash mentioned this issue Apr 15, 2021

Allow use of 32 element tuples without dynamic allocation #40468

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

inference regression on `afoldl` #39175

inference regression on `afoldl` #39175

jmichel7 commented Jan 10, 2021

sostock commented Jan 10, 2021

cafaxo commented Jan 10, 2021

jmichel7 commented Jan 10, 2021 •

edited

Loading

cafaxo commented Jan 10, 2021 •

edited

Loading

jebej commented Jan 10, 2021

jmichel7 commented Jan 13, 2021

cafaxo commented Jan 13, 2021

jmichel7 commented Jan 13, 2021

timholy commented Jan 13, 2021

vtjnash commented Jan 25, 2021

vtjnash commented Jan 26, 2021 •

edited

Loading

inference regression on afoldl #39175

inference regression on afoldl #39175

Comments

jmichel7 commented Jan 10, 2021

sostock commented Jan 10, 2021

cafaxo commented Jan 10, 2021

jmichel7 commented Jan 10, 2021 • edited Loading

cafaxo commented Jan 10, 2021 • edited Loading

jebej commented Jan 10, 2021

jmichel7 commented Jan 13, 2021

cafaxo commented Jan 13, 2021

jmichel7 commented Jan 13, 2021

timholy commented Jan 13, 2021

vtjnash commented Jan 25, 2021

vtjnash commented Jan 26, 2021 • edited Loading

inference regression on `afoldl` #39175

inference regression on `afoldl` #39175

jmichel7 commented Jan 10, 2021 •

edited

Loading

cafaxo commented Jan 10, 2021 •

edited

Loading

vtjnash commented Jan 26, 2021 •

edited

Loading