Skip to content

Massive performance degredation between v0.2.8 and v0.2.9 with LoopVectorization.jl #50

@MasonProtter

Description

@MasonProtter

I think something is not being communicated correctly to LoopVectorization.jl on version 0.2.9:

julia> using Tullio, LoopVectorization
[ Info: Precompiling Tullio [bc48ee85-29a4-5162-ae0b-a64e1601d4bc]

julia> tmul!(C, A, B) = @tullio C[i, j] = A[i, k] * B[k, j]
tmul! (generic function with 1 method)

julia> foreach((2, 10, 50, 100)) do N
           A, B = rand(N, N + 1), rand(N + 1, N + 2)
           @show N
           @btime tmul!(C, $A, $B) setup=(C=zeros($N, $N+2)) # Matmul with Tullio.jl
           @btime  mul!(C, $A, $B) setup=(C=zeros($N, $N+2)) # Matmul with OpenBLAS
       end
N = 2
  51.804 ns (0 allocations: 0 bytes)
  123.709 ns (0 allocations: 0 bytes)
N = 10
  210.035 ns (0 allocations: 0 bytes)
  371.549 ns (0 allocations: 0 bytes)
N = 50
  10.550 μs (0 allocations: 0 bytes)
  13.939 μs (0 allocations: 0 bytes)
N = 100
  25.340 μs (49 allocations: 3.19 KiB)
  39.860 μs (0 allocations: 0 bytes)

(@v1.5) pkg> st Tullio LoopVectorization
Status `~/.julia/environments/v1.5/Project.toml`
  [bdcacae8] LoopVectorization v0.8.26
  [bc48ee85] Tullio v0.2.8

Now, restarting julia,

(@v1.5) pkg> add Tullio@v0.2.9
   Updating registry at `~/.julia/registries/General`
   Updating git-repo `https://github.com/JuliaRegistries/General.git`
  Resolving package versions...
Updating `~/.julia/environments/v1.5/Project.toml`
  [bc48ee85]  Tullio v0.2.8  v0.2.9
Updating `~/.julia/environments/v1.5/Manifest.toml`
  [bc48ee85]  Tullio v0.2.8  v0.2.9

julia> using Tullio, LoopVectorization
[ Info: Precompiling Tullio [bc48ee85-29a4-5162-ae0b-a64e1601d4bc]

julia> tmul!(C, A, B) = @tullio C[i, j] = A[i, k] * B[k, j]
tmul! (generic function with 1 method)

julia> foreach((2, 10, 50, 100)) do N
           A, B = rand(N, N + 1), rand(N + 1, N + 2)
           @show N
           @btime tmul!(C, $A, $B) setup=(C=zeros($N, $N+2)) # Matmul with Tullio.jl
           @btime  mul!(C, $A, $B) setup=(C=zeros($N, $N+2)) # Matmul with OpenBLAS
       end
N = 2
  51.125 ns (0 allocations: 0 bytes)
  129.749 ns (0 allocations: 0 bytes)
N = 10
  847.338 ns (0 allocations: 0 bytes)
  372.568 ns (0 allocations: 0 bytes)
N = 50
  111.719 μs (0 allocations: 0 bytes)
  13.920 μs (0 allocations: 0 bytes)
N = 100
  261.787 μs (49 allocations: 3.19 KiB)
  38.220 μs (0 allocations: 0 bytes)

(@v1.5) pkg> st Tullio LoopVectorization
Status `~/.julia/environments/v1.5/Project.toml`
  [bdcacae8] LoopVectorization v0.8.26
  [bc48ee85] Tullio v0.2.9

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions