-
-
Notifications
You must be signed in to change notification settings - Fork 35
Closed as duplicate of#850
Description
I noticed that matrix-matrix multiplication between H::Hermitian{Float64, Matrix{Float64}} and A::Adjoint{Float64, Matrix{Float64}} is orders of magnitude slower than corresponding multiplications H*H, A*A or with ordinary matrix M::Matrix{Float64}, that is, M*M, H*M or A*M. Below is a concrete example with some sample timings on my machine.
using LinearAlgebra, BenchmarkTools
H = Hermitian(randn(1000, 1000))
M = Matrix(H)
A = M'
@btime M*M;
# 6.248 ms (3 allocations: 7.63 MiB)
@btime H*H;
# 8.359 ms (6 allocations: 15.26 MiB)
@btime A*A;
# 6.435 ms (3 allocations: 7.63 MiB)
@btime M*A;
# 4.270 ms (3 allocations: 7.63 MiB)
@btime M*H;
# 6.350 ms (3 allocations: 7.63 MiB)
### AND HERE IS THE PROBLEM
@btime H*A;
# 1.025 s (3 allocations: 7.63 MiB)I think that there may be some problem with method dispatch, as it seems that the H*A case uses a very generic non-specialized method for vanilla AbstractArrays.
which(*, (typeof(H), typeof(A)))
# *(A::AbstractMatrix, B::AbstractMatrix)
# @ LinearAlgebra ~/.julia/juliaup/julia-1.11.3+0.x64.linux.gnu/share/julia/stdlib/v1.11/LinearAlgebra/src/matmul.jl:112Which is simply this:
function (*)(A::AbstractMatrix, B::AbstractMatrix)
TS = promote_op(matprod, eltype(A), eltype(B))
mul!(matprod_dest(A, B, TS), A, B)
endVersion Info
Julia Version 1.11.3
Commit d63adeda50d (2025-01-21 19:42 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: 16 × AMD Ryzen 9 5900HX with Radeon Graphics
WORD_SIZE: 64
LLVM: libLLVM-16.0.6 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 16 virtual cores)
Environment:
JULIA_PROJECT = @.
JULIA_PKG_PRESERVE_TIERED_INSTALLED = true
JULIA_REVISE = manual
Metadata
Metadata
Assignees
Labels
No labels