Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

missing sparse matrix transpose methods (performance regression in 0.7.0beta2 vs 0.6) #28432

Closed
StephenVavasis opened this issue Aug 3, 2018 · 13 comments
Labels
domain:arrays:sparse Sparse arrays domain:linear algebra Linear algebra performance Must go faster

Comments

@StephenVavasis
Copy link
Contributor

The following statements are all slow in 0.7.0beta2 and are regressions from 0.6. They are slow because of missing sparse-matrix-transpose methods, so with this issue I'm requesting that someone write the missing methods (and even better, develop a tool or technique to automatically identify missing methods).

julia> x = spzeros(50000,50000)
50000×50000 SparseMatrixCSC{Float64,Int64} with 0 stored entries

julia> @time y = sparse(x')
  7.527323 seconds (608.47 k allocations: 30.442 MiB, 0.17% gc time)
50000×50000 SparseMatrixCSC{Float64,Int64} with 0 stored entries

julia> @time y = convert(SparseMatrixCSC{Float64,Int},x')
  7.351817 seconds (294 allocations: 406.281 KiB)
50000×50000 SparseMatrixCSC{Float64,Int64} with 0 stored entries

julia> @time y = hcat(x,x')
  7.672178 seconds (944.88 k allocations: 47.971 MiB, 0.61% gc time)
50000×100000 SparseMatrixCSC{Float64,Int64} with 0 stored entries

julia> @time y = vcat(x,x')
  7.634333 seconds (588.12 k allocations: 29.708 MiB, 0.05% gc time)
100000×50000 SparseMatrixCSC{Float64,Int64} with 0 stored entries
julia> @time is,js,es = findnz(x')
┌ Warning: `findnz(A::AbstractMatrix)` is deprecated, use `begin
│     I = findall(!iszero, A)
│     (getindex.(I, 1), getindex.(I, 2), A[I])
│ end` instead.
│   caller = top-level scope at util.jl:156
└ @ Core util.jl:156
  7.399162 seconds (520 allocations: 31.750 KiB)
(Int64[], Int64[], 0-element SparseArrays.SparseVector{Float64,Int64} with 0 stored entries)
@andreasnoack
Copy link
Member

CuArrays has a way of detecting this issue. See https://github.com/JuliaGPU/CuArrays.jl/blob/master/src/indexing.jl#L3-L8. It was recently discussed on Slack. The trick is to introduce a global flag to make scalar indexing throw an error. That can be used during tests to detect when the slow fallback methods are being used.

However, we might also need to consider the scope of defining methods that simply materialize the lazy transpose. In some cases, it might be more reasonable to ask the user to explicitly materialize the lazy transpose.

@StephenVavasis
Copy link
Contributor Author

I think that materializing is a reasonable option, but wouldn't it make sense for materializing to be the default (i.e., the output of the apostrophe operator) and lazy to be some more specialized operator? The performance hit for needless materialization is not nearly as big as the performance hit the other way around (not materializing, and then falling back to a dense matrix implementation).

@ararslan ararslan added performance Must go faster domain:linear algebra Linear algebra domain:arrays:sparse Sparse arrays labels Aug 4, 2018
@StephenVavasis
Copy link
Contributor Author

I have a suggestion on how to fix these and similar problems. Right now, there is a base type AbstractSparseMatrix. Require that every class that inherits from this abstract type should define a method convert(SparseMatrixCSC{T,Int}, A). Then, finally, for operations like matvec-mul, findnz, and so on, make the following definition:

   function *(A::AbstractSparseMatrix{T}, x::AbstractVector{T}) where {T}
      return convert(SparseMatrixCSC{T,Int}, A) * x
    end

In this manner, the fallback for unimplemented sparse methods will be SparseMatrixCSC operations instead of dense matrix operations.

@oscardssmith
Copy link
Member

if that was the path we wanted, we wouldn't require fallbacks, we could just manually convert by iterating over nz.

@StephenVavasis
Copy link
Contributor Author

Sorry, I didn't follow the previous comment. Could you explain in more detail? The fallbacks are not "required"; they are happening by accident because of unimplemented methods.

@oscardssmith
Copy link
Member

my point is that to get this behavior sister matrices don't need to be able to convert to csc. f(sparse, vector) can do this transformation itself.

@StephenVavasis
Copy link
Contributor Author

Sorry, I'm still not getting it. Are you saying that every subtype of AbstractSparseMatrix should provide a method for iterating over its nonzero entries? And then fallbacks for AbstractSparseMatrix for matvec-mul, findnz, hcat, etc can be implemented in terms of that method? If this is what you mean, then I agree, this would be a good solution to the issue I am raising.

@oscardssmith
Copy link
Member

Yes. Part of the reason for implementing this way is that findnz is already implemented for many sparse structures, and is a much more general operation.

@StephenVavasis
Copy link
Contributor Author

OK, I'm glad we're in agreement. Is there already a name for the protocol for iterating over nonzero entries of an abstract sparse matrix? Something analogous to iterate in stdlib?

@oscardssmith
Copy link
Member

yes. findnz it might have been changed to findstored I forget. I believe it is already used for most sparse structures

@StephenVavasis
Copy link
Contributor Author

The function that seems to iterate over nonzero entries is _sparse_findnextnz in sparsematrix.jl. This function is apparently never invoked, which is strange, because the Julia convention is that functions whose names start with underscores are helpers for nearby exported functions.

Another thing to note: the transpose of a sparse matrix is not an abstract sparse matrix in 0.7.0-beta2:

julia> import SparseArrays.sparse

julia> import SparseArrays.AbstractSparseMatrix

julia> eye1 = sparse(1:3,1:3,ones(3));

julia> isa(eye1, AbstractSparseMatrix)
true

julia> isa(eye1', AbstractSparseMatrix)
false

So the types have to be adjusted somehow to get this to be true in order for your scheme to work. Perhaps this requires multiple inheritance. If I recall correctly, there is a way to simulate multiple inheritance in Julia called "Holy traits", but I don't remember right now how they work.

@StephenVavasis
Copy link
Contributor Author

I looked up Holy traits, which are described here:

#2345 (comment)

and indeed, they can solve the above problems, albeit with a substantial refactoring of Base and stdlib. Consider, for example, how to define * so that it works properly for all of Julia's sparse matrices even though they don't belong to the same abstract type. First, every type of sparse matrix needs to define nziteration(), which is an iteration over its nonzero entries (in some order). Then we make definitions roughly as follows:

has_nziteration(::Any) = Val{false}
has_nziteration(::LinearAlgebra.Adjoint{F,SparseArrays.SparseMatrixCSC{F,I}} where {F,I} = Val{true}
# and so on for all other sparse matrix types
*(A,v) = _internal_multiply(A, v, has_nziteration(A))
function _internal_multiply(A, v, ::Type{Val{true}})
   y = zeros(eltype(A), size(A,1))
   for i,j,e in nziteration(A)
        y[i] += e * v[j]
   end
   y
end

I am willing to contribute some code to make this happen, but I don't have sufficient knowledge of Julia or clout in the community to start a PR that requires this much rewriting of Base and stdlib.

@ViralBShah
Copy link
Member

I note that performance is significantly improved in all the reported cases. Let's file new issues for cases that still need work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:arrays:sparse Sparse arrays domain:linear algebra Linear algebra performance Must go faster
Projects
None yet
Development

No branches or pull requests

5 participants