permutedims(::SparseMatrixCSC) calls a fallback method #28324

jlapeyre · 2018-07-28T14:30:03Z

The documentation recommends permutedims to get a material transpose. There is an efficient function, but it is not called:

julia> size(S)
(20, 10)

julia> typeof(S)
SparseMatrixCSC{Float64,Int64}

julia> @btime permutedims($S);
  2.860 μs (12 allocations: 2.38 KiB)

julia> @btime copy(Transpose($S));
  188.709 ns (5 allocations: 720 bytes)

julia> copy(Transpose(S)) == permutedims(S)
true

The text was updated successfully, but these errors were encountered:

andreasnoack · 2018-07-30T09:48:58Z

We'll need a permutedims methods for SparseMatrixCSC that just calls copy(transpose(A)). Would be great if you could prepare a PR.

jlapeyre · 2018-07-31T09:25:57Z

Would be great if you could prepare a PR.

OK. In case someone cares to comment first, these are the relevant lines in sparsematrix.jl

adjoint(A::SparseMatrixCSC) = Adjoint(A)
transpose(A::SparseMatrixCSC) = Transpose(A)
Base.copy(A::Adjoint{<:Any,<:SparseMatrixCSC}) = ftranspose(A.parent, conj)
Base.copy(A::Transpose{<:Any,<:SparseMatrixCSC}) = ftranspose(A.parent, identity)

The following should provide a minimal fix

Base.permutedims(A::SparseMatrixCSC) = ftranspose(A.parent, identity)
function Base.permutedims(A::SparseMatrixCSC, (a,b))
    (a, b) == (2, 1) && return ftranspose(A.parent, identity)
    (a, b) == (1, 2) && return copy(A)
    throw(ArgumentError("no valid permutation of dimensions"))
end

The error message could be better, but it is consistent with existing code.
I don't like ftranspose(A.parent, identity) repeated three times.
Note that the material transpose, which is supposed be called recursively, is exactly the same as permutedims, which is supposed to act on the top-level only. (This suggested PR would only do this more efficiently). In fact, the material transpose appears to act at the top-level only (Inconsistency in descending transpose between sparse and dense arrays #28369).
Using copy(transpose(M)) to get a material transpose is so obviously wrong at first glance that I guess it is not accidental. I searched for arguments in its favor, but I have not been able to find a relevant discussion. This use is semantically orthogonal to the other uses of copy. Whatever the choice, it should be generally applicable to wrappers meant to implement lazy evaluation. This precludes using copy to.... copy such wrapped objects. The assertion "That will never be needed" presumes an unusual degree of clairvoyance.

copy(1:3) returns a UnitRange. And, collect(1:3) returns a dense array. I seems that collect is a better choice than copy to materialize a lazy operation. But, I'd like to read a discussion, if one exists.

andreasnoack · 2018-07-31T10:19:46Z

You are right that permutedims is only supposed to act on the top-level so my proposed solution wasn't correct.

Using copy(transpose(M)) to get a material transpose is so obviously wrong...This use is semantically orthogonal to the other uses of copy.

This has indeed been discussed. Maybe @Sacha0 remembers which are the relevant issues. Would you prefer to get a Transpose when using copy? Would you copy the underlying array or should it just be a noop? A similar example where it might be more obvious that you don't want copy to return the same type is for SubArray where it would be unfortunate if copy(view(randn(1000,1000), 1:2, 1:2)) copied the underlying arrays and it is redundant to wrap the copied 2x2 matrix in a SubArray.

jlapeyre · 2018-07-31T12:52:21Z

it would be unfortunate if copy(view(randn(1000,1000), 1:2, 1:2)) copied the underlying arrays and it is redundant to wrap the copied 2x2 matrix in a SubArray

If it comes down to a choice between preserving the current consensus for copy(transpose(.... and disasters like your example (or the absurdity of copying just a wrapper), then clearly the former is better. It may be non-orthogonal design (it certainly was not "discoverable" for me), but it is performant.

Would you prefer to get a Transpose when using copy? Would you copy the underlying array or should it just be a noop?

That's a good question. But, maybe better questions are what do you want and what is the best interface for providing it ? Currently, we have

julia> v = view(randn(1000, 1000), 1:2, 1:2);

julia> copy(v)
2×2 Array{Float64,2}:
 -0.981013  1.06454
 -1.69617   1.73077

julia> collect(v)
2×2 Array{Float64,2}:
 -0.981013  1.06454
 -1.69617   1.73077

The same holds for collect(transpose(m)). It seems to me (again, I'd like to see more counter examples) that recommending collect for materializing the view is better precisely because it is less ambiguous. So, the current behavior of copy might be correct for avoiding unintended consequences. But, that doesn't mean it has to be the recommended interface for materializing lazy views. If the behavior of collect above holds more or less generally, then it shouldn't be a huge effort to switch to recommending it. Has using collect this way been discussed ?

andreasnoack · 2018-07-31T13:14:17Z

Has using collect this way been discussed ?

Yes and for some reason it has fallen in dismay. I'm pretty neutral wrt collect but the general view seems to be that it should go away and in favor of using constructors. Most often that would be Array. However, in this case, and possibly other cases, something generic is needed since the result should depend on whatever is being wrapped by Transpose. Again, @Sacha0 has a good memory and will probably be able to fill in the missing details here.

jlapeyre · 2018-07-31T14:09:40Z

Ok thanks for the tip. Here are some issues to start with: #16029, #24595, #12251. Looks like it started with more or less "collect can always be replaced with Vector". But it turns out it's more complicated than that.

Sacha0 · 2018-07-31T19:14:27Z

Regarding the semantics of copy, collect, materialize and friends, I

to the inimitable @mbauman. Best!

jlapeyre · 2018-07-31T19:38:13Z

I punt

I'm preparing a what I hope is a non-controversial PR :)

Sacha0 · 2018-07-31T19:42:14Z

I'm preparing a what I hope is a non-controversial PR :)

Famous last words 😉.

mbauman · 2018-07-31T19:54:45Z

Yes, copy vs. collect vs. materialize goes down a big rabbit hole very quickly. If you're curious about reading more, #16029 is a good starting point for the latest thoughts. From there leads a labyrinth of linked issues.

For now, we're living with the fact that copy plays two roles: it copies elements into a new outer structure and in doing so it often "materializes" laziness. This is something I know both Sacha and I have wanted to address, but it's a big task that's been on hold as we've finalized 0.7/1.0.

ViralBShah · 2019-07-06T22:25:55Z

This appears resolved to me. I get the same performance for both operations reported above.

andreasnoack added the domain:arrays:sparse Sparse arrays label Jul 30, 2018

ViralBShah closed this as completed Jul 6, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

permutedims(::SparseMatrixCSC) calls a fallback method #28324

permutedims(::SparseMatrixCSC) calls a fallback method #28324

jlapeyre commented Jul 28, 2018

andreasnoack commented Jul 30, 2018

jlapeyre commented Jul 31, 2018

andreasnoack commented Jul 31, 2018

jlapeyre commented Jul 31, 2018

andreasnoack commented Jul 31, 2018 •

edited

Loading

jlapeyre commented Jul 31, 2018

Sacha0 commented Jul 31, 2018

jlapeyre commented Jul 31, 2018

Sacha0 commented Jul 31, 2018

mbauman commented Jul 31, 2018 •

edited

Loading

ViralBShah commented Jul 6, 2019

permutedims(::SparseMatrixCSC) calls a fallback method #28324

permutedims(::SparseMatrixCSC) calls a fallback method #28324

Comments

jlapeyre commented Jul 28, 2018

andreasnoack commented Jul 30, 2018

jlapeyre commented Jul 31, 2018

andreasnoack commented Jul 31, 2018

jlapeyre commented Jul 31, 2018

andreasnoack commented Jul 31, 2018 • edited Loading

jlapeyre commented Jul 31, 2018

Sacha0 commented Jul 31, 2018

jlapeyre commented Jul 31, 2018

Sacha0 commented Jul 31, 2018

mbauman commented Jul 31, 2018 • edited Loading

ViralBShah commented Jul 6, 2019

andreasnoack commented Jul 31, 2018 •

edited

Loading

mbauman commented Jul 31, 2018 •

edited

Loading