-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
more generic broadcast over a single sparse matrix #19239
Conversation
# Bcolptr = similar(A.colptr, A.n + 1) | ||
# copy!(Bcolptr, 1, A.colptr, 1, A.n + 1) | ||
# Browval = similar(A.rowval, nnz(A)) | ||
# copy!(Browval, 1, A.rowval, 1, nnz(A)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These alternative lines for constructing Bcolptr
and Browval
might be safer than the copy
/ resize!
lines above: If A.colptr
and/or A.rowval
are shorter than their expected lengths, the copy!
s might catch the error, whereas the former copy
/resize!
combinations won't. The copy
/resize!
lines are simpler/cleaner though. Thoughts? Thanks!
# NOTE: The following two method definitions work around #19096. These definitions should | ||
# be folded into the two preceding definitions on resolution of #19096. | ||
broadcast{Tf,T}(f::Tf, ::Type{T}, A::SparseMatrixCSC) = broadcast(y -> f(T, y), A) | ||
broadcast{Tf,T}(f::Tf, A::SparseMatrixCSC, ::Type{T}) = broadcast(x -> f(x, T), A) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The four method definitions above for handling common combinations of sparse matrices with broadcast
scalars is not satisfying: A better solution would achieve the same for (1) all broadcast
scalar types (Base.Broadcast.containertype(x) == Any
?) and (2) any combination (number, order, type mixture) of broadcast
scalars. One might achieve (2) via a generated function. But how one might simultaneously and gracefully achieve (1) eludes me. Thoughts much appreciated. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you need a generated function. Maybe we can brainstorm with @pabloferz about this at some point soon? Maybe start a discourse thread? @pabloferz, are you on discourse.julialang.org yet?
In any case, it seems like that should be separate from this PR, which is only about the single-argument case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Al least for this case you might want to add another trait layer. E.g.
import Base.Broadcast.containertype
broadcast(f, A, B::SparseMatrixCSC) = broadcast_sp(f, containertype(A), containertype(B), A, B)
broadcast(f, A::SparseMatrixCSC, B) = broadcast_sp(f, containertype(A), containertype(B), A, B)
broadcast_sp(f, ::Type{Any}, ::Type{Array} A, B) = broadcast(x -> f(A, x), B)
broadcast_sp(f, ::Type{Array}, ::Type{Any} A, B) = broadcast(x -> f(x, B), A)
broadcast_sp(f, ::Type, ::Type, A, B) = sparse(Base.Broadcast.broadcast_c(f, Array, A, B))
@stevengj I'm already on discourse, so feel free to ping me (@pabloferz
) if you'd like to start a thread overe there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The following snippet works for arbitrary combinations of broadcast scalars with a single sparse matrix, but the (y...) = f(a, y...)
kills inference.
import Base.Broadcast: broadcast_c, containertype, promote_containertype
containertype{T<:SparseMatrixCSC}(::Type{T}) = SparseMatrixCSC
# Combinations of sparse arrays with broadcast scalars should yield sparse arrays
promote_containertype(::Type{Any}, ::Type{SparseMatrixCSC}) = SparseMatrixCSC
promote_containertype(::Type{SparseMatrixCSC}, ::Type{Any}) = SparseMatrixCSC
# Combinations of sparse arrays with anything else should fall back to dense array methods
promote_containertype(::Type{Array}, ::Type{SparseMatrixCSC}) = Array
promote_containertype(::Type{SparseMatrixCSC}, ::Type{Array}) = Array
promote_containertype(::Type{Tuple}, ::Type{SparseMatrixCSC}) = Array
promote_containertype(::Type{SparseMatrixCSC}, ::Type{Tuple}) = Array
broadcast_c(f, ::Type{SparseMatrixCSC}, A) = broadcast(f, A)
broadcast_c(f, ::Type{SparseMatrixCSC}, A, Bs...) = broadcast_sp(f, containertype(A), containertype(Bs...), A, Bs...)
broadcast_sp(f, ::Type{Any}, ::Type{SparseMatrixCSC}, a, Bs...) = broadcast_c((y...) -> f(a, y...), SparseMatrixCSC, Bs...)
broadcast_sp(f, ::Type{SparseMatrixCSC}, ::Type{Any}, A, bs...) = broadcast_c(x -> f(x, bs...), SparseMatrixCSC, A)
broadcast_sp(f, ::Type{SparseMatrixCSC}, ::Type{SparseMatrixCSC}, A, Bs...) = broadcast_c(f, Array, A, Bs...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A possible solution would be to have (additionally to what you have above)
EDIT: There was a typo. This works with one SparseMatrixCSC
and up to ten scalars but won't be inferable with two or more SparseMatrixCSC
s.
EDIT 2: Added a method, and it appears to be inferable with up to eight arguments (I'd say that should be enough).
immutable FArgs{F,T}
f::F
args::T
end
(f::FArgs)(x...) = f.f(f.args..., x...)
broadcast_sp(f, ::Type{Any}, ::Type{SparseMatrixCSC}, a, Bs...) = broadcast_c(FArgs(f,(a,)), SparseMatrixCSC, Bs...)
broadcast_sp(f::FArgs, ::Type{Any}, ::Type{SparseMatrixCSC}, a, Bs...) = broadcast_c(FArgs(f.f,(f.args...,a)), SparseMatrixCSC, Bs...)
broadcast_sp(f::FArgs, ::Type{SparseMatrixCSC}, ::Type{Any}, A, bs...) = broadcast_c(x -> f.f(f.args..., x, bs...), SparseMatrixCSC, A)
broadcast_sp(f::FArgs, ::Type{SparseMatrixCSC}, ::Type{SparseMatrixCSC}, A, Bs...) = broadcast_c(f.f, Array, f.args..., A, Bs...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The hacky snippet below takes a different tack. In principle it should handle any combination of broadcast scalars and SparseMatrixCSC
s; in practice type inference fails when (1) the number of arguments exceeds foursix or (2) more than one SparseMatrixCSC
is involved. (edited:)
import Base: tail, front
import Base.Broadcast: broadcast_c, containertype, promote_containertype, broadcast_indices
broadcast_indices(::Type{SparseMatrixCSC}, A) = indices(A)
containertype{T<:SparseMatrixCSC}(::Type{T}) = SparseMatrixCSC
# Combinations of sparse arrays with broadcast scalars should yield sparse arrays
promote_containertype(::Type{Any}, ::Type{SparseMatrixCSC}) = SparseMatrixCSC
promote_containertype(::Type{SparseMatrixCSC}, ::Type{Any}) = SparseMatrixCSC
# Combinations of sparse arrays with anything else should fall back to dense array methods
promote_containertype(::Type{Array}, ::Type{SparseMatrixCSC}) = Array
promote_containertype(::Type{SparseMatrixCSC}, ::Type{Array}) = Array
promote_containertype(::Type{Tuple}, ::Type{SparseMatrixCSC}) = Array
promote_containertype(::Type{SparseMatrixCSC}, ::Type{Tuple}) = Array
broadcast_c(f, ::Type{SparseMatrixCSC}, As...) = foo((x, y) -> f(y...), (), As...)
foo(f, arrays, nextarg, remargs...) = foo((x, y) -> f(x, (nextarg, y...)), arrays, remargs...)
foo(f, arrays, nextarg::SparseMatrixCSC, remargs...) =
foo((x, y) -> f(front(x), (last(x), y...)), (arrays..., nextarg), remargs...)
foo(f, arrays, nextarg) = bar(x -> f(x, (nextarg,)), arrays...)
foo(f, arrays, nextarg::SparseMatrixCSC) = bar(x -> f(front(x), (last(x),)), arrays..., nextarg)
bar(f, A) = broadcast(x -> f((x,)), A)
bar(f, A, B) = broadcast_c((x,y) -> f((x,y)), Array, A, B)
bar(f, A, B, Cs...) = broadcast_c((x,y,zs...) -> f((x,y,zs...)), Array, A, B, Cs...)
I'm sure a better approach exists, possibly e.g. inverting the call chain. Having a look at @pabloferz
's suggestion now. Best!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Below are two evolutions of @pabloferz's suggestion. The first variation retains the helper type, which distills to a stack for leading scalar arguments. The second variation nixes the helper type and instead uses a tuple for the stack.
Both variations reduce the mutual recursion between broadcast_c
and broadcast_sp
to recursion simply in broadcast_sp
, and punt on argument lists involving more than one SparseMatrixCSC
.
Manually extending this approach to handle argument lists involving two (or some larger small integer N) SparseMatrixCSC
s seems straightforward, but incurs rapid code growth with N. How to extend this approach to an arbitrary number of (see followup post). But perhaps N = 2 or N = 3 would be good enough.SparseMatrixCSC
s isn't clear to me
I will play with the N = 2 case of this approach next, and then attempt to improve (inference of) the more general approach I posted above.
Code common to both variations
import Base: tail
import Base.Broadcast: broadcast_c, containertype, promote_containertype, broadcast_indices
broadcast_indices(::Type{SparseMatrixCSC}, A) = indices(A)
containertype{T<:SparseMatrixCSC}(::Type{T}) = SparseMatrixCSC
# Combinations of sparse arrays with broadcast scalars should yield sparse arrays
promote_containertype(::Type{Any}, ::Type{SparseMatrixCSC}) = SparseMatrixCSC
promote_containertype(::Type{SparseMatrixCSC}, ::Type{Any}) = SparseMatrixCSC
# Combinations of sparse arrays with anything else should fall back to dense array methods
promote_containertype(::Type{Array}, ::Type{SparseMatrixCSC}) = Array
promote_containertype(::Type{SparseMatrixCSC}, ::Type{Array}) = Array
promote_containertype(::Type{Tuple}, ::Type{SparseMatrixCSC}) = Array
promote_containertype(::Type{SparseMatrixCSC}, ::Type{Tuple}) = Array
Variation retaining helper type
immutable FuncArgs{F,T}; func::F; args::T; end
# Entrypoint to broadcast_sp
broadcast_c(f, ::Type{SparseMatrixCSC}, A, Bs...) =
broadcast_sp(FuncArgs(f, ()), containertype(A), containertype(Bs...), A, Bs...)
# Endpoints for argument lists containing a single SparseMatrixCSC
broadcast_sp(fa, ::Type{Any}, ::Type{SparseMatrixCSC}, a, B) = broadcast(x -> fa.func(fa.args..., a, x), B)
broadcast_sp(fa, ::Type{SparseMatrixCSC}, ::Type{Any}, A, bs...) = broadcast(x -> fa.func(fa.args..., x, bs...), A)
# Endpoint for argument lists containing more than one SparseMatrixCSC (punt)
broadcast_sp(fa, ::Type{SparseMatrixCSC}, ::Type{SparseMatrixCSC}, A, Bs...) = broadcast_c(fa.func, Array, fa.args..., A, Bs...)
# Case requiring further reduction
broadcast_sp(fa, ::Type{Any}, ::Type{SparseMatrixCSC}, a, B, Cs...) =
broadcast_sp(FuncArgs(fa.func, (fa.args..., a)), containertype(B), containertype(Cs...), B, Cs...)
Variation nixing helper type
# Entrypoint to broadcast_sp
broadcast_c(f, ::Type{SparseMatrixCSC}, A, Bs...) =
broadcast_sp(f, (), containertype(A), containertype(Bs...), A, Bs...)
# Endpoints for argument lists containing a single SparseMatrixCSC
broadcast_sp(f, as, ::Type{Any}, ::Type{SparseMatrixCSC}, b, C) = broadcast(x -> f(as..., b, x), C)
broadcast_sp(f, as, ::Type{SparseMatrixCSC}, ::Type{Any}, B, cs...) = broadcast(x -> f(as..., x, cs...), B)
# Endpoint for argument lists containing more than one SparseMatrixCSC (punt)
broadcast_sp(f, as, ::Type{SparseMatrixCSC}, ::Type{SparseMatrixCSC}, B, Cs...) = broadcast_c(f, Array, as..., B, Cs...)
# Case requiring further reduction
broadcast_sp(f, as, ::Type{Any}, ::Type{SparseMatrixCSC}, b, C, Ds...) =
broadcast_sp(f, (as..., b), containertype(C), containertype(Ds...), C, Ds...)
Best!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New versions capable (in principle) of handling any number of SparseMatrixCSC
s follow below. None circumvent #19096. TL;DR The first version (A) is the best so far, perhaps good enough to incorporate for now? Thoughts? Any and all pointers much appreciated, particularly towards better understanding of why type inference works/fails where it does.
Code common to all versions:
import Base: front, tail
import Base.Broadcast: broadcast_c, containertype, promote_containertype, broadcast_indices
# Broadcast container type promotion for SparseMatrixCSCs
containertype{T<:SparseMatrixCSC}(::Type{T}) = SparseMatrixCSC
broadcast_indices(::Type{SparseMatrixCSC}, A) = indices(A)
# Combinations of sparse arrays with broadcast scalars should yield sparse arrays
promote_containertype(::Type{Any}, ::Type{SparseMatrixCSC}) = SparseMatrixCSC
promote_containertype(::Type{SparseMatrixCSC}, ::Type{Any}) = SparseMatrixCSC
# Combinations of sparse arrays with anything else should fall back to dense array
promote_containertype(::Type{Array}, ::Type{SparseMatrixCSC}) = Array
promote_containertype(::Type{SparseMatrixCSC}, ::Type{Array}) = Array
promote_containertype(::Type{Tuple}, ::Type{SparseMatrixCSC}) = Array
promote_containertype(::Type{SparseMatrixCSC}, ::Type{Tuple}) = Array
(A) For the following version, type inference remains happy when the argument list (1) contains a single sparse matrix and up to six broadcast scalars, (2) contains two sparse matrices and up to twelve broadcast scalars (more depending on position, strangely), or (3) up to six sparse matrices as long as no broadcast scalars appear. This behavior is a mystery to me, insight much appreciated. This version seems the best in practice at the moment.
# closeoverscalars steps through the argument list. For each scalar argument,
# closeoverscalars generates a closure incorporating that scalar into the present version
# of the broadcast function, removes that scalar from the argument list, and steps to the
# next argument. For each matrix argument, closeoverscalars generates a closure which
# will later funnel that matrix argument to the right slot in the original broadcast
# function, collects that matrix in a tuple of matrices over which to eventually broadcast,
# removes that matrix from the argument list, and steps to the next argument. After
# traversing the argument list, closeoverscalars returns a closure incorporating
# all broadcast scalars in the argument list followed by the argument list's sparse matrices.
#
# Entrypoint to closeoverscalars
broadcast_c(f, ::Type{SparseMatrixCSC}, args...) = broadcast(closeoverscalars(f, args)...)
closeoverscalars(f, args) = closeoverscalars((x, y) -> f(y...), (), args...)
# Recursion cases for closeoverscalars
closescalar(f, s) = (x, y) -> f(x, (s, y...))
passmatrix(f) = (x, y) -> f(front(x), (last(x), y...))
closeoverscalars(f, mats, nextarg, remargs...) = closeoverscalars(closescalar(f, nextarg), mats, remargs...)
closeoverscalars(f, mats, nextarg::SparseMatrixCSC, remargs...) = closeoverscalars(passmatrix(f), (mats..., nextarg), remargs...)
# Base cases for closescalars
closelastscalar(f, s) = (x...) -> f(x, (s,))
passlastmatrix(f) = (x...) -> f(front(x), (last(x),))
closeoverscalars(f, mats, lastarg) = (closelastscalar(f, lastarg), mats...)
closeoverscalars(f, mats, lastarg::SparseMatrixCSC) = (passlastmatrix(f), mats..., lastarg)
Edit: Partial evaluation not correct. (B) For the following version (a descendant of @pabloferz's suggestion), type inference remains happy when the argument list (1) contains up to nine members including either one or two sparse matrices, or (2) contains three sparse matrices and no broadcast scalars. This behavior is also a mystery to me, insight much appreciated.
# broadcast_sp steps through the argument list, collecting each sequence of broadcast scalars
# between one sparse matrix (or start or end of the argument list) and the next in a tuple,
# storing the sequence of such tuples in stups ("scalar-tuples"), and storing the sparse
# matrices in mats. After traversing the argument list, broadcast_sp calls closeoverscalars
# which, given stups, mats, and the broadcast function, generates a closure incorporating
# all broadcast scalars in the argument list. broadcast_sp then passes that closure along
# with the argument list's sparse matrices on to broadcast.
#
# Entrypoint to broadcast_sp
broadcast_c(fn, ::Type{SparseMatrixCSC}, A, Bs...) =
broadcast_sp(fn, ((),), (), containertype(A), containertype(Bs...), A, Bs...)
# Recursion cases for broadcast_sp
broadcast_sp(fn, stups, mats, ::Type{Any}, ::Type{SparseMatrixCSC}, a, B, Cs...) =
broadcast_sp(fn, (front(stups)..., (last(stups)..., a)), mats, containertype(B), containertype(Cs...), B, Cs...)
broadcast_sp(fn, stups, mats, ::Type{SparseMatrixCSC}, ::Type{SparseMatrixCSC}, A, B, Cs...) =
broadcast_sp(fn, (stups..., ()), (mats..., A), containertype(B), containertype(Cs...), B, Cs...)
# Base cases for broadcast_sp
broadcast_sp(fn, stups, mats, ::Type{SparseMatrixCSC}, ::Type{SparseMatrixCSC}, A, B) =
broadcast(closeoverscalars((x...) -> fn(front(front(x))..., last(stups)..., last(front(x)), last(x)), front(stups)), mats..., A, B)
broadcast_sp(fn, stups, mats, ::Type{SparseMatrixCSC}, ::Type{Any}, A, bs...) =
broadcast(closeoverscalars((x...) -> fn(front(x)..., last(stups)..., last(x), bs...), front(stups)), mats..., A)
broadcast_sp(fn, stups, mats, ::Type{Any}, ::Type{SparseMatrixCSC}, a, B) =
broadcast(closeoverscalars((x...) -> fn(front(x)..., last(stups)..., a, last(x)), front(stups)), mats..., B)
# Recursively close over scalars
closeoverscalars(fn, stups) = closeoverscalars((x...) -> fn(front(x)..., last(stups)..., last(x)), front(stups))
closeoverscalars(fn, ::Tuple{}) = fn
Edit: Partial evaluation not correct. (C) The following version is a simpler realization of the preceding concept, but type inference does not seem to like it (nor a similar version using varags for groupargs
).
# Entrypoint
broadcast_c(f, ::Type{SparseMatrixCSC}, args...) =
(ga = groupargs(args); broadcast(closeoverscalars(f, ga[1]), ga[2]...))
# Entrypoint for groupargs
groupargs(args) = groupargs(((),), (), first(args), tail(args))
# Recursion cases for groupargs
groupargs(stups, mats, nextarg, remargs) =
groupargs((front(stups)..., (last(stups)..., nextarg)), mats, first(remargs), tail(remargs))
groupargs(stups, mats, nextarg::SparseMatrixCSC, remargs) =
groupargs((stups..., ()), (mats..., nextarg), first(remargs), tail(remargs))
# Base cases for groupargs
groupargs(stups, mats, lastarg, ::Tuple{}) = ((front(stups)..., (last(stups)..., lastarg)), mats)
groupargs(stups, mats, lastarg::SparseMatrixCSC, ::Tuple{}) = ((stups..., ()), (mats..., lastarg))
# Entrypoint and recursion and base cases for closeoverscalars
closeoverscalars(f, stups) = _closeoverscalars((x...) -> f(x..., last(stups)...), front(stups))
_closeoverscalars(f, stups) = _closeoverscalars((x...) -> f(front(x)..., last(stups)..., last(x)), front(stups))
_closeoverscalars(f, ::Tuple{}) = f
Best!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Sacha0 Inspired on your approach above that evolved from my suggestion I was able to write it in such a way that it is inferable up to eight arguments irrespectively of the arguments types. The maximum number of arguments, so that it keeps inferable, I believe, is a limitation of the inference mechanism, but I haven't looked this into detail.
The approach would be your 'common code' above plus
# Entrypoint to broadcast_sp
broadcast_c(f, ::Type{SparseMatrixCSC}, A, Bs...) =
broadcast_sp(f, (), nothing, (), containertype(A), containertype(Bs...), A, Bs...)
# Recursion cases for broadcast_sp
broadcast_sp(f, ftup, ::Void, ttup, ::Type{Any}, ::Type{SparseMatrixCSC}, a, B, Cs...) =
broadcast_sp(f, (ftup..., a), nothing, ttup, containertype(B), containertype(Cs...), B, Cs...)
broadcast_sp(f, ftup, ::Void, ttup, ::Type{SparseMatrixCSC}, ::Type{SparseMatrixCSC}, A, B, Cs...) =
broadcast_sp(f, ftup, A, ttup, containertype(B), containertype(Cs...), B, Cs...)
broadcast_sp(f, ftup, A, ttup, ::Type{Any}, ::Type{SparseMatrixCSC}, a, B, Cs...) =
broadcast_sp(f, ftup, A, (ttup..., a), containertype(B), containertype(Cs...), B, Cs...)
# Base cases for broadcast_sp
broadcast_sp(f, ftup, ::Void, ttup, ::Type{Any}, ::Type{SparseMatrixCSC}, a, B) =
broadcast(x -> f(ftup..., a, x), B)
broadcast_sp(f, ftup, ::Void, ttup, ::Type{SparseMatrixCSC}, ::Type{Any}, A, bs...) =
broadcast(x -> f(ftup..., x, bs...), A)
broadcast_sp(f, ftup, ::Void, ttup, ::Type{SparseMatrixCSC}, ::Type{SparseMatrixCSC}, A, B) =
broadcast((x, y) -> f(ftup..., x, y), A, B)
broadcast_sp(f, ftup, A, ttup, ::Type{Any}, ::Type{SparseMatrixCSC}, a, B) =
broadcast((x, y) -> f(ftup..., x, ttup..., a, y), A, B)
broadcast_sp(f, ftup, A, ttup, ::Type{SparseMatrixCSC}, ::Type{Any}, B, cs...) =
broadcast((x, y) -> f(ftup..., x, ttup..., y, cs...), A, B)
broadcast_sp(f, ftup, A, ttup, ::Type{SparseMatrixCSC}, ::Type{SparseMatrixCSC}, B, Cs...) =
broadcast_c(f, Array, ftup..., A, ttup..., B, Cs...)
NOTE: When using the dot syntax, given that the parser can already reduce the effective number of arguments passed to broadcast the number of arguments can be even more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussion continued on discourse. Best!
# behavior presently above (this would be consistent with existing binary sparse broadcast | ||
# behavior); or (2) the generic unary broadcast behavior should be nz2nz_z2z as above, and the | ||
# nz2z_z2z specializations below should ultimately go away in favor of `broadcast`+`dropzeros!` | ||
# (this would not be consistent with existing binary sparse broadcast behavior). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This TODO concerns the primary decision described in the original post.
# behavior presently above (this would be consistent with existing binary sparse broadcast | ||
# behavior); or (2) the generic unary broadcast behavior should be nz2nz_z2z as above, and the | ||
# nz2z_z2z specializations below should ultimately go away in favor of `broadcast`+`dropzeros!` | ||
# (this would not be consistent with existing binary sparse broadcast behavior). | ||
|
||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once the nz2z/nz2nz decision is made, most of the code below this line till ## Broadcasting kernels specialized for returning a SparseMatrixCSC
should go away.
My initial preference would be |
On the other hand, implementation would be simpler and faster if we used |
Cheers, my thinking as well. I appreciate your devil's-advocate arguments though; thoughts below.
The benchmarks in #18309 indicate that Another small argument for Best! |
(Absent additional input in the next couple of days I will revise this PR with |
function broadcast{Tf}(f::Tf, A::SparseMatrixCSC) | ||
fofzero = f(zero(eltype(A))) | ||
fpreszero = fofzero == zero(fofzero) | ||
return fpreszero ? _broadcast_zeropres(f, A) : _broadcast_notzeropres(f, A) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't you pass fofzero
to _broadcast_notzeropres
so that it doesn't need to be recomputed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Originally I passed fofzero
to _broadcast_notzeropres
for that reason, but later nixed the pass for simplicity. Happy to restore the pass if that strikes you as better though? Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It shouldn't make much difference, I guess, so whatever you prefer.
Switched to |
1bff9ff
to
f5bc714
Compare
@nanosoldier |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels |
Big improvement in the sparse-broadcast benchmark, unsurprisingly. Odd regression in the array-comprehension collect benchmark, probably unrelated noise since that doesn't involve sparse matrices at all. |
Seeing the same |
Pushed two small modifications: (1) |
The Travis problem should now be fixed, if you want to rebase. |
…scalars) more generic.
Rebased. I've made some headway re. #19239 (review), but improvements on that front could be an independent PR (however you prefer). Thanks and best! |
@stevengj, additional thoughts, or is this in shape to merge? (Separating further work along #19239 (review) line seems the most effective path forward?) Thanks and best! |
Thanks for reviewing / merging! |
…scalars) more generic. (JuliaLang#19239)
In #18590, one proposal for a more generic
broadcast
over sparse matrices received broad support: At runtime, (1) determine whether thebroadcast
operation returns zero when all (relevant) arguments are zero (by samplingf(0...)
), (2) branch to code specialized for the case at hand, and (3) in either case return a sparse matrix for type stability.This pull request begins implementing that proposal. Specifically, this pull request starts with
broadcast
over a single sparse matrix (and also handles a few common combinations of a single sparse matrix withbroadcast
scalars).A few code (and corresponding PR) comments highlight some decisions. The primary decision concerns #18309: When the
broadcast
operation maps a (stored) nonzero entry to zero, should the result store that zero (nz2nz behavior) or not (nz2z behavior)? I now lean towards nz2z behavior for the following reason:Consider
C = A .* B
(and similar operations) whereA
andB
are sparse. Chances are you wantC
to the store the intersection (rather than union) of the structures ofA
andB
(the present behavior of.*
and friends). Returning the structural intersection rather than union in such cases requires the nz2z behavior whenbroadcast
ing over two or more sparse matrices (assuming you determine thebroadcast
operation's zero-preserving behavior solely via sampling, see #18590). Havingbroadcast
over a single sparse matrix behave differently seems like a startling inconsistency. That inconsistency seems more important than the performance and implementation complexity differences mentioned in #18309. Regarding the zero-purging consistency point in #18309, with more thought this part ofbroadcast
's behavior isn't really zero-purging in the sense discussed elsewhere (not so sure aboutbroadcast!
though).Thoughts? Best!