Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalize broadcast to handle tuples and scalars #16986

Merged
merged 4 commits into from
Sep 18, 2016

Conversation

pabloferz
Copy link
Contributor

@pabloferz pabloferz commented Jun 17, 2016

This addresses #16966 and should also help a little with #4883 and #16285.

EDITED: Here is a little demo tested locally (also, needs #17389 to work properly)

julia> parse.(Int, ["1", "2"])
2-element Array{Int64,1}:
 1
 2

julia> convert.(Float32, [1,2])
2-element Array{Float32,1}:
 1.0
 2.0

julia> trunc.((Int,), [1.2, 3.4])
2-element Array{Int64,1}:
 1
 3

julia> ceil.(UInt8, [1.2 3.4; 5.6 6.7])
2×2 Array{UInt8,2}:
 0x02  0x04
 0x06  0x07

julia> abs.((1, -2))
(1,2)

julia> broadcast(+, 1.0, (0, -2.0))
(1.0,-1.0)

julia> broadcast(+, 1.0, (0, -2.0), [1])
2-element Array{Float64,1}:
 2.0
 0.0

julia> broadcast(*, ["Look", "I'm", "...with"], " ", ["ma'", "broadcasting", "scalars"], "!")
3-element Array{String,1}:
 "Look ma'!"        
 "I'm broadcasting!"
"...with scalars!"

I just have a question though, can the following be written in a form that would make it pure? Or is it pure? (I have not completely clear the concept of pureness).

promote_eltype_op{T}(op, Ts::AbstractArray{DataType}, ::AbstractArray{T}) = typejoin([promote_op(op, S, T) for S in Ts]...)

TODO:

  • Tests pass
  • Add tests
  • Check performance
  • Docs

@StefanKarpinski
Copy link
Sponsor Member

Were you able to rebuild with this change? The builds all look like they failed to bootstrap.

@pabloferz
Copy link
Contributor Author

pabloferz commented Jun 17, 2016

I tested the changes on a julia script, I didn't try building with changes (so this should work after bootstrap). I'll try to find where the problem is.

The problem was a silly typo. I believe it should work now.

@JeffBezanson
Copy link
Sponsor Member

can the following be written in a form that would make it pure

No. The compiler cannot currently prove that a whole array is constant and won't be mutated.

Adding just Type seems a bit ad-hoc. What behavior do we really want? I would think maybe any non-AbstractArray should be treated as scalar?

@pabloferz
Copy link
Contributor Author

I would think maybe any non-AbstractArray should be treated as scalar

That seems a better idea. I'll see if I can make that work.

@KristofferC
Copy link
Sponsor Member

Tuples too?

@TotalVerb
Copy link
Contributor

This proposal is going to end up a mess of special cases. This should be done with something like setindex(Array{typeof(scalar),0}(), scalar) that is faster and has nicer syntax.

@@ -199,6 +199,17 @@ promote_rule(::Type{Float64}, ::Type{Float32}) = Float64
widen(::Type{Float16}) = Float32
widen(::Type{Float32}) = Float64

promote_op{T<:Union{Float32,Float64}}(::typeof(trunc), ::Type{Signed}, ::Type{T}) = Int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need to do this for numbers. A generic definition like this seems to be enough:

promote_op{R<:Number,S<:Number}(op, ::Type{R}, ::Type{S}) = typeof(op(one(R), one(S)))

I'm currently working on it for PR #16996.

Copy link
Contributor Author

@pabloferz pabloferz Jun 18, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That won't work when one of the arguments is a type, which is the case here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I completely misread that line due to what I was working on at the same time, carry on.

@JeffBezanson
Copy link
Sponsor Member

I'm worried about the promote_op definitions that ignore op, e.g.

promote_op{R,S}(::Any, ::Type{R}, ::Type{S}) = (@_pure_meta; promote_type(R, S))

Calling something like op(one(T)) is not ideal, but at least it reflects the function's real behavior. I would like to use the same algorithm as map for determining the result type. For minimum disruption, it would be good to use promote_op in the cases where it's defined, and otherwise fall back to the map algorithm. A good way to do this would be to have promote_op default to returning Union{}, and then we can dispatch on Type{Union{}} to select the map algorithm. Is that workable, or is there too much code that depends on the current promote_op fallback? If we can't do it that way, we should just use the map algorithm in every case.

@nalimilan
Copy link
Member

I think most of the code uses promote_op on numbers, where using one(T) is indeed the best approach IMHO (cf. #16995). The fallback can be incorrect, so I'm not even sure it's used anywhere. Anyway I really agree we need something better, and it should be consistent with actual return types.

@pabloferz
Copy link
Contributor Author

As Milan said, most of the code that uses promote_op is on methods that work with numbers. I'll explore how hard would it be to use the current mechanism and fall back to the map one.

A couple of questions more. Should broadcast always return an Array? Should it work with Tuples and treat them as Array{T,1}? In particular, should broadcast(+, 1, 1) be 2 or a zero dimensional Array holding 2?

@nalimilan
Copy link
Member

A couple of questions more. Should broadcast always return an Array? Should it work with Tuples and treat them as Array{T,1}? In particular, should broadcast(+, 1, 1) be 2 or a zero dimensional Array holding 2?

I'd say it should definitely work on tuples, and it should return a tuple like map already does. broadcast(+, 1, 1) could return a scalar, so that it's a strict generalization of map (people have talked a few times about replacing map with broadcast).

@pabloferz pabloferz changed the title Treat types as scalars in broadcast Treat non-AbstractArrays as scalars in broadcast Jun 20, 2016
@pabloferz pabloferz mentioned this pull request Jun 20, 2016
@pabloferz pabloferz force-pushed the pz/broadcast branch 3 times, most recently from 31871b3 to 17da30d Compare June 20, 2016 23:11
@pabloferz pabloferz changed the title Treat non-AbstractArrays as scalars in broadcast Treat non-indexable types as scalars in broadcast Jun 20, 2016
@pabloferz pabloferz force-pushed the pz/broadcast branch 5 times, most recently from 59be7fd to a4d9436 Compare June 21, 2016 13:28
@pabloferz
Copy link
Contributor Author

Ok, I think it should be better now. Can we give it another spin?

@stevengj
Copy link
Member

Does a new version of BaseBenchmarks need to be tagged?

@yuyichao
Copy link
Contributor

Does a new version of BaseBenchmarks need to be tagged?

If you are talking about the ones affected by vectorization depwarn then no, it's already updated. There doesn't seem to be any new commit since then either.

@yuyichao
Copy link
Contributor

@nanosoldier runbenchmarks(ALL, vs = ":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels

@stevengj
Copy link
Member

I don't see how this PR could possibly have affected sorting or sparse-transpose performance. @jrevels, are those tests typically "noisy"?

@Sacha0
Copy link
Member

Sacha0 commented Sep 14, 2016

Those regressions appear within the noise margins I've recently seen elsewhere (for example #17900 (comment), which I could not reproduce locally and did not persist across nanosoldier runs).

@stevengj
Copy link
Member

In which case this should be good to merge?

@Sacha0
Copy link
Member

Sacha0 commented Sep 18, 2016

I locally built and benchmarked this branch against master, and could not reproduce any of the potential regressions the most recent nanosoldier run identified. So if nanosoldier's most recent run was the last remaining concern, this PR should be in good shape. Best!

@stevengj stevengj merged commit 2193638 into JuliaLang:master Sep 18, 2016
@pabloferz pabloferz deleted the pz/broadcast branch September 18, 2016 21:05
@Sacha0 Sacha0 mentioned this pull request Sep 19, 2016

@generated function broadcast_tup{AT,nargs}(f, As::AT, ::Type{Val{nargs}}, n)
quote
ntuple(n -> (@ncall $nargs f i->_broadcast_getindex(As[i], n)), Val{n})
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this generated function definition is illegal (they may not use closures / inner functions, as that generates a new type and corrupts inference)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be reverted then? is that restriction fully documented or enforced?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this is a very strong restriction which includes @threads (possibly other macros) and comprehensions.

Copy link
Contributor Author

@pabloferz pabloferz Sep 19, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would changing this to (((@ncall $nargs f i->_broadcast_getindex(As[i], _)) for _ = 1:n)...) be enough to fix this?

EDIT: No, some of the changes from #18413 are making all variations of fixes I tried, fail.

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future, I believe we should fix the comprehension issue by introducing sealed anonymous functions (so that they get uniqued based upon their code contents rather than their definition position / gensym-name). But yes, that's currently invalid.

@mauro3
Copy link
Contributor

mauro3 commented Oct 25, 2016

Can this be backported to 0.5? I was getting all excited about the .-syntax, but this kills a fair few of the use cases I have. The workaround (to define size and getindex for all argument types of methods which should work with .-syntax) is pretty ugly.

@tkelman
Copy link
Contributor

tkelman commented Oct 25, 2016

This would be too big of a behavior change to backport.

@mauro3
Copy link
Contributor

mauro3 commented Oct 25, 2016

Bummer. But in that case this should either throw a useful error message and/or be added to the 0.5 docs. Currently this is:

julia> type A; a end

julia> f(x,a) = a.a+x
f (generic function with 1 method)

julia> f.([1,2], A(5))
ERROR: MethodError: no method matching size(::A)
Closest candidates are:
  size{N}(::Any, ::Integer, ::Integer, ::Integer...) at abstractarray.jl:48
  size(::BitArray{1}) at bitarray.jl:39
  size(::BitArray{1}, ::Any) at bitarray.jl:43
  ...
 in broadcast_shape(::Array{Int64,1}, ::A) at ./broadcast.jl:31
 in broadcast_t(::Function, ::Type{Any}, ::Array{Int64,1}, ::Vararg{Any,N}) at ./broadcast.jl:213
 in broadcast(::Function, ::Array{Int64,1}, ::A) at ./broadcast.jl:230

and after defining size the same again for getindex.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:broadcast Applying a function over a collection
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet