Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce allocations in broadcast #19639

Merged
merged 3 commits into from
Dec 20, 2016

Conversation

pabloferz
Copy link
Contributor

With this PR

julia> function foo(x, n)
           for i = 1:n
               broadcast!(x -> 2x+1, x, x)
           end
           return x
       end
foo (generic function with 1 method)

julia> @time foo([0,0,0], 10^4);
  0.027883 seconds (25.78 k allocations: 1.108 MB)

julia> @time foo([0,0,0], 10^4);
  0.000121 seconds (6 allocations: 288 bytes)

julia> using BenchmarkTools

julia> @benchmark [1,2,3] .+ 1
BenchmarkTools.Trial: 
  memory estimate:  224.00 bytes
  allocs estimate:  2
  --------------
  minimum time:     63.186 ns (0.00% GC)
  median time:      67.704 ns (0.00% GC)
  mean time:        74.881 ns (7.55% GC)
  maximum time:     841.374 ns (89.36% GC)
  --------------
  samples:          10000
  evals/sample:     982
  time tolerance:   5.00%
  memory tolerance: 1.00%

julia> @benchmark broadcast(+, [1,2,3], 1)
BenchmarkTools.Trial: 
  memory estimate:  224.00 bytes
  allocs estimate:  2
  --------------
  minimum time:     65.979 ns (0.00% GC)
  median time:      71.068 ns (0.00% GC)
  mean time:        78.431 ns (7.44% GC)
  maximum time:     1.139 μs (92.01% GC)
  --------------
  samples:          10000
  evals/sample:     982
  time tolerance:   5.00%
  memory tolerance: 1.00%

Compare this with #19608 (comment) and #16285 (comment)

@kshyatt kshyatt requested a review from Sacha0 December 18, 2016 02:29
@kshyatt kshyatt added the domain:broadcast Applying a function over a collection label Dec 18, 2016
@martinholters
Copy link
Member

Are the changes to the sparse matrix code related to the addressed problem?

@KristofferC
Copy link
Sponsor Member

@nanosoldier runbenchmarks(ALL, vs = ":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels

@@ -1403,13 +1403,19 @@ sparse(S::UniformScaling, m::Integer, n::Integer=m) = speye_scaled(S.λ, m, n)
# map/map! entry points
function map!{Tf,N}(f::Tf, C::SparseMatrixCSC, A::SparseMatrixCSC, Bs::Vararg{SparseMatrixCSC,N})
_checksameshape(C, A, Bs...)
return map_nocheck!(f, C, A, Bs...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this could be tied to the bounds checking mechanism? Or would it be an abuse?

Copy link
Member

@Sacha0 Sacha0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great!

Perhaps having @timholy sign off on the inlining changes would be prudent?

The broadcast-fusion and linalg-arithmetic benchmark improvements are lovely. The scalar-floatexp-ldexp, sparse-arithmetic-unary minus, and string-join regressions should be noise. Might the linalg-factorization and array regressions be real?

I agree with @martinholters, the sparse matrix changes are orthogonal to the other changes in this pull request. I would prefer those changes appear in a separate pull request. (I might advocate holding off with that pull request for now, having left that TODO outstanding for two reasons: I wasn't certain whether avoiding the redundant shape check is worth the extra code complexity, and I plan to restructure that code somewhat in the near future in any case.)

Thanks again @pabloferz!

@pabloferz
Copy link
Contributor Author

pabloferz commented Dec 20, 2016

I removed the sparse related changes. The initial changes seemed to affect somehow some the svd and eigvecs methods for Diagonal and Bidiagonal so I took the chance too also improve them. Should be better now.

The reason for which there was a @noinline in the _broadcast! methods is no longer a concern so I don't think there's any risk in changing them.

function broadcast_t(f, ::Type{Any}, T::Type, shape, iter, As...)
if isempty(iter)
return similar(Array{T}, shape)
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why move the code handling the empty case inside this method and add a second type argument?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ups. I was playing around reorganizing the code and left this, but shouldn't be necessary. I'll put it back as it was.

@Sacha0
Copy link
Member

Sacha0 commented Dec 20, 2016

@nanosoldier runbenchmarks(ALL, vs = ":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - no performance regressions were detected. A full report can be found here. cc @jrevels

@stevengj stevengj merged commit 99b6a8c into JuliaLang:master Dec 20, 2016
@stevengj
Copy link
Member

stevengj commented Dec 20, 2016

Combined with dot ops, we now have:

julia> function bar(x, n)
                  for i = 1:n
                      x .= 2 .* x .+ 1
                  end
                  return x
                end
bar (generic function with 1 method)

julia> @time bar([0,0,0], 10^4); # warmup
  0.020100 seconds (17.47 k allocations: 713.317 KB)

julia> @time bar([0,0,0], 10^4);
  0.000226 seconds (6 allocations: 288 bytes)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:broadcast Applying a function over a collection
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants