Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preserve block structure in broadcasting #61

Merged
merged 8 commits into from
Nov 29, 2018
Merged

Conversation

dlfivefifty
Copy link
Member

This preserves the block structure under broadcasting.

One missing case is BlockMatrix .+ BlockVector.

This is step 1 in resolving #31

@codecov-io
Copy link

codecov-io commented Nov 26, 2018

Codecov Report

Merging #61 into master will increase coverage by 0.94%.
The diff coverage is 83.33%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #61      +/-   ##
==========================================
+ Coverage   62.74%   63.69%   +0.94%     
==========================================
  Files          10       11       +1     
  Lines         459      482      +23     
==========================================
+ Hits          288      307      +19     
- Misses        171      175       +4
Impacted Files Coverage Δ
src/blockarrayinterface.jl 42.85% <100%> (+22.85%) ⬆️
src/abstractblockarray.jl 76.27% <100%> (+0.4%) ⬆️
src/blocksizes.jl 66.66% <75%> (ø) ⬆️
src/blockbroadcast.jl 82.35% <82.35%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d92f0b4...7340610. Read the comment docs.

####


union.(([1,2,3],[4,5,6]), ([1,2,3],[4,5,6]))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it a remnant of experiments or something?

@tkf
Copy link
Member

tkf commented Nov 27, 2018

Are you planning to optimize the computation at some point? More precisely, what I have in mind is something like the following:

using BlockArrays
using BlockArrays: cumulsizes
using FillArrays

function blocks_to_vector(blocks)
    R = eltype(blocks)
    T = eltype(R)
    @assert R <: AbstractVector
    ba = BlockArray{T, 1, R}(undef_blocks, size.(blocks, 1))
    ba.blocks .= blocks
    return ba
end

n = 100
x = blocks_to_vector([Fill(111.0, 4n), Fill(222.0, n)])
y = blocks_to_vector([Fill(333.0, 4n), Fill(444.0, n)])
z = randn(size(x, 1))

function naive(x, y, z)
    @. z = x * y * z
    return z
end

function blocked(x, y, z)
    for (xb, yb, b, e) in zip(x.blocks, y.blocks,
                              cumulsizes(x, 1)[1:end-1],
                              cumulsizes(x, 1)[2:end])
        zb = @view z[b:e-1]
        @. zb = xb * yb * zb
    end
    return z
end

@assert naive(x, y, copy(z)) == blocked(x, y, copy(z))

using BenchmarkTools
using Statistics
(b1 = @benchmark naive($x, $y, z) setup=(z=randn(size(x, 1)))) |> display
(b2 = @benchmark blocked($x, $y, z) setup=(z=randn(size(x, 1)))) |> display
judge(median(b2), median(b1))

I want naive to be as fast as blocked.

Just FYI, the output of the script above is:

BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     2.638 μs (0.00% GC)
  median time:      2.798 μs (0.00% GC)
  mean time:        2.950 μs (0.00% GC)
  maximum time:     8.577 μs (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     9
BenchmarkTools.Trial:
  memory estimate:  288 bytes
  allocs estimate:  4
  --------------
  minimum time:     276.241 ns (0.00% GC)
  median time:      287.697 ns (0.00% GC)
  mean time:        334.031 ns (10.72% GC)
  maximum time:     149.490 μs (99.72% GC)
  --------------
  samples:          10000
  evals/sample:     290
  4.404270 seconds (14.81 M allocations: 1.053 GiB, 25.97% gc time)
BenchmarkTools.TrialJudgement:
  time:   -89.72% => improvement (5.00% tolerance)
  memory: +Inf% => regression (1.00% tolerance)

@dlfivefifty
Copy link
Member Author

Yes optimising is next.

@dlfivefifty
Copy link
Member Author

@tkf In dl/fastbroadcast branch I've improved the speed (at least for block vectors/matrices). Unfortunately it's not quite working when the arguments have different block sizes.

@dlfivefifty
Copy link
Member Author

Let me know if you have any more comments, otherwise I'll merge


A = randn(5)
@test blocksizes(A) == BlockArrays.BlockSizes([5])
A[Block(1)] == A
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing @test?

PseudoBlockStyle{M}(::Val{N}) where {N,M} = PseudoBlockStyle{N}()
BroadcastStyle(::Type{<:BlockArray{<:Any,N}}) where N = BlockStyle{N}()
BroadcastStyle(::Type{<:PseudoBlockArray{<:Any,N}}) where N = PseudoBlockStyle{N}()
BroadcastStyle(::DefaultArrayStyle{N}, b::AbstractBlockStyle{M}) where {M,N} = typeof(b)(_max(Val(M),Val(N)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you planning to "vendor" _max at some point? Otherwise it doesn't run on Julia master.

@tkf
Copy link
Member

tkf commented Nov 29, 2018

I still haven't learned BlockArrays internal so please don't wait for me to do any non-nit-picking review :)

Unfortunately it's not quite working when the arguments have different block sizes.

Does "different block sizes" include a case like BlockArray(randn(6), 1:3) .+ ones(6) (which I tried and didn't work ATM)? Looking at the code, it looks like it's designed to handle this case as well via view.

@dlfivefifty
Copy link
Member Author

In this PR that works:

julia> BlockArray(randn(6), 1:3) .+ ones(6) 
6-element BlockArray{Float64,1,Array{Float64,1}}:
  1.630468442532147 
 ───────────────────
  2.1389068389040777
  1.0502081079839254
 ───────────────────
  1.4265672357406085
  0.3687548937045446
 -0.1642694693789739

In the other branch dl/fastbroadcast the issue is that we want to use blocks for arrays with compatible block sizes as the destination, and regular indexing for other arrays. I think BlockSlice might make that easy but I'll have to dig deeper.

@dlfivefifty
Copy link
Member Author

The failure on nightly is now an unrelated change in lmul!, which can be handled separately.

@dlfivefifty dlfivefifty merged commit 9024edf into master Nov 29, 2018
@dlfivefifty dlfivefifty deleted the dl/broadcast branch November 29, 2018 10:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants