Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use ArraysOfArrays to reduce allocation #65

Merged
merged 10 commits into from
Jul 27, 2021
Merged

use ArraysOfArrays to reduce allocation #65

merged 10 commits into from
Jul 27, 2021

Conversation

Moelf
Copy link
Member

@Moelf Moelf commented Jul 25, 2021

close #64

@codecov
Copy link

codecov bot commented Jul 25, 2021

Codecov Report

Merging #65 (3a2ba37) into master (2ca8c49) will decrease coverage by 0.87%.
The diff coverage is 75.00%.

❗ Current head 3a2ba37 differs from pull request most recent head cfafdcc. Consider uploading reports for the commit cfafdcc to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master      #65      +/-   ##
==========================================
- Coverage   83.41%   82.54%   -0.88%     
==========================================
  Files          10       10              
  Lines        1182     1203      +21     
==========================================
+ Hits          986      993       +7     
- Misses        196      210      +14     
Impacted Files Coverage Δ
src/UnROOT.jl 50.00% <ø> (-50.00%) ⬇️
src/custom.jl 54.28% <0.00%> (-6.04%) ⬇️
src/types.jl 92.40% <ø> (ø)
src/iteration.jl 66.08% <85.00%> (+1.27%) ⬆️
src/root.jl 89.47% <93.33%> (+0.46%) ⬆️
src/streamers.jl 88.57% <0.00%> (-1.79%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2ca8c49...cfafdcc. Read the comment docs.

@Moelf
Copy link
Member Author

Moelf commented Jul 25, 2021

manually requesting @aminnj 's review with his many battle tested workload

@aminnj
Copy link
Member

aminnj commented Jul 25, 2021

Hmm, I tried this out on a 2.4M event Z->mumu tree. Uploaded here (~400MB). I get 5x slower looping rate compared to master.

julia> const f = ROOTFile("doublemu.root") # 2.4M events

julia> const t = LazyTree(f, "t", [r"^Muon_(pt|eta|phi|mass)$","MET_pt"]);

julia> t.Muon_pt
2476431-element LazyBranch{Vector{Float32}, UnROOT.Nooffsetjagg}:
 [22.402422, 18.186892]
 [45.062744, 44.058678]
 [8.216898]
 
 [24.227955, 14.885574]
 [16.145634, 12.987432, 6.6345634, 4.7526903]

julia> struct DummyLV{T <: AbstractFloat}
    pt::T
    eta::T
    phi::T
    mass::T
end
julia> @time for (i,evt) in enumerate(t)
    length(evt.Muon_pt) < 2 && continue # at least 2 mu
    ((evt.Muon_pt[1] < 20) || (evt.Muon_pt[2] < 20)) && continue # leading 2 mu pt>20
    lvs = DummyLV.(evt.Muon_pt,evt.Muon_eta,evt.Muon_phi,evt.Muon_mass)
end

0.199550 seconds (511.96 k allocations: 56.539 MiB) # in master

1.161238 seconds (19.17 M allocations: 721.681 MiB, 11.49% gc time, 5.18% compilation time)
 # in this branch

@Moelf
Copy link
Member Author

Moelf commented Jul 25, 2021

hmm, I can imagine if loop is very tight already, getting views itself might be slower than having concrete numbers with very good locality.

@aminnj
Copy link
Member

aminnj commented Jul 25, 2021

Some more data. With this simpler test:

julia> @time for (i,evt) in enumerate(t)
    length(evt.Muon_pt) < 2 && continue # at least 2 mu
end

I got

master

cold: 2.859309 seconds (24.56 M allocations: 1.072 GiB, 20.29% gc time, 0.16% compilation time)
warm: 0.030027 seconds (5.90 k allocations: 767.688 KiB)

this PR

cold: 0.575812 seconds (4.92 M allocations: 373.350 MiB, 14.94% gc time, 1.65% compilation time)
warm: 0.234112 seconds (4.89 M allocations: 150.975 MiB, 7.24% gc time)

@aminnj
Copy link
Member

aminnj commented Jul 25, 2021

Time and allocations are reduced for cold runs in this PR wrt master, but subsequent runs are slow. I tried reverting the 3GB->1GB cache reduction and still see the slowness. So it must not be the cache.

@Moelf
Copy link
Member Author

Moelf commented Jul 26, 2021

I think I found the issue,
before:
image

after:
image

those gaps ....

@Moelf Moelf changed the title use ArraysOfArrays and Ref() to reduce allocation use ArraysOfArrays to reduce allocation Jul 26, 2021
@Moelf
Copy link
Member Author

Moelf commented Jul 26, 2021

I wonder if we can pre-fetch the first basket of each LazyBranch at creation time which will help us infer the buffer type without too much user interference

@Moelf
Copy link
Member Author

Moelf commented Jul 26, 2021

do we want to merge this? or people have ideas for polishing. I think there's still some small instability causing allocation, but since even with that @assert, none of the tests failed so I don't know what to look for....

@tamasgal
Copy link
Member

Yes I think we should merge and then see how it performs in the wild 😜

@Moelf Moelf merged commit 04a4487 into master Jul 27, 2021
@Moelf Moelf deleted the test_ArraysOfArrays branch September 6, 2021 00:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Poor performance when weaving many branches together
3 participants