use ArraysOfArrays to reduce allocation #65

Moelf · 2021-07-25T21:04:21Z

close #64

codecov · 2021-07-25T21:07:07Z

Codecov Report

Merging #65 (3a2ba37) into master (2ca8c49) will decrease coverage by 0.87%.
The diff coverage is 75.00%.

❗ Current head 3a2ba37 differs from pull request most recent head cfafdcc. Consider uploading reports for the commit cfafdcc to get more accurate results

@@            Coverage Diff             @@
##           master      #65      +/-   ##
==========================================
- Coverage   83.41%   82.54%   -0.88%     
==========================================
  Files          10       10              
  Lines        1182     1203      +21     
==========================================
+ Hits          986      993       +7     
- Misses        196      210      +14

Impacted Files	Coverage Δ
src/UnROOT.jl	`50.00% <ø> (-50.00%)`	⬇️
src/custom.jl	`54.28% <0.00%> (-6.04%)`	⬇️
src/types.jl	`92.40% <ø> (ø)`
src/iteration.jl	`66.08% <85.00%> (+1.27%)`	⬆️
src/root.jl	`89.47% <93.33%> (+0.46%)`	⬆️
src/streamers.jl	`88.57% <0.00%> (-1.79%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2ca8c49...cfafdcc. Read the comment docs.

Moelf · 2021-07-25T21:08:50Z

manually requesting @aminnj 's review with his many battle tested workload

aminnj · 2021-07-25T23:06:17Z

Hmm, I tried this out on a 2.4M event Z->mumu tree. Uploaded here (~400MB). I get 5x slower looping rate compared to master.

julia> const f = ROOTFile("doublemu.root") # 2.4M events

julia> const t = LazyTree(f, "t", [r"^Muon_(pt|eta|phi|mass)$","MET_pt"]);

julia> t.Muon_pt
2476431-element LazyBranch{Vector{Float32}, UnROOT.Nooffsetjagg}:
 [22.402422, 18.186892]
 [45.062744, 44.058678]
 [8.216898]
 ⋮
 [24.227955, 14.885574]
 [16.145634, 12.987432, 6.6345634, 4.7526903]

julia> struct DummyLV{T <: AbstractFloat}
    pt::T
    eta::T
    phi::T
    mass::T
end

julia> @time for (i,evt) in enumerate(t)
    length(evt.Muon_pt) < 2 && continue # at least 2 mu
    ((evt.Muon_pt[1] < 20) || (evt.Muon_pt[2] < 20)) && continue # leading 2 mu pt>20
    lvs = DummyLV.(evt.Muon_pt,evt.Muon_eta,evt.Muon_phi,evt.Muon_mass)
end

0.199550 seconds (511.96 k allocations: 56.539 MiB) # in master

1.161238 seconds (19.17 M allocations: 721.681 MiB, 11.49% gc time, 5.18% compilation time)
 # in this branch

Moelf · 2021-07-25T23:19:51Z

hmm, I can imagine if loop is very tight already, getting views itself might be slower than having concrete numbers with very good locality.

aminnj · 2021-07-25T23:21:04Z

Some more data. With this simpler test:

julia> @time for (i,evt) in enumerate(t)
    length(evt.Muon_pt) < 2 && continue # at least 2 mu
end

I got

master

cold: 2.859309 seconds (24.56 M allocations: 1.072 GiB, 20.29% gc time, 0.16% compilation time)
warm: 0.030027 seconds (5.90 k allocations: 767.688 KiB)

this PR

cold: 0.575812 seconds (4.92 M allocations: 373.350 MiB, 14.94% gc time, 1.65% compilation time)
warm: 0.234112 seconds (4.89 M allocations: 150.975 MiB, 7.24% gc time)

aminnj · 2021-07-25T23:24:39Z

Time and allocations are reduced for cold runs in this PR wrt master, but subsequent runs are slow. I tried reverting the 3GB->1GB cache reduction and still see the slowness. So it must not be the cache.

Moelf · 2021-07-26T14:39:34Z

I think I found the issue,
before:

after:

those gaps ....

Moelf · 2021-07-26T21:04:12Z

I wonder if we can pre-fetch the first basket of each LazyBranch at creation time which will help us infer the buffer type without too much user interference

Moelf · 2021-07-26T21:38:19Z

do we want to merge this? or people have ideas for polishing. I think there's still some small instability causing allocation, but since even with that @assert, none of the tests failed so I don't know what to look for....

…nto test_ArraysOfArrays

tamasgal · 2021-07-27T06:23:12Z

Yes I think we should merge and then see how it performs in the wild 😜

[skip ci]

use ArraysOfArrays and Ref() to reduce allocation

344667f

Moelf added 2 commits July 25, 2021 23:51

use ArraysOfArrays in doubly jagged too

9969095

Let TLV also uses VectorsOfVectors

d2fd7ac

Moelf mentioned this pull request Jul 26, 2021

Poor performance when weaving many branches together #64

Closed

Fix type instability by adding a new parameter for buffer

d52e528

Moelf force-pushed the test_ArraysOfArrays branch from add0fff to d52e528 Compare July 26, 2021 20:58

Moelf changed the title ~~use ArraysOfArrays and Ref() to reduce allocation~~ use ArraysOfArrays to reduce allocation Jul 26, 2021

Use map ntoh to eliminate BitVector

3369d6f

Moelf added 3 commits July 26, 2021 23:42

Merge branch 'master' into test_ArraysOfArrays

9fae840

Clean up debuging code

4453355

Merge branch 'test_ArraysOfArrays' of github.com:tamasgal/UnROOT.jl i…

845748c

…nto test_ArraysOfArrays

tamasgal approved these changes Jul 27, 2021

View reviewed changes

Formatting

cfafdcc

Moelf force-pushed the test_ArraysOfArrays branch from 3a2ba37 to cfafdcc Compare July 27, 2021 08:04

Bump version, add compat

7101a2f

[skip ci]

Moelf merged commit 04a4487 into master Jul 27, 2021

Moelf deleted the test_ArraysOfArrays branch September 6, 2021 00:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use ArraysOfArrays to reduce allocation #65

use ArraysOfArrays to reduce allocation #65

Moelf commented Jul 25, 2021

codecov bot commented Jul 25, 2021 •

edited

Moelf commented Jul 25, 2021

aminnj commented Jul 25, 2021

Moelf commented Jul 25, 2021

aminnj commented Jul 25, 2021

aminnj commented Jul 25, 2021

Moelf commented Jul 26, 2021

Moelf commented Jul 26, 2021 •

edited

Moelf commented Jul 26, 2021 •

edited

tamasgal commented Jul 27, 2021

use ArraysOfArrays to reduce allocation #65

use ArraysOfArrays to reduce allocation #65

Conversation

Moelf commented Jul 25, 2021

codecov bot commented Jul 25, 2021 • edited

Codecov Report

Moelf commented Jul 25, 2021

aminnj commented Jul 25, 2021

Moelf commented Jul 25, 2021

aminnj commented Jul 25, 2021

master

this PR

aminnj commented Jul 25, 2021

Moelf commented Jul 26, 2021

Moelf commented Jul 26, 2021 • edited

Moelf commented Jul 26, 2021 • edited

tamasgal commented Jul 27, 2021

codecov bot commented Jul 25, 2021 •

edited

Moelf commented Jul 26, 2021 •

edited

Moelf commented Jul 26, 2021 •

edited