Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Threading failure with @batch #133

Closed
aminnj opened this issue Oct 8, 2021 · 2 comments · Fixed by #134
Closed

Threading failure with @batch #133

aminnj opened this issue Oct 8, 2021 · 2 comments · Fixed by #134

Comments

@aminnj
Copy link
Member

aminnj commented Oct 8, 2021

(nondeterministic) MWE:

using UnROOT
using Base.Threads
using Polyester # make sure it is >=0.5.3

function foo_batch(t)
    @batch for evt in t
    # @threads for evt in t
        evt.nMuon == 4 || continue
    end
end

foo_batch(LazyTree(ROOTFile("lz4_Run2012BC_DoubleMuParked_Muons.root"),"Events", ["nMuon"]))

but note that it only fails 30% of the time :( with something like this:

Task (failed) @0x00007f17e9f1d150
UndefRefError: access to undefined reference
Stacktrace:
 [1] load
   @ ~/.julia/packages/ManualMemory/J0yTE/src/ManualMemory.jl:60 [inlined]
 [2] macro expansion
   @ ~/.julia/packages/ManualMemory/J0yTE/src/ManualMemory.jl:160 [inlined]
 [3] load
   @ ~/.julia/packages/ManualMemory/J0yTE/src/ManualMemory.jl:160 [inlined]
 [4] (::Polyester.BatchClosure{var"#1#2", ManualMemory.Reference{Tuple{Int64, Static.StaticInt{1}, Polyester.NoLoop, LazyTree{TypedTables.Table{NamedTuple{(:nMuon,), Tuple{UInt32}}, 1, NamedTuple{(:nMuon,), Tuple{LazyBranch{UInt32, UnROOT.Nojagg, Vector{UInt32}}}}}}}}, false})(p::Ptr{UInt64})
   @ Polyester ~/.julia/packages/Polyester/G022G/src/batch.jl:5
 [5] _call
   @ ~/.julia/packages/ThreadingUtilities/IkkvN/src/threadtasks.jl:11 [inlined]
 [6] (::ThreadingUtilities.ThreadTask)()
   @ ThreadingUtilities ~/.julia/packages/ThreadingUtilities/IkkvN/src/threadtasks.jl:29Task
  next: Nothing nothing
  queue: Nothing nothing
  storage: Nothing nothing
  donenotify: Base.GenericCondition{SpinLock}
    waitq: Base.InvasiveLinkedList{Task}
      head: Nothing nothing
      tail: Nothing nothing
    lock: SpinLock
      owned: Int64 0
  result: UndefRefError UndefRefError()
  logstate: Nothing nothing
  code: ThreadingUtilities.ThreadTask
    p: Ptr{UInt64} @0x00000000025cf000
  rngState0: UInt64 0xa0ac74b4dbd9be5c
  rngState1: UInt64 0xca657b522fb38e15
  rngState2: UInt64 0x80f16291609a20c5
  rngState3: UInt64 0x00380ae50f218885
  _state: UInt8 0x02
  sticky: Bool true
  _isexception: Bool true

@Moelf suggests it's the customization in UnROOT/src/polyester.jl. Commenting that out seems to eliminate these failures and @batch still works because of the recent PR that made LazyTree <: AbstractVector{LazyEvent{T}}.

So, since @threads/@batch have pretty much the same overhead when considering our problem/batch size [1], if we can do all of the following,

@threads for i in eachindex()
@threads for evt in t
@threads for (i,evt) in enumerate(t)

then we can drop all the polyester stuff.

[1]
Double muon file H->ZZ->4mu benchmark with polyester.jl commented out
image

@aminnj
Copy link
Member Author

aminnj commented Oct 8, 2021

julia> @time @batch for evt in t ; _ = evt.Muon_pt; end
  0.501378 seconds (149.80 k allocations: 3.466 GiB, 11.44% gc time, 12.94% compilation time)

julia> @time @threads for evt in t ; _ = evt.Muon_pt; end
  0.413486 seconds (24.45 k allocations: 3.385 GiB, 18.05% gc time, 4.27% compilation time)

julia> @time @batch for (i,evt) in enumerate(t) ; _ = evt.Muon_pt ; end
  2.157369 seconds (61.70 M allocations: 8.052 GiB, 56.97% gc time, 4.21% compilation time)

julia> @time @threads for (i,evt) in enumerate(t) ; _ = evt.Muon_pt ; end
  2.276005 seconds (61.57 M allocations: 7.970 GiB, 60.74% gc time, 1.21% compilation time)

Whether we use @batch (with our customization) or @threads, the parallelized enumerate(t) sucks, so I guess this means we can drop @batch/polyester.jl?

And in the future it would be nice if we could get @threads for (i,evt) in enumerate(t) to have the same performance as @threads for evt in t

@aminnj
Copy link
Member Author

aminnj commented Oct 8, 2021

Actually, this is an easy fix.

-Base.getindex(e::Iterators.Enumerate{LazyTree{T}}, row::Int) where T = (row, first(iterate(e.itr, row)))
+Base.getindex(e::Iterators.Enumerate{LazyTree{T}}, row::Int) where T = (row, LazyEvent(innertable(e.itr), row))

and then...

julia> @time @threads for evt in t ; _ = evt.Muon_pt ; end
  0.420194 seconds (24.55 k allocations: 3.460 GiB, 16.40% gc time, 0.00% compilation time)

julia> @time @threads for (i,evt) in enumerate(t) ; _ = evt.Muon_pt ; end
  0.401146 seconds (27.84 k allocations: 3.460 GiB, 16.16% gc time, 4.41% compilation time)

and then we drop polyester.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant