-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug in offsets readout, unexpected fall back to 0 #39
Comments
I just checked with >>> import km3io
>>> f = km3io.OnlineReader("mcv6.1.mupage_10G.si
... rene.jterbr00007209.2548.root")
>>> f.events.snapshot_hits._offsets
array([ 0, 151, 198, ..., 783089, 783164, 783229]) |
Digging deeper, it's related to how we handle multiple entries in For the small file, julia> f = ROOTFile("/home/tgal/Dev/UnROOT.jl/test/samples/km3net_online.root");
julia> f["KM3NET_EVENT/KM3NET_EVENT/snapshotHits"].fBasketSeek
┌ Warning: Can't automatically create LazyBranch for branch KM3NET_EVENT/KM3NET_EVENT/snapshotHits. Returning a branch object
└ @ UnROOT ~/.julia/packages/UnROOT/o16Nw/src/root.jl:96
10-element Vector{Int64}:
1606921
0
0
0
0
0
0
0
0
0 Whereas for the larger file we have two entries, in two different parts of the ROOT file: julia> f["KM3NET_EVENT/KM3NET_EVENT/snapshotHits"].fBasketSeek┌ Warning: Can't automatically create LazyBranch for branch KM3NET_EVENT/KM3NET_EVENT/snapshotHits. Returning a branch object
└ @ UnROOT ~/.julia/packages/UnROOT/o16Nw/src/root.jl:96
10-element Vector{Int64}:
324654693
531785725
0
0
0
0
0
0
0
0 It's just the stitching of the two different parts, which is buggy, so that will be an easy fix, I think. Let's see... |
https://github.com/tamasgal/UnROOT.jl/blob/74735c939e4127ee527270b21be9cd19626566f9/src/root.jl#L280 maybe this? But this should only push the last offset for each basket |
I am going through the code. The problem is also the extra entry in the Are you handy with basket option tweaking in ROOT? We need a small test file which uses multiple branches to test this 😕 I am currently looking how to set the basket size but have not found anything useful yet. |
if there are 10 elements in a basket, we do expect 11 offsets. Because you index the raw bytes for ith element with |
Yes that's true, but I have one more ;) |
Just to clarify: the number of offset-elements is correct (11) for the small file with 10 entries. For the large file it's 10412 instead of 10411 (corresponding to 10410 elements). |
ah ok, I think the fix is when we dump out the entire branch's raw bytes, for each basket we need to add a "global offset". and that's probably also where you want to remove the extra offset that is duplicate between two baskets |
Yep, we just need to go ahead and count up. I am trying to figure out where to do it, the code has changed a lot 😆 |
I think just this one function: if what I imagine happening is causing the issue. To enable accessing baskets out of order, each baskets return data and local offsets (i.e always starts |
Yes, it should be ripped apart (divide and conquer). For now I'll do a quick fix and later we can discuss how to proceed. |
wait, I think we can just fix it by something along the line of: Seeks = branch.fBasketSeek;
for i in eachindex(Seeks)
if i<length(Seeks) && Seeks[i+1]!=0
pop!(offsets)
end
data, offset = readbasketseek(f, branch, Seeks[i])
append!(datas, data)
append!(offsets, offset)
end |
Oh well, look at my hack 😆 |
I just discovered a serious bug in the
data
andoffset
readout while I was working with larger ROOT files. I think it's mostly affecting large datasets, but I am not sure.With small sample file, we have no issues, everything is read out correctly and I can assemble the hits.
Just as a side note: a
UnROOT.KM3NETDAQHit
consists of 10 bytes and per event we have a 10 bytes header.Now comes the fun part. Here is a large file with
10410
events. InUnROOT
, the length of theoffsets
indicates10411
events, one more than expected!I first compared the length of events (number of hits) at the beginning of the data and the end of the data and they all match! It's just this weird jump in the middle of the
offsets
.Luckily the number of bytes in
data
matches the expected number of bytes when cross-checking with uproot (it'sn_events * 10bytes + n_hits * 10bytes
== 7936390`)Now I have to understand why the offset starts to count from 0 and how to solve it. For now,
UnROOT
simply iterates over thediff(offsets)
and it worked, but apparently it's not the full story;)
The text was updated successfully, but these errors were encountered: