Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TH1_3 not defined #168

Open
PatriceLebrun opened this issue May 10, 2022 · 41 comments
Open

TH1_3 not defined #168

PatriceLebrun opened this issue May 10, 2022 · 41 comments
Assignees
Labels
help wanted Extra attention is needed
Milestone

Comments

@PatriceLebrun
Copy link

I use for the first time and I try to read a TH1F histogram produced with GEANT4
But I got the following issue:

julia> f = ROOTFile("/Users/lebrun/Desktop/dedx2COMET.root")
ROOTFile with 1 entry and 53 streamers.
/Users/lebrun/Desktop/dedx2COMET.root
└─ h1 (TH1D)

julia> f["h1"]
ERROR: UndefVarError: TH1_3 not defined

Did I make a trivial error ?

@Moelf
Copy link
Member

Moelf commented May 10, 2022

abstract type TH1 <: ROOTStreamedObject end
struct TH1_8 <: TH1 end
function readfields!(io, fields, T::Type{TH1_8}) end
abstract type TH2 <: ROOTStreamedObject end
struct TH2_4 <: TH2 end
function readfields!(io, fields, T::Type{TH2_4}) end

I think we just don't have bootstrap for that type here

@Moelf
Copy link
Member

Moelf commented May 10, 2022

idk how this works, looks like it does literally nothing, but if you want, try adding these two lines

 struct TH1_3 <: TH1 end 
 function readfields!(io, fields, T::Type{TH1_3}) end 

@PatriceLebrun
Copy link
Author

I added the two lines from Moelf in bootstrap.jl which fixed this issue but I need to do the same sort of thing for many other types (TAxis_ , TAttAxis_ TNamed_ , ...) and at the end I got the following error:
ERROR: Object 'UnROOT.TAttAxis' has 40 bytes; expected 3180234308

(geant4-v11.0.1 , OS: Ubuntu 20.04.4 LTS)

@Moelf
Copy link
Member

Moelf commented May 11, 2022

can you send an example file?

@PatriceLebrun
Copy link
Author

PatriceLebrun commented May 11, 2022 via email

@Moelf
Copy link
Member

Moelf commented May 11, 2022

sorry I can't see your file

@PatriceLebrun
Copy link
Author

PatriceLebrun commented May 11, 2022 via email

@Moelf
Copy link
Member

Moelf commented May 11, 2022

you're not sending email to me directly, see: #168

@PatriceLebrun
Copy link
Author

Here is the file concerned

dedx2COMET.root.gz

@PatriceLebrun
Copy link
Author

PatriceLebrun commented May 11, 2022 via email

@Moelf
Copy link
Member

Moelf commented Jun 1, 2022

looks like this is the difference between

# this issue
In [3]: f = up.open("./dedx2COMET.root")

In [4]: f.keys()
Out[4]: ['h1;1']

In [5]: f["h1"]
Out[5]: <TH1D (version 1) at 0x7f07b281ce20>

In [6]: f["h1"]
Out[6]: <TH1D (version 1) at 0x7f07b281ce20>

# what UnROOT.jl knows how to handle
In [4]: f["myTH1D"]
Out[4]: <TH1D (version 2) at 0x7fa9a22892b0>

I don't have experience in hunting down different internal version of ROOT object, asking @tamasgal

@aminnj
Copy link
Member

aminnj commented Jun 2, 2022

In case it helps, you could try to copy the current structs and modify them based on these differences

root [27] TH1::Class()->GetStreamerInfo(3)->ls()

StreamerInfo for class: TH1, version=3, checksum=0x5668b3d7
  TNamed         BASE            offset=  0 type=67 The basis for a named object (name, title)
  TAttLine       BASE            offset= 64 type= 0 Line attributes
  TAttFill       BASE            offset= 80 type= 0 Fill area attributes
  TAttMarker     BASE            offset= 96 type= 0 Marker attributes
  Int_t          fNcells         offset=112 type= 3 number of bins(1D), cells (2D) +U/Overflows
  TAxis          fXaxis          offset=120 type=61 X axis descriptor
  TAxis          fYaxis          offset=336 type=61 Y axis descriptor
  TAxis          fZaxis          offset=552 type=61 Z axis descriptor
  Short_t        fBarOffset      offset=768 type= 2 (1000*offset) for bar charts or legos
  Short_t        fBarWidth       offset=770 type= 2 (1000*width) for bar charts or legos
  Stat_t         fEntries        offset=776 type= 8 Number of entries
  Stat_t         fTsumw          offset=784 type= 8 Total Sum of weights
  Stat_t         fTsumw2         offset=792 type= 8 Total Sum of squares of weights
  Stat_t         fTsumwx         offset=800 type= 8 Total Sum of weight*X
  Stat_t         fTsumwx2        offset=808 type= 8 Total Sum of weight*X*X
  Double_t       fMaximum        offset=816 type= 8 Maximum value for plotting
  Double_t       fMinimum        offset=824 type= 8 Minimum value for plotting
  Double_t       fNormFactor     offset=832 type= 8 Normalization factor
  TArrayD        fContour        offset=840 type=62 Array to display contour levels
  TArrayD        fSumw2          offset=864 type=62 Array of sum of squares of weights
  TString        fOption         offset=888 type=65 histogram options
  TList*         fFunctions      offset=912 type=63 ->Pointer to list of functions (fits and user)
   i= 0, TNamed          type= 67, offset=  0, len=1, method=0
   i= 1, TAttLine        type=  0, offset= 64, len=1, method=0
   i= 2, TAttFill        type=  0, offset= 80, len=1, method=0
   i= 3, TAttMarker      type=  0, offset= 96, len=1, method=0
   i= 4, fNcells         type=  3, offset=112, len=1, method=0
   i= 5, fXaxis          type= 61, offset=120, len=1, method=0
   i= 6, fYaxis          type= 61, offset=336, len=1, method=0
   i= 7, fZaxis          type= 61, offset=552, len=1, method=0
   i= 8, fBarOffset      type= 22, offset=768, len=2, method=0 [optimized]
   i= 9, fEntries        type= 28, offset=776, len=8, method=0 [optimized]
   i=10, fContour        type= 62, offset=840, len=1, method=0
   i=11, fSumw2          type= 62, offset=864, len=1, method=0
   i=12, fOption         type= 65, offset=888, len=1, method=0
   i=13, fFunctions      type= 63, offset=912, len=1, method=0
root [28] TH1::Class()->GetStreamerInfo(8)->ls()

StreamerInfo for class: TH1, version=8, checksum=0x1c3740c4
  TNamed         BASE            offset=  0 type=67 The basis for a named object (name, title)
  TAttLine       BASE            offset= 64 type= 0 Line attributes
  TAttFill       BASE            offset= 80 type= 0 Fill area attributes
  TAttMarker     BASE            offset= 96 type= 0 Marker attributes
  int            fNcells         offset=112 type= 3 number of bins(1D), cells (2D) +U/Overflows
  TAxis          fXaxis          offset=120 type=61 X axis descriptor
  TAxis          fYaxis          offset=336 type=61 Y axis descriptor
  TAxis          fZaxis          offset=552 type=61 Z axis descriptor
  short          fBarOffset      offset=768 type= 2 (1000*offset) for bar charts or legos
  short          fBarWidth       offset=770 type= 2 (1000*width) for bar charts or legos
  double         fEntries        offset=776 type= 8 Number of entries
  double         fTsumw          offset=784 type= 8 Total Sum of weights
  double         fTsumw2         offset=792 type= 8 Total Sum of squares of weights
  double         fTsumwx         offset=800 type= 8 Total Sum of weight*X
  double         fTsumwx2        offset=808 type= 8 Total Sum of weight*X*X
  double         fMaximum        offset=816 type= 8 Maximum value for plotting
  double         fMinimum        offset=824 type= 8 Minimum value for plotting
  double         fNormFactor     offset=832 type= 8 Normalization factor
  TArrayD        fContour        offset=840 type=62 Array to display contour levels
  TArrayD        fSumw2          offset=864 type=62 Array of sum of squares of weights
  TString        fOption         offset=888 type=65 histogram options
  TList*         fFunctions      offset=912 type=63 ->Pointer to list of functions (fits and user)
  int            fBufferSize     offset=920 type= 6 fBuffer size
  double*        fBuffer         offset=928 type=48 [fBufferSize] entry buffer
  TH1::EBinErrorOpt fBinStatErrOpt  offset=968 type= 3 option for bin statistical errors
  TH1::EStatOverflows fStatOverflows  offset=972 type= 3 per object flag to use under/overflows in statistics
   i= 0, TNamed          type= 67, offset=  0, len=1, method=0
   i= 1, TAttLine        type=  0, offset= 64, len=1, method=0
   i= 2, TAttFill        type=  0, offset= 80, len=1, method=0
   i= 3, TAttMarker      type=  0, offset= 96, len=1, method=0
   i= 4, fNcells         type=  3, offset=112, len=1, method=0
   i= 5, fXaxis          type= 61, offset=120, len=1, method=0
   i= 6, fYaxis          type= 61, offset=336, len=1, method=0
   i= 7, fZaxis          type= 61, offset=552, len=1, method=0
   i= 8, fBarOffset      type= 22, offset=768, len=2, method=0 [optimized]
   i= 9, fEntries        type= 28, offset=776, len=8, method=0 [optimized]
   i=10, fContour        type= 62, offset=840, len=1, method=0
   i=11, fSumw2          type= 62, offset=864, len=1, method=0
   i=12, fOption         type= 65, offset=888, len=1, method=0
   i=13, fFunctions      type= 63, offset=912, len=1, method=0
   i=14, fBufferSize     type=  6, offset=920, len=1, method=84435080
   i=15, fBuffer         type= 48, offset=928, len=1, method=920
   i=16, fBinStatErrOpt  type= 23, offset=968, len=2, method=0 [optimized]

and similar for TAxis 6 vs 10. I didn't get further than that because of an EOF error :(

@Momox357
Copy link

I'm currently experiencing the same situation with TH1_6 not defined and wanted to check if there has been any progress / new input regarding this.
As above just copy pasting the struct for TAxis gives me an EOF error as well, I've attached the .root file as a .zip just in case.
DataSample_search.zip

@Moelf
Copy link
Member

Moelf commented Jun 17, 2022

Can you easily check how old those root files are? I feel like we must haven't experience this because we have been testing with new files

@Momox357
Copy link

They are at least from 2013 since they are part of the supplementary material of a statistics book we are working with that got published in that year I can ask if there's a more exact date but I doubt it, so your guess is likely correct since you mentionted most tests were made with NanoAOD files.

@tamasgal
Copy link
Member

Sorry for the very late response, I was on holidays and then had a lot of teaching to do.

As @aminnj wrote, the GetStreamerInfo() is very helpful but I also used uproot in the past successfully to create the parser logic and compare between versions. That being said, I'd propose to check how uproot handles it. There are still some unresolved mysteries regarding old files, so maybe we are on hitting some higher barriers there ;) I will try to free up some time and look at your file.

@Momox357
Copy link

Momox357 commented Jun 21, 2022 via email

@tamasgal tamasgal self-assigned this Jun 22, 2022
@tamasgal tamasgal added this to the Version 1.0 milestone Jun 22, 2022
@Moelf Moelf added the help wanted Extra attention is needed label Jun 24, 2022
@jstrube
Copy link
Member

jstrube commented Nov 7, 2022

I have a similar issue that's probably related. I'm just attaching the file here.
t.root.gz

@tamasgal
Copy link
Member

tamasgal commented Nov 7, 2022

Thanks Jan. We really need a bit more contributors 🙈

I'll try to look at this tomorrow (I have another issue as draft for a week or so...)

@jstrube
Copy link
Member

jstrube commented Nov 7, 2022

Thanks! I'd be interested in the workaround that @Momox357 has mentioned. At the moment I'm a bit stuck.

@Momox357
Copy link

Momox357 commented Nov 7, 2022

You might be a bit disappointed in that case! Since I only had a few histograms that had this problem I wrote a ROOT script that exported the relevant data (bin content and error) to a .txt and imported that into a new histogram that I could continue working with.
"Workaround" might have been a generous term for that.

@tamasgal
Copy link
Member

tamasgal commented Nov 7, 2022

I just had a look at the file @Momox357 has provided and I even bump into a parsing error in uproot:

>>> f = uproot.open("dedx2COMET.root")

>>> h1 = f["h1"]

>>> h1.to_hist()
---------------------------------------------------------------------------
KeyInFileError                            Traceback (most recent call last)
Cell In [18], line 1
----> 1 h1.to_hist()

File ~/Dev/UnROOT.jl/venv/lib/python3.10/site-packages/uproot/behaviors/TH1.py:206, in Histogram.to_hist(self, metadata, axis_metadata)
    195 def to_hist(self, metadata=boost_metadata, axis_metadata=boost_axis_metadata):
    196     """
    197     Args:
    198         metadata (dict of str \u2192 str): Metadata to collect (keys) and
   (...)
    203     Converts the histogram into a ``hist`` object.
    204     """
    205     return uproot.extras.hist().Hist(
--> 206         self.to_boost(metadata=boost_metadata, axis_metadata=boost_axis_metadata)
    207     )

File ~/Dev/UnROOT.jl/venv/lib/python3.10/site-packages/uproot/behaviors/TH1.py:312, in TH1.to_boost(self, metadata, axis_metadata)
    309     else:
    310         storage = boost_histogram.storage.Double()
--> 312 xaxis = _boost_axis(self.member("fXaxis"), axis_metadata)
    313 out = boost_histogram.Histogram(xaxis, storage=storage)
    314 for k, v in metadata.items():

File ~/Dev/UnROOT.jl/venv/lib/python3.10/site-packages/uproot/behaviors/TH1.py:23, in _boost_axis(axis, metadata)
     20 fNbins = axis.member("fNbins")
     21 fXbins = axis.member("fXbins", none_if_missing=True)
---> 23 if axis.member("fLabels") is not None:
     24     out = boost_histogram.axis.StrCategory([str(x) for x in axis.member("fLabels")])
     26 elif fXbins is None or len(fXbins) != fNbins + 1:

File ~/Dev/UnROOT.jl/venv/lib/python3.10/site-packages/uproot/model.py:561, in Model.member(self, name, all, none_if_missing)
    559             return None
    560         else:
--> 561             raise uproot.KeyInFileError(
    562                 name,
    563                 because="""{}.{} has only the following members:
    564
    565     {}
    566 """.format(
    567                     type(self).__module__,
    568                     type(self).__name__,
    569                     ", ".join(repr(x) for x in self.all_members),
    570                 ),
    571                 file_path=getattr(self._file, "file_path", None),
    572             )

KeyInFileError: not found: 'fLabels' because uproot.dynamic.Model_TAxis_v6 has only the following members:

    '@fUniqueID', '@fBits', 'fName', 'fTitle', 'fNdivisions', 'fAxisColor', 'fLabelColor', 'fLabelFont', 'fLabelOffset', 'fLabelSize', 'fTickLength', 'fTitleOffset', 'fTitleSize', 'fTitleColor', 'fTitleFont', 'fNbins', 'fXmin', 'fXmax', 'fXbins', 'fFirst', 'fLast', 'fTimeDisplay', 'fTimeFormat'

in file dedx2COMET.root

I think that also uproot tries to simply parse data with possibly incompatible streamer versions and either it works -- which means it should be OK since there are several sanity checks in the ROOT format itself -- or produces some errors which indicate that some versions are not compatible.

I tried to teach UnROOT that TH1_3 is TH1_8 but I think the problem is much more deeper.

After working myself through all the streamer versions I encountered (like TAxis_1, TAxis_6, TAttMarker_1, TAttMarker_6 and so on), which is anyways very weird: why would a file use different versions of streamers? Was it opened for writing with different ROOT version? I finally hit this:

julia> f["h1"]
ERROR: UndefVarError: TAttAxis_8321 not defined

Which can only be some parsing error, unless the file is from the post-alien era 😉

@tamasgal
Copy link
Member

tamasgal commented Nov 7, 2022

OK I have a bit more information.

I am now working with the file @jstrube provided in #168 (comment)

When I add TAttMarker_1 as an alias to TAttMarker_2 I get a bit further and the parsing eventually chokes at the fIOFeatures step, which is simply missing since the preamble version is <20. I need to figure out how to parse earlier versions:

julia> using UnROOT

julia> f = ROOTFile("t.root");

julia> ENV["JULIA_DEBUG"] = "UnROOT";

julia> f["Primaries/px1"]
┌ Debug: Splitting path 'Primaries/px1' and getting items recursively
└ @ UnROOT ~/Dev/UnROOT.jl/src/root.jl:152
┌ Debug: Retrieving Primaries ('TTree')
└ @ UnROOT ~/Dev/UnROOT.jl/src/root.jl:157
┌ Debug: TTree: UnROOT.TKey32(2965, 2, 22758, 0x00000000, 73, 1, 5381164, 64, "TTree", "Primaries", "Step-by-step energy deposition")
└ @ UnROOT ~/Dev/UnROOT.jl/src/bootstrap.jl:899
┌ Debug: Compression type: ZL
└ @ UnROOT ~/Dev/UnROOT.jl/src/types.jl:154
┌ Debug: Compressed/uncompressed size in bytes: 2883 / 22758
└ @ UnROOT ~/Dev/UnROOT.jl/src/types.jl:155
┌ Debug: fIOFeatures is missing
└ @ UnROOT ~/Dev/UnROOT.jl/src/bootstrap.jl:955
┌ Debug: Reading branches
└ @ UnROOT ~/Dev/UnROOT.jl/src/bootstrap.jl:959
ERROR: EOFError: read end of file

@tamasgal
Copy link
Member

tamasgal commented Nov 7, 2022

So I guess this is where to search first:

fields[:fIOFeatures] = missing

but this might not be the end of the story ;)

@tamasgal
Copy link
Member

tamasgal commented Nov 7, 2022

Well, this looks like a different issue to me, somehow a recursion error when inside the TTree the same TTree is parsed again. Below you see the output with a bit more debug.

Maybe the problem is that these branches are sitting at the top of the ROOT file, without any TDirectory or so, but I am now guessing (I am close to 15 hours of work today with just a few tiny interruptions 🙈)

julia> f["pfates"]
┌ Debug: Retrieving pfates ('TTree')
└ @ UnROOT ~/Dev/UnROOT.jl/src/root.jl:157
┌ Debug: TTree: UnROOT.TKey32(575673, 2, 1258332, 0x00000000, 70, 1, 5844079, 64, "TTree", "pfates", "Step-by-step energy deposition")
└ @ UnROOT ~/Dev/UnROOT.jl/src/bootstrap.jl:899
┌ Debug: Compression type: ZL
└ @ UnROOT ~/Dev/UnROOT.jl/src/types.jl:154
┌ Debug: Compressed/uncompressed size in bytes: 575594 / 1258332
└ @ UnROOT ~/Dev/UnROOT.jl/src/types.jl:155
┌ Debug: fIOFeatures is missing
└ @ UnROOT ~/Dev/UnROOT.jl/src/bootstrap.jl:955
┌ Debug: Reading branches
└ @ UnROOT ~/Dev/UnROOT.jl/src/bootstrap.jl:959
┌ Debug: Unpacking: UnROOT.TKey32(575673, 2, 1258332, 0x00000000, 70, 1, 5844079, 64, "TTree", "pfates", "Step-by-step energy deposition")
└ @ UnROOT ~/Dev/UnROOT.jl/src/streamers.jl:312
ERROR: EOFError: read end of file
Stacktrace:

@Cornelius-G
Copy link

Hi, is there maybe any update on this topic?
I have a (rather new) .root file and get the error UndefVarError: TAttMarker_1 not defined, which seems to be connected to this issue here.
(Wanted to show people how much nicer it would be to use Julia than ROOT/PyROOT. But this didn't work out since already reading the file fails 😖)

@tamasgal
Copy link
Member

Hi @Cornelius-G ! I have not made any progress but if you could provide a small sample file, that would already increase the sample size greatly ;)

@Moelf
Copy link
Member

Moelf commented Jan 26, 2023

hi @Cornelius-G , I think we're mainly much more ergonomic when dealing with TTree or RNTuple.

Btw, you can try using uproot via PythonCall.jl or PyCall.jl for reading histograms, since they're juts one-time reading of small data, doesn't involve looping.

oh and it would be nice to have a sample .root file with TAttMarker_1

@Cornelius-G
Copy link

Unfortunately, the original file is quite large and should be looped over.
I attached a smaller sample of the file that also gives the TAttMarker_1 error when running:

f = ROOTFile("data.root")
f["proton"]

data.zip

@Moelf
Copy link
Member

Moelf commented Jan 26, 2023

I'm a bit confused what is _1:
https://github.com/root-project/root/blob/4ef94f4432a39ef5542cc4f00d46f079115b263d/core/base/inc/TAttMarker.h#LL50C9-L50C9

https://github.com/scikit-hep/uproot5/blob/d6d311aef4e22cc73ae056665dbb41cc18b2d6ac/src/uproot/models/TAtt.py#L328

as far as I can tell the only known version is _2 although, of course, Uproot doesn't have trouble opening this file so there must be something we can do as well

@Cornelius-G
Copy link

Cornelius-G commented Jan 27, 2023

Well, I have absolutely no idea about root files and the parser. So I'm sorry if I'm just saying things that you already know....
The _1 seems to be the preamble.version. (Is this something like the version of the Tsomething objects?)

I just played around a bit with the stream!(io, fields, ::Type{T}; check=true) function and simply replaced the TAttMarker_1 with TAttMarker_2 in this function (instead of adding a new TAttMarker_1 struct in bootstrap.jl as @tamasgal did above):

streamer_name = ( (T==UnROOT.TAttMarker) ? Symbol(T, "_2") : Symbol(T, "_$(preamble.version)"))

With this I end up with an EOFError like @tamasgal also mentioned above

ERROR: EOFError: read end of file
Stacktrace:
  [1] peek
    @ ./iobuffer.jl:180 [inlined]
  [2] read
    @ ./iobuffer.jl:190 [inlined]
  [3] readtype
    @ ~/julia/.julia/packages/UnROOT/snew4/src/io.jl:23 [inlined]
  [4] readobjany!(io::IOBuffer, tkey::UnROOT.TKey32, refs::Dict{Int32, Any})
    @ UnROOT ~/julia/.julia/packages/UnROOT/snew4/src/streamers.jl:198
  [5] unpack(io::IOBuffer, tkey::UnROOT.TKey32, refs::Dict{Int32, Any}, T::Type{UnROOT.TObjArray})
    @ UnROOT ~/julia/.julia/packages/UnROOT/snew4/src/streamers.jl:317
  [6] UnROOT.TTree(io::UnROOT.MmapStream, tkey::UnROOT.TKey32, refs::Dict{Int32, Any}; top::Bool)
    @ UnROOT ~/julia/.julia/packages/UnROOT/snew4/src/bootstrap.jl:947
  [7] UnROOT.TTree(io::UnROOT.MmapStream, tkey::UnROOT.TKey32, refs::Dict{Int32, Any})
    @ UnROOT ~/julia/.julia/packages/UnROOT/snew4/src/bootstrap.jl:888
  [8] macro expansion
    @ ~/julia/.julia/packages/UnROOT/snew4/src/root.jl:159 [inlined]
  [9] (::UnROOT.var"##getter#313#133"{ROOTFile, String})()
    @ UnROOT ~/julia/.julia/packages/Memoization/ut5GT/src/Memoization.jl:163
 [10] get!(default::UnROOT.var"##getter#313#133"{ROOTFile, String}, lru::LRUCache.LRU{Any, Any}, key::Tuple{Tuple{ROOTFile, String}, Tuple{}})
    @ LRUCache ~/julia/.julia/packages/LRUCache/8li9I/src/LRUCache.jl:116
 [11] _get!
    @ ~/julia/.julia/packages/Memoization/ut5GT/src/Memoization.jl:170 [inlined]
 [12] _getindex(f::ROOTFile, s::String)
    @ UnROOT ~/julia/.julia/packages/Memoization/ut5GT/src/Memoization.jl:165
 [13] getindex(f::ROOTFile, s::String)
    @ UnROOT ~/julia/.julia/packages/UnROOT/snew4/src/root.jl:138
 [14] top-level scope
    @ ~/readRoot.jl:35

Not sure how to get further from this point. But I guess that was also your problem 😅

@tamasgal
Copy link
Member

The problem is probably already happening before reaching the AttrMarker. I think there is some systematics that the version is parsed as "1". Maybe something right before is not parsed correctly.

The ROOT structure is fairly complex and the only source of information we have is code, more or less ;)

If uproot can parse the file, then we have some inconsistencies in the streamer parsing. The good news is that it's at the beginning of the chain. If you have time and are eager, you can look up what uproot does and compare with UnROOT. I have little time for this at the moment, very sorry ☹️

@rushabhgala
Copy link

Is there an update on this?
I am trying to read some TH1_ histograms and encouter the same error

ERROR: UndefVarError: `TH1_7` not defined
Stacktrace:
 [1] stream!(io::IOBuffer, fields::Dict{Symbol, Any}, ::Type{UnROOT.TH1}; check::Bool)
   @ UnROOT ~/.julia/packages/UnROOT/1QDzX/src/streamers.jl:587
 [2] stream!
   @ ~/.julia/packages/UnROOT/1QDzX/src/streamers.jl:583 [inlined]
 [3] TH(io::UnROOT.MmapStream, tkey::UnROOT.TKey32, refs::Dict{Int32, Any})
   @ UnROOT ~/.julia/packages/UnROOT/1QDzX/src/bootstrap.jl:967
 [4] TH1F(io::UnROOT.MmapStream, tkey::UnROOT.TKey32, refs::Dict{Int32, Any})
   @ UnROOT ~/.julia/packages/UnROOT/1QDzX/src/bootstrap.jl:945
 [5] _getindex(f::ROOTFile, s::String)
   @ UnROOT ~/.julia/packages/UnROOT/1QDzX/src/root.jl:182
 [6] #146
   @ ~/.julia/packages/UnROOT/1QDzX/src/root.jl:167 [inlined]
 [7] get!(default::UnROOT.var"#146#147"{ROOTFile, String}, h::Dict{Any, Any}, key::String)
   @ Base ./dict.jl:468
 [8] getindex(f::ROOTFile, s::String)
   @ UnROOT ~/.julia/packages/UnROOT/1QDzX/src/root.jl:166
 [9] top-level scope
   @ REPL[5]:1

This is what I've done. I saved the file here https://scikit-hep.org/uproot3/examples/hepdata-example.root locally and tried to read the histograms using the following snippet

using UnROOT
f = ROOTFile("hepdata-example.root")
f["hpx"]

@Moelf
Copy link
Member

Moelf commented Mar 4, 2024

yeah sorry about that, we don't know what does the internal of TH1_7 looks like, basically the _7 is like a version number used internally by ROOT whenever they change layout of objects, we don't have a good workflow to debug and add every revisions.

@tamasgal
Copy link
Member

tamasgal commented Mar 4, 2024

I will do another reverse engineering session to see if I can figure out something...

@tamasgal
Copy link
Member

tamasgal commented Mar 5, 2024

So I am pretty sure we are missing something. TAxis_9 and has fLabels which we do not parse and I looked at TAxis_10 to see if it was removed and it's there. In addition to that, fModLabels was introduced in class version 10.

Now I am a bit surprised why UnROOT.jl reads TAxis_10 without any problems. Apparently if the labels and the modification of labels are empty, they can be safely "ignored", but I am not sure how that works. fLabels is a THashList*:

UnROOT.TStreamerObjectPointer
  version: UInt16 0x0004
  fOffset: Int64 0
  fName: String "fLabels"
  fTitle: String "List of labels"
  fType: Int32 64
  fSize: Int32 8
  fArrayLength: Int32 0
  fArrayDim: Int32 0
  fMaxIndex: Array{Int32}((5,)) Int32[0, 0, 0, 0, 0]
  fTypeName: String "THashList*"
  fXmin: Float64 0.0
  fXmax: Float64 0.0
  fFactor: Float64 0.0

which is a doubly linked list (TList):

  version: UInt16 0x0004
  fOffset: Int64 0
  fName: String "TList"
  fTitle: String "Doubly linked list"
  fType: Int32 0
  fSize: Int32 0
  fArrayLength: Int32 0
  fArrayDim: Int32 0
  fMaxIndex: Array{Int32}((5,)) Int32[0, 1774568379, 0, 0, 0]
  fTypeName: String "BASE"
  fXmin: Float64 0.0
  fXmax: Float64 0.0
  fFactor: Float64 0.0
  fBaseVersion: Int32 5

@tamasgal
Copy link
Member

tamasgal commented Mar 6, 2024

I got further (cf #309) but still struggling to fully parse the TH1 with class version 7. There are a bunch of graphics formatting things saved in this file, so it's not as easy as parsing the histogram. There are TLatex instances and other attributed texts, which we had no bootstrapping for.

@tamasgal
Copy link
Member

tamasgal commented Mar 6, 2024

Coming back to the original issue (TH1 class version 3), I will now try to track down all the related base classes.

Btw. I added a debug function to compare streamers. It's very rudimentary but does the job. Here I compare the streamers for the file with TH1_3 and one of our sample files which work (TH1_8):

julia> f1 = ROOTFile("/Users/tamasgal/Downloads/dedx2COMET.root")
ROOTFile with 1 entry and 53 streamers.
/Users/tamasgal/Downloads/dedx2COMET.root
└─ h1 (TH1D)


julia> f2 = ROOTFile("test/samples/histograms1d2d.root")
ROOTFile with 5 entries and 14 streamers.
test/samples/histograms1d2d.root
├─ myTH1F (TH1F)
├─ myTH1D (TH1D)
├─ myTH2F (TH2F)
├─ myTH2D (TH2D)
└─ myTH1D_nonuniform (TH1D)


julia> UnROOT.Debug.streamerdiff(f1, f2, "TH1")
==========
A: class version 3 (checksum: 1449702359)
B: class version 8 (checksum: 473383108)

Common dependencies: TNamed, TAttMarker, TAttFill, TAttLine
--------------
A: TNamed
B: TNamed
Mismatch of field values in version:
    A: 2
    B: 4
Mismatch of field values in fMaxIndex:
    A: Int32[0, 0, 0, 0, 0]
    B: Int32[0, -541636036, 0, 0, 0]
--------------
A: TAttLine
B: TAttLine
Mismatch of field values in version:
    A: 2
    B: 4
Mismatch of field values in fMaxIndex:
    A: Int32[0, 0, 0, 0, 0]
    B: Int32[0, -1811462839, 0, 0, 0]
Mismatch of field values in fBaseVersion:
    A: 1
    B: 2
--------------
A: TAttFill
B: TAttFill
Mismatch of field values in version:
    A: 2
    B: 4
Mismatch of field values in fMaxIndex:
    A: Int32[0, 0, 0, 0, 0]
    B: Int32[0, -2545006, 0, 0, 0]
Mismatch of field values in fBaseVersion:
    A: 1
    B: 2
--------------
A: TAttMarker
B: TAttMarker
Mismatch of field values in version:
    A: 2
    B: 4
Mismatch of field values in fMaxIndex:
    A: Int32[0, 0, 0, 0, 0]
    B: Int32[0, 689802220, 0, 0, 0]
Mismatch of field values in fBaseVersion:
    A: 1
    B: 2
--------------
A: fNcells
B: fNcells
Mismatch of field values in version:
    A: 2
    B: 4
Mismatch of field values in fTypeName:
    A: Int_t
    B: int
--------------
A: fXaxis
B: fXaxis
Mismatch of field values in version:
    A: 2
    B: 4
Mismatch of field values in fSize:
    A: 0
    B: 216
--------------
A: fYaxis
B: fYaxis
Mismatch of field values in version:
    A: 2
    B: 4
Mismatch of field values in fSize:
    A: 0
    B: 216
--------------
A: fZaxis
B: fZaxis
Mismatch of field values in version:
    A: 2
    B: 4
Mismatch of field values in fSize:
    A: 0
    B: 216
--------------
A: fBarOffset
B: fBarOffset
Mismatch of field values in version:
    A: 2
    B: 4
Mismatch of field values in fTypeName:
    A: Short_t
    B: short
--------------
A: fBarWidth
B: fBarWidth
Mismatch of field values in version:
    A: 2
    B: 4
Mismatch of field values in fTypeName:
    A: Short_t
    B: short
--------------
A: fEntries
B: fEntries
Mismatch of field values in version:
    A: 2
    B: 4
Mismatch of field values in fTypeName:
    A: Stat_t
    B: double
--------------
A: fTsumw
B: fTsumw
Mismatch of field values in version:
    A: 2
    B: 4
Mismatch of field values in fTypeName:
    A: Stat_t
    B: double
--------------
A: fTsumw2
B: fTsumw2
Mismatch of field values in version:
    A: 2
    B: 4
Mismatch of field values in fTypeName:
    A: Stat_t
    B: double
--------------
A: fTsumwx
B: fTsumwx
Mismatch of field values in version:
    A: 2
    B: 4
Mismatch of field values in fTypeName:
    A: Stat_t
    B: double
--------------
A: fTsumwx2
B: fTsumwx2
Mismatch of field values in version:
    A: 2
    B: 4
Mismatch of field values in fTypeName:
    A: Stat_t
    B: double
--------------
A: fMaximum
B: fMaximum
Mismatch of field values in version:
    A: 2
    B: 4
Mismatch of field values in fTypeName:
    A: Double_t
    B: double
--------------
A: fMinimum
B: fMinimum
Mismatch of field values in version:
    A: 2
    B: 4
Mismatch of field values in fTypeName:
    A: Double_t
    B: double
--------------
A: fNormFactor
B: fNormFactor
Mismatch of field values in version:
    A: 2
    B: 4
Mismatch of field values in fTypeName:
    A: Double_t
    B: double
--------------
A: fContour
B: fContour
Mismatch of field values in version:
    A: 2
    B: 4
Mismatch of field values in fSize:
    A: 0
    B: 24
--------------
A: fSumw2
B: fSumw2
Mismatch of field values in version:
    A: 2
    B: 4
Mismatch of field values in fSize:
    A: 0
    B: 24
--------------
A: fOption
B: fOption
Mismatch of field values in version:
    A: 2
    B: 4
Mismatch of field values in fSize:
    A: 0
    B: 24
--------------
A: fFunctions
B: fFunctions
Mismatch of field values in version:
    A: 2
    B: 4
Mismatch of field values in fSize:
    A: 0
    B: 8

Missing in A: fBufferSize, fBuffer, fBinStatErrOpt, fStatOverflows

julia> 

@tamasgal
Copy link
Member

tamasgal commented Mar 6, 2024

Also interesting there is no streamer for TH1D for the later version of TH1. I think we need to completely rewrite the histogram reader. Even better: make it automated so that everyhing else works 🙈

julia> f1 = ROOTFile("/Users/tamasgal/Downloads/dedx2COMET.root")
ROOTFile with 1 entry and 53 streamers.
/Users/tamasgal/Downloads/dedx2COMET.root
└─ h1 (TH1D)


julia> f2 = ROOTFile("test/samples/histograms1d2d.root")
ROOTFile with 5 entries and 14 streamers.
test/samples/histograms1d2d.root
├─ myTH1F (TH1F)
├─ myTH1D (TH1D)
├─ myTH2F (TH2F)
├─ myTH2D (TH2D)
└─ myTH1D_nonuniform (TH1D)


julia> UnROOT.streamerfor(f1, "TH1D")
UnROOT.StreamerInfo(UnROOT.TStreamerInfo{UnROOT.TObjArray}("TH1D", "", 0x0b288969, 1, UnROOT.TObjArray("", 0, Any[UnROOT.TStreamerBase
  version: UInt16 0x0002
  fOffset: Int64 0
  fName: String "TH1"
  fTitle: String "1-Dim histogram base class"
  fType: Int32 0
  fSize: Int32 0
  fArrayLength: Int32 0
  fArrayDim: Int32 0
  fMaxIndex: Array{Int32}((5,)) Int32[0, 0, 0, 0, 0]
  fTypeName: String "BASE"
  fXmin: Float64 0.0
  fXmax: Float64 0.0
  fFactor: Float64 0.0
  fBaseVersion: Int32 3
, UnROOT.TStreamerBase
  version: UInt16 0x0002
  fOffset: Int64 0
  fName: String "TArrayD"
  fTitle: String "Array of doubles"
  fType: Int32 0
  fSize: Int32 0
  fArrayLength: Int32 0
  fArrayDim: Int32 0
  fMaxIndex: Array{Int32}((5,)) Int32[0, 0, 0, 0, 0]
  fTypeName: String "BASE"
  fXmin: Float64 0.0
  fXmax: Float64 0.0
  fFactor: Float64 0.0
  fBaseVersion: Int32 1
])), Set(Any["TArrayD", "TH1"]))

julia> UnROOT.streamerfor(f2, "TH1D")
missing

@Moelf
Copy link
Member

Moelf commented Mar 6, 2024

that's wild

@tamasgal
Copy link
Member

tamasgal commented Mar 6, 2024

Yeah... Well, in the end it's just reading the streamers, composing and inherit fields from the bases (which are superclasses) and then append the remaining fields. After that it's just a recursive thing to read everything.

This really needs time and work 😬

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

8 participants