Skip to content
This repository has been archived by the owner on Jan 7, 2023. It is now read-only.

support for object branches #3

Closed
ndawe opened this issue Sep 11, 2012 · 7 comments
Closed

support for object branches #3

ndawe opened this issue Sep 11, 2012 · 7 comments

Comments

@ndawe
Copy link
Member

ndawe commented Sep 11, 2012

Say you have some branches of objects, like TLorentzVectors, and you want to include something about them in the output array. What if you could pass in a dictionary of [class name] -> [list of functions] and for every branch encountered of type [class name], N=len([list of functions]) columns in the output array would be included and filled with the output of each function called on the object. For example you could pass in a function pt() for TLorentzVectors that would return the transverse momentum:

def pt(v):
    return v.Pt()

and a column with the name of the original object branch suffixed with _pt would be filled with the output of that function. Do you think this would be difficult to implement?

@piti118
Copy link
Contributor

piti118 commented Sep 11, 2012

Probably possible but would be lots and lots of work and I guess it will be really slow.

@ndawe
Copy link
Member Author

ndawe commented Sep 11, 2012

OK, I'll put it on my "dream list". If you are willing to pay the price of including object info, then I still think it could be very useful, especially if only taking the hit once when converting a tree into HDF5 to play with your data in PyTables.

@piti118
Copy link
Contributor

piti118 commented Nov 7, 2012

The new refactor allow for this kind of stuff. See the converters array in _librootnumpy.pyx. But you will need to know inner working of how root dump object to get it to work.

@piti118
Copy link
Contributor

piti118 commented Dec 16, 2012

Putting a note here about internal working of root object dump.

  • The Branch that's an object will be subclass of TBranchElement
  • The fields are stored in sub branch (use Branch->GetListOfBranch())
  • The actual data will be subclass of TLeafElement
  • If we check the LeafCounter and countval of TLeafElement it will point to the leaf of the mother branch
    • This is very confusing with variable length array detection
    • We can however use IsA to check the type of the leaf
  • Type of value inside LeafElement
    • For SINGLE value stored in leave GetLenStatic will report 1
      • A real legit VARIED array will report GetLenStatic 1 too
        • GetLen and GetNdata Doesn't seem to report correct length though... what does it do??
        • Can check CounterLeaf->IsOnTerminalBranch() if not then it's SINGLE, if yes then it's legit VARIED.!!!
    • For FIXED array. GetLenStatic will report >1 integer
    • Some VARIED array GetLenStatic will report <0
      • I do not know yet how to get the length of this kind of field.

So for now it is possible to dump object but only for SINGLE fields, FIXED field and some Varied field. Vector will work transparently.

@kratsg
Copy link
Collaborator

kratsg commented Sep 23, 2014

Hey all,

So the idea of object-level information in the file is now going to be succeeded by xAODs which handles all of this work. https://svnweb.cern.ch/trac/atlasoff/browser/Event/xAOD

An example is a Tau Jet object: https://svnweb.cern.ch/trac/atlasoff/browser/Event/xAOD/xAODTau/trunk/Root/TauJet_v1.cxx

Now they did include backwards-compatibility, so there is a way to dump an xAOD as an n-tuple. Going the other way is supposed to be hard; but I'm still starting to jump into this new data format. So I don't know a whole lot about actually using it in practice.

@ndawe
Copy link
Member Author

ndawe commented Sep 23, 2014

xAOD is an ATLAS-specific format.

The idea here it to be able to take a TTree containing branches of ROOT objects (or even user-defined objects) and providing a mechanism to convert the objects (instead of skipping them) into variable/fixed-length array representations in the numpy array. Like the example above, if I have TLorentzVectors and I care about the Eta() and Pt() of these objects, then there should be a way in tree2array to automatically grab these fields from the objects in these branches and generate the fields in the output record array.

@ndawe
Copy link
Member Author

ndawe commented Apr 9, 2015

Fields that are expressions involving objects are now supported as of 4.0.1 ref #184. So I'm closing this.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants