-
Notifications
You must be signed in to change notification settings - Fork 67
Add support to write Jagged Arrays to TTrees #477
Conversation
int8 tests do not pass. Jagged arrays of type int8 created by uproot can be read by uproot but not by ROOT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a couple of argument name changes.
The int8
thing is weird. It only seems to affect certain Python versions, which is rather disturbing. Can you try installing two Python versions with and without the bug using conda environments (e.g. 3.6 and 3.7 or 3.7 and 3.8). Be sure to use ROOT from conda in both cases, since a ROOT installation mismatched to Python version should not work.
@HDembinski, as long as you're not writing jagged arrays of Note that the argument names are likely to change. I suggested that "dependence" should become "lengths"; do you have other suggestions? (In Awkward0, this quantity is called "counts", but that's changing in Awkward1 to avoid confusion with the reducer "count". The Awkward1 name for this, "ak.num", might be a little confusing out of context as a "num" argument because what's needed here is the name of the branch with the number of values in each entry, which singular "num" doesn't suggest. It's okay for "ak.num" because that has an "axis" argument.) |
@jpivarski To debug this I set up a virtual environment running ROOT installed via conda for Python 3.7. But here I found that ROOT is unable to properly read TTrees containing
Here is how I created the file -
Reading it via my local system ROOT (master) for Python 3.8 -
Reading it on my virtual machine running ROOT (from conda) for Python 3.7 -
|
I can reproduce this. (My primary ROOT version comes from Conda.) Also, it's not the file—I could do the Note that we can do this: >>> import ROOT
>>> f = ROOT.TFile.Open("jagged.root")
>>> tree = f.Get("t")
>>> tree.Scan()
***********************************************
* Row * Instance * n * branch *
***********************************************
* 0 * 0 * 1 * 0 *
* 1 * 0 * 2 * 1 *
* 1 * 1 * 2 * 2 *
* 2 * 0 * 3 * 3 *
* 2 * 1 * 3 * 4 *
* 2 * 2 * 3 * 5 *
***********************************************
6 The data are definitely there. I'd ask @chrisburr about ROOT in Conda, but this is such a low-level manifestation of whatever the problem might be that I doubt the error could be discovered this way. I'll keep thinking about what would be a way to delve more deeply. Also, I misunderstood earlier; I thought you had been saying that 8-bit integers (i.e. Sometimes, there can be 32-bit vs 64-bit issues across platforms because of how the 32-bit → 64-bit transition happened about 10 years ago. I still have to treat Windows as a special case because of how NumPy works under Windows. The main issue here, though, is that the "count" seems to always be |
1736305
to
3f68423
Compare
I have replaced the earlier |
Length of jaggedarrays depending on the same lengths branch should be the same
Are you done making changes to this PR, and would it produce an error message for any data types that aren't correctly covered ("correct" = uproot writes it, ROOT reads it)? If so, I'll merge it. Thanks! |
I am done with the PR :) and yes, unsupported types will |
[10, 11, 12]]) | ||
|
||
with uproot.recreate(filename, compression=None) as f: | ||
f["t"] = uproot.newtree({"branch": uproot.newbranch(numpy.dtype(">i4"), counter="n")}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe rename "counter" to "size"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opened PR #481
Thank you for adding this, this is a huge step forward for those of us who want to use uproot also to write hierarchical trees.
I looked into the tests, seems all good. Do you want me to try it out anyway? Regarding int8, I actually have a branch of that type, it stores bit flags. I may be able to work around this. I don't have a solution for this issue, but I may be able to share some insight what could be the problem... The issues you found could be related to the some messy details regarding int8, uint8, char, signed char, and unsigned char. For one, ROOT has trouble distinguishing char from int8, when doing TTree::Scan, it prints int8 as char, even if the type of the branch is int8. Apart from that, depending on the system, int8 may be just an alias to char, while on others these are two distinct types for the compiler. The different char types are somewhat special in C and C++ https://stackoverflow.com/questions/75191/what-is-an-unsigned-char |
Fixes #354
The user interface is described in the tests. In particular to be noted is the new
dependence
flag. For eg - https://github.com/scikit-hep/uproot/compare/write-jagged#diff-5f81591ed2ee25a817de2554fd35f608R1861Writing Jagged Arrays of the
int8
datatype is a little bit tricky and does not work yet - I am facing some difficulties in even getting the numbers out of a file created by ROOT. (This is for when using ROOT to read the values. uproot can read out the values correctly, even for the Jagged Arrays of typeint8
written by uproot) Also, since uproot reads the data correctly, that makes it trickier to debug.Weirdly, the travis test for reading a jagged array branch of type
>i8
created by uproot, using ROOT fails - https://travis-ci.org/github/scikit-hep/uproot/jobs/679767938#L1994. The weird part is that it always passes on my local system(ROOT master) whereas it always fails on Travis(ROOT from conda). I would really appreciate any help in debugging this.An update to the README should probably accompany this once the user interface has been finalised.