-
Notifications
You must be signed in to change notification settings - Fork 1.4k
[ntuple] Binary format RC3 #16645
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ntuple] Binary format RC3 #16645
Conversation
Test Results 18 files 18 suites 3d 18h 36m 23s ⏱️ For more details on these failures, see this check. Results for commit 2a90c5f. ♻️ This comment has been updated with latest results. |
hahnjo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
two quick questions regarding RNTuple::Streamer()
tree/ntuple/v7/src/RNTuple.cxx
Outdated
| auto classVersion = buf.ReadVersion(&offClassBuf, &bcnt); | ||
| if (classVersion < 4) | ||
| throw RException(R__FAIL("unsupported RNTuple pre-release")); | ||
| buf.ReadVersion(&offClassBuf, &bcnt); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this? My understanding is that it was required for the custom streaming of class version 4. I didn't check how this interferes with the checksum below, but I believe that offCkData takes care of skipping the class version already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related question, how does the reading fail after the PR when reading old (unsupported) RNtuple files?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It fails to open à la "ROOT::Experimental::RNTuple is not an RNTuple". It's like trying to open an RNTuple from a TTree or and TH1F key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this? My understanding is that it was required for the custom streaming of class version 4. I didn't check how this interferes with the checksum below, but I believe that
offCkDatatakes care of skipping the class version already.
I thought we need this to get the byte count (bcnt).. @silverweed ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah... I believe we can reverse the operations: first read the "final" ReadClassBuffer, then compute the checksum and compare it to the value on disk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what I had in mind: hahnjo@3649c92 (feel free to just pick into this PR, or we can do it as follow-up) It compares the checksum only after reading the class, but this is likely fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now submitted as #16956
tree/ntuple/v7/src/RNTuple.cxx
Outdated
| // Rewind the version bytes, as ReadClassBuffer needs to read the version again. | ||
| buf.SetBufferOffset(offClassBuf); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see above; maybe we can remove this?
5b2f516 to
23e17c9
Compare
tree/ntuple/v7/src/RNTuple.cxx
Outdated
| buf.SetBufferOffset(offClassBuf); | ||
| buf.ReadClassBuffer(RNTuple::Class(), this); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed this the first time around, we should be using:
virtual Int_t ReadClassBuffer(const TClass *cl, void *pointer, Int_t version, UInt_t start, UInt_t count, const TClass *onfile_class = nullptr) = 0;
| buf.SetBufferOffset(offClassBuf); | |
| buf.ReadClassBuffer(RNTuple::Class(), this); | |
| buf.ReadClassBuffer(RNTuple::Class(), this, classVersion, bcnt); |
rather than do the version reading twice.
|
Draft for now. We don't want to merge it before CHEP in order to not impact benchmarks done on master. |
43cefee to
3397df9
Compare
Make the on-disk type name of the cardinality field spell `ROOT::RNTupleCardinality<T>` instead of `ROOT::Experimental::RNTupleCardinality<T>`.
| namespace Experimental { | ||
| class RNTuple; | ||
|
|
||
| namespace Experimental { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are those still in Experimental? Should they be in Internal and/or Detail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed, at this point only type names that are part of the on-disk format move out of experimental.
| #include <vector> | ||
|
|
||
| using namespace ROOT::Experimental; | ||
| using namespace ROOT::Experimental::Internal; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| using namespace ROOT::Experimental::Internal; | |
| using namespace ROOT; | |
| using namespace ROOT::Experimental::Internal; |
(and keeping the rest of the code as it was).
seems to match 'better' the spirit of this file coding style.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the idea of using namespace ROOT::Experimental::Internal was avoiding long boilerplate strings throughout the file. However, I find the ROOT:: prefix short and concise and I'd have a preference of writing ROOT::RNTuple rather than RNTuple (similar to std::string over string).
5289735 to
294f9a6
Compare
|
Note: clang-format is fine with the changes except that it doesn't understand string literals. |
hahnjo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
enirolf
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two small comments, LGTM otherwise!
tree/ntuple/v7/src/RMiniFile.cxx
Outdated
| offset = key.GetSeekKey() + key.fKeyLen; | ||
|
|
||
| constexpr size_t kMinNTupleSize = 70; // size of a RTFNTuple version 4 (min supported version) | ||
| constexpr size_t kMinNTupleSize = 78; // size of a RTFNTuple version 2 (min supported version) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a particular reason RTFNuple::GetSizePlusChecksum is not used here instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the idea is that the anchor size can grow in the future, so we have to maintain kMinNTupleSize independently of the RTFNuple structure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But that contradicts the comment next to it then, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm not sure, for me it's quite clear:
- For future versions 3, 4, 5, ...
RTFNTuplegains more members that have a default initialization. That will increaseRTFNuple::GetSizePlusChecksum - This exact line will forever stay like this because it was the size of
RTFNTupleversion 2, when we declared RNTuple 1.0.0.0, and we will always be able to read it back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll left the hard-coded 78 but added a static_assert, which makes it clear that this number is currently identical to RTFNuple::GetSizePlusChecksum(). If we grow the anchor in the future, the static assert will remind us of this spot.
Moves ROOT::Experimental::RNTuple to ROOT::RNTuple. Resets class version to 2 (general first class version recommendation for new classes). Removes previous compatibility code.
For historical reasons, the current column flags are 0x08 and 0x10. Reset to 0x01 and 0x02 for format RC3.
Make sure that we can easily extend the field meta-data by putting all the mandatory components first and the optional components last.
There is no (foreseen) use of a version range for the extra type info. Its only current purpose, storing streamer info records, does not need the version at all. A possible future use is storing enum constant names, but that would also be bound to a specific type version and not to a version range. If needed in a future version, a type version range can be added then.
| constexpr size_t kMinNTupleSize = 70; // size of a RTFNTuple version 4 (min supported version) | ||
| // size of a RTFNTuple version 2 (min supported version); future anchor versions can grow. | ||
| constexpr size_t kMinNTupleSize = 78; | ||
| static_assert(kMinNTupleSize == RTFNTuple::GetSizePlusChecksum()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
< static_assert(kMinNTupleSize == RTFNTuple::GetSizePlusChecksum());
static_assert(kMinNTupleSize <= RTFNTuple::GetSizePlusChecksum());
| | 0x08 | 32 | Real32 | IEEE-754 single precision float | | ||
| | 0x09 | 16 | Real16 | IEEE-754 half precision float | | ||
| | 0x16 | 64 | Int64 | Two's complement, little-endian 8-byte signed integer | | ||
| | 0x00 | 1 | Bit | Boolean value | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why does this start at 0x00 instead of 0x01 now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was part of the format cleanup. It seemed unclear why we should not not start at 0.
Version leading into the 1.0 format. Final cleanups and removal of backward compatibility code for previous experimental anchors.