[ntuple] Fix read request coalescing when opening file from anchor#16659
[ntuple] Fix read request coalescing when opening file from anchor#16659jblomer merged 3 commits intoroot-project:masterfrom
Conversation
When we open an RNTuple from an anchor, we don't read the anchor with the mini file reader and therefore we must initialize the max key size manually from the given anchor. Otherwise, page coalescing won't work.
| } else { | ||
| fReader.SetMaxKeySize(fAnchor->GetMaxKeySize()); | ||
| } |
There was a problem hiding this comment.
I am a bit confused. Why is it not:
| } else { | |
| fReader.SetMaxKeySize(fAnchor->GetMaxKeySize()); | |
| } | |
| if (!fAnchor) { | |
| fAnchor = fReader.GetNTuple(fNTupleName).Unwrap(); | |
| fReader.SetMaxKeySize(fAnchor->GetMaxKeySize()); | |
| } |
and then the SetMaxKeySize also on line 305 ?
There was a problem hiding this comment.
The mini file reader will set the max key size as a side effect of reading the anchor.
If we already have the anchor, we don't need to read it again.
There was a problem hiding this comment.
Why the diverging code path? I.e. Is there a reason to keep the mini file reader setting the value as a side effect rather than having set only in one place?
There was a problem hiding this comment.
@silverweed What do you think? I agree that the current situation is not ideal.
|
So this affected reading an RNTuple back from a TFile? What would be the visible change in behavior after this fix? |
The effect is that ROOT uses much fewer read requests to get the data. At the moment, every page is a read request. When the fix is merged, I expect more or less You'll be able to verify this in the metrics output, looking at the |
Test Results 17 files 17 suites 4d 2h 59m 31s ⏱️ For more details on these failures, see this check. Results for commit f36963c. ♻️ This comment has been updated with latest results. |
hahnjo
left a comment
There was a problem hiding this comment.
LGTM as a fix, other improvements can be done in future PRs. Some comments on the added test inline.
| std::optional<ROOT::Experimental::RNTupleView<void>> viewPy; | ||
| std::optional<ROOT::Experimental::RNTupleView<void>> viewPz; | ||
|
|
||
| float px, py, pz; |
There was a problem hiding this comment.
maybe initialize these variables to silence the (spurious) compiler warnings...
| auto model = RNTupleModel::Create(); | ||
| auto ptrPx = model->MakeField<float>("px"); | ||
| auto ptrPy = model->MakeField<float>("py"); | ||
| model->MakeField<bool>("trigger"); |
| EXPECT_LT(reader->GetDescriptor().GetNClusters(), | ||
| reader->GetMetrics().GetCounter("RNTupleReader.RPageSourceFile.nClusterLoaded")->GetValueAsInt()); |
There was a problem hiding this comment.
Why do we expect to load more clusters than there are in the file?
When we open an RNTuple from an anchor, we don't read the anchor with the mini file reader and therefore we must initialize the max key size manually from the given anchor. Otherwise, page coalescing won't work.
@Dr15Jones FYI.