Skip to content

Conversation

@enirolf
Copy link
Contributor

@enirolf enirolf commented Sep 25, 2025

For snapshotting to RNTuple, users can now pass an RNTupleWriteOptions to RSnapshotOptions to configure the output RNTuple. Compression settings that have been set directly through RSnapshotOptions::fCompressionAlgorithm and RSnapshotOptions::fCompressionLevel are propagated to RSnapshotOptions::fNTupleWriteOpts, provided that they haven't been set there already as well.

In addition, a check has been added to warn users when they set options that are specific to one output format, but RSnapshotOptions.fOutputFormat has been set to use the other.

Closes #19784.

@github-actions
Copy link

github-actions bot commented Sep 25, 2025

Test Results

    21 files      21 suites   3d 17h 49m 34s ⏱️
 3 693 tests  3 692 ✅ 0 💤 1 ❌
75 689 runs  75 688 ✅ 0 💤 1 ❌

For more details on these failures, see this check.

Results for commit 52895b9.

♻️ This comment has been updated with latest results.

Copy link
Member

@pcanal pcanal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

// compression settings in fNTupleWriteOpts have not been changed, and the compression algorithm or level in fOptions
// have.
if (fOptions.fNTupleWriteOpts.GetCompression() == RCompressionSetting::EDefaults::kUseGeneralPurpose &&
(fOptions.fCompressionAlgorithm != RCompressionSetting::EAlgorithm::kZLIB ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that leaves the possibility that a user explicitly sets fCompression... to what happens to be the current default, but then is surprised because RNTuple doesn't pick it up but continue to use its zstd default. I don't have a good solution though, maybe that's ok.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, good point. Perhaps to avoid situations like these it might be better to also warn users for fCompression..., and require them to be set through fNTupleWriteOpts only.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After the merger of #20030, the behavior is now such that fCompression[...] will be propagated to fNTupleWriteOpts if and only if in there it has not changed from the default. I think that should (reasonably) remove possible ambiguities regarding what compression is used.

@enirolf enirolf force-pushed the rdf-snapshot-opts branch 2 times, most recently from 12b5c24 to cb535da Compare September 30, 2025 13:55
@enirolf enirolf requested a review from jblomer September 30, 2025 14:30
N.B., compression settings that have been set directly through the
snapshot options are propagated to the RNTuple write options, provided
that they haven't been set there already.
... and warn users when an option has been set that has no effect on the
chosen output format.
@enirolf
Copy link
Contributor Author

enirolf commented Oct 23, 2025

After discussing offline with @vepadulano, we concluded that in the end it makes more sense to throw an exception in case either of RSnapshotOptions::fCompression[...] and RNTupleWriteOptions::fCompression has been set to conflicting (non-default) settings. I've updated the first commit accordingly.

#ifndef ROOT_RSNAPSHOTOPTIONS
#define ROOT_RSNAPSHOTOPTIONS

#include "ROOT/RNTupleWriteOptions.hxx"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for bringing this up only now. By looking at the code more closely, I am reminded that a lot of work has been put into removing all header inclusions in RDataFrame of other parts of ROOT I/O (namely TTree, TChain, TFile and related headers). This now introduces another header inclusion which is logically similar to including something from TTree, and I'm afraid it goes against the rest of the work that was done. I don't have yet a clear idea on how to overcome this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two ways I can see:

  • Pass the RNTupleWriteOptions via std::unique_ptr and default to nullptr.
  • Unpack the various options from RNTupleWriteOptions flat into RSnapshotOptions, and then only internally construct the RNTupleWriteOptions in the .cxx

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second option would have the slight advantage of removing the need for the checks of compression settings, we just wouldn't create another compression setting option in the RSnapshotOptions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thanks for bringing this up. I agree that the second option seems more favorable. That would also keep the struct as simple as possible. Perhaps it's not necessary to add all write options for the time being, but only the ones we expect are more commonly tweaked, but on the other hand it also doesn't cost much more effort to just add them all at once.

Copy link
Contributor

@jblomer jblomer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In principle looks ok to me. I think I would prefer for the moment an internal function in RDF or RNTuple that compares the RNTuple write options. I'm not sure how useful this is as a general API, and once it is in we are bound to maintain it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[df] RNTuple snapshot + TTree-specific options

4 participants