Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ntuple, io] pass paths, names of ntuples for merging to RNTuple side #6101

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

mxxo
Copy link
Collaborator

@mxxo mxxo commented Jul 29, 2020

We construct a vector of string pairs and pass it through to the
RNTuple::Merge function for merging. This implements a "legacy bridge"
between hadd/TFileMerger and the new RNTuple library.

We adhere to the Merge(TCollection*, TFileMergeInfo*) interface without
using either class in the implementation by carefully casting the
RNTuple information vector and output file string pointers.

An example of this technique is shown here, which passes UBSan, ASan,
and strict aliasing checks on gcc 10 and clang 10: https://godbolt.org/z/7ff3e8

For demonstration, the hadd output for merging two ntuple files looks like:

$ ./bin/hadd -f dir/merged.root ntuple1.root ntuple2.root
hadd Target file: dir/merged.root
hadd compression setting for all output: 1
hadd Source file 1: ntuple1.root
hadd Source file 2: ntuple2.root
hadd Sources and Target have different compression levels
hadd merging will be slower
hadd Target path: dir/merged.root:/
Warning in <TFileMerger::MergeRecursive>: merging RNTuples is experimental

# note: TFileMerger debug output begins 
examining ntuple1.root:/
        got key for type ROOT::Experimental::RNTuple with name ntuple
examining ntuple2.root:/
        got key for type ROOT::Experimental::RNTuple with name ntuple
got ntuple from file 'ntuple1.root:/' named 'ntuple'
got ntuple from file 'ntuple2.root:/' named 'ntuple'

# note: RNTuple::Merge debug output begins
RNTuple merger output file is dir/merged.root:/
merging ntuple from file 'ntuple1.root:/ named 'ntuple'
merging ntuple from file 'ntuple2.root:/ named 'ntuple'
RNTuple merging is unimplemented

# control is passed back to TFileMerger::MergeRecursive
Error in <TFileMerger::MergeRecursive>: error merging RNTuples
Error in <TFileMerger::Merge>: error during merge of your ROOT files

We construct a vector of string pairs and pass it through to the
RNTuple::Merge function for merging. This implements a "legacy bridge"
between hadd/TFileMerger and the new RNTuple library.

We adhere to the Merge(TCollection*, TFileMergeInfo*) interface without
using either class in the implementation by carefully casting the
RNTuple information vector and output file string pointers.

An example of this technique is shown here, which passes UBSan, ASan,
and strict aliasing checks on gcc 10 and clang 10: https://godbolt.org/z/7ff3e8
@mxxo mxxo requested a review from pcanal as a code owner July 29, 2020 23:00
@mxxo mxxo requested a review from jblomer July 29, 2020 23:00
@phsft-bot
Copy link
Collaborator

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos7-multicore/default, ROOT-fedora30/cxx14, ROOT-fedora31/noimt, ROOT-ubuntu16/nortcxxmod, mac1014/python3, mac1015/cxx17, windows10/cxx14
How to customize builds

@jblomer jblomer self-assigned this Jul 30, 2020
@jblomer
Copy link
Contributor

jblomer commented Jul 30, 2020

I think we should leave that open for information but open a separate PR with the merging parts that are RNTuple-internal. We can then merge the RNTuple-internal parts independently but still use the code in this PR to (manually) test if the merging still works as if called from hadd.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants