-
Notifications
You must be signed in to change notification settings - Fork 62
Add a tool to merge several podio files into a single one #681
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
tmadlener
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does a lot of heavy lifting that might not be strictly necessary, as it unpacks and than repacks every collection in every frame, whereas a simple hadd would almost be enough for merging TTree based files. Additionally, it will do schema evolution (if applicable), so that there could be subtle changes to the produced output file. On the other hand, this will correctly handle input files with different schema versions, so that might be not too bad.
How much slower is this than a c++ implementation? There is #620 after all to solve a quite significant performance issue with podio-dump, and the difference is not only the long startup time in that case.
|
Ah I didn't think about |
Can you specify what crashes in this case? |
|
|
4118849 to
59d9dff
Compare
|
Maybe add a bit of metadata that keeps track of which files the the merged files comes from. |
8868a1e to
f11cd49
Compare
|
This should be ready and now write some metadata with the names of the files as passed in the arguments (full paths being saved as full paths): |
049efef to
5fddc2a
Compare
BEGINRELEASENOTES
ENDRELEASENOTES
Useful not to have to deal with many small files, even though the readers can read them fine.