[ntuple] Reduce memory usage of `RPageSinkBuf` #20425

hahnjo · 2025-11-14T10:02:41Z

When IMT is turned on and RPageSinkBuf has an RTaskScheduler, we would previously buffer all pages and create tasks to seal / compress them. While this exposes the maximum work, it's a waste of memory if other threads are not fast enough to process the tasks. Heuristically assume that there is enough work if we already buffer more uncompressed bytes than the approximate zipped cluster size.

In a small test, writing random data with ROOT::EnableImplicitMT(1) and therefore no extra worker thread, the application used 500 MB before this change for the default cluster size of 128 MiB. After this change, memory usage is reduced to around 430 MB (compared to a memory usage of 360 MB without IMT). The compression factor is around ~2.1x in this case, which roughly checks out:
Instead of buffering the full uncompressed cluster (which is around compression factor * zipped cluster size = 270 MiB), we now buffer uncompressed pages up to the approximate zipped cluster size (128 MiB) and then start compressing pages immediately. The result of course also needs to be buffered, but is much smaller after compression: ((1 - 1 / compression factor) * zipped cluster size = 67 MiB). Accordingly, the gain will be higher for larger compression factors.

Closes #18314

FYI @Dr15Jones @makortel as discussed previously

Created tasks reference *this, so moving is not safe. It's also not needed because RPageSinkBuf is always inside a std::unique_ptr.

When IMT is turned on and RPageSinkBuf has an RTaskScheduler, we would previously buffer all pages and create tasks to seal / compress them. While this exposes the maximum work, it's a waste of memory if other threads are not fast enough to process the tasks. Heuristically assume that there is enough work if we already buffer more uncompressed bytes than the approximate zipped cluster size. In a small test, writing random data with ROOT::EnableImplicitMT(1) and therefore no extra worker thread, the application used 500 MB before this change for the default cluster size of 128 MiB. After this change, memory usage is reduced to around 430 MB (compared to a memory usage of 360 MB without IMT). The compression factor is around ~2.1x in this case, which roughly checks out: Instead of buffering the full uncompressed cluster (which is around compression factor * zipped cluster size = 270 MiB), we now buffer uncompressed pages up to the approximate zipped cluster size (128 MiB) and then start compressing pages immediately. The result of course also needs to be buffered, but is much smaller after compression: ((1 - 1 / compression factor) * zipped cluster size = 67 MiB). Accordingly, the gain will be higher for larger compression factors.

github-actions · 2025-11-14T11:43:03Z

Test Results

22 files 22 suites 3d 14h 15m 11s ⏱️
3 776 tests 3 776 ✅ 0 💤 0 ❌
81 133 runs 81 133 ✅ 0 💤 0 ❌

Results for commit e896cd5.

makortel · 2025-11-14T19:08:37Z

Thanks @hahnjo! Just thinking out loud how we would eventually test the impact of this PR, is this something you'd be comfortable in backporting to 6.36, or would you prefer to keep it in master only? (just to be clear, at this point I'm not asking to backport)

hahnjo · 2025-11-15T09:07:06Z

Thanks @hahnjo! Just thinking out loud how we would eventually test the impact of this PR, is this something you'd be comfortable in backporting to 6.36, or would you prefer to keep it in master only? (just to be clear, at this point I'm not asking to backport)

Hi @makortel, in the past I rebuilt ROOT locally with the same configuration as in the IB (in particular CMAKE_CXX_STANDARD=20) and only some modifications applied on top. Then LD_PRELOAD the RNTuple library libROOTNTuple.so, which works as long as there is no ABI change. We briefly discussed backporting to v6-36-00-patches and I believe in general we would be open, provided it's tested in CMSSW and Athena (via dev3 LCG nightlies).

makortel · 2025-11-17T16:00:02Z

Thanks @hahnjo. Sounds like we should evaluate the impact through the CMSSW IB that tracks ROOT master (can be done).

hahnjo · 2025-11-17T16:27:53Z

That testing will of course also get much easier once you move to ROOT 6.36 globally and merge the RNTuple modules into master...

hahnjo added 2 commits November 14, 2025 10:03

[ntuple] Delete move of RPageSinkBuf

d841342

Created tasks reference *this, so moving is not safe. It's also not needed because RPageSinkBuf is always inside a std::unique_ptr.

hahnjo requested review from enirolf, jblomer, pcanal, silverweed and vepadulano November 14, 2025 10:02

hahnjo self-assigned this Nov 14, 2025

hahnjo added the in:RNTuple label Nov 14, 2025

jblomer approved these changes Nov 14, 2025

View reviewed changes

makortel mentioned this pull request Nov 17, 2025

Test impact of RPageSinkBuf PR cms-sw/framework-team#1672

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ntuple] Reduce memory usage of `RPageSinkBuf` #20425

[ntuple] Reduce memory usage of `RPageSinkBuf` #20425

hahnjo commented Nov 14, 2025

Uh oh!

github-actions bot commented Nov 14, 2025

Uh oh!

makortel commented Nov 14, 2025

Uh oh!

hahnjo commented Nov 15, 2025

Uh oh!

makortel commented Nov 17, 2025

Uh oh!

hahnjo commented Nov 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[ntuple] Reduce memory usage of RPageSinkBuf #20425

Are you sure you want to change the base?

[ntuple] Reduce memory usage of RPageSinkBuf #20425

Conversation

hahnjo commented Nov 14, 2025

Uh oh!

github-actions bot commented Nov 14, 2025

Test Results

Uh oh!

makortel commented Nov 14, 2025

Uh oh!

hahnjo commented Nov 15, 2025

Uh oh!

makortel commented Nov 17, 2025

Uh oh!

hahnjo commented Nov 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[ntuple] Reduce memory usage of `RPageSinkBuf` #20425

[ntuple] Reduce memory usage of `RPageSinkBuf` #20425