Skip to content

[df] Manage resources created via TFile::Get#21964

Merged
dpiparo merged 2 commits intoroot-project:masterfrom
vepadulano:gh-21962
Apr 21, 2026
Merged

[df] Manage resources created via TFile::Get#21964
dpiparo merged 2 commits intoroot-project:masterfrom
vepadulano:gh-21962

Conversation

@vepadulano
Copy link
Copy Markdown
Member

Fixes #21962

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 20, 2026

Test Results

    22 files      22 suites   3d 9h 57m 40s ⏱️
 3 838 tests  3 837 ✅  1 💤 0 ❌
76 662 runs  76 644 ✅ 18 💤 0 ❌

Results for commit e651291.

♻️ This comment has been updated with latest results.

@vepadulano vepadulano requested a review from dpiparo April 21, 2026 06:56
@vepadulano vepadulano linked an issue Apr 21, 2026 that may be closed by this pull request
1 task
@vepadulano
Copy link
Copy Markdown
Member Author

Also fixes #21987 since the two issues stem from the same pattern.

Copy link
Copy Markdown
Collaborator

@ferdymercury ferdymercury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to my local tests on my branch: the full valgrind memleak report only disappared after I added:

7003f23

Maybe with your approach this is not needed anymore, but it would still be useful to add a comment into TFile.cxx as to why key is not deleted / purposely leaked if StreamerInfo not found.

* Simplify logic to ensure output file is ready for snapshot
* Avoid unnecessarily calling TFile::Get and prefer TFile::GetKey to check for presence of object
@vepadulano
Copy link
Copy Markdown
Member Author

According to my local tests on my branch: the full valgrind memleak report only disappared after I added:

7003f23

Maybe with your approach this is not needed anymore, but it would still be useful to add a comment into TFile.cxx as to why key is not deleted / purposely leaked if StreamerInfo not found.

I don't see the originally reported leaks with valgrind anymore. There's something weird I still see with the original reproducer:

==167542== 216 (184 direct, 32 indirect) bytes in 1 blocks are definitely lost in loss record 5,720 of 7,089
==167542==    at 0x48412AC: operator new(unsigned long) (vg_replace_malloc.c:488)
==167542==    by 0x49AA1FB: TStorage::ObjectAlloc(unsigned long) (TStorage.cxx:292)
==167542==    by 0x4904F54: TObject::operator new(unsigned long) (TObject.h:189)
==167542==    by 0x5026A13: TFile::Recover() (TFile.cxx:2174)
==167542==    by 0x502186B: TFile::Init(bool) (TFile.cxx:869)
==167542==    by 0x501FE9E: TFile::TFile(char const*, char const*, char const*, int) (TFile.cxx:583)
==167542==    by 0x502F668: TFile::Open(char const*, char const*, char const*, int, int) (TFile.cxx:3979)
==167542==    by 0x7B1727F: ROOT::Internal::RDF::UntypedSnapshotRNTupleHelper::Initialize() (RDFSnapshotHelpers.cxx:894)
==167542==    by 0x4193C3: ROOT::Internal::RDF::RActionSnapshot<ROOT::Internal::RDF::UntypedSnapshotRNTupleHelper, ROOT::Detail::RDF::RLoopManager>::Initialize() (RActionSnapshot.hxx:137)
==167542==    by 0x7B8B1BF: ROOT::Detail::RDF::RLoopManager::InitNodes() (RLoopManager.cxx:804)
==167542==    by 0x7B8CB51: ROOT::Detail::RDF::RLoopManager::Run(bool) (RLoopManager.cxx:943)
==167542==    by 0x41016B: ROOT::RDF::RResultPtr<ROOT::RDF::RInterface<ROOT::Detail::RDF::RLoopManager> >::TriggerRun() (RResultPtr.hxx:418)
==167542==

Which to me looks like a false positive since that line is using standard unique_ptr semantics

fOutputFile.reset(TFile::Open(fFileName.c_str(), fOptions.fMode.c_str()));

@ferdymercury
Copy link
Copy Markdown
Collaborator

ferdymercury commented Apr 21, 2026

Which to me looks like a false positive since that line is using standard unique_ptr semantics
==167542== by 0x5026A13: TFile::Recover() (TFile.cxx:2174)

Yep, saw the same. I don't think it's a false positive, since it got solved with:
7003f23

@vepadulano
Copy link
Copy Markdown
Member Author

@ferdymercury now I understand what you mean, apologies for being slow! I see that's indeed another leak, good catch! I think that deserves a separate PR on its own (probably can be reproduced standalone even without RDF).

@vepadulano
Copy link
Copy Markdown
Member Author

@ferdymercury I have created a reproducer and issue for the TFile::Open leak at #21997

@vepadulano vepadulano added this to the 6.40.00 milestone Apr 21, 2026
@vepadulano
Copy link
Copy Markdown
Member Author

@dpiparo this should go to 6.40

@dpiparo dpiparo modified the milestones: 6.40.00, 6.40.00-rc1 Apr 21, 2026
@dpiparo
Copy link
Copy Markdown
Member

dpiparo commented Apr 21, 2026

Let's put it in 6.40.00-rc1

@dpiparo dpiparo merged commit 4d873d9 into root-project:master Apr 21, 2026
30 checks passed
} else {
outFile->Delete(treeName.c_str());
// Ensure deletion of object and all its cycles.
outFile->Delete((objName + ";*").c_str());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That feels wrong. What if the object is semantically the same? In that case this deletes all previous 'backup' copies of the object, is that really the intent? (usually the over-write only over-write the lastest key and keeps the backup)?

Related though, the TTree::Delete("all"); has no choice but to delete also the backup copies since it removes from the files all the TBasket that they share.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Memory leak when using snapshot to write RNTuples using RDataFrame Memory leak when reading RNTuples using RDataFrame

4 participants