Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PyROOT] Fix memory leak in TTree __getattr__ pythonization #15608

Merged
merged 3 commits into from
May 23, 2024

Conversation

guitargeek
Copy link
Contributor

@guitargeek guitargeek commented May 22, 2024

Fixes #15610

As reported on the forum:
https://root-forum.cern.ch/t/memory-leak-in-pyroot-when-using-user-defined-c-macros-from-python-and-ttree-friends/59432

According to that post, the problem is present at least since 6.26/04, but I suspect it has been there forever (at least 6.22 with with cppyy upgrade in 2019).

I fixed the memory leak in the Pythonization is in the usual way how I fix problems with the PyROOT CPython extension: re-implementing the offending parts in Python and hoping that the problem is gone. Which it is!

Note that the code also had no obvious memory leak before. where it was
using the CPyCppyy::Instance_FromVoidPtr function from the CPyCppyy
API. If similar problems appear, this function might have to be
investigated in more detail.

But for this commit it is not relevant: moving more Pythonization logic
into the Python layer is better anyway.

The problem can be reproduced with a variation of the forum reproducer:

import ROOT
import numpy as np

ROOT.gInterpreter.Declare(
    """
template<class T>
void * MyGetAddress(T * b) {
   return *(void**)b->GetAddress();
}
"""
)


def macro(tree, *args, **kwargs):

    import cppyy.ll

    # manually
    v1 = cppyy.ll.cast[tree.GetBranch("value1").GetClassName() + "*"](ROOT.MyGetAddress(tree.GetBranch("value1")))
    # using the Pythonization
    v2 = getattr(tree, "value2")
    return None


pinfo = ROOT.ProcInfo_t()


def print_memory(i):
    ROOT.gSystem.GetProcInfo(pinfo)
    print(i, "memory usage", pinfo.fMemResident, pinfo.fMemVirtual)


class reTupler:
    def __init__(self, tree_name, new_file, src_file):
        self.src_file = ROOT.TFile.Open(src_file)
        self.src_tree = self.src_file.Get(tree_name)

        self.new_file = ROOT.TFile.Open(new_file, "recreate")
        self.new_tree = ROOT.TTree(tree_name, tree_name)

        # To access branches in 'src_tree' from 'new_tree':
        self.new_tree.AddFriend(self.src_tree)

        # To keep track of new branches and store values:
        self.new_branches = {}

    def add_branch(self, name, f, value_type="float"):
        self.new_branches[name] = {}
        self.new_branches[name]["f"] = f
        self.new_branches[name]["name"] = name
        self.new_branches[name]["value_type"] = value_type
        self.new_branches[name]["value"] = value = ROOT.std.vector(value_type)()
        self.new_branches[name]["tbranch"] = self.new_tree.Branch(name, value)

    def run(self):
        nentries = self.src_tree.GetEntries()
        for i in range(nentries):
            # Get entry and make sure src_tree and new_tree are synced
            self.src_tree.GetEntry(i)
            self.new_tree.GetEntry(i)

            # Now loop on all the branches that have been added:
            for branch_name, branch_dict in self.new_branches.items():
                branch_dict["value"].clear()
                # self.new_tree
                branch_dict["f"](self.new_tree)
                # [branch_dict['value'].push_back(result) for result in branch_dict['f'](self.new_tree)]

            # Fill entry with all computed branches
            self.new_tree.Fill()

            if i % 10000 == 0:
                print_memory(i)

        self.new_tree.Write()
        self.new_file.Close()
        self.src_file.Close()


file_path = "_src.root"

file = ROOT.TFile.Open(file_path, "recreate")
tree = ROOT.TTree("DDTree", "DDTree")

# Branch: value1
value1_value = ROOT.std.vector("float")()
value1_branch = tree.Branch("value1", value1_value)

# Branch: value2
value2_value = ROOT.std.vector("float")()
value2_branch = tree.Branch("value2", value2_value)


value_length = 20
nentries = 100000
for i in range(nentries):
    tree.GetEntry(i)

    value1_value.clear()
    [value1_value.push_back(result) for result in np.random.uniform(-999, 999, value_length)]

    value2_value.clear()
    [value2_value.push_back(result) for result in np.random.uniform(-99, 999, value_length)]

    tree.Fill()

file.Write()
file.Close()

tree_name = "DDTree"
src_path = "_src.root"
new_path = "_new.root"

retupler = reTupler("DDTree", new_path, src_path)
retupler.add_branch("new_branch", macro, "float")

retupler.run()

Output without this PR:

0 memory usage 361424 1544100
10000 memory usage 380580 1548912
20000 memory usage 386148 1554504
30000 memory usage 391332 1559540
40000 memory usage 396324 1565324
50000 memory usage 402084 1572740
60000 memory usage 407652 1577572
70000 memory usage 413028 1582796
80000 memory usage 418596 1588976
90000 memory usage 423780 1594012

________________________________________________________
Executed in    3.40 secs    fish           external
   usr time    3.33 secs  399.00 micros    3.33 secs
   sys time    1.96 secs  106.00 micros    1.96 secs

Output with this PR:

0 memory usage 361396 1544116
10000 memory usage 375848 1544304
20000 memory usage 375848 1544304
30000 memory usage 375848 1544304
40000 memory usage 375848 1544304
50000 memory usage 375848 1544304
60000 memory usage 375848 1544304
70000 memory usage 375848 1544304
80000 memory usage 375848 1544304
90000 memory usage 375848 1544304

________________________________________________________
Executed in    2.08 secs    fish           external
   usr time    2.06 secs  471.00 micros    2.06 secs
   sys time    1.99 secs  126.00 micros    1.99 secs

The time measurements exclude the toy data generation. The new implementation is also almost twice as fast as the old one, so a win-win!

Note: I''m pretty sure there was also a JIRA issue about this problem, I can't find it anymore...
Edit: no, the issue I had in mind was not really the same: https://its.cern.ch/jira/browse/ROOT-9875.

Copy link
Member

@vepadulano vepadulano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for such a quick fix! I wonder what was the root cause of the memory leak, maybe the CPyCppyy::Instance_FromVoidPtr? Usually one possible culprit is excessive calls to the interpreter but I don't see any evident one in the modified code. Could you add some comment about this in the commit message?

Comment on lines 179 to 181
void * finalAddress = nullptr;
std::string finalType;
std::tie(finalAddress, finalType) = ResolveBranch(tree, name, branch);
Copy link
Member

@vepadulano vepadulano May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
void * finalAddress = nullptr;
std::string finalType;
std::tie(finalAddress, finalType) = ResolveBranch(tree, name, branch);
const auto [finalAddress, finalType] = ResolveBranch(tree, name, branch);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that's an improvement because it hides the types. It's nice to know when reading that it's void * and std::string, no? I'll try to replace the auto in your suggestion with something typed, if it's not too long

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, it's only working with auto. I'll change the variable names to make the type clearer then

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just a suggestion, I see your point and you can leave it as it is if you find it not necessary 👍

PyObject *GetAttr(PyObject *self, PyObject *pyname)
// Returns a Python tuple where the first element is either the desired
// CPyCppyy proxy, or an address that still needs to be wrapped by the caller
// in a proxy using cppyy.ll.cast. In the latter laster, the second tuple
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am unsure about this, but maybe you meant

Suggested change
// in a proxy using cppyy.ll.cast. In the latter laster, the second tuple
// in a proxy using cppyy.ll.cast. In the latter case, the second tuple

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thoughts were out-of-sync with the typing 🙂

Copy link

github-actions bot commented May 22, 2024

Test Results

    12 files      12 suites   2d 18h 50m 41s ⏱️
 2 637 tests  2 637 ✅ 0 💤 0 ❌
29 949 runs  29 949 ✅ 0 💤 0 ❌

Results for commit 3f1ed33.

♻️ This comment has been updated with latest results.

@guitargeek
Copy link
Contributor Author

Thanks for the review! I have added something about the CPyCppyy API to the commit description and PR description as you requested:

Note that the code also had no obvious memory leak before. where it was
using the `CPyCppyy::Instance_FromVoidPtr` function from the CPyCppyy
API. If similar problems appear, this function might have to be
investigated in more detail.

But for this commit it is not relevant: moving more Pythonization logic
into the Python layer is better anyway.

@guitargeek
Copy link
Contributor Author

By the way, I was so sure it would fail on Windows 32 bit! I think it's the first time I changed something with pointers in PyROOT and the CI was green first try 😆 So this time, there is for sure no "overfitting" to the tested platforms

As reported on the forum:
https://root-forum.cern.ch/t/memory-leak-in-pyroot-when-using-user-defined-c-macros-from-python-and-ttree-friends/59432

I fixed the memory leak in the Pythonization is in the usual way how I
fix problems with the PyROOT CPython extension: re-implementing the
offending parts in C++ and hoping that the problem is gone. Which it is!

Note that the code also had no obvious memory leak before. where it was
using the `CPyCppyy::Instance_FromVoidPtr` function from the CPyCppyy
API. If similar problems appear, this function might have to be
investigated in more detail.

But for this commit it is not relevant: moving more Pythonization logic
into the Python layer is better anyway.
Copy link
Member

@vepadulano vepadulano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Can you apply the clang-format suggestions before merging?

@dpiparo
Copy link
Member

dpiparo commented May 23, 2024

I agree with @vepadulano . Then we should backport to 6.32.

The ROOT code formatting style was almost respected already, and it
would be better to do so consistently.
@guitargeek
Copy link
Contributor Author

I only formatted the code as requested, so no need to wait for the CI.

@guitargeek guitargeek merged commit 6df472e into root-project:master May 23, 2024
13 of 14 checks passed
@guitargeek guitargeek deleted the get_branch branch May 23, 2024 11:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Memory leak in TTree __getattr__ pythonization
3 participants