Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unbounded memory use when iterating over RecordArray #2275

Closed
ponyisi opened this issue Mar 1, 2023 · 3 comments · Fixed by #2311
Closed

Unbounded memory use when iterating over RecordArray #2275

ponyisi opened this issue Mar 1, 2023 · 3 comments · Fixed by #2311
Labels
bug (unverified) The problem described would be a bug, but needs to be triaged

Comments

@ponyisi
Copy link

ponyisi commented Mar 1, 2023

Version of Awkward Array

1.10.2

Description and code to reproduce

When using iterators for a RecordArray multiple times, we see apparently unbounded growth of the RSS of our code, which eventually causes them to be killed by the OOM-killer. Not entirely sure if this is coming from uproot or awkward, but the fact that it depends on the way the RecordArray is accessed suggests that it's an awkward issue.

To reproduce:

import uproot
import psutil, gc

FILE='data_ceph/larlumi/root/run_00358031.root'

def itr(tree):
    records=tree.arrays(['PosInTrain','EventNumber'])
    records['PosInTrain']*records['EventNumber']
    
def itr2(tree):
    records=tree.arrays(['PosInTrain','EventNumber'])
    for q in records:
        pass

p = psutil.Process()
print('Array-at-a-time operations')
for j in range(10):
    print(p.memory_full_info().rss)
    with uproot.open(FILE) as f:
        t=f['myTree']
        itr(t)
    gc.collect()
print(p.memory_full_info().rss)

print()
print('Iteration')
for j in range(10):
    print(p.memory_full_info().rss)
    with uproot.open(FILE) as f:
        t=f['myTree']
        itr2(t)
    gc.collect()
print(p.memory_full_info().rss)

The test file can be obtained from root://utatlas.its.utexas.edu//data_ceph/onyisi/larlumi/root/run_00358031.root . The first set of iterations ("Array-at-a-time") will increase and level out RSS use, while "Iteration" will grow on each loop by a similar amount. Explicitly switching off the caches in uproot.open() doesn't affect this.

The issue appears in awkward 1.10.2 / uproot 4.3.7. I can confirm that it seems to be fixed in awkward 2.0.8 / uproot 5.0.3 ; however the coffea dependencies mean we can't upgrade to this version at this time.

(And yes we do have reasons why we can't use the array-at-a-time operations ... and numba makes them pretty fast anyway ...)

@ponyisi ponyisi added the bug (unverified) The problem described would be a bug, but needs to be triaged label Mar 1, 2023
@jpivarski
Copy link
Member

It could be one of these: #567, #1127, #1280. In Awkward 1.x, the layout tree structures that back the arrays were implemented in C++, and after all the investigation in those three issues, I'm still not sure whether there is a memory leak on tree node structures (the small metadata objects, not the large array buffers). To see this leak, if there is one, you have to create a large number of small arrays, which is not the intended usage pattern—as you've pointed out.

You're also right that it goes away in Awkward 2.x. In fact, this was a fundamental motivation of the 2.x project, to avoid the Python/C++ border as much as possible. (ACAT talk) One thing that I know is a memory leak is circular references that cross the Python/C++ border; in fact, this is also a NumPy issue that has been open since 2009 (numpy/numpy#6581 (comment)). But in your case, the reference is not circular.

Running your script with Awkward 1.10.2; layout nodes are C++ objects:

Array-at-a-time operations
57126912
108662784
110686208
118956032
119750656
121495552
122552320
121614336
121638912
122638336
122638336

Iteration
122638336
193118208
280481792
367898624
455299072
542711808
630079488
717500416
804909056
892297216
979660800

Running your script with the latest main (past 2.0.8); layout nodes are Python objects:

Array-at-a-time operations
38154240
75194368
78688256
79626240
79626240
79892480
79900672
80158720
80166912
80424960
80691200

Iteration
80691200
77701120
78237696
78495744
78499840
77971456
78499840
78495744
78499840
78233600
78499840

If you know that you're going to be iterating in Python (not Numba), then an up-front conversion to Python objects (to_list) would be beneficial for both speed and memory. Python iteration over built-in objects is some large factor (more than 10?) faster than any calls through __getitem__, especially if they have to cross the Python/C++ border with pybind11.

If you're using Numba, the mechanism for iteration is so different that a memory leak here would be no indication of a memory leak in Numba. An itr2 that has been JIT-compiled with Numba (and has enough in the loop that it doesn't get optimized away) consists of stack-bound iterator objects that access the array contents as borrowed memory—so, no memory management at all. (The original array is forced to live as long as the Numba-compiled function runs.)

So if your final application will be in Numba, try memory-profiling that. I think the results will be very different.

@jpivarski
Copy link
Member

Good news! A seemingly different memory leak in Coffea could be traced down to leaking strings when calling Form.parameters, and that fixed this leak, too.

There will be a 1.10.3 bug-fix release with this in it (PR #2311).

>>> import awkward
>>> awkward.__version__
'1.10.2'
>>> import uproot
>>> uproot.__version__
'4.3.7'
Array-at-a-time operations
50438144
114077696
115261440
119037952
120647680
121069568
121069568
121069568
121069568
121069568
122101760

Iteration
122101760
122200064
122200064
122167296
121167872
122167296
121167872
121151488
122171392
121171968
121147392

(does not grow; it used to grow by about a factor of 10.)

@jpivarski jpivarski linked a pull request Mar 13, 2023 that will close this issue
@ponyisi
Copy link
Author

ponyisi commented Mar 14, 2023

Does seem to be fixed in 1.10.3, thanks!

@ponyisi ponyisi closed this as completed Mar 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug (unverified) The problem described would be a bug, but needs to be triaged
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants