memory leak caused by racing conditions in parallel loop #7028

evfro · 2021-05-14T14:52:13Z

Reporting a bug

[x ] I have tried using the latest released version of Numba (most recent is
visible in the change log (https://github.com/numba/numba/blob/master/CHANGE_LOG).
[ x] I have included a self contained code sample to reproduce the problem.
i.e. it's possible to run as 'python bug.py'.

Simple reproducing example:

# bug.py

from time import sleep
import numpy as np
from numba import njit, prange

@njit(parallel=IS_PARALLEL)
def run_iters(n_tries, size):
    res = np.zeros(size, dtype=np.float64)
    for n in prange(n_tries):
        tmp = np.zeros(size, dtype=np.float64)
        res += tmp
    return res


if __name__ == "__main__":
    n_tries = 200
    size = 6000
    for _ in range(100):
        res = run_iters(n_tries, size)
        sleep(0.1) # allow counter to measure performance

When IS_PARALLEL is False, no memory leak occurs. To check this I run mprof run --include-children python bug.py using the standard memory_profiler package. The exported profile I get with mprof plot --output memory.png is shown below:

However, when IS_PARALLEL is True, I get completely different picture:

The leak gets worse as the size of arrays and the number of internal iterations in the loop increases. In my experiments with real code I observed several hundreds gigabytes of RAM consumed by the process, where normally it requires less then a gigabyte.

Note that when racing conditions are resolved via accumulation of temporary results, everything works normally again. I.e., whis this modification:

@njit(parallel=True)
def run_iters(n_tries, size):
    res = np.zeros((n_tries, size), dtype=np.float64)
    for n in prange(n_tries):
        tmp = np.zeros(size, dtype=np.float64)
        res[n, :] += tmp
    return res.sum(axis=0)

I get the following memory profile:

The problem with this workaround is that n_tries in my real experiments is very large, so it's not an option to accumulate intermediate results.

The text was updated successfully, but these errors were encountered:

stuartarchibald · 2021-05-19T13:46:59Z

Thanks for the report. This is a MWR:

import os
import numpy as np
from numba import njit, prange
import psutil

proc = psutil.Process(os.getpid())


def memusage(diff):
    print("mem: %.16f, diff %.16f" % (float(proc.memory_info().rss),
                                      float(diff)))


@njit(parallel=True)
def run_iters():
    n = 5000
    res = np.zeros(n)
    for i in prange(100):
        res += np.zeros(n)
    return res


if __name__ == "__main__":
    for i in range(20):
        smem = proc.memory_info().rss
        run_iters()
        emem = proc.memory_info().rss
        diff = emem - smem
        if i > 0:
            memusage(diff)

CC @DrTodd13

rsinda · 2021-12-01T09:18:01Z

I Found that numba does not do garbage collection also. I created an API and ran a function 3 to 5 time it started consuming RAM from 1 gb and eventually end up with 8 gb.

stuartarchibald added needtriage ParallelAccelerator labels May 14, 2021

stuartarchibald added bug - memory leak Memory leak and removed needtriage labels May 19, 2021

stuartarchibald mentioned this issue Jun 28, 2021

Fix ir_utils._max_label being updated incorrectly #7156

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

memory leak caused by racing conditions in parallel loop #7028

memory leak caused by racing conditions in parallel loop #7028

evfro commented May 14, 2021

stuartarchibald commented May 19, 2021

rsinda commented Dec 1, 2021

memory leak caused by racing conditions in parallel loop #7028

memory leak caused by racing conditions in parallel loop #7028

Comments

evfro commented May 14, 2021

Reporting a bug

stuartarchibald commented May 19, 2021

rsinda commented Dec 1, 2021