Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memory leak caused by racing conditions in parallel loop #7028

Open
evfro opened this issue May 14, 2021 · 2 comments
Open

memory leak caused by racing conditions in parallel loop #7028

evfro opened this issue May 14, 2021 · 2 comments

Comments

@evfro
Copy link

evfro commented May 14, 2021

Reporting a bug

  • [x ] I have tried using the latest released version of Numba (most recent is
    visible in the change log (https://github.com/numba/numba/blob/master/CHANGE_LOG).
  • [ x] I have included a self contained code sample to reproduce the problem.
    i.e. it's possible to run as 'python bug.py'.

Simple reproducing example:

# bug.py

from time import sleep
import numpy as np
from numba import njit, prange

@njit(parallel=IS_PARALLEL)
def run_iters(n_tries, size):
    res = np.zeros(size, dtype=np.float64)
    for n in prange(n_tries):
        tmp = np.zeros(size, dtype=np.float64)
        res += tmp
    return res


if __name__ == "__main__":
    n_tries = 200
    size = 6000
    for _ in range(100):
        res = run_iters(n_tries, size)
        sleep(0.1) # allow counter to measure performance

When IS_PARALLEL is False, no memory leak occurs. To check this I run mprof run --include-children python bug.py using the standard memory_profiler package. The exported profile I get with mprof plot --output memory.png is shown below:

image

However, when IS_PARALLEL is True, I get completely different picture:

image

The leak gets worse as the size of arrays and the number of internal iterations in the loop increases. In my experiments with real code I observed several hundreds gigabytes of RAM consumed by the process, where normally it requires less then a gigabyte.

Note that when racing conditions are resolved via accumulation of temporary results, everything works normally again. I.e., whis this modification:

@njit(parallel=True)
def run_iters(n_tries, size):
    res = np.zeros((n_tries, size), dtype=np.float64)
    for n in prange(n_tries):
        tmp = np.zeros(size, dtype=np.float64)
        res[n, :] += tmp
    return res.sum(axis=0)

I get the following memory profile:

image

The problem with this workaround is that n_tries in my real experiments is very large, so it's not an option to accumulate intermediate results.

@stuartarchibald
Copy link
Contributor

Thanks for the report. This is a MWR:

import os
import numpy as np
from numba import njit, prange
import psutil

proc = psutil.Process(os.getpid())


def memusage(diff):
    print("mem: %.16f, diff %.16f" % (float(proc.memory_info().rss),
                                      float(diff)))


@njit(parallel=True)
def run_iters():
    n = 5000
    res = np.zeros(n)
    for i in prange(100):
        res += np.zeros(n)
    return res


if __name__ == "__main__":
    for i in range(20):
        smem = proc.memory_info().rss
        run_iters()
        emem = proc.memory_info().rss
        diff = emem - smem
        if i > 0:
            memusage(diff)

CC @DrTodd13

@rsinda
Copy link

rsinda commented Dec 1, 2021

I Found that numba does not do garbage collection also. I created an API and ran a function 3 to 5 time it started consuming RAM from 1 gb and eventually end up with 8 gb.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants