Memory leak test: take fluctuations into account (#1757)

Preamble ======= We have a [memory leak test suite](https://github.com/giampaolo/psutil/blob/e1ea2bccf8aea404dca0f79398f36f37217c45f6/psutil/tests/__init__.py#L897), which calls a function many times and fails if the process memory increased. We do this in order to detect missing `free()` or `Py_DECREF` calls in the C modules. When we do, then we have a memory leak. The problem ========== A problem we've been having for probably over 10 years, is the false positives. That's because the memory fluctuates. Sometimes it may increase (or even decrease!) due to how the OS handles memory, the Python's garbage collector, the fact that RSS is an approximation and who knows what else. So thus far we tried to compensate that by using the following logic: - warmup (call fun 10 times) - call the function many times (1000) - if memory increased before/after calling function 1000 times, then keep calling it for another 3 secs - if it still increased at all (> 0) then fail This logic didn't really solve the problem, as we still had occasional false positives, especially lately on FreeBSD. The solution ========= This PR changes the internal algorithm so that in case of failure (mem > 0 after calling fun() N times) we retry the test for up to 5 times, increasing N (repetitions) each time, so we consider it a failure only if the memory **keeps increasing** between runs. So for instance, here's a legitimate failure: ``` psutil.tests.test_memory_leaks.TestModuleFunctionsLeaks.test_disk_partitions ... Run #1: extra-mem=696.0K, per-call=3.5K, calls=200 Run #2: extra-mem=1.4M, per-call=3.5K, calls=400 Run #3: extra-mem=2.1M, per-call=3.5K, calls=600 Run #4: extra-mem=2.7M, per-call=3.5K, calls=800 Run #5: extra-mem=3.4M, per-call=3.5K, calls=1000 FAIL ``` If, on the other hand, the memory increased on one run (say 200 calls) but decreased on the next run (say 400 calls), then it clearly means it's a false positive, because memory consumption may be > 0 on second run, but if it's lower than the previous run with less repetitions, then it cannot possibly represent a leak (just a fluctuation): ``` psutil.tests.test_memory_leaks.TestModuleFunctionsLeaks.test_net_connections ... Run #1: extra-mem=568.0K, per-call=2.8K, calls=200 Run #2: extra-mem=24.0K, per-call=61.4B, calls=400 OK ``` Note about mallinfo() ================ Aka #1275. `mallinfo()` on Linux is supposed to provide memory metrics about how many bytes gets allocated on the heap by `malloc()`, so it's supposed to be way more precise than RSS and also [USS](http://grodola.blogspot.com/2016/02/psutil-4-real-process-memory-and-environ.html). In another branch were I exposed it, I verified that fluctuations still occur even when using `mallinfo()` though, despite less often. So that means even `mallinfo()` would not grant 100% stability.
giampaolo · May 12, 2020 · 6adcca6 · 6adcca6
1 parent e903726
commit 6adcca6
Show file tree

Hide file tree

Showing 10 changed files with 127 additions and 206 deletions.
diff --git a/.flake8 b/.flake8
@@ -3,6 +3,11 @@
 # T001 = print() statement
 
 [flake8]
+ignore =
+    # ambiguous variable name 'l'
+    E741,
+    # line break after binary operator
+    W504
 per-file-ignores =
    setup.py:T001
    scripts/*:T001

diff --git a/HISTORY.rst b/HISTORY.rst
@@ -24,6 +24,7 @@ XXXX-XX-XX
     >>> proc
     psutil.Process(pid=12739, name='python3', status='terminated',
                    exitcode=<Negsigs.SIGTERM: -15>, started='15:08:20')
+- 1757_: memory leak tests are now stable.
 
 **Bug fixes**
 

diff --git a/appveyor.yml b/appveyor.yml
@@ -69,6 +69,7 @@ build: off
 
 test_script:
   - "%WITH_COMPILER% %PYTHON%/python.exe scripts/internal/winmake.py test"
+  - "%WITH_COMPILER% %PYTHON%/python.exe scripts/internal/winmake.py test-memleaks"
 
 after_test:
   - "%WITH_COMPILER% %PYTHON%/python.exe scripts/internal/winmake.py wheel"

diff --git a/docs/conf.py b/docs/conf.py
@@ -267,21 +267,21 @@ def get_version():
 # -- Options for LaTeX output ---------------------------------------------
 
 latex_elements = {
-     # The paper size ('letterpaper' or 'a4paper').
-     #
-     # 'papersize': 'letterpaper',
+    # The paper size ('letterpaper' or 'a4paper').
+    #
+    # 'papersize': 'letterpaper',
 
-     # The font size ('10pt', '11pt' or '12pt').
-     #
-     # 'pointsize': '10pt',
+    # The font size ('10pt', '11pt' or '12pt').
+    #
+    # 'pointsize': '10pt',
 
-     # Additional stuff for the LaTeX preamble.
-     #
-     # 'preamble': '',
+    # Additional stuff for the LaTeX preamble.
+    #
+    # 'preamble': '',
 
-     # Latex figure (float) alignment
-     #
-     # 'figure_align': 'htbp',
+    # Latex figure (float) alignment
+    #
+    # 'figure_align': 'htbp',
 }
 
 # Grouping the document tree into LaTeX files. List of tuples

diff --git a/psutil/_compat.py b/psutil/_compat.py
@@ -8,7 +8,6 @@
 """
 
 import collections
-import contextlib
 import errno
 import functools
 import os
@@ -23,7 +22,7 @@
     # literals
     "u", "b",
     # collections module
-    "lru_cache", "redirect_stderr",
+    "lru_cache",
     # shutil module
     "which", "get_terminal_size",
     # python 3 exceptions
@@ -423,17 +422,3 @@ def get_terminal_size(fallback=(80, 24)):
                 return (res[1], res[0])
             except Exception:
                 return fallback
-
-
-# python 3.4
-try:
-    from contextlib import redirect_stderr
-except ImportError:
-    @contextlib.contextmanager
-    def redirect_stderr(target):
-        original = sys.stderr
-        try:
-            sys.stderr = target
-            yield
-        finally:
-            sys.stderr = original
diff --git a/psutil/arch/windows/net.c b/psutil/arch/windows/net.c
@@ -16,42 +16,35 @@
 
 
 static PIP_ADAPTER_ADDRESSES
-psutil_get_nic_addresses() {
-    // Note: GetAdaptersAddresses() increase the handle count on first
-    // call (and not anymore later on).
-    // allocate a 15 KB buffer to start with
-    int outBufLen = 15000;
-    DWORD dwRetVal = 0;
-    ULONG attempts = 0;
-    PIP_ADAPTER_ADDRESSES pAddresses = NULL;
-
-    do {
-        pAddresses = (IP_ADAPTER_ADDRESSES *) malloc(outBufLen);
-        if (pAddresses == NULL) {
-            PyErr_NoMemory();
-            return NULL;
-        }
-
-        dwRetVal = GetAdaptersAddresses(AF_UNSPEC, 0, NULL, pAddresses,
-                                        &outBufLen);
-        if (dwRetVal == ERROR_BUFFER_OVERFLOW) {
-            free(pAddresses);
-            pAddresses = NULL;
-        }
-        else {
-            break;
-        }
-
-        attempts++;
-    } while ((dwRetVal == ERROR_BUFFER_OVERFLOW) && (attempts < 3));
+psutil_get_nic_addresses(void) {
+    ULONG bufferLength = 0;
+    PIP_ADAPTER_ADDRESSES buffer;
+
+    if (GetAdaptersAddresses(AF_UNSPEC, 0, NULL, NULL, &bufferLength)
+            != ERROR_BUFFER_OVERFLOW)
+    {
+        PyErr_SetString(PyExc_RuntimeError,
+                        "GetAdaptersAddresses() syscall failed.");
+        return NULL;
+    }
 
-    if (dwRetVal != NO_ERROR) {
-        PyErr_SetString(
-            PyExc_RuntimeError, "GetAdaptersAddresses() syscall failed.");
+    buffer = malloc(bufferLength);
+    if (buffer == NULL) {
+        PyErr_NoMemory();
+        return NULL;
+    }
+    memset(buffer, 0, bufferLength);
+
+    if (GetAdaptersAddresses(AF_UNSPEC, 0, NULL, buffer, &bufferLength)
+            != ERROR_SUCCESS)
+    {
+        free(buffer);
+        PyErr_SetString(PyExc_RuntimeError,
+                        "GetAdaptersAddresses() syscall failed.");
         return NULL;
     }
 
-    return pAddresses;
+    return buffer;
 }
 
 
@@ -191,11 +184,11 @@ psutil_net_if_addrs(PyObject *self, PyObject *args) {
             for (i = 0; i < (int) pCurrAddresses->PhysicalAddressLength; i++) {
                 if (i == (pCurrAddresses->PhysicalAddressLength - 1)) {
                     sprintf_s(ptr, _countof(buff_macaddr), "%.2X\n",
-                            (int)pCurrAddresses->PhysicalAddress[i]);
+                              (int)pCurrAddresses->PhysicalAddress[i]);
                 }
                 else {
                     sprintf_s(ptr, _countof(buff_macaddr), "%.2X-",
-                            (int)pCurrAddresses->PhysicalAddress[i]);
+                              (int)pCurrAddresses->PhysicalAddress[i]);
                 }
                 ptr += 3;
             }
@@ -325,8 +318,7 @@ psutil_net_if_addrs(PyObject *self, PyObject *args) {
 
 /*
  * Provides stats about NIC interfaces installed on the system.
- * TODO: get 'duplex' (currently it's hard coded to '2', aka
-         'full duplex')
+ * TODO: get 'duplex' (currently it's hard coded to '2', aka 'full duplex')
  */
 PyObject *
 psutil_net_if_stats(PyObject *self, PyObject *args) {

diff --git a/psutil/tests/__init__.py b/psutil/tests/__init__.py
@@ -476,12 +476,16 @@ def terminate(proc_or_pid, sig=signal.SIGTERM, wait_timeout=GLOBAL_TIMEOUT):
         from psutil._psposix import wait_pid
 
     def wait(proc, timeout):
-        try:
-            return psutil.Process(proc.pid).wait(timeout)
-        except psutil.NoSuchProcess:
-            # Needed to kill zombies.
-            if POSIX:
-                return wait_pid(proc.pid, timeout)
+        if isinstance(proc, subprocess.Popen) and not PY3:
+            proc.wait()
+        else:
+            proc.wait(timeout)
+        if WINDOWS and isinstance(proc, subprocess.Popen):
+            # Otherwise PID may still hang around.
+            try:
+                return psutil.Process(proc.pid).wait(timeout)
+            except psutil.NoSuchProcess:
+                pass
 
     def sendsig(proc, sig):
         # If the process received SIGSTOP, SIGCONT is necessary first,
@@ -899,22 +903,15 @@ class TestMemoryLeak(PsutilTestCase):
     typically functions implemented in C which forgot to free() memory
     from the heap. It does so by checking whether the process memory
     usage increased before and after calling the function many times.
-    The logic:
-
-        call_fun_n_times()
-        if mem_diff > tolerance:
-            call_fun_for_3_secs()
-            if mem_diff > 0:
-                return 1  # failure
-        return 0  # success
 
     Note that this is hard (probably impossible) to do reliably, due
     to how the OS handles memory, the GC and so on (memory can even
-    decrease!). In order to avoid false positives you should adjust the
-    tolerance of each individual test case, but most of the times you
-    won't have to.
+    decrease!). In order to avoid false positives, in case of failure
+    (mem > 0) we retry the test for up to 5 times, increasing call
+    repetitions each time. If the memory keeps increasing then it's a
+    failure.
 
-    If available (Linux, OSX, Windows) USS memory is used for comparison,
+    If available (Linux, OSX, Windows), USS memory is used for comparison,
     since it's supposed to be more precise, see:
     http://grodola.blogspot.com/2016/02/psutil-4-real-process-memory-and-environ.html
     If not, RSS memory is used. mallinfo() on Linux and _heapwalk() on
@@ -932,15 +929,15 @@ def test_fun(self):
                 self.execute(some_function)
     """
     # Configurable class attrs.
-    times = 1000
+    times = 200
     warmup_times = 10
-    tolerance = 4096  # memory
-    retry_for = 3.0  # seconds
+    retries = 5
+    tolerance = 0  # memory
     verbose = True
 
-    def setUp(self):
-        self._thisproc = psutil.Process()
-        gc.collect()
+    @classmethod
+    def setUpClass(cls):
+        cls._thisproc = psutil.Process()
 
     def _get_mem(self):
         # USS is the closest thing we have to "real" memory usage and it
@@ -957,77 +954,63 @@ def _get_fds_or_handles(self):
     def _call(self, fun):
         return fun()
 
-    def _itercall(self, fun, iterator):
+    def _call_ntimes(self, fun, times):
         """Get 2 distinct memory samples, before and after having
         called fun repeadetly, and return the memory difference.
         """
-        ncalls = 0
-        gc.collect()
+        gc.collect(generation=1)
         mem1 = self._get_mem()
-        for x in iterator:
+        for x in range(times):
             ret = self._call(fun)
-            ncalls += 1
             del x, ret
-        gc.collect()
+        gc.collect(generation=1)
         mem2 = self._get_mem()
         self.assertEqual(gc.garbage, [])
-        diff = mem2 - mem1
-        if diff < 0:
-            self._log("negative memory diff -%s" % (bytes2human(abs(diff))))
-        return (diff, ncalls)
-
-    def _call_ntimes(self, fun, times):
-        return self._itercall(fun, range(times))[0]
-
-    def _call_for(self, fun, secs):
-        def iterator(secs):
-            stop_at = time.time() + secs
-            while time.time() < stop_at:
-                yield
-        return self._itercall(fun, iterator(secs))
+        diff = mem2 - mem1  # can also be negative
+        return diff
 
     def _log(self, msg):
         if self.verbose:
             print_color(msg, color="yellow", file=sys.stderr)
 
-    def execute(self, fun, times=times, warmup_times=warmup_times,
-                tolerance=tolerance, retry_for=retry_for):
+    def execute(self, fun, times=None, warmup_times=None, retries=None,
+                tolerance=None):
         """Test a callable."""
-        if times <= 0:
-            raise ValueError("times must be > 0")
-        if warmup_times < 0:
-            raise ValueError("warmup_times must be >= 0")
-        if tolerance is not None and tolerance < 0:
-            raise ValueError("tolerance must be >= 0")
-        if retry_for is not None and retry_for < 0:
-            raise ValueError("retry_for must be >= 0")
+        times = times if times is not None else self.times
+        warmup_times = warmup_times if warmup_times is not None \
+            else self.warmup_times
+        retries = retries if retries is not None else self.retries
+        tolerance = tolerance if tolerance is not None else self.tolerance
+        try:
+            assert times >= 1, "times must be >= 1"
+            assert warmup_times >= 0, "warmup_times must be >= 0"
+            assert retries >= 0, "retries must be >= 0"
+            assert tolerance >= 0, "tolerance must be >= 0"
+        except AssertionError as err:
+            raise ValueError(str(err))
 
         # warm up
         self._call_ntimes(fun, warmup_times)
-        mem1 = self._call_ntimes(fun, times)
-
-        if mem1 > tolerance:
-            # This doesn't necessarily mean we have a leak yet.
-            # At this point we assume that after having called the
-            # function so many times the memory usage is stabilized
-            # and if there are no leaks it should not increase
-            # anymore. Let's keep calling fun for N more seconds and
-            # fail if we notice any difference (ignore tolerance).
-            msg = "+%s after %s calls; try calling fun for another %s secs" % (
-                bytes2human(mem1), times, retry_for)
-            if not retry_for:
-                raise self.fail(msg)
+        messages = []
+        prev_mem = 0
+        increase = times
+        for idx in range(1, retries + 1):
+            mem = self._call_ntimes(fun, times)
+            msg = "Run #%s: extra-mem=%s, per-call=%s, calls=%s" % (
+                idx, bytes2human(mem), bytes2human(mem / times), times)
+            messages.append(msg)
+            success = mem <= tolerance or mem < prev_mem
+            if success:
+                if idx > 1:
+                    self._log(msg)
+                return
             else:
+                if idx == 1:
+                    print()  # NOQA
                 self._log(msg)
-
-            mem2, ncalls = self._call_for(fun, retry_for)
-            if mem2 > mem1:
-                # failure
-                msg = "+%s memory increase after %s calls; " % (
-                    bytes2human(mem1), times)
-                msg += "+%s after another %s calls over %s secs" % (
-                    bytes2human(mem2), ncalls, retry_for)
-                raise self.fail(msg)
+                times += increase
+                prev_mem = mem
+        raise self.fail(". ".join(messages))
 
     def execute_w_exc(self, exc, fun, **kwargs):
         """Convenience method to test a callable while making sure it