Skip to content

memory profiling

Oliver Beckstein edited this page Jun 12, 2019 · 3 revisions

As a developer or user you sometimes want to find out where your code consumes memory. Or you suspect that the code "leaks memory", i.e., memory is not marked as available even after the data structures that occupy it are not longer needed. In Python there are a number of so-called memory profilers available. Here we introduce memory-profiler.

Installation

See the memory-profiler home page. Either from conda-forge

conda install -c conda-forge memory_profiler

or via pip

pip install -U memory_profiler

Quick-start

Collecting data

In short: Decorate functions with @profile and run your code with memory_profiler. Two convenient ways to do this:

  1. load the module during execution
python -m memory_profiler example.py
  1. use the included mprof script and run
mprof run example.py

This will create a line-by-line summary similar to line_profiler.

For other ways to run see memory-profiler.

Plot

You can also generate plots of the timeseries of memory usage with

mprof run example.py
mprof plot

Example: Does the MemoryReader leak memory

I tested the memory reader (in 0.19.2 on Python 3.6 on macOS) to see if there were some obvious memory leaks.

Methods

I use the script xtc_vs_memreader.py. The script loads the NhaA equilibrium dataset, which is about 1 GB in size (as compressed XTC). It first runs a simple analysis three times with the normal XTCReader. It then runs the same analysis another three times but first loads the trajectory into memory. (I load the whole trajectory ainto memory to stress the memory and then skip every 10th step during the analysis; for performance one would instead read every 10th step into memory and then analyze all frames.) This was run on a laptop with 8 GB of RAM and a SSD disk.

I used memory_profiler to obtain memory data:

mprof run xtc_vs_memreader.py

and

mprof plot

Results

One can look at the line-by-line report but more interesting is the time course. The first one is with del u; gc.collect() enabled. The memory goes up when transfer_to_memory happens (in the run() function) and then goes down when I manually garbage collect.

Time trace of memory consumption with manual garbage collection.

The second one with automatic gc shows that python releases memory when it wants to... I don't understand why the total max memory is a bit different.

Time trace of memory consumption with Python's automatic garbage collection.

In both cases, however, it looks as if it releases (almost) all memory. I assume that the very small difference at the end has more got to do with the sampling interval of 0.1 s than a memory leak.

My conclusion from the simple test is that there's not a massive memory leak in the MemoryReader.

Clone this wiki locally
You can’t perform that action at this time.