# Memory profiling

In [1]:
import sys 
import numpy as np

import matplotlib.pyplot as plt
%matplotlib inline

## Size of individual objects

###  `sys.getsizeof()`

Provides a simple way to (estimate) the size of an object. For example, create a `list` of the first ten cubes:

In [2]:
a = [x**3 for x in range(10)]
a

[0, 1, 8, 27, 64, 125, 216, 343, 512, 729]

Determine the size of `a`:

In [3]:
size_a = sys.getsizeof(a)
print(f'Size of a: {size_a} bytes') 

Size of a: 192 bytes


`sys.getsizeof()` is good for simple Python objects. The memory usage for more complicated objects (that may reference other objects) might not be accurate.

### NumPy 
NumPy provides its own way to show the size of an array. Use this rather than `sys.getsizeof()`: 

In [4]:
a = np.arange(10)**3
a

array([  0,   1,   8,  27,  64, 125, 216, 343, 512, 729])

In [5]:
a.nbytes

80

## Memory profiler 

Much like `line_profiler`, this examines code line-by-line and displays memory usage. Unfortunately, there is no magical way to do this in the notebook. 

It does provide a useful `%memit` magic though. Load it with:

In [6]:
%load_ext memory_profiler

The `%memit` magic shows how much memory the notebook has used so far:

In [7]:
%memit

peak memory: 71.61 MiB, increment: 0.08 MiB


In [8]:
# (run this cell twice and watch the output...)
%memit b = np.random.randn(1000,1000)

peak memory: 86.91 MiB, increment: 15.29 MiB


To show how `memory_profiler` works we first create a file with code to be profiled. Decorate the function(s) of interest with `@profile`:

In [9]:
%%file tmp2.py

import numpy as np
from scipy.stats import spearmanr

@profile
def spearman():
    data = np.random.randn(1000,10000)
    correlation = spearmanr(data, axis=1).correlation
    return correlation


if __name__ == "__main__":
    spearman()

Overwriting tmp2.py


In [10]:
!python -m memory_profiler tmp2.py

Filename: tmp2.py

Line #    Mem usage    Increment   Line Contents
     5   55.848 MiB   55.848 MiB   @profile
     6                             def spearman():
     7  132.164 MiB   76.316 MiB       data = np.random.randn(1000,10000)
     8  367.398 MiB  235.234 MiB       correlation = spearmanr(data, axis=1).correlation
     9  367.398 MiB    0.000 MiB       return correlation




## mprof

The `mprof` command is part of `memory_profiler`. It measures the memory consumption over time. It also works with non-Python code. Create a file to profile:

In [16]:
%%file tmp3.py

import numpy as np

@profile
def f1():
    data = np.random.randn(1000,10000)
    return data

@profile
def f2():
    data = np.random.randn(1000,10000)
    return data
    
if __name__ == "__main__":
    a = f1()
    b = f2()

Overwriting tmp3.py


Use the `mprof run` to execute and profile the script:

In [17]:
!mprof run tmp3.py

mprof: Sampling memory every 0.1s
running as a Python program...


Use `mprof plot` to generate a plot:

In [18]:
!mprof plot --output mem.png

Using last profile data.


![title](mem.png)

`mprof` writes its data to a text file named `mprofile_<timestamp>.dat`:

In [19]:
!ls mprofile_*

mprofile_20180321120435.dat mprofile_20180321120525.dat


In [20]:
!cat mprofile_20180321120525.dat

FUNC __main__.f1 41.6680 1521633926.1026 117.9922 1521633926.4258
FUNC __main__.f2 117.9922 1521633926.4259 194.2891 1521633926.7415
CMDLINE /Users/sthorn/anaconda_envs/profiling/bin/python tmp3.py
MEM 0.023438 1521633925.6028
MEM 12.722656 1521633925.7077
MEM 19.796875 1521633925.8110
MEM 28.984375 1521633925.9132
MEM 32.992188 1521633926.0176
MEM 46.675781 1521633926.1229
MEM 72.023438 1521633926.2267
MEM 95.410156 1521633926.3283
MEM 119.859375 1521633926.4335
MEM 145.738281 1521633926.5387
MEM 170.332031 1521633926.6439
MEM 194.304688 1521633926.7491
