# Introduction to Memory Profiling

> Objectives:
> * Be introduced to memory profiling using different tools
> * Some small introduction to time profiling in IPython too


## ipython_memwatcher

Our recommended way to profile memory consumption for this tutorial will be [ipython_memwatcher](https://pypi.python.org/pypi/ipython_memwatcher):


In [1]:
from ipython_memwatcher import MemWatcher
mw = MemWatcher()
mw.start_watching_memory()

In [1] used 0.020 MiB RAM in 0.001s, peaked 0.000 MiB above current, total RAM usage 31.703 MiB


In [2]:
# Let's create a big object
a = [i for i in range(1000*1000)]

In [2] used 31.199 MiB RAM in 0.106s, peaked 0.996 MiB above current, total RAM usage 62.902 MiB


In [3]:
# Get some measurements from the last executed cell:
meas = mw.measurements
meas

Measurements(memory_delta=31.19921875, time_delta=0.10560011863708496, memory_peak=0.99609375, memory_usage=62.90234375)

In [3] used 0.062 MiB RAM in 0.009s, peaked 0.000 MiB above current, total RAM usage 62.965 MiB


In [4]:
# MemWatcher.measurements is a named tuple.  We can easily get info out of it:
meas.memory_delta

31.19921875

In [4] used 2.020 MiB RAM in 0.105s, peaked 0.000 MiB above current, total RAM usage 64.984 MiB


In [5]:
# This takes between 32 ~ 35 bytes per element:
meas.memory_delta * (2**20) / 1e6

32.714752

In [5] used -1.965 MiB RAM in 0.002s, peaked 1.965 MiB above current, total RAM usage 63.020 MiB


In [6]:
# What are these elements made from?
type(a[0])

int

In [6] used 0.000 MiB RAM in 0.012s, peaked 0.000 MiB above current, total RAM usage 63.020 MiB


In [7]:
# How much memory take an int?
# Beware: the size below will depend on whether you are using a 32-bit or 64-bit Python
import sys
sys.getsizeof(a[0])

24

In [7] used 0.008 MiB RAM in 0.003s, peaked 0.000 MiB above current, total RAM usage 63.027 MiB


Ok.  On 64-bits, that means that the int object allocates 8 bytes for the integer value, and 16 bytes for other metadata (Python object).  But 24 is quite less than 32~35.  Where this overhead comes from?

Well, it turns out that the list structure needs additional pointers to reference the different components. [Explain with some diagrams]

## memory_profiler

[memory_profiler](https://pypi.python.org/pypi/memory_profiler) is a basic module for memory profiling that many others use (like the `ipython_memwatcher` above) and it interacts well with ipython, so it is worth to see how it works:

In [8]:
%load_ext memory_profiler

In [8] used 0.004 MiB RAM in 0.001s, peaked 0.000 MiB above current, total RAM usage 63.031 MiB


In [9]:
# Use %memit magic command exposed by memory_profiler
%memit b = [i for i in range(1000*1000)]

peak memory: 134.00 MiB, increment: 70.97 MiB
In [9] used 71.121 MiB RAM in 0.330s, peaked 0.000 MiB above current, total RAM usage 134.152 MiB


Please note that the `peak_memory` in this case is different than the `peaked_memory` reported by ipython_memwatcher package.

## Guppy

Guppy is nice for having an overview of how different structures are using our memory:

In [10]:
from guppy import hpy; hp=hpy()
hp.heap()

Partition of a set of 2128731 objects. Total size = 83076464 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0 2002852  94 48068448  58  48068448  58 int
     1   1231   0 16443512  20  64511960  78 list
     2  58644   3  5120424   6  69632384  84 str
     3  31176   1  2673888   3  72306272  87 tuple
     4   1606   0  1696144   2  74002416  89 dict (no owner)
     5    463   0  1331560   2  75333976  91 dict of module
     6   8024   0  1027072   1  76361048  92 types.CodeType
     7      2   0   983968   1  77345016  93 guppy.heapy.heapyc.NodeGraph
     8   7865   0   943800   1  78288816  94 function
     9    941   0   847816   1  79136632  95 type
<572 more rows. Type e.g. '_.more' to view.>

In [10] used 68.035 MiB RAM in 3.065s, peaked 0.000 MiB above current, total RAM usage 202.188 MiB


In [11]:
del b
hp.heap()

Partition of a set of 1128950 objects. Total size = 50953432 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0 1003107  89 24074568  47  24074568  47 int
     1   1222   0  8316400  16  32390968  64 list
     2  58646   5  5120600  10  37511568  74 str
     3  31177   3  2673976   5  40185544  79 tuple
     4   1609   0  1698520   3  41884064  82 dict (no owner)
     5    463   0  1331560   3  43215624  85 dict of module
     6   8024   1  1027072   2  44242696  87 types.CodeType
     7      2   0   983600   2  45226296  89 guppy.heapy.heapyc.NodeGraph
     8   7865   1   943800   2  46170096  91 function
     9    941   0   847816   2  47017912  92 type
<568 more rows. Type e.g. '_.more' to view.>

In [11] used 2.707 MiB RAM in 0.704s, peaked 0.000 MiB above current, total RAM usage 204.895 MiB


In [12]:
# Size of the list (beware, this does not include the contents!)
hp.iso(a)

Partition of a set of 1 object. Total size = 8126536 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0      1 100  8126536 100   8126536 100 list

In [12] used 0.027 MiB RAM in 0.014s, peaked 0.000 MiB above current, total RAM usage 204.922 MiB


## %time and %timeit

In [13]:
# IPython provides a magic command to see how much time a command takes to run
%time asum = sum(a)

CPU times: user 4 ms, sys: 0 ns, total: 4 ms
Wall time: 7.32 ms
In [13] used 0.012 MiB RAM in 0.009s, peaked 0.000 MiB above current, total RAM usage 204.934 MiB


Note that `%time` offers quite detailed statistics on the time spent.

Also, the time reported by MemoryWatcher has a typical overhead of 1~5 ms over the time reported by %time, so when the times to measure are about this order then it is better to rely on the %time (or %timeit below) values.  

In [14]:
# We have another way to measure timings doing several loops and getting the mean
%timeit bsum = sum(a)
# However, one must notice that %timeit does not return the result of expressions

100 loops, best of 3: 7.04 ms per loop
In [14] used 0.035 MiB RAM in 2.931s, peaked 0.000 MiB above current, total RAM usage 204.969 MiB


Interestingly, %timeit allows to retrieve the measured times in loops with the -o flag:

In [15]:
t = %timeit -o sum(a)
print(t.all_runs)
print(t.best)

100 loops, best of 3: 6.94 ms per loop
[0.7369999885559082, 0.6935329437255859, 0.6994690895080566]
0.00693532943726
In [15] used 0.008 MiB RAM in 2.984s, peaked 0.000 MiB above current, total RAM usage 204.977 MiB


And one can specify the number of loops (-n) and the number of repetitions (-r):

In [16]:
t = %timeit -r1 -n1 -o sum(a)
print(t.all_runs)
print(t.best)

1 loops, best of 1: 7.28 ms per loop
[0.007275104522705078]
0.00727510452271
In [16] used 0.000 MiB RAM in 0.026s, peaked 0.000 MiB above current, total RAM usage 204.977 MiB


### Exercise 1

Provided a dictionary like:

```
d = dict(("key: %i"%i, i*2) for i in a)
```

Try to guess how much RAM it uses using the techniques introduced above.

Why do you think it takes more space than a list?

*Hint*: Every entry in a dictionary has pointers to two objects: key and value. 

## Solution

In [17]:
d = dict(("key: %i"%i, i*2) for i in a)

In [17] used 94.512 MiB RAM in 0.906s, peaked 0.000 MiB above current, total RAM usage 299.488 MiB


In [18]:
# Compute the size of key + value
sys.getsizeof("key: 100000") + sys.getsizeof(1)

72

In [18] used 0.004 MiB RAM in 0.002s, peaked 0.000 MiB above current, total RAM usage 299.492 MiB


In [19]:
# Compute the size of a Python object + pointers for all elements in MB
(16 + 8 + 8) * 1000*1000 / 2**20.

30.517578125

In [19] used 0.000 MiB RAM in 0.012s, peaked 0.000 MiB above current, total RAM usage 299.492 MiB


In [20]:
# Using guppy
hp.iso(d)

Partition of a set of 1 object. Total size = 50331928 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0      1 100 50331928 100  50331928 100 dict (no owner)

In [20] used 0.000 MiB RAM in 0.588s, peaked 0.000 MiB above current, total RAM usage 299.492 MiB


So, guppy is telling us that just the dictionary structure is taking ~50 MB, whereas the contents alone are taking ~70MB, so we should have expected the dictionary to consume ~120 MB.  However, our `MemWatcher` instance is reporting just ~100 MB.  Take away lesson: measuring memory consumption in Python is tricky but using proper tools we can still get valuable hints!