#Introduction to Memory Profiling

> Objectives:
> * Be introduced to memory profiling using different tools
> * Some small introduction to time profiling in IPython too


##ipython_memwatcher

Our recommended way to profile memory consumption for this tutorial will be [ipython_memwatcher](https://pypi.python.org/pypi/ipython_memwatcher):


In [None]:
from ipython_memwatcher import MemWatcher
mw = MemWatcher()
mw.start_watching_memory()

In [None]:
# Let's create a big object
a = [i for i in range(1000*1000)]

In [None]:
# Get some measurements from the last executed cell:
meas = mw.measurements
meas

In [None]:
# MemWatcher.measurements is a named tuple.  We can easily get info out of it:
meas.memory_delta

In [None]:
# This takes betweed 32 ~ 35 bytes per element:
meas.memory_delta * (2**20) / 1e6

In [None]:
# What are these elements made from?
type(a[0])

In [None]:
# How much memory take an int?
# Beware: the size below will depend on whether you are using a 32-bit or 64-bit Python
import sys
sys.getsizeof(a[0])

But 24 is quite less than 32~35.  Where this overhead comes from?

##objgraph

In [None]:
# Let's introduce the objgraph package and see
b = [1,2,3]
import objgraph
objgraph.show_refs([b], filename='simple-list.png')
from IPython.core.display import Image 
Image('simple-list.png')

So, the list is an structure that takes a pointer (8 bytes in 64-bit platforms) per every element in the list.  If we add this to the 24 bytes per int, then we have 32 bytes per element, which is close to the computed 32~35 bytes above.  The remaining difference is probably due to how Python handles memory internally (over-allocation). 

##memory_profiler

[memory_profiler](https://pypi.python.org/pypi/memory_profiler) is a basic module for memory profiling that many others use (like the `ipython_memwatcher` above) and it interacts well with ipython, so it is worth to see how it works:

In [None]:
%load_ext memory_profiler

In [None]:
# Use %memit magic command exposed by memory_profiler
%memit b = [i for i in range(1000*1000)]

Please note that the `peak_memory` in this case is different than the `peaked_memory` reported by ipython_memwatcher package.

##Guppy

Guppy is nice for having an overview of how different structures are using our memory:

In [None]:
from guppy import hpy; hp=hpy()
hp.heap()

In [None]:
# Size of the list (beware, this does not include the contents!)
hp.iso(a)

##%time and %timeit

In [None]:
# IPython provides a magic command to see how much time a command takes to run
%time asum = sum(a)

Note that `%time` offers quite detailed statistics on the time spent.

Also, the time reported by MemoryWatcher has a typical overhead of 3~5 ms over the time reported by %time, so when the times to measure are about this order then it is better to rely on the %time (or %timeit below) values.  

In [None]:
# We have another way to measure timings doing several loops and getting the mean
%timeit bsum = sum(a)

In [None]:
# However, one must notice that %timeit does not return the result of expressions
bsum

Interestingly, %timeit allows to retrieve the measured times in loops with the -o flag:

In [None]:
t = %timeit -o sum(a)
print(t.all_runs)
print(t.best)

And one can specify the number of loops (-n) and the number of repetitions (-r):

In [None]:
t = %timeit -r1 -n1 -o sum(a)
print(t.all_runs)
print(t.best)

### Exercise 1

Provided a dictionary like:

```
d = dict(("key: %i"%i, i*2) for i in a)
```

Try to guess how much RAM it uses.

Why do you think it takes more space than a list?

*Hint*: Use the `objgraph` package on a short dictionary so as to better see the data structure.  In case you cannot get `objgraph` to work, every entry in a dictionary has pointers to two objects: key and value. 