#Timing

Now we are sure that our program is correct, we can ask for performance. Given two different algorithms which solves the same problem with the same output, we can say that one gives better performance than the other by comparing its processing times.

Time function gives us a time mark. It's straightforward to calculate a function processing time by taking marks before and after the function and subtract them.

In [1]:
import sieveExample
import time
t1 = time.time()
result = sieveExample.sieveOfEratosthenes(100000)
t2 = time.time()
print 'sieveOfEratosthenes took {} seconds'.format(t2 - t1)

ImportError: No module named sieveExample

We can also take overall timing of a python script by passing `-t` argument to run command. 
This print timing information at the end of the run. IPython will give
you an estimated CPU time consumption and wall clock times for your script. Under Unix, an estimate of time spent on system tasks is also given (for Windows platforms this is reported as 0.0, since it can not be measured). An additional ``-N<N>`` option can be given, where <N> must be an integer indicating how many times you want the script to
run. The final timing report will include total and per run results.


In [None]:
%run?
%run -t -N5 sieveExampleArgParse.py 1000000

However if we want more timing with more precision we can use magic command %timeit. 

**%timeit** executes a function several times, and returns the best time obtained. 
It will limit the number of runs depending on how long the script takes to execute.

The number of runs may be set with with -n 1000, for example, which will limit %timeit to a thousand iterations

The number of rounds %timeit it is executed  can also be modified, using -r. For example -r will produce the best result of 5 executions, by default is 3

In [None]:
import sieveExample
%timeit sieveExample.sieveOfEratosthenes(1000000)

**Comparing two different implementations**

Now lets to compare two different implimentations for the same problem. Numpy and scipy both have implemented a method for interpolation function. We can use %timeit to know which function is the fastest one. 

In [None]:
import numpy as np
import scipy.interpolate as spip 
np.interp?

In [None]:
spip.interp1d?

In [None]:
x = np.linspace(0, 2*np.pi, 10)
y = np.sin(x)
xvals = np.linspace(0, 2*np.pi, 50)
#scipy
f = spip.interp1d(x, y)
scipy_vals = f(xvals)
# numpy
numpy_vals = np.interp(xvals, x, y)
#assert if values are close!
assert np.allclose(scipy_vals, numpy_vals) 
#np.allclose?

Once we have checked that both functions gives us the same result aproximately 
(absolute(`a` - `b`) <= (`atol` + `rtol` * absolute(`b`)); `rtol`=1e-05, `atol`=1e-08),

we can check it performance:

In [None]:
print 'scipy:'
%timeit -n 10000 -r5 f(xvals)
print 'numpy:'
%timeit -n 10000 -r5 np.interp(xvals, x, y)

#Profiling

Now we know the time consumed by an implementation, but this time can be shortened? And the key question how it can be done? First thing is to realize where is consuming more time in our code. Profiling gives us how many time takes each method or function been called. We can see the number of calls, total time, time per call and cummulative time since out function was called.

At the command line:

In [None]:
%run -m cProfile sieveExampleArgs.py 1000000

Store profile results and visualize it with [`pstats`](http://docs.python.org/library/profile.html#module-pstats). From command line:

In [None]:
!python -m cProfile -o sieveExample.prof sieveExampleArgs.py 100000

In [None]:
import pstats
stats = pstats.Stats('sieveExample.prof')
stats.print_stats()

Sorting stats:

In [None]:
stats.sort_stats('cumtime').print_stats()

In [None]:
stats.sort_stats('tottime').print_stats(5) #five rows

In [None]:
stats.sort_stats('cumtime').print_stats(r'range') #filter using Regular Expression 

We also can use **magic command** %prun for profiling

In [None]:
%prun?

In [None]:
import sieveExample
%prun sieveExample.sieveOfEratosthenes(100000) #Using magic command

In [None]:
%prun -D sieveExample_sieve.prof sieveExample.sieveOfEratosthenes(100000)

In [None]:
%prun -q -D scipy_interp.prof f(xvals)

In [None]:
%prun -q -D numpy_interp.prof np.interp(xvals, x, y)

Show stats:

In [None]:
import pstats
stats = pstats.Stats('scipy_interp.prof')
stats.sort_stats('tottime').print_stats(3) #three rows

In [None]:
import pstats
stats = pstats.Stats('numpy_interp.prof')
stats.sort_stats('cumtime').print_stats(3) #three rows

Profiling can help to find useless calculations:

In [None]:
import pandas as pd
import numpy as np
unames = ['user_id', 'gender', 'age', 'occupation', 'zip']
users = pd.read_table('./ml-1m/users.dat', sep='::', header=None, names=unames, engine='python')
rnames = ['user_id', 'movie_id', 'rating', 'timestamp']
ratings = pd.read_table('./ml-1m/ratings.dat', sep='::', header=None, names=rnames,  engine='python')
mnames = ['movie_id', 'title', 'genres']
movies = pd.read_table('./ml-1m/movies.dat', sep='::', header=None, names=mnames,  engine='python')
data = pd.merge(pd.merge(ratings, users), movies)

def top_movies(dataFrame,usr):
    user= dataFrame[dataFrame.user_id == usr]
    max_i = user.rating.max()
    return user[user.rating == max_i].title

def compareTopMovies(data,usr1, usr2):
    movi1= top_movies(data,usr1).values
    movi2 = top_movies(data,usr2).values
    hits=np.intersect1d(movi1,movi2)
    return hits

#Top Movies for user 1
print top_movies(data,1)
#Compare TopMovies shared by two users:
print compareTopMovies(data,1,2)

In [None]:
#Compare all users between them. Profiling
%prun -D compare.prof {x:compareTopMovies(data,1,x) for x in users.user_id if x!=1}

In [None]:
import pstats
stats = pstats.Stats('compare.prof')
stats.sort_stats('cumtime').print_stats(50) #50 rows

We can realize that TopMovie is called 2 times per each user in the table, inside compareTopMovies.
Lets see these functions line per line 

###Line Profiler

See how long it took each line in a function to run.  Functions to profile this way must be passed by name with -f.

In [None]:
##pip install line-profiler
%load_ext line_profiler

In [None]:
%lprun?
%lprun -f top_movies top_movies(data,1) 

In [None]:
%lprun -f compareTopMovies compareTopMovies(data,1,2)

###Memory profiler


Now let's take a look into memory profiling. 

In [None]:
##pip install psutil
##pip install memory-profiler
%load_ext memory_profiler

See how much memory a script uses line by line. Let’s take a look at the sieveOfEratosthenes function that we profiled with %prun - except this time we’re interested in incremental memory usage and not execution time. NOTE: %mprun can only be used on functions defined in physical files, and not in the IPython environment.

In [None]:
%mprun?
#clear all variables
#%reset 
import pandasExample
%mprun -f pandasExample.test pandasExample.test()

See how much memory a script uses overall. %memit works a lot like %timeit except that the number of iterations is set with -r instead of -n.

In [None]:
%memit -r 3  pandasExample.test()

##Challenges

1. Change the sieve Of Eratosthenes implemantion, such that its performances would be better. Hint: use Numpy arrays and boolean filters.

2. Change function compareTopMovies in order to get better performance, by reducing useless code. Hint: reuse before recalculate.