## Code Profiling and Timing

The notebook will give walk you through the array of functionalities that ipython offers to profile and time your code.

Motivation: Once you have your code working it's always a good practice to dig into its efficiency. To do that, you should check for the execution time of your operations and determine where the bottleneck is.

### Time the execution of code using %time

In [None]:
import pandas as pd
%time pd.read_csv("../data/advertising.csv")


### Learn about time of execution using %timeit

%timeit is used to learn about the time it took for repeated execution of a single line of code. This is a more accurate measure as it runs multiple times and returns the average time.

It automatically decides the number of repetitions based on the execution time of the statement of code.

In [46]:
%timeit sum(range(100))

1.94 µs ± 102 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [50]:
a_list = [random.random() for i in range(1000)]
%timeit a_list.sort()

4.61 µs ± 349 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


Repeating an operation is not always the best option. Sorting an unsorted list is different from sorting a sorted list and thus you may have skewed results because of the repetitions.

Whenever you have a slow performing command or when system delays may not have a dire effect on the results, it's better to use **%time** magic command

The better choice for the sorting example:   HARSHIT THIS IS TOO MANY WORDS. I'D TAKE ALL OF THIS OUT OF THE FILE -- YOU CAN STILL SAY IT, JUST LEAVE IT OUT OF THE FILE

In [51]:
#Use %time for slow performing commands
a_list = [random.random() for i in range(1000)]
%time a_list.sort()

CPU times: user 225 µs, sys: 270 µs, total: 495 µs
Wall time: 292 µs


In [52]:
#%%timeit handles more than one line of code
result = 0
for i in range(1000):
    for j in range(1000):
        result += i * (2) ** j

1.34 s ± 129 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


### Run code with a profiler using %prun

The commands runs the code with a profiler giving us all the details of the total number of function calls, time it took to complete each function call along with the cummulative time as well.

We have defined a simple function that is creating a square(equal number of rows and cols) 2D list.

In [43]:
#function that creates a square 2D list
import random
import math
def create_sq_matrices(n_rows):
    out_arr = []
    for i in range(n_rows):
        out_arr.append([math.ceil(random.random()*n_rows) for k in range(n_rows)])
    return out_arr

In [44]:
#Profile the defined function for 1000 rows using %prun.
%prun create_sq_matrices(1000)

 

In [2]:
sum(range(100))

4950