In [1]:
import timeit
import cProfile

This is the function that was given in the exercise text. The goal of the exercise is to try to optimize it and to assess the impact with both timeit and cProfile.

In [2]:
def approx_pi2(n=10000000):
    val = 0.
    for k in range(1,n+1):
        val += 1./k**2
    return (6 * val)**.5

This is how I've optimized the function. 

In [3]:
def approx_pi2_opt(n=10000000):
    val = sum([1./(k * k) for k in range(1, n+1)])
    return (6 * val)**.5

This is the optimized calculation, as a string rather than as a function. It should run the same, but without the function call overhead. 

In [4]:
approx_pi2_string = "(6*sum([1./(k * k) for k in range(1, 10000000+1)]))**.5"

Here is a comparison of runtimes with timeit.

In [5]:
num_tests = 1
num_repeats = 20

time_vector_original = timeit.repeat("approx_pi2()", number = num_tests, repeat = num_repeats, globals=globals())
time_best_original = min(time_vector_original)
print("Original Function, Best Time: {:.3f}".format(time_best_original))

time_vector_opt = timeit.repeat("approx_pi2_opt()", number = num_tests, repeat = num_repeats, globals=globals())
time_best_opt = min(time_vector_opt)
print("Optimized Function, Best Time: {:.3f}".format(time_best_opt))

time_vector_string = timeit.repeat(approx_pi2_string, number = num_tests, repeat = num_repeats, globals=globals())
time_best_string = min(time_vector_string)
print("Optimized String, Best Time: {:.3f}".format(time_best_string))

speedup_opt = time_best_original / time_best_opt
speedup_string = time_best_original / time_best_string

print("Optimized Function, Speedup: {:.2f}".format(speedup_opt))
print("Optimized String, Speedup: {:.2f}".format(speedup_string))

Original Function, Best Time: 6.786
Optimized Function, Best Time: 2.680
Optimized String, Best Time: 2.607
Optimized Function, Speedup: 2.53
Optimized String, Speedup: 2.60


It looks like my optimized version runs about 2.5x as fast, which is cool. The string is slightly faster than the function, although I'm not sure that result would hold up to more rigorous testing. 

Here is the profile for the original function.

In [6]:
cProfile.run("approx_pi2()")

         4 function calls in 9.133 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    9.133    9.133    9.133    9.133 <ipython-input-2-ebb0c387764c>:1(approx_pi2)
        1    0.000    0.000    9.133    9.133 <string>:1(<module>)
        1    0.000    0.000    9.133    9.133 {built-in method builtins.exec}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}




I don't understand how I would use this to optimize the function. I can see that the entirety of the time takes place within the function, which makes sense since it's all I'm assessing. That doesn't give me any insight into how to make the function more efficient; I just did the optimization based on outside knowledge. 

Here is the profile for my optimized function.

In [7]:
cProfile.run("approx_pi2_opt()")

         6 function calls in 3.356 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.181    0.181    3.356    3.356 <ipython-input-3-fafa65d2d229>:1(approx_pi2_opt)
        1    3.098    3.098    3.098    3.098 <ipython-input-3-fafa65d2d229>:2(<listcomp>)
        1    0.000    0.000    3.356    3.356 <string>:1(<module>)
        1    0.000    0.000    3.356    3.356 {built-in method builtins.exec}
        1    0.077    0.077    0.077    0.077 {built-in method builtins.sum}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}




This one at least has a bit of a breakdown. Looking at the tottime, I can see that the function spends the most time on the list comprehension, which makes sense. I guess if I wanted to optimize further, that would be the area to target. 

Is the difference between the cumtime and the tottime for the function representative of the function call overhead? 

In [8]:
cProfile.run(approx_pi2_string)

         5 function calls in 3.034 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    2.693    2.693    2.693    2.693 <string>:1(<listcomp>)
        1    0.232    0.232    3.034    3.034 <string>:1(<module>)
        1    0.000    0.000    3.034    3.034 {built-in method builtins.exec}
        1    0.109    0.109    0.109    0.109 {built-in method builtins.sum}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}




Again, this one spends most of its time on the list comprehension. I'm noticing that this version spends more time on thestring module. What does that mean? It was negligible for the function versions. 