# Objectives

1. To understand why algorithm analysis is important.

2. To be able to use “Big-O” to describe execution time.

3. To understand the “Big-O” execution time of common operations on Python lists and dictionaries.

4. To understand how the implementation of Python data impacts algorithm analysis.

5. To understand how to benchmark simple Python programs.

### When two programs solve the same problem but look different, is one program better than the other?

In order to answer this question, we need to remember that there is an important difference between a program and the underlying algorithm that the program is representing. <br>**An algorithm** is a generic, step-by-step list of instructions for solving a problem. It is a method for solving any instance of the problem such that given a particular input, the algorithm produces the desired result. <br>**A program**, on the other hand, is an algorithm that has been encoded into some programming language. There may be many programs for the same algorithm, depending on the programmer and the programming language being used.

**Algorithm analysis** is concerned with comparing algorithms based upon the amount of computing resources that each algorithm uses. We want to be able to consider two algorithms and say that one is better than the other because it is more efficient in its use of those resources or perhaps because it simply uses fewer.

In [2]:
import time

def sumOfN(n):
    start = time.time()

    theSum = 0
    for i in range(1,n+1):
        theSum = theSum + i

    end = time.time()

    return theSum, end-start

In [6]:
# Taking an average of the function's run time over a series of runs, 
# Gives us a fair idea of the function's running time

times = 0.0
n = 10000
k = 10

for i in range(k):
    summ, time_ = sumOfN(n)
    print('Sum is %i, Time is %10.7f' %(summ, time_))  # For n = 10,000
    times+=time_
    
print(f'average time is {round(times/10, 7)} secs!')

Sum is 50005000, Time is  0.0009966
Sum is 50005000, Time is  0.0019960
Sum is 50005000, Time is  0.0019941
Sum is 50005000, Time is  0.0009971
Sum is 50005000, Time is  0.0010319
Sum is 50005000, Time is  0.0000000
Sum is 50005000, Time is  0.0000000
Sum is 50005000, Time is  0.0009463
Sum is 50005000, Time is  0.0010309
Sum is 50005000, Time is  0.0009637
average time is 0.0009957 secs!


In [13]:
# Let's increase n to 100K

times = 0.0
n = 100000

for i in range(k):
    summ, time_ = sumOfN(n)
    print('Sum is %i, Time is %10.7f' %(summ, time_))  # For n = 100,000
    times+=time_
    
print(f'average time is {round(times/10, 7)} secs!')

Sum is 5000050000, Time is  0.0099721
Sum is 5000050000, Time is  0.0129979
Sum is 5000050000, Time is  0.0060227
Sum is 5000050000, Time is  0.0089409
Sum is 5000050000, Time is  0.0099726
Sum is 5000050000, Time is  0.0109708
Sum is 5000050000, Time is  0.0090284
Sum is 5000050000, Time is  0.0000000
Sum is 5000050000, Time is  0.0000000
Sum is 5000050000, Time is  0.0156231
average time is 0.0083529 secs!


In [14]:
# Let's increase n to 1M

times = 0.0
n = 1000000

for i in range(k):
    summ, time_ = sumOfN(n)
    print('Sum is %i, Time is %10.7f' %(summ, time_))  # For n = 1000,000
    times+=time_
    
print(f'average time is {round(times/10, 7)} secs!')

Sum is 500000500000, Time is  0.0787928
Sum is 500000500000, Time is  0.0612891
Sum is 500000500000, Time is  0.0624540
Sum is 500000500000, Time is  0.0624838
Sum is 500000500000, Time is  0.0624881
Sum is 500000500000, Time is  0.0781412
Sum is 500000500000, Time is  0.0656111
Sum is 500000500000, Time is  0.0781119
Sum is 500000500000, Time is  0.0624790
Sum is 500000500000, Time is  0.0624845
average time is 0.0674335 secs!


In [15]:
# Let's increase n to 10M

times = 0.0
n = 10000000

for i in range(k):
    summ, time_ = sumOfN(n)
    print('Sum is %i, Time is %10.7f' %(summ, time_))  # For n = 10,000,000
    times+=time_
    
print(f'average time is {round(times/10, 7)} secs!')

Sum is 50000005000000, Time is  0.6789792
Sum is 50000005000000, Time is  1.0012732
Sum is 50000005000000, Time is  0.7957108
Sum is 50000005000000, Time is  0.6825488
Sum is 50000005000000, Time is  0.7347000
Sum is 50000005000000, Time is  0.7320881
Sum is 50000005000000, Time is  0.7973578
Sum is 50000005000000, Time is  0.7608554
Sum is 50000005000000, Time is  0.8444207
Sum is 50000005000000, Time is  0.7549961
average time is 0.778293 secs!


**We observe for the iterative algorithm above that the time is fairly consistent and generally increases 10 times more as we increase n by a product of 10**

Let's try a new algorithm for solving `sum-of-n`, this function, sumOfN2, takes advantage of a closed equation:-

${\sum_{i=1}}^n = {{n(n+1)}\over 2}$, to compute the sum of the first n integers without iterating.

In [16]:
def sumOfN2(n):
    start = time.time()
    summ = (n*(n+1))//2
    duration = time.time() - start
    
    return summ, duration

print(sumOfN2(10))

(55, 0.0)


In [17]:
for i in range(10):
    print('Sum is %i, Time is %10.10f' %sumOfN2(10000))  # For 10,000

Sum is 50005000, Time is 0.0000000000
Sum is 50005000, Time is 0.0000000000
Sum is 50005000, Time is 0.0000000000
Sum is 50005000, Time is 0.0000000000
Sum is 50005000, Time is 0.0000000000
Sum is 50005000, Time is 0.0000000000
Sum is 50005000, Time is 0.0000000000
Sum is 50005000, Time is 0.0000000000
Sum is 50005000, Time is 0.0000000000
Sum is 50005000, Time is 0.0000000000


In [18]:
for i in range(10):
    print('Sum is %i, Time is %10.10f' %sumOfN2(100000)) # For 100,000

Sum is 5000050000, Time is 0.0000000000
Sum is 5000050000, Time is 0.0000000000
Sum is 5000050000, Time is 0.0000000000
Sum is 5000050000, Time is 0.0000000000
Sum is 5000050000, Time is 0.0000000000
Sum is 5000050000, Time is 0.0000000000
Sum is 5000050000, Time is 0.0000000000
Sum is 5000050000, Time is 0.0000000000
Sum is 5000050000, Time is 0.0000000000
Sum is 5000050000, Time is 0.0000000000


In [19]:
for i in range(10):
    print('Sum is %i, Time is %10.10f' %sumOfN2(1000000)) # For 1000,000

Sum is 500000500000, Time is 0.0000000000
Sum is 500000500000, Time is 0.0000000000
Sum is 500000500000, Time is 0.0000000000
Sum is 500000500000, Time is 0.0000000000
Sum is 500000500000, Time is 0.0000000000
Sum is 500000500000, Time is 0.0000000000
Sum is 500000500000, Time is 0.0000000000
Sum is 500000500000, Time is 0.0000000000
Sum is 500000500000, Time is 0.0000000000
Sum is 500000500000, Time is 0.0000000000


In [20]:
for i in range(10):
    print('Sum is %i, Time is %10.10f' %sumOfN2(10000000)) # For 10,000,000

Sum is 50000005000000, Time is 0.0000000000
Sum is 50000005000000, Time is 0.0000000000
Sum is 50000005000000, Time is 0.0000000000
Sum is 50000005000000, Time is 0.0000000000
Sum is 50000005000000, Time is 0.0000000000
Sum is 50000005000000, Time is 0.0000000000
Sum is 50000005000000, Time is 0.0000000000
Sum is 50000005000000, Time is 0.0000000000
Sum is 50000005000000, Time is 0.0000000000
Sum is 50000005000000, Time is 0.0000000000


In [21]:
for i in range(10):
    print('Sum is %i, Time is %10.10f' %sumOfN2(100000000))  # For 100,000,000

Sum is 5000000050000000, Time is 0.0000000000
Sum is 5000000050000000, Time is 0.0000000000
Sum is 5000000050000000, Time is 0.0000000000
Sum is 5000000050000000, Time is 0.0000000000
Sum is 5000000050000000, Time is 0.0000000000
Sum is 5000000050000000, Time is 0.0000000000
Sum is 5000000050000000, Time is 0.0000000000
Sum is 5000000050000000, Time is 0.0000000000
Sum is 5000000050000000, Time is 0.0000000000
Sum is 5000000050000000, Time is 0.0000000000


There are two important things to notice about this output. 
* First, the times recorded above are shorter than any of the previous examples. 
* Second, they are very consistent no matter what the value of `n` from 10K to 100M. It appears that `sumOfN2` is hardly impacted by the number of integers being added.

But what does this benchmark really tell us? Intuitively, we can see that the iterative solutions seem to be doing more work since some program steps are being repeated. This is likely the reason it is taking longer. Also, the time required for the iterative solution seems to increase as we increase the value of n. However, there is a problem. If we ran the same function on a different computer or used a different programming language, we would likely get different results. It could take even longer to perform sumOfN2 if the computer were older.

Let's see a recursive solution...

In [22]:
def sumOfN3(n):
    # base case
    if n <= 1:
        return 1
    
    return n + sumOfN3(n-1)

print(sumOfN3(10))

55


In [23]:
times = 0.0
n = 1000

for i in range(10):
    start = time.time()
    summ = sumOfN3(n)
    duration = time.time() - start
    print('Sum is %i, Time is %10.7f' %(summ, duration)) # For n = 1,000 
    times+=duration

print(f'average time is {round(times/10,7)} secs!')

Sum is 500500, Time is  0.0020301
Sum is 500500, Time is  0.0000000
Sum is 500500, Time is  0.0009940
Sum is 500500, Time is  0.0009644
Sum is 500500, Time is  0.0000000
Sum is 500500, Time is  0.0000000
Sum is 500500, Time is  0.0000000
Sum is 500500, Time is  0.0010278
Sum is 500500, Time is  0.0000000
Sum is 500500, Time is  0.0000000
average time is 0.0005016 secs!


In [30]:
times = 0.0
n = 2000

for i in range(10):
    start = time.time()
    summ = sumOfN3(n)
    duration = time.time() - start
    print('Sum is %i, Time is %10.7f' %(summ, duration)) # For n = 2,000 
    times+=duration

print(f'average time is {round(times/10,7)} secs!')

Sum is 2001000, Time is  0.0009964
Sum is 2001000, Time is  0.0009995
Sum is 2001000, Time is  0.0009980
Sum is 2001000, Time is  0.0010300
Sum is 2001000, Time is  0.0009704
Sum is 2001000, Time is  0.0009916
Sum is 2001000, Time is  0.0010314
Sum is 2001000, Time is  0.0000000
Sum is 2001000, Time is  0.0009968
Sum is 2001000, Time is  0.0009966
average time is 0.0009011 secs!


**We can see that the recursive case has an even more expensive running time, with just 2000 value running for almost 0.0009 secs, which was the average entire time taken to run 10000 value iteratively. Recursion is quite space expensive too as seen below, if we try to add 5000 numbers recursively it produces a `RecursionError` saying `maximum recursion depth exceeded` ** 

In [35]:
try:
    times = 0.0
    n = 5000

    for i in range(10):
        start = time.time()
        summ = sumOfN3(n)
        duration = time.time() - start
        print('Sum is %i, Time is %10.7f' %(summ, duration)) # For n = 5,000 
        times+=duration

    print(f'average time is {round(times/10,7)} secs!')
except RecursionError as e:
    print(e)

maximum recursion depth exceeded in comparison


**Therefore, we need a way to evaluate an algorithms performance consistently, irrespective of the running time of the algorithm. Since running-time is also a function of the underlying computing resource used to run the program and this would differ, producing conflicting results depending on the computing hardware used.**

### Big-Oh-Notation