# 1. What is Algorithm Analysis? 

An algorithm is a generic, step-by-step list of instructions for solving a problem. It is a method for solving any instance of the problem such that given a particular input, the algorithm produces the desired result. 

Algorithm analysis is concerned with comparing algorithms based on upon the amount of computing resources that each algorithm uses. There are two different ways to look at computer resources. 

- The amount of space or memory an algorithm requires(dictated by instance itself)
- The amount of time requires to execute(execution time) 

```Python
# Sum of N : Algorithm1 
import time 

def sumofN(n): 
    start = time.time() 
    
    theSum = 0
    for i in range(1, n+1): 
        theSum += i
        
    end = time.time() 
    return theSum, end - start 

# Sum of N : Algorithm2 
def sumofN2(n): 
    start = time.time() 
    theSum = (n*(n+1))/2
    end = time.time() 
    return theSum, end - start 

# In case of n = 10,000,000, the execution time of two algorithms are same as below : 
# Algorithm 1 : 0.1729250 seconds
# Algorithm 2 : 0.00000095 seconds
```

However, the benchmark techniques doesn't provide us with a useful measurement, because it is dependent on a particular machine, program, time of day, complier, and programming language. 

# 2. Big-O Notation

The **Big-O** notation provides a useful approximation to the actual number of steps in the computation. 

Suppose that for some algorithm, the exact number of steps is $T(n) = 5n^2 + 27n + 1005$. When n is small, the constant 1005 seems to be the dominant part of the function. However, as n gets larger, the $n^2$ terms becomes the most important. So, if n gets large, we can ignore the other terms and focus on $5n^2$. 

Sometimes, the performance of an algorithm depends on the exact values of the data rather than simply the size of the problem. The worst case performance refers to a particular data set where the algorithm performs especially poorly. 

**<center>Common Functions for Big-O** 

|f(n)|Name| 
|:---:|:---:| 
|1|Constant|
|log(n)|Logarithmic|
|n|Linear|
|nlog(n)|Log Linear|
|$n^2$|Quadratic|
|$n^3$|Cubic|
|$2^n$|Exponential|

# 3. An Anagram Detection Example

Anagram : One string is an anagram of another if the second is simply a rearrangement of the first.  

## 3.1 Solution 1 : Checking Off 

Check the length of the strings and then to see that each character in the first string actually occurs in the second. 

```Python
def anagramSolution1(s1, s2): 
    stillOK = True
    if len(s1) != len(s2): 
        stillOK = False 
        
    alist = list(s2) 
    pos1 = 0 
    
    while pos1 < len(s1) and stillOK: 
        pos2 = 0
        found = False 
        while pos2 < len(alist) and not found: 
            if s1[pos1] == alist[pos2]: 
                found = True 
            else: 
                pos2 += 1
                
        if found: 
            alist[pos2] = None 
        else: 
            stillOK = False 
            
        pos1 += 1
    
    return stillOK
```

- Each of the n characters in s1 will cause an iteration through up to n characters in the list form s2. 
- The number of visits then becomes the sum of the integers from 1 to n. 
- $T(n) = \frac{n(n+1)}{2} \to O(N^2)$

## 3.2 Solution 2 : Sort and Compare 

We will end up with the same string if the original two strings are anagrams. 

```Python
def anagramSolution2(s1, s2): 
    alist1 = list(s1) 
    alist2 = list(s2) 
    
    alist1.sort()
    alist2.sort() 
    
    pos = 0
    matches = True 
    
    while pos < len(s1) and matches: 
        if alist[pos] == alist2[pos]: 
            pos += 1
        else: 
            matches = False 
            
    return matches 
```

- $O(N^2)$ or $O(n \log{n})$ : sorting process

## 3.3 Solution 3 : Brute Force 

A **brute force** technique for solving a problem typically tries to exhaust all possibilities. This is probably not going to be a good solution. 

## 3.4 Solution 4 : Count and Compare 

Using the fact that any two anagrams will have the same number of a's, the same number of b's, and so on. 

```Python
def anagramSolution4(s1, s2): 
    c1 = [0] * 26
    c2 = [0] * 26 
    for i in range(len(s1)): 
        pos = ord(s1[i]) - ord('a')
        c1[pos] += 1
        
    for i in range(len(s2)): 
        pos = ord(s2[i]) - ord('a') 
        c2[pos] += 1
        
    j = 0
    stillOK = True 
    while j < 26 and stillOK: 
        if c[j] == c2[j]: 
            j += 1
        else: 
            stillOK = False 
            
    return stillOK 
```

- The first two iterations used to count the characters are both based on n. 
- The third iteration always takes 26 steps since there are 26 possible characters in the string. 
- $T(n) = 2n + 26 \to O(N)$ 
- Altough solution 4 was able to run in linear time, it could only do so by using additional storage to keep the two lists of characeter counts. 

This is a common occurrence. On many occasions we will need to make decision between time and space **trade-offs**. As a computer scientist, when given a choice of algorithms, it will be up to us to determine the best use of computing resources given a particular problem. 

# 4. Lists

- Indexing and assigning to an index position : $O(1)$
- Append method : $O(1)$ 

## 4.1 Four ways to generate a list of n numbers

```Python
def test1(): 
    l = [] 
    for i in range(1000): 
        l += [i] 
        
def test2(): 
    l = [] 
    for i in range(1000): 
        l.append(i) 
        
def test3(): 
    l = [i for i in range(1000)] 
    
def test4(): 
    l = list(range(1000)) 
```
The **timeit** module is designed to allow Python developers to make cross-platform timing measurements by running functions. 

**timeit.Timer(stmt='pass', setup='pass')** 

The timeit Class is the Class for timing execution speed of small code snippets. This will return Timer object and can measure the execution time using timeit() method.

- stmt : Code or function to be excuted 
- setup : Declare code or function required in advance to execute stmt. The run time of setup is excluded from the overall measurement run time. 

**Timer.timeit(number=1000000)** 

Time number executions of the main statement. This executes the setup statement once, and then returns the time it takes to execute the main statement a number of times. 

```Python
t1 = Timer("test1()", "from __main__ import test1")
print("concat ",t1.timeit(number=1000), "milliseconds")
t2 = Timer("test2()", "from __main__ import test2")
print("append ",t2.timeit(number=1000), "milliseconds")
t3 = Timer("test3()", "from __main__ import test3")
print("comprehension ",t3.timeit(number=1000), "milliseconds")
t4 = Timer("test4()", "from __main__ import test4")
print("list range ",t4.timeit(number=1000), "milliseconds")

# concat  6.54352807999 milliseconds
# append  0.306292057037 milliseconds
# comprehension  0.147661924362 milliseconds
# list range  0.0655000209808 milliseconds
```

## 4.2 Big-O Efficiency of Python List Operators 

|Operation|Big-O Efficiency|
|:---:|:---:|
|index []| O(1) |
|index assignment| O(1) | 
|append|O(1)|
|pop()|O(1)|
|pop(i)|O(n)|
|insert(i, item)|O(n)|
|del operator|O(n)|
|iteration|O(n)|
|contains (in) | O(n)| 
|get slice [x:y]| O(k)|
|del slice| O(n)|
|set slice| O(n+k)|
|reverse| O(n)| 
|concatenate|O(k)|
|sort|O(nlogn)|
|multiply|O(nk)| 

## 4.3 Comparison of time efficiency of list.pop() method 

What we would expect to see is that the time required to pop from the end of the list will stay constant even as the list grows in size, while the time to pop from the beginning of the list continue to increase as the list grows. 

```Python
popzero = Timer("x.pop(0)", "from __main__ import x")
popend = Timer("x.pop()", "from __main__ import x")
print("pop(0)   pop()")
for i in range(1000000,100000001,1000000):
    x = list(range(i))
    pt = popend.timeit(number=1000)
    x = list(range(i))
    pz = popzero.timeit(number=1000)
    print(f"{pz:.5f}, {pt:.5f}")
```

We can see that as the list gets longer and longer the time it takes to pop(0) also increases($O(n)$) while the time for pop stays very flat($O(1)$).  

![](Img/listpop.png)

# 5. Dictionaries

## 5.1 Big-O Efficiency of Python Dictionary Operators 

One important side note on dictionary performance is that the efficiencies are for average performance. 

|Operation|Big-O Efficiency|
|:---:|:---:|
|copy|O(n)| 
|get item|O(1)|
|set item|O(1)|
|delete item|O(1)|
|contains (in)|O(1)| 
|iteration|O(n)| 

## 5.2 Coparision of time efficiency of in operator between list and dictionary 

We will compare the performance of the contains operation between lists and dictionaries. We will confirm that the contains operator for lists is $O(n)$ and the contains operator for dictionaries is $O(1)$. 

```Python
import timeit 
import random 

for i in range(10000, 1000001, 20000): 
    t = timeit.Timer(f"random.randrange({i}) in x", "from __main__ import random, x")
    x = list(range(i)) 
    lst_time = t.timeit(number=1000)
    x = {j:None for j in range(i)} 
    d_time = t.timeit(number=1000)
    print(f"{i}, {lst_time:.3f}, {d_time:.3f}")
```

We can see that the dictionary is consistently faster. And also can see that the time it takes for the contains operator on the list grows linearly with the size of the list while the time for the contains operator on a dictionary is constant even as the dictionary size grows. 

![](Img/ldcontains.png)