# Asymptotic Analysis & Data Structures

### Topics to discuss today:

<ul>
    <li>What is Asymptotic Analysis?</li>
    <li>Classifying time complexities</li>
    <li>Classifying space complexities</li>
    <li>Implementing a LinkedList</li>
</ul>

### What is Asymptotic Analysis?

Asymptotic analysis refers to setting mathematical bounds of an algorithms run-time performance. Asymptotic analysis is used for estimating time and space complexity.

There are three metrics we measure:
<ul>
<li><b>Best Case</b> − Minimum time required for running.</li>
<li><b>Average Case</b> − Average time required for running.</li>
<li><b>Worst Case</b> − Maximum time required for running.</li>
</ul>

Here are the two major asymptotic notations that we'll be focusing on today:
<ul>
<li>Ο Notation (Big O Notation)</li>
<li>Ω Notation (Omega Notation)</li>
</ul>

#### Big O Notation
Big O notation expresses the <b>upper bound</b> of an algorithm's execution time. This measures the <b>worst case</b> time complexity.

#### Omega Notation
Omega notation expresses the <b>lower bound</b> of an algorithm's execution time. This measures the <b>best case</b> time complexity.



<table style="text-align:left;" class="table table-bordered">
    <thead>
        <tr>
            <th>Name</th>
            <th>Time Complexity</th>
        </tr>
    </thead>

  <tr>
<td>constant</td>
<td>Ο(1)</td>
</tr>
<tr>
<td>logarithmic</td>
<td>Ο(log n)</td>
</tr>
<tr>
<td>linear</td>
<td>Ο(n)</td>
</tr>
<tr>
<td>n log n</td>
<td>Ο(n log n)</td>
</tr>
<tr>
<td>quadratic</td>
<td>Ο(n<sup>2</sup>)</td>
</tr>
<tr>
<td>cubic</td>
<td>Ο(n<sup>3</sup>)</td>
</tr>
<tr>
<td>polynomial</td>
<td>n<sup>Ο(1)</sup></td>
</tr>
<tr>
<td>exponential</td>
<td>2<sup>Ο(n)</sup></td>
</tr>
</table>

Extra resources:
https://www.youtube.com/watch?v=0oDAlMwTrLo

##### O(1) Example
No matter the size of the input data, the execution time will always be the same

In [6]:
a = 10000000
b = 2000000

def is_smaller(a,b):
    return a < b

print(is_smaller(a,b))

False


In [11]:
%timeit is_smaller(10,20)

72.4 ns ± 1.36 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)


In [12]:
%timeit is_smaller(10000000,2000000)

71.4 ns ± 0.354 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)


In [10]:
a_list = [1,2,3,4,5,6,7,8,9,10]

i = 0
for num in a_list:
    print(a_list[0])
    i+= 1
    if i >= 3:
        break

1
1
1


##### O(n) Example
The execution time increases linearly with the length of the input. For each growth in size of the input, the time it takes to run increases by the same amount.

In [13]:
def find_sum(a_list):
    curr_sum = 0
    for num in a_list:
        curr_sum += num
        
    return curr_sum

In [14]:
%timeit find_sum([1,42,12])

216 ns ± 0.998 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [15]:
%timeit find_sum([1,42,12,14])

252 ns ± 0.678 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [16]:
%timeit find_sum([1,42,12,14,1,42,12,14])

350 ns ± 1.82 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [17]:
def find_sum_mean(a_list):
    curr_sum = 0
    # O(n)
    for num in a_list:
        curr_sum += num
        
    curr_avg = 0
    # O(n)
    for num in a_list:
        curr_avg = (num + curr_avg) / 2
        
    return (curr_sum, curr_avg)

# O(n + n) -> O(2n) -> O(n)

In [None]:
def find_sum_mean(a_list):
    curr_sum = 0
    # O(n^2)
    for num in a_list:
        for num2 in a_list:
            pass
        curr_sum += num
        
    curr_avg = 0
    # O(n)
    for num in a_list:
        curr_avg = (num + curr_avg) / 2
        
    return (curr_sum, curr_avg)

# O(n^2 + n) -> O(n^2)

In [18]:
%timeit find_sum_mean([1,42,12])

507 ns ± 4.05 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [19]:
%timeit find_sum_mean([1,42,12,14])

658 ns ± 4.83 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [20]:
%timeit find_sum_mean([1,42,12,14,1,42,12,14])

962 ns ± 8.62 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


##### O(log(n))
A logarithmic time complexity increases linearly as the input increases exponentially. Usually this occurs when we decrease the size of our input as we move through our algorithm. It is O(log(n)) when we do divide and conquer type of algorithms like binary search. 

Additional Explanations:
https://www.youtube.com/watch?v=wjDY5RbILno


In [12]:
def log_func(n):
    curr_product = 1
    iterations = 0
    while n != curr_product:
        curr_product *= 2
        iterations += 1
        
    return iterations

    
        
%timeit log_func(8)

128 ns ± 1.03 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)


In [13]:
def log_func_recursive(n):
    if n <= 1:
        return 0
    
    n = n / 2
    return 1 + log_func_recursive(n)

%timeit log_func_recursive(8)

240 ns ± 4.53 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)


In [30]:
%timeit log_func_recursive(32)

763 ns ± 5.59 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [31]:
%timeit log_func_recursive(33)

889 ns ± 7.89 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [32]:
%timeit log_func_recursive(64)

892 ns ± 2.96 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


###### O(n^2) Example
When an algorithm needs to perform a linear time operation for each value in the input data

In [22]:
a_list = [1,2,3,4]
b_list = [1,2,3]

for num1 in a_list:
    for num2 in b_list:
        print(num1, num2)

1 1
1 2
1 3
2 1
2 2
2 3
3 1
3 2
3 3
4 1
4 2
4 3


In [23]:
def make_pair(a_list, b_list):
    res_list = []
    for num1 in a_list:
        for num2 in b_list:
            res_list.append((num1, num2))
            
    return res_list

In [24]:
%timeit make_pair([1,2,3], [4,5,6])

996 ns ± 2.89 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [25]:
%timeit make_pair([1,2,3,4], [4,5,6,7])

1.45 µs ± 5.03 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [26]:
%timeit make_pair([1,2,3,4,5,6], [4,5,6,7,8,9])

2.74 µs ± 16.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


### In-Class Exercise
In a comment in the following three cells, classify each algorithm into one of the time complexities discussed above.

In [2]:
def two_sum_loops(nums, target):
    for i, num in enumerate(nums):
        for j, num2 in enumerate(nums[i + 1:]):
            if target - num == num2:
                return [i,j+i+1]

In [3]:
def two_sum(nums, target):
    d={}
    for i, num in enumerate(nums):
        if target - num in d:
            return [d[target-num],i]
        d[num]=i
    return -1

In [4]:
def check_if_num_in_list(a_list, value):
    return value in a_list

## Space Complexity
Space complexity refers to the total amount of memory space that is consumed by an algorithm. This value includes both any new values created as well as well as input values

We'll use Big O notation for space complexity as well. In this case, Big O gives the worst-case of an algorithm’s growth rate. 

"The space this algorithm takes will grow no more quickly than this, but it could grow more slowly."

###### O(1) Example

In [33]:
# O(1) + O(1) => O(1 + 1) => O(1)
def make_sum(a,b):
    return a + b

In [34]:
input1 = 10 # O(1)
input2 = 35 # O(1)

make_sum(input1, input2)

45

###### O(n) Example


In [38]:
#             O(1)
def make_list(number_to_add):
    # Start of Auxiliary Space
    res_list = [] # O(n)
    
    for num in range(number_to_add):
        res_list.append(num)
        
    return res_list
    # End of Auxiliary Space

In [40]:
# Input space: O(1)
# Auxiliary space: O(n)

# Total space: O(1 + n) => O(n)

In [36]:
print(make_list(25))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24]


In [37]:
print(make_list(35))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34]


Input Space: O(n) <- This comes from aList in the input
Auxiliary Space: O(1) <- The only variables created in the function are integers

Total Space: O(n + 1) or O(n)

In [48]:
#                 O(n)    O(1)   => O(n + 1) => O(n)
def binary_search(a_list, target):
    lower_bound = 0 # O(1)
    upper_bound = len(a_list) - 1 # O(1)
    mid = None # O(1)
    found = False # O(1)
    
    while not found:
        mid = (lower_bound + upper_bound) // 2
        
        if a_list[mid] > target:
            upper_bound = mid - 1
        elif a_list[mid] < target:
            lower_bound = mid + 1
        else:
            found = True
            
    return mid
    # Auxiliary Space => O(1 + 1 + 1 + 1) => O(1)

In [None]:
# O(n + 1) => O(n)

In [46]:
binary_search([1,2,3,4,5], 4)

3

In [None]:
[1,2,3,4,5] # Looking for 4

The recursive calls generate new function calls in the stack. Each call on the stack stores a separate copy of the variables defined in the function. The array is passed by reference so a separate copy of the array is not created for each function call. As we can have O(log(n)) calls to the function, the space complexity of the recursive version should include the O(log(n)) auxiliary space. Hence, the overall space complexity is:

Input space: O(n)
Auxiliary space: O(log n)

Total Space: O(n + log n) OR O(n)

In [42]:
# Calls to the function O(log(n))
# O(n)
def binary_search_recursive(a_list, target):
    mid = len(a_list) // 2 # O(1)
    
    if a_list[mid] == target:
        return mid
    elif a_list[mid] > target:
        return binary_search(a_list[:mid], target)
    else:
        return mid + binary_search(a_list[mid:], target)
    
    
    
binary_search([1,5,8,9, 10], 10)

NameError: name 'binary_search' is not defined

In [41]:
# O(log(n) + n) => O(n)

In [None]:
#HOME WORK

In [20]:
def sum_factorial(lst):
  total = 0         #O(1)
  for num in lst:   #O(n)
    count = 1       #O(1)
    while num >= 1: #O(log(n))
      count *= num  #O(1)
      num -= 1      #O(1)
    total +=count   #O(1)
  return total      #O(1)
%timeit sum_factorial([4, 6])
#Time Complexity O(log(n))

450 ns ± 24.8 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)


In [22]:
def number_of_pairs(gloves):
    colors = {}  #O(1)
    for glove in gloves: # O(n)
        if glove not in colors.keys(): # O(n)
            colors[glove] = 1 #O(1)
        else:   
            colors[glove] += 1 #O(1)
    
    pair_count = 0 #O(1)
    for key in colors.keys(): #O(n)
        pair_count += colors[key]//2 #O(1)
    return pair_count #O(1)
%timeit number_of_pairs(["red", "green", "red", "blue", "blue"])
#Time Complexity O(n)

618 ns ± 22.5 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)


In [24]:
def is_pangram(s):
    string = s.lower() #O(n)
    list =['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z'] #O(1)
    for i in string: #O(n)
        if i in list: #O(n)
            list.remove(i) #O(n)
    if len(list) == 0: #O(1)
        return True #O(1)
    else: #O(1)
        return False #O(1)
%timeit is_pangram("hello everyone ele")
#Time Complexity O(n)

3.64 µs ± 40.1 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)


In [5]:
def is_triangle(a, b, c):
    if a == 0 or b== 0 or c== 0: #O(1)
        return False #O(1)
    import math 
    cosA = (b**2 + c**2 - a**2) / (2 * b * c) #O(1)
    cosB = (a**2 + c**2 - b**2) / (2 * a * c) #O(1)
    cosC = (a**2 + b**2 - c**2) / (2 * a * b) #O(1)
    if cosA >= -1 and cosB >= -1 and cosC >= -1 and cosA <= 1 and cosB <= 1 and cosC <= 1 and a + b != c and a + c!= b and b + c!= a and math.acos(cosA) + math.acos(cosB) + math.acos(cosC) >= 3.1415926535897 and math.acos(cosA) + math.acos(cosB) + math.acos(cosC) <= 3.1415926535897999: #O(n)
        return True #O(1)
    else:
        return False #O(1)
%timeit is_triangle(3,4,5)
#Time Complexity O(n)

902 ns ± 10.2 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)


In [9]:
def sum_two_smallest_numbers(l):
    list = [] #O(1)
    list.append(min(l))  #O(n)
    l.remove(min(l))    #O(n)
    list.append(min(l)) #O(n)
    total = sum(list)   #O(n)
    return total        #O(1)
%timeit sum_two_smallest_numbers([2,5,2,6,43,15,62,0,-5])
#Time Complexity O(n)

583 ns ± 5.23 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)


In [10]:
def find_outlier(l):
    even = 0 #O(1)
    odd = 0 #O(1)
    even_list = [] #O(1)
    odd_list = [] #O(1)
    for num in l: #O(n)
        if num % 2 == 0: #O(1)
            even += 1 #O(1)
            even_list.append(num) #O(n)
        else:
            odd += 1    #O(1)
            odd_list.append(num)    #O(n)
    if even > odd:  #O(1)
        for num in odd_list:    #O(n)
            return (num)    #O(1)
    else:   #O(1)
        for num in even_list:   #O(n)
            return (num)    #O(1)
%timeit find_outlier([5,3,7,2,8,93,24,52])
#Time Complexity O(log(n))

542 ns ± 14.5 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
