Problem:  Suppose We search the number 100 in a sorted list of 100 elements ranging from 1 to 100.
We could search from the  start until the end of the list:

-  [1,2,3,4,....100]
-  [2,3,4,...,100] 1.step (to low, go on)
-  [3,4,...,100] 2.step (to low, go on)
-  [4,...,100] 3.step (to low, go on)
- .....
- ....
- ....
- [99, 100] 99.step 
- [100] 100. Step (FOUND!)

We have a algorithm with a time complexity of $O(N)$: The algo would take 100 steps (operations) if we would search for the number 100.

- $O(N)$ --> linear complexity       $f(N) = N$
- $O(1)$ --> constant complexity     $f(N) = 1$
- $O(N * N)$ --> quadratic complexity $f(N) = N^{2}$


In [None]:
l1 = list(range(1,101))
length_step1 = len(l1)
length_step1

In [None]:
middle = int((len(l1))/2)
middle
print(l1[middle])
l2 = l1[:middle]
print(*l2)

In [None]:
middle = int(len(l2)/2)
middle

 Introduction

 - an algorith (*algo*) is a set of instructions
 - there are different algos for the same task
 - each algo has a trade-off

In [None]:
l1 = list(range(1,101))
length_step1 = len(l1)
length_step1

In [None]:
middle = int((len(l2))/2)
middle
print(l2[middle])
l3 = l2[:middle]
print(*l3)

In [None]:
middle = int(len(l3)/2) + 1   # int(25 / 2) = 12  but the middle of 25 is 13 therefore +1
l3[middle]
l4 = l3[:middle]
print(*l4)

In [None]:
middle = int(len(l4)/2)
middle
l4[middle] 

In total it took us four steps: 
1step(100->50)
2.step(50->25),
3.step(25->12)
4.Step(found)

With a simple search it would us take 57 steps.

With each step of binary search, you cut the number of numbers in half until you found your number.

In [None]:
#def binary_search(my_list_to_search, item_to_find):
    #algo
    #return Tru if found otherwise False

## Exercise:

- Write the simple search algo to find an item in a sorted list.
- Now do the same with the binary search algo.

- ```simple_search([1,2,....,99,100], 3)  ->true, number of steps```

- ```simple_search([1,2,....,99,100], 101)  ->false, number of steps```

- ```binary_search([1,2,....,99,100], 3)  ->true, number of steps```

- ```binary_search([1,2,....,99,100], 101)  ->false, number of steps```

- Compare the number of steps of both algos.

- run the binary_search algo for each sorted lists with the following *sizes*: [100, 1_000, 10_000, 100_000, 1_000_000_, 10_000_000]
- each list should start with 1
- search each list with the last element
- save the result in a *results* list for each list sizes

plot each your results: 
- the x-axis will represent your list size
- the y-axis will represent your number of steps

Result:
![](big_o_binary_search.png)

data: [(100, 7), (1000, 10), (10000, 14), (100000, 17), (1000000, 20)

Binary search will take only 20 steps for 1 millon items in a sequence.
In general, for any list of n, binary search will take $log_{2}n$ steps to run in the worst case, whereas simple search will take n steps.

 ## Logarithms 

 $log_{10} 100$ is like asking 'How many times do we have to multiply 10 with itself to get 100?'
 The answer is: 10 * 10, therefore 2 times. $log_{10} 100 = 2$
 $10^{2} = 100$
 * Logs are flips of exponentials. You can also say, the log function is the inverse function of the exponential function.
 

|exponential|log|
|:--:|:--:|
|$2^{3}=8$|3 = $log_{2}8$|
|$2^{4}=16$|4 = $log_{2}16$|
|$2^{5}=32$|5 = $log_{2}32$|

Therefore, for a list of 32 elements you'd have to check 5 numbers at most.
-  Your list of 32 elements can be halved a maximum of 5 times. That would be worse case.

### Running time

IN simple search the maximum number of guesses ia the same as the size of the list. This is called *linear time*. Binary search runs in *logarithmic time*.

|simple search|Binary Search|
|:--:|:--:|
|4.000.000.000 Items ->4.000.000.000 Guesses| 4.000.000.000 Items -> 32 Guesses|
|$O(N)$|$O(log N)$|


- Big O notation tells you how fast an algo is (for worse case)
- Big O natation lets you compare the number of operations
- $O(N)$, where $N$ is the number of operations.

## Selection Sort

In [None]:
l = [156, 141, 35, 94, 88, 61, 111]

def findSmallest(arr):
    smallest = arr[0]
    smallest_index = 0
    for i in range(1, len(arr)):
        if arr[i] < smallest:
            smallest = arr[i]
            smallest_index = i
    return smallest_index

def selectionSort(arr):
    newArr = []
    for i in range(len(arr)):
        smallest_index = findSmallest(arr)
        newArr.append(arr.pop(smallest_index))
    return newArr

#selectionSort(l)
findSmallest(l)


This takes $O(n * n)$ time or $O(n^{2})$.

# Recursion

Recursion is a coding technique used in many algos. For example in the bubblesort algo, you had to use recursion.

A common example to demonstrate recursion is the faculty function: 5! = 1 * 2 * 3 * 4* 5

In [None]:
def fac(n):
    result = 1
    for i in range(1,n+1):
        result *= i
    return result

fac(5)

In [None]:
def fac(n):
    if n == 1:     # stopping condition ; base case
        return 1
    else:           # recursive case
        result = n * fac(n-1)
        return result

fac(5)

fac(5) ----> 5 * fac(4) --> 5 * ( 4 * fac(3)) --> 5 * ( 4 * ( 3 * fac(2) ) ) -->  5 * ( 4 * ( 3 * ( 2 * fac(1) ) ) )
--> 5 * ( 4 * ( 3 * ( 2 * 1 ) ) ) = 5 * 4 * 3 * 2 * 1 = n!

### The Stack

A stack is like a stack of sticky notes. When you add an element, it gets added to the top of the sticky notes.
When you read an item, you only read the topmost item, and it's taken off the stack of sticky notes.

A stack works like the sticky notes and is a simple data structure.

### The Call Stack
The computer uses a stack internally called call stack:


In [None]:
def greet2(name):
    print('greet2')
    return 'whatever'

def bye():
    print('bye')

def greet(name):
    greet2(name)
    bye()

greet('hannah')

 - The variable name 'hannah' is saved in memory
 - every time you make a funtion call, all values for all variables for that call are stored in memory
 - after greet2 you make a further function call (bye()); again memory for this call and all associated variables is allocated
 - after greet2() and bye() is done, the top box is popped off the call stack
 - When greet2 was called greet() was only *partially completed*

 When you call a function from within a function , the calling function is paused in a *partially complete state*.

 ![](call_stack_box.png)

fac(3) ----> 3 * fac(2) --> 3 * ( 2* fac(1))) --> 3 * 2 * 1  

# Divide and Conquer

- D & C is a general technique used for algos

Suppose you want to divide a plot of land in:

- square
- even
- big as possible

In [None]:
32 % 12

In [None]:
12 % 8

In [None]:
8 % 4

In [None]:
def hcf(a,b):
    remainder = a % b
    if remainder == 0:
        return b
    else:
        return str(remainder) + ' ' + str(hcf(b, remainder))

hcf(32, 12)

return hcf(32, 12) --> hcf(12, 8) ---> hcf(8, 4) --> since 8 % 4 == 0 ---> 4

In [None]:
hcf(168, 64)

![](divide_conquer.png)

## Quicksort

- uses an D & C

[33, 15, 10]

break down this list until you're at the base case.

1. pick on element from the list. This element is called pivot. for example 33
2. Find the element for smaller than pivot and the elements larger than the pivot

smaller: ```[15,10]``` bigger ```[]```

This is called partitioning. We have now:

- a sub-list of all numbers less than the pivot
- the pivot
- a sub-list of all numbers greater than pivot

If they were sorted we could write: *left list + pivot + right list* 

In [None]:
[10, 15] + [33] + []

In [None]:
def quicksort():
    pass

#quicksort([12, 10]) + [33] + quicksort([])

In [None]:
def quicksort(array):
    print(len(array))
    if len(array) < 2:
        return array
    else:
        pivot = array[1]
        less = [ element for element in array[:1] + array[2:] if element <= pivot ]  #why do we need the <=?
        greater = [ element for element in array[:1] + array[2:] if element > pivot ]
        print(f"less: {less}")
        print(f"pivot:  {pivot}")
        print(f"greater:{greater}")
    return quicksort(less) + [pivot] + quicksort(greater)


quicksort([10,5,2,3,4,4])





Quicksort : worse case: $O(N^{2})$ average: $O(N * log N)$

mergesort: worse case: $O(N * log N)$



In [None]:
import matplotlib.pyplot as plt
import math

def plot(sizes, results):
    plt.plot(sizes, results, marker='x')
    plt.xlabel('size of list')
    plt.ylabel('steps')
    plt.legend()
    plt.show()

sizes = [*range(1, 5)]
logN = [math.log(ele) for ele in sizes ]
NlogN = [ele * math.log(ele) for ele in sizes ]
N = [ele for ele in sizes ]
N_square = [ele**2 for ele in sizes]
N_exp =  [math.exp(ele) for ele in sizes]
N_fac = [fac(ele) for ele in sizes]

plt.plot(sizes, logN, label='logN')
plt.plot(sizes, N, label='N')
plt.plot(sizes, NlogN, label='NlogN')
plt.plot(sizes, N_square, label='N_square')
plt.plot(sizes, N_exp, label='N_exp')
plt.plot(sizes, N_fac, label='N!')
plt.xlabel('size of list')
plt.ylabel('steps')
plt.legend()
plt.show()

|Array size| logN | N | N log N | N^{2}|
|:--:|:--:|:--:|:--:|:--:|
|10|0.3s|1s|3.3s|10s
|100|0.6s|10 s|66.4s|16.6min
|1000|1 s|100s|996|27.7hours

Suppose these constants:

|Simple Search| binary search|
|:--:|:--:|
|10ms * N|1sec * log (N)

simple search with N = 4Billon --> 10ms * 4Billon = 463 days
binary search with N = 4Billon --> 1s * log(4Billons) = 32 seconds


Quicksort : worse case: $O(N^{2})$ average: $O(N * log N)$

mergesort: worse case: $O(N * log N)$


for mergesort and quicksort the constant makes a difference.
quicksort hits the average case way more often than the worse case.


### The traveling salesperson: O(N!)

Some people believe it can't be improved.
Unsolved problems in computer science.
The best we can do is come up with an approximation.

#### Exact algorithms (see wikipedia)
The most direct solution would be to try all permutations (ordered combinations) and see which one is cheapest (using brute-force search). The running time for this approach lies within a polynomial factor of $O(n!)$, the factorial of the number of cities, so this solution becomes impractical even for only 20 cities.

### Non Exact algorithms
Modern methods can find solutions for extremely large problems (millions of cities) within a reasonable time which are with a high probability just 2–3% away from the optimal solution.

- nearest neighbour (NN) algorithm
- The Algorithm of Christofides and Serdyukov