# Introduction to Algorithms

- search algorithm (binary search)
- run time of an algorithm (binary search)
- run time of a algorithms (Big O notation)
                                        


## Introduction 

- an *algorithm* (*algo*) is a set of instructions for accomplishing a task
- there are different *algo*s for the same task
- each algo has trade-off
- we will compare these trade offs (for example: merge sort vs. quick sort)

Problem: Suppose We search the number 100 in a sorted list of 100 elements ranging from 1 to 100:
We could either search from the start until the end of the list:
-  [1,2,3,4,....100]
-  [2,3,4,...,100] 1.step (to low, go on)
-  [3,4,...,100] 2.step (to low, go on)
-  [4,...,100] 3.step (to low, go on)
- .....
- ....
- ....
- [99, 100] 99.step (FOUND!)

We have a algorithm with a time complexity of O(N): The algo would take 100 steps if we would search for 100 



## A better way to search - Binary Search

- suppose we have a list of [1,2,3,4,..,49,50,51,..100] and we want to search for the number 57:

1. start at the middle: Guess: 50; Too low, but you just eliminated *half* the numbers! Now you know that 1 - 50 are all too low.
[51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100]
2. Next guess: 75. Too high, but you cut down half of the remaining numbers!
[51,52,53,54,55,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74]
*With binary search, you guess the middle number and eliminate half the remaining numbers every time.*
3. Next guess: 


In [45]:
l1 = [51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100]
length_step1 = len(l1)
length_step1

50

In [46]:
print(l1[len(l1)//2]) #is this higher or lower than my item: 76 > 57 ---> slice this half away
l2 = l1[:len(l1)//2]
print(*l2)
length_step2 = len(l2)
length_step2

76
51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75


25

In [47]:
print(l2[len(l2)//2]) #63 > 57
l3 = l2[:len(l2)//2]
print(*l3)
length_step3 = len(l3)
length_step3

63
51 52 53 54 55 56 57 58 59 60 61 62


12

In [48]:
print(l3[len(l3)//2]) # 57 == 57  #found


57


In total it took us four steps 1Step(100->50), 2.step(50->25), 3.step(25->12), 4.Step(**found**)! With a simple search it would us take 57 steps.

With each step of binary search, you cut the number of numbers in half until you found your number.

## Exercise:

- Write the simple search algo to find an item in a sorted list.
- Now do the same with the binary search algo.

- ```simple_search([1,2,....,99,100], 3)  ->true, number of steps```

- ```simple_search([1,2,....,99,100], 101)  ->false, number of steps```

- ```binary_search([1,2,....,99,100], 3)  ->true, number of steps```

- ```binary_search([1,2,....,99,100], 101)  ->false, number of steps```

- Compare the number of steps of both algos.

- run the binary_search algo for each sorted lists with the following *sizes*: [100, 1_000, 10_000, 100_000, 1_000_000_, 10_000_000]
- each list should start with 1
- search each list with the last element
- save the result in a *results* list for each list sizes

plot each your results: 
- the x-axis will represent your list size
- the y-axis will represent your number of steps

Results:

![](big_o_binary_search.png)

data: [(100, 7), (1000, 10), (10000, 14), (100000, 17), (1000000, 20)

Binary search will take only 20 steps for 1 million.

In general, for any list of n, binary search will take $log_{2} n$ steps to run in the worst case, whereas simple search will take n steps. 

### Logarithms

$log_{10} 100$ is like asking, 'How many times do we have to multiply 10 with itself to get 100?'
The answere is: 10*10, therefore 2 times. $log_{10} 100 = 2$
Logs are the flip of exponentials. You can also say, the log function is the inverse function of the exponential function.
$100 = 10^{2}$


|exponential|log|
|:--:|:--:|
|$2^{3}=8$|3 = $log_{2}8$|
|$2^{4}=16$|4 = $log_{2}16$|
|$2^{5}=32$|5 = $log_{2}32$|

Therefore, for a list of 32 elements you'd have to check 5 numbers at most.
-  Your list of 32 elements can be halved a maximum of 5 times. That would be worse case.

### Running time

In simple search the maximum number of guesses is the same as the size of the list. This is called *linear time*.
Binary search runs in logarithmic time.

|simple search|Binary Search|
|:--:|:--:|
|4.000.000.000 Items ->4.000.000.000 Guesses| 4.000.000.000 Items -> 32 Guesses|
|$O(N)$|$O(log N)$|

- Big O notation tells you how fast an algo is.
- Big O notation lets you compare the number of operations.
- It tells you how fast an algo grows.
- $O(N)$, where $N$ is the number of operations
- Big O notation is about the worst-case scenario


## Selection Sort

### How memory works

You can imagine the memory as chest of drawers. The computer is like a gigantic set of drawers, and each drawer has an address.
If you want to store an item in memory, your computer allocates memory space for your item.

- multiple item are stored as *array* or *lists*.

| array | list |
|:--:| :--:|
|stores items **contiguously** (next to each other)|lists are linked (items can be stored anywhere in memory)|
|look up element instantly  | if you are going to keep jumping around, linked lists are terrible|
| Reading $O(1)$| $O(N)$ WHY?|
| Insertion in the middle  $O(N)$| $O(1)$|
| Deletion  $O(N)$| $O(1)$|
| random access | sequential access|

Insertion: for arrays, you have to shift all the rest of the elements down. And if there is no space, you have to copy everything to a new location.

In [10]:
l = [156, 141,35, 94, 88, 61, 111]

def findSmallest(arr):
    smallest = arr[0]
    smallest_index = 0
    for i in range(1, len(arr)):
        if arr[i] < smallest:
            smallest = arr[i]
            smallest_index = i
    return smallest_index

def selectionSort(arr):
    newArr = []
    for i in range(len(arr)):
        smallest = findSmallest(arr)
        newArr.append(arr.pop(smallest))
    return newArr
selectionSort([5, 3, 6, 2, 10])


[2, 3, 5, 6, 10]