# What is an Algorithm anyway?

An algorithm is an explicit, precise, unambiguous, mechanically-executable
sequence of elementary instructions, usually intended to accomplish a specific purpose

Source: https://jeffe.cs.illinois.edu/teaching/algorithms/

## Al-gebra and al-gorithm and al-gorists (not Al-Gorists)

![Musa-al-Kharismi](https://upload.wikimedia.org/wikipedia/commons/a/a6/Khwarizmi_Amirkabir_University_of_Technology.png)

https://en.wikipedia.org/wiki/Muhammad_ibn_Musa_al-Khwarizmi

## Decent Algorithm

BottlesOfBeer(n):

```
For i ← n down to 1

  Sing “i bottles of beer on the wall, i bottles of beer, ”

  Sing “ Take one down, pass it around, i − 1 bottles of beer on the wall. ”

  Sing “ No bottles of beer on the wall, no bottles of beer, ”

  Sing “ Go to the store, buy some more, n bottles of beer on the wall. ”
```



## Not a real algorithm 

BeAMillionaireAndNeverPayTaxes():


```

Get a million dollars. 

If the tax man comes to your door and says, “ You have never paid taxes! ”

Say “ I forgot.
```




### How about  Get a million dollars algorithm?
1. Collect underpants
2. ?
3. Profit

Source: https://en.wikipedia.org/wiki/Gnomes_(South_Park)

Still not a real algorithm.


## Describing Algorithms

The skills required to effectively design and analyze algorithms are entangled with the skills required to effectively describe algorithms. A complete description of any algorithm has four components:

 *  **What:** A precise specification of the problem that the algorithm solves.
 *  **How:** A precise description of the algorithm itself.
 * **Why:** A proof that the algorithm solves the problem it is supposed to solve.
 * **How fast:** An analysis of the running time of the algorithm.

# How to find a the largest number in an unsorted list(array)?
#

In [None]:
mylist = [1,4,67,2,7,2]
print(max(mylist))

67


In [None]:
# lets implement our naive max algorithm
# or as they call it Brute Force
def find_my_max(seq):
    my_max = None
    for n in seq:
        if my_max == None or n > my_max:
            my_max = n
    return my_max
    # set max to negative infinitty
    # loop through seq
    # if any number is larger than max we set max to this number
    # return max

In [None]:
print(find_my_max(mylist))

67


In [None]:
print(find_my_max([-4,23,-6,2,0,0,5]))

23


In [None]:
import random
big_list = [random.randint(1,100_000_000) for _ in range(1_000_000)]
len(big_list)


1000000

In [None]:
%%timeit
max(big_list) # most likely this is already cached so running it 1000 times will not prove nothing

17.2 ms ± 26.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [None]:
%%timeit
find_my_max(big_list)

97.5 ms ± 13.3 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [None]:
huge_list = [random.randint(1,1_000_000_000) for _ in range(10_000_000)]


In [None]:
%%timeit
max(huge_list)

174 ms ± 2.95 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [None]:
%%timeit
find_my_max(huge_list)

926 ms ± 31.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [None]:
%%timeit
find_my_max(huge_list)

959 ms ± 69 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [None]:
sorted_big_list = sorted(big_list)


In [None]:
%%timeit
max(sorted_big_list)

67.5 ms ± 463 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [None]:
%%timeit
min(sorted_big_list)

67.3 ms ± 1.7 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [None]:
sorted_big_list[0], sorted_big_list[-1]

(178, 99999962)

In [None]:
# How about finding some value in a list
over_9000 = [n in big_list for n in range(9000,9200)]
over_9000

[False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 True,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 True,
 True,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 False,
 Fa

In [None]:
True in over_9000

True

In [None]:
over_9000.index(True)

21

In [None]:
%%timeit
9021 in big_list # so this check took a while

13.1 ms ± 116 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [None]:
%%timeit
9021 in sorted_big_list

1.82 µs ± 12.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [None]:
big_list.index(9021),sorted_big_list.index(9021)

(648803, 82)

In [None]:
big_set = set(big_list) # another data structure

In [None]:
%%timeit
9021 in big_set


104 ns ± 4.52 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)


In [None]:
# Key takeaway - choosing a right data structure can be crucial
# in set is O(1)
# in list is O(n) - on average, sometimes the value is found quickly sometimes you have to go through whole list

In [None]:
%%timeit
9000 in big_list

20.8 ms ± 782 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [None]:
%%timeit
9000 in big_set

90.3 ns ± 10.2 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)


In [None]:
%%timeit
count = 100_000
nums = [] # lets image we do not know about list comprehension
for n in range(count):
    nums.append(n)
nums.reverse() # in place number reversal
# of course list(range(count))
# nums[:10]

11.2 ms ± 86.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [None]:
# can we try to do better ? why reverse if we could already make reverse immediate....

In [None]:
%%timeit
count = 100_000
nums2 = []
for n in range(count):
    nums2.insert(0, n)
    

3.78 s ± 9.68 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [None]:
nums[:5],nums2[:5]

([999999, 999998, 999997, 999996, 999995], [99999, 99998, 99997, 99996, 99995])

# What is GCD?
## the largest number that divides two numbers evenly (no reminder)
## 12 8 will have GCD of 4, and 21, 18, will have GCD of 3

# one algorithm to solve would be this
* divide numbers in prime factors
* GCD will be the common prime factors in both numbers multiplied

In [None]:
# GCD or 30 and 12 would be 6 because:
# 30 = 2*3*5
# 12 = 2*2*3
# so GCD is 2*3
# the only catch being that you have to find prime factors for a number
# well

In [None]:
# if we did not know about Euclid
# brute force algorithm
def naiveGCD(x, y):
    gcd = 1 # so 1 will always be a fallback GCD
    for n in range(2, min(x,y)+1): # also those off by one errors you have to watch out
        if x%n == 0 and y%n ==0:
            gcd = n # wow we got a new high gcd
    return gcd

In [None]:
naiveGCD(12,8)

4

In [None]:
naiveGCD(30,12)

6

In [None]:
%%timeit
naiveGCD(10_000_200,900_000)

97.7 ms ± 3.35 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [None]:
# This 2000 years old algorithm
# https://en.wikipedia.org/wiki/Euclidean_algorithm

In [None]:
def gcd(x, y):
    while(y):
        x, y = y, x % y # calculate the reminder and then swap the values, in other languages you'd use a temp variable
    return x

In [None]:
gcd(10_000_200,900_000)

600

In [None]:
%%timeit
gcd(10_000_200,900_000)

927 ns ± 57.6 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [None]:
gcd(12, 8)

4

In [None]:
gcd(21,18)

3

In [None]:
gcd(100, 93)

1

In [None]:
gcd(10, 8)

2

In [None]:
gcd(24,12)

12