# Basic Algorithms

## Basic Algorithms

### Binary Search

### Efficiency of Binary Search

### Binary Search Variation

### Binary Search: First & last indexes

### Tries

### Heaps

A heap is a specific type of tree with some of its own additional rules. In a heap, elements are arranged in increasing or decreasing order, such that the root element is either the maximum or minimum value in the tree. There are two different types of heaps. Max heaps & min heaps that capture those two situations. 

In a max heap, a parent must always have a greater value than its child so the root ends up being the biggest element. 

The opposite is true for min heaps: A parent has a lower value than its child, so the root is the minimum element. 

Heaps don't need to be binary trees, so parents can have any number of children. Operations like search, insert, and delete can vary a lot based on the type of heap being used. 

### Max Binary Heap & Heapify
* Two children only
* The root is the maximum element
* Must be a complete tree:
  * All levels except the last one are completely full
  * If the last level isn't totally full, values are added from left to right. The right most leaf will be empty until the whole row has been filled. 
* Peek(): Gets the maximum value. $O(1)$
* Search: $O(N)$
  * Use max heap properties to our advantage: if the element we're searching for is bigger than the root, we can quit searching. This reduces the runtime in the average case to $O(N/2)$
* Insert: $O(log(N))$ (i.e. the height of the tree)
  * Stick the element into the next open spot in the tree.
  * Heapify: Reorder the tree based on the heap property.
    * Since we care that our parent element is bigger than its child, we just need to keep comparing our new element with its parent and swapping them when the child is bigger. 
* Extract: $O(log(N))$ (i.e. the height of the tree)
  * The root is remove from the tree
  * Stick the right-most leaf into the root spot, then compare it to its children and swap where necessary

### Heap Implementation
Though heaps are represented as trees, they're actually often stored as arrays. That's because we know how many children its parent has, and thus how many nodes will be at each level, we can use math to figure out where the next node will fall in the array and then traverse the tree. 

Storing the data in an array will save us space. We just need to store the value at the right index. 

If we were to create a Node data structure, we'd need to store values and a bunch of pointers for every element. 

### Heaps Exercise

## Self-Balancing Tree
A balanced tree has nodes condensed to as few levels as possible. A self-balancing tree is one that tries to minimize the number of levels that it uses. It uses algorithms during insertion & deletion to keep itself balanced, and the nodes themselves might have some additional properties. 

### Red-Black Tree
The Red-Black Tree is an extension of a binary search tree. 

Properties of a Red-Black Tree:
1. Nodes are assigned an additional color property, where the values must be either red or black. The colors red & black are just a convention - we really just need a way to distinguish two types of nodes. 
2. Null Leaf Nodes: Every node in the tree that doesn't otherwise have two children must have null children. All Null Leaf Nodes must be colored black. 
3. If a node is red, both of its children must be black. 
4. Optional rule: The root node must be black.
5. Every path from a node to its descendant null nodes must contain the same number of black nodes. 

### Red-Black Tree Insertion
There are several different states of the tree, and the node you're inserting that require different courses of action. 

The resulting tree needs to follow both the red-black tree and BST rules. 

Rule of insertion: Insert red nodes only

**Case 1: Insert the first node into the tree.**
Since it is the root, the color can be changed to black based on the Rule 4.

**Case 2: The parent node is black.**
If the new parent node is black, nothing more has to be done. Since a red node is being added, it doesn't upset the balance of black nodes in any path or violate any of the other rules. 

**Case 3: The parent node is red.**
There are several cases with more complicated solutions. 
* The parent and its sibling are both red: 
  * They should be changed to black, and their parent (the grandparent), becomes red. We switch the node colors in this way to maintain the number of black nodes in a given path.
  * The biggest problem here is that we could have violated another property by changing the grandparent. But we can just treat the grandparent as a newly inserted node and change it or its ancestors according to the same cases and rules.
  
**Case 4 and 5: Node's parent is red and its sibling is black.**
We need to perform a rotation: Shift a group of nodes around in a way that changes the structure of the tree, but not the order of the nodes. Keep in mind that this is a BST, and we need to keep the elements in strict order.

**Case 4: The red node and its red parent are not on the same side of their parents.**
The red node is a right child, and its parent is a left child. Perform a left rotation since the nodes shift one place to the left while maintaining their order. At this point, we have a setup that looks exactly like Case 5, where both the red node and its red parent are on the same side of their parents. 

**Case 5: The red node and its red parent are on the same side of their parents.**
Perform a right rotation involving the grandparent and both of its children. Swap the colors of the parent and grandparent. 

In doing the rotation, ew keep any one sub-tree from getting much larger than the others. 

![PXL_20201111_160343350.jpg](attachment:PXL_20201111_160343350.jpg)

### Time efficiency
Insert, search, delete: $O(log(N))$ in the average and worst case. 

### Build a Red-Black Tree

## Sorting Algorithms

### Bubble Sort

### Merge Sort

### Merge Sort: Counting Inversions

### Case Specific Sorting of Strings

### Quicksort

### Heapsort

### Pair Sum

### Sort 0, 1, 2

## Faster Divide & Conquer Algorithms
Divide and conquer is a strategy to solve problems more efficiently by breaking problems down into sub-problems that can be more easily solved. Then, the solutions to these sub-problems are combined to yield an answer to the original problem.

Next: Apply divide-and-conquer to efficiently find the median element out of a collection of unsorted numbers. 

Divide-and-conquer are recursive algorithms.

### Median Problem
Given:
* Unsorted list $A=[a_1, ..., a_n]$ of $n$ numbers

Goal:
* Find the median of A: $\lceil \frac{n}{2} \rceil$-th smallest element
  * for odd $n = 2l+1$: median is the $(l+1)$-st smallest
  
#### Find k-smallest element
Let's extend the problem: Instead of finding the median element, we want to find the k-smallest element, where k is an input given to us. 

Given: 
* Unsorted $A$ and integer $k$ where $1 \leq k \leq n$

Find:
* $k$-th smallest of $A$

Note: If we set $k = \frac{n}{2}$, then that'll be the median

#### Easy algorithm
Sort A & output the $k$-th element: $O(n*log(n))$ time

#### Optimal algorithm
$O(n)$ time algorithm by [B, F, P, R, T '73]

### Basic Approach
Divide & conquer: QuickSort style

![Screen%20Shot%202020-11-14%20at%2012.15.57%20PM.png](attachment:Screen%20Shot%202020-11-14%20at%2012.15.57%20PM.png)

### D&C: High Level
Aim: $O(N)$ running time
![Screen%20Shot%202020-11-14%20at%2012.24.40%20PM.png](attachment:Screen%20Shot%202020-11-14%20at%2012.24.40%20PM.png)

### D&C: Recursive Pivot
Aim: Find a good pivot in $O(N)$ time (worst-case).

The algorithm above finds the pivot in $O(N)$ expected time, not worst-case time.

If we can successfully find a good pivot in $O(N)$ time, then the running time of our algorithm will satisfy the following relation:
$$T(N) = T(\frac{3}{4}N) + O(N) = O(N)$$

Explanation:
* $T(N)$ is at most $T(\frac{3}{4}N)$ because it is a good pivot, so the sub-problem will be of size at most $\frac{3}{4}$ the original size
* Plus $O(N)$ to find a good pivot, and $O(N)$ to partition $A$ into three sets. 
* The recurrence solves to $O(N)$, so the overall running time of the algorithm will be order N.

We have some slack: Instead of $T(\frac{3}{4}N)$ we could have used a constant that's less than one. So we're left with a slack of $T(0.24N)$, which we can use to help us find a good pivot.

So we're going to design an algorithm with the following running time:
$$T(N) = T(\frac{3}{4}N) + T(\frac{N}{5} + O(N) = O(N)$$

$T(\frac{N}{5} + O(N)$ is going to be the time it's going to take us to find a good pivot. The key fact is that $\frac{3}{4} + \frac{1}{5} = 0.95 < 1$ 

How are we going to utilize $T(\frac{n}{5})$: Choose a subset $S$ of $A$ where $|S|=\frac{N}{5}$. Recursively run our median algorithm on this subset $S$.

Set the pivot $p = Median(S)$. The time it's going to need us to find the median of this subset $S$ is going to be $T(\frac{n}{5})$, since the subset $S$ is of size $\frac{n}{5}$.

![Screen%20Shot%202020-11-14%20at%2012.39.30%20PM.png](attachment:Screen%20Shot%202020-11-14%20at%2012.39.30%20PM.png)

Question: How do we choose the subset $S$? 

We need to choose subset $S$ so it is a good representative sample of the entire array $A$.

### Median: Pseudocode
`FastSelect(A,k)`
* input: unsorted $A$ & integer $k$ where $1 \leq k \leq n$
* output: $k$-th smallest of $A$

Algorithm:
* Find a good pivot: Break $A$ into $\lceil \frac{n}{5} \rceil$ groups: $G_1, G_2, ..., G_{\frac{n}{5}}$
  * We can do it in an arbitrary way. E.g.: take the first five elements of $A$ for the first group, etc.
* For `i=1 -> n/5`:
  * sort `G_i` & let `m_i = median(G_i)`
* Let $S = {m_1, m_2, ..., m_{\frac{n}{5}}}$
* Find median of set $S$: Pivot `p = FastSelect(S, n/10)`
  * Explanation: $S$ has ${\frac{n}{5}}$ elements. We want to find its median, therefore we look for $k = \frac{n}{10}$. The ${\frac{n}{10}}$-th smallest element of $S$ is the median of this set $S$, which we store in element $P$
* Use `p` as the pivot: Partition $A$ into $A<p$, $A=p$, $A>p$
* Use the quick select approach from before: Based on the sizes of these three sets, we either recursively search in small set, big set, or simply output `p`
  * If $k \leq |A < p|$ then `return(FastSelect(A<p, k))`
  * If $k > |A < p| + |A=p|$ then `return(FastSelect(A>p, k-|A<p|-|A=p|)`
  * Else `return p`

### Median Running Time
Prove that $p$ is in fact a good pivot. 

![Screen%20Shot%202020-11-14%20at%2012.57.10%20PM.png](attachment:Screen%20Shot%202020-11-14%20at%2012.57.10%20PM.png)

## Practice Problems

### Square Root of an Integer
Find the square root of the integer without using any Python library. You have to find the floor value of the square root.

For example if the given number is 16, then the answer would be 4.

If the given number is 27, the answer would be 5 because sqrt(5) = 5.196 whose floor value is 5.

The expected time complexity is `O(log(n))`

Here is some boilerplate code and test cases to start with:
```python
def sqrt(number):
    """
    Calculate the floored square root of a number

    Args:
       number(int): Number to find the floored squared root
    Returns:
       int: Floored Square Root
    """
    pass

print ("Pass" if  (3 == sqrt(9)) else "Fail")
print ("Pass" if  (0 == sqrt(0)) else "Fail")
print ("Pass" if  (4 == sqrt(16)) else "Fail")
print ("Pass" if  (1 == sqrt(1)) else "Fail")
print ("Pass" if  (5 == sqrt(27)) else "Fail")
```

### Search in a Rotated Sorted Array
You are given a sorted array which is rotated at some random pivot point.

Example: [0,1,2,4,5,6,7] might become [4,5,6,7,0,1,2]

You are given a target value to search. If found in the array return its index, otherwise return -1.

You can assume there are no duplicates in the array and your algorithm's runtime complexity must be in the order of `O(log n)`.

Example:

Input: `nums = [4,5,6,7,0,1,2], target = 0, Output: 4`

Here is some boilerplate code and test cases to start with:

```python
def rotated_array_search(input_list, number):
    """
    Find the index by searching in a rotated sorted array

    Args:
       input_list(array), number(int): Input array to search and the target
    Returns:
       int: Index or -1
    """
   pass

def linear_search(input_list, number):
    for index, element in enumerate(input_list):
        if element == number:
            return index
    return -1

def test_function(test_case):
    input_list = test_case[0]
    number = test_case[1]
    if linear_search(input_list, number) == rotated_array_search(input_list, number):
        print("Pass")
    else:
        print("Fail")

test_function([[6, 7, 8, 9, 10, 1, 2, 3, 4], 6])
test_function([[6, 7, 8, 9, 10, 1, 2, 3, 4], 1])
test_function([[6, 7, 8, 1, 2, 3, 4], 8])
test_function([[6, 7, 8, 1, 2, 3, 4], 1])
test_function([[6, 7, 8, 1, 2, 3, 4], 10])
```

### Rearrange Array Digits
Rearrange Array Elements so as to form two number such that their sum is maximum. Return these two numbers. You can assume that all array elements are in the range [0, 9]. The number of digits in both the numbers cannot differ by more than 1. You're not allowed to use any sorting function that Python provides and the expected time complexity is `O(nlog(n))`.

for e.g. [1, 2, 3, 4, 5]

The expected answer would be [531, 42]. Another expected answer can be [542, 31]. In scenarios such as these when there are more than one possible answers, return any one.

Here is some boilerplate code and test cases to start with:

```python
def rearrange_digits(input_list):
    """
    Rearrange Array Elements so as to form two number such that their sum is maximum.

    Args:
       input_list(list): Input List
    Returns:
       (int),(int): Two maximum sums
    """
    pass

def test_function(test_case):
    output = rearrange_digits(test_case[0])
    solution = test_case[1]
    if sum(output) == sum(solution):
        print("Pass")
    else:
        print("Fail")

test_function([[1, 2, 3, 4, 5], [542, 31]])
test_case = [[4, 6, 2, 5, 9, 8], [964, 852]]
```

### Dutch National Flag Problem
Given an input array consisting on only 0, 1, and 2, sort the array in a single traversal. You're not allowed to use any sorting function that Python provides.

Note: `O(n)` does not necessarily mean single-traversal. For e.g. if you traverse the array twice, that would still be an `O(n)` solution but it will not count as single traversal.

Here is some boilerplate code and test cases to start with:

```python
def sort_012(input_list):
    """
    Given an input array consisting on only 0, 1, and 2, sort the array in a single traversal.

    Args:
       input_list(list): List to be sorted
    """
    pass

def test_function(test_case):
    sorted_array = sort_012(test_case)
    print(sorted_array)
    if sorted_array == sorted(test_case):
        print("Pass")
    else:
        print("Fail")

test_function([0, 0, 2, 2, 2, 1, 1, 1, 2, 0, 2])
test_function([2, 1, 2, 0, 0, 2, 1, 0, 1, 0, 0, 2, 2, 2, 1, 2, 0, 0, 0, 2, 1, 0, 2, 0, 0, 1])
test_function([0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2])
```

### Autocomplete with Tries

### Unsorted Integer Array
In this problem, we will look for smallest and largest integer from a list of unsorted integers. The code should run in `O(n)` time. Do not use Python's inbuilt functions to find min and max.

**Bonus Challenge:** Is it possible to find the max and min in a single traversal?

```python
def get_min_max(ints):
    """
    Return a tuple(min, max) out of list of unsorted integers.

    Args:
       ints(list): list of integers containing one or more integers
    """
   pass

## Example Test Case of Ten Integers
import random

l = [i for i in range(0, 10)]  # a list containing 0 - 9
random.shuffle(l)

print ("Pass" if ((0, 9) == get_min_max(l)) else "Fail")
```

Sorting usually requires `O(n log n)` time Can you come up with a `O(n)` algorithm (i.e., linear time)?

### Request Routing in a Web Server with a Trie

#### HTTPRouter using a Trie
For this exercise we are going to implement an HTTPRouter like you would find in a typical web server using the Trie data structure we learned previously.

There are many different implementations of HTTP Routers such as regular expressions or simple string matching, but the Trie is an excellent and very efficient data structure for this purpose.

The purpose of an HTTP Router is to take a URL path like "/", "/about", or "/blog/2019-01-15/my-awesome-blog-post" and figure out what content to return. In a dynamic web server, the content will often come from a block of code called a handler.

First we need to implement a slightly different Trie than the one we used for autocomplete. Instead of simple words the Trie will contain a part of the http path at each node, building from the root node /

In addition to a path though, we need to know which function will handle the http request. In a real router we would probably pass an instance of a class like Python's [SimpleHTTPRequestHandler](https://docs.python.org/3/library/http.server.html#http.server.SimpleHTTPRequestHandler) which would be responsible for handling requests to that path. For the sake of simplicity we will just use a string that we can print out to ensure we got the right handler

We could split the path into letters similar to how we did the autocomplete Trie, but this would result in a Trie with a very large number of nodes and lengthy traversals if we have a lot of pages on our site. A more sensible way to split things would be on the parts of the path that are separated by slashes ("/"). A Trie with a single path entry of: "/about/me" would look like:

`(root, None) -> ("about", None) -> ("me", "About Me handler")`

We can also simplify our RouteTrie a bit by excluding the suffixes method and the endOfWord property on RouteTrieNodes. We really just need to insert and find nodes, and if a RouteTrieNode is not a leaf node, it won't have a handler which is fine.

```python
# A RouteTrie will store our routes and their associated handlers
class RouteTrie:
    def __init__(self, ...):
        # Initialize the trie with an root node and a handler, this is the root path or home page node

    def insert(self, ...):
        # Similar to our previous example you will want to recursively add nodes
        # Make sure you assign the handler to only the leaf (deepest) node of this path

    def find(self, ...):
        # Starting at the root, navigate the Trie to find a match for this path
        # Return the handler for a match, or None for no match

# A RouteTrieNode will be similar to our autocomplete TrieNode... with one additional element, a handler.
class RouteTrieNode:
    def __init__(self, ...):
        # Initialize the node with children as before, plus a handler

    def insert(self, ...):
        # Insert the node as before
```        

Next we need to implement the actual Router. The router will initialize itself with a RouteTrie for holding routes and associated handlers. It should also support adding a handler by path and looking up a handler by path. All of these operations will be delegated to the RouteTrie.

Hint: the RouteTrie stores handlers under path parts, so remember to split your path around the '/' character

Bonus Points: Add a not found handler to your Router which is returned whenever a path is not found in the Trie.

More Bonus Points: Handle trailing slashes! A request for '/about' or '/about/' are probably looking for the same page. Requests for '' or '/' are probably looking for the root handler. Handle these edge cases in your Router.

```python
# The Router class will wrap the Trie and handle 
class Router:
    def __init__(self, ...):
        # Create a new RouteTrie for holding our routes
        # You could also add a handler for 404 page not found responses as well!

    def add_handler(self, ...):
        # Add a handler for a path
        # You will need to split the path and pass the pass parts
        # as a list to the RouteTrie

    def lookup(self, ...):
        # lookup path (by parts) and return the associated handler
        # you can return None if it's not found or
        # return the "not found" handler if you added one
        # bonus points if a path works with and without a trailing slash
        # e.g. /about and /about/ both return the /about handler


    def split_path(self, ...):
        # you need to split the path into parts for 
        # both the add_handler and loopup functions,
        # so it should be placed in a function here
```
        
#### Test Cases
```python
# Here are some test cases and expected outputs you can use to test your implementation

# create the router and add a route
router = Router("root handler", "not found handler") # remove the 'not found handler' if you did not implement this
router.add_handler("/home/about", "about handler")  # add a route

# some lookups with the expected output
print(router.lookup("/")) # should print 'root handler'
print(router.lookup("/home")) # should print 'not found handler' or None if you did not implement one
print(router.lookup("/home/about")) # should print 'about handler'
print(router.lookup("/home/about/")) # should print 'about handler' or None if you did not handle trailing slashes
print(router.lookup("/home/about/me")) # should print 'not found handler' or None if you did not implement one
```