# 1. Arrays

## [1.1. Arrays - Codebasics](https://www.youtube.com/watch?v=gDqQf4Ekr2A&list=PLeo1K3hjS3uu_n_a__MI_KktGTLYopZ12&index=3)

In Python, integers are stored as 4 bytes (where each byte is 8 bits where 1 bit is a zero or one).

<img src=Comp-Images/1.1.1.png width=400 /> <img src=Comp-Images/1.1.2.png width=400 />

So, to look up an index in an array is just one simple calculation. We take the 0th memory address and add on (index number * size_in_bytes_of(integer)). That's why indexing an array its very fast: O(1).

But, looking up is slow (O(n)). That's because we have to iterate through each item in the array, thereby performing n searches.

Python only has dynamic arrays, meaning Python handles memory if the array needs to be bigger dynamically. Languages like Java have static and dynamic types.

# 2. Linked Lists

## [2.1. Linked Lists Code Implementation - Codebasics](https://www.youtube.com/watch?v=qp8u-frRAnU&list=PLeo1K3hjS3uu_n_a__MI_KktGTLYopZ12&index=4)

This is what a linked list with an insertion looks like in memory:

<img src=Comp-Images/2.1.1.png width=500 />

There's a difference between indexing and inserting: Indexing is finding the place of an element. In this context, if we want to go to the 10th index, it will take time O(n=10). If we wanted to go to the 150th index it will take O(n=100), because we have to go through each link in the list. Therefore the time complexity is O(n). Inserting, however, is the time taken to add the new element and set the pointer at the specified location. This is quick. But because we often are inserting to random locations in an array, we must first index to find that location, which is slow (O(n)), before inserting with speed O(1).



This is what a double-linked list looks like. We store the pointer of the element after AND before:

<img src=Comp-Images/2.1.2.png width=600 />

Code implementation:

In [29]:
# This is what we store at a memory address
class Node:
    def __init__(self, data=None, next=None):
        self.data = data
        self.next = next

    
class LinkedList:
    def __init__(self):
        self.head = None
    
    
    def print(self):
        
        if self.head is None:
            print("Linked List is empty")
            return
        
        itr = self.head
        ll_as_str = ''
        
        while itr:
            
            # Append 'str(itr.data) --> ' if next element is not None, otherwise, just append 'str(itr.data)'
            ll_as_str += str(itr.data) + ' --> ' if itr.next else str(itr.data)
            itr = itr.next
            
        print(ll_as_str)
    
    
    def insert_at_beginning(self, data):
        
        # If the LL has a head, then create a node and put this head after (next) the node.
        node = Node(data, self.head)
        self.head = node
        
    def insert_at_end(self, data):
        
        if self.head is None:
            self.head = Node(data, None)
            return
        
        itr = self.head
        
        # Exhaust the iterator until we know that itr.next is None.
        while itr.next:
            itr = itr.next
        
        itr.next = Node(data, None)
        
    
    def get_length(self):
        count = 0
        itr = self.head
        while itr:
            count+=1
            itr = itr.next

        return count
    
    
    def insert_at(self, index, data):
        if index<0 or index>self.get_length():
            raise Exception("Invalid Index")

        if index==0:
            self.insert_at_begining(data)
            return

        count = 0
        itr = self.head
        while itr:
            
            # Once we get to the element before our insertion point..
            if count == index - 1: 
                
                # ..we need to create our node and make the next one point to the old itr.next.
                node = Node(data, itr.next)
                itr.next = node
                break

            itr = itr.next
            count += 1
            
    
    def remove_at(self, index):
        if index<0 or index>=self.get_length():
            raise Exception("Invalid Index")

        if index==0:
            self.head = self.head.next
            return

        count = 0
        itr = self.head
        while itr:
            
            # Once we get to the element before our deletion point..
            if count == index - 1:
                # ..we need it to point to the next next element (because the next element has been deprecated). 
                itr.next = itr.next.next
                break

            itr = itr.next
            count+=1
        
        
    # Converts any iterable to a linked list.
    def insert_values(self, data_list_object):
        self.head = None
        for data in data_list_object:
            self.insert_at_end(data)
    
    

In [30]:
ll = LinkedList()
ll.insert_at_beginning(1)
ll.insert_at_beginning(2)
ll.insert_at_beginning(3)

ll.print()

3 --> 2 --> 1


In [31]:
ll.insert_at_end(4)
ll.insert_at_end(5)

ll.print()

3 --> 2 --> 1 --> 4 --> 5


In [32]:
ll2 = LinkedList()
ll2.insert_values(['pen', 'pineapple', 'apple', 'pen', 'grapes'])
ll2.print()

pen --> pineapple --> apple --> pen --> grapes


In [33]:
# Remove the 'pen' at index 3.
ll2.remove_at(3)
ll2.print()

pen --> pineapple --> apple --> grapes


In [34]:
# Add the 'banana' at index 2.
ll2.insert_at(2, 'banana')
ll2.print()

pen --> pineapple --> banana --> apple --> grapes


# 3. Hash Tables

## [3.1. Hash Tables Code Implementation - Codebasics](https://www.youtube.com/watch?v=ea8BRGxGmlA&list=PLeo1K3hjS3uu_n_a__MI_KktGTLYopZ12&index=5)

In [35]:
class HashTable:  
    def __init__(self):
        self.MAX = 100
        self.arr = [None for i in range(self.MAX)]
        
    def get_hash(self, key):
        hash = 0
        for char in key:
            hash += ord(char)
        return hash % self.MAX
    
    # this allows for the '[]' syntax. Objects made with this class will produce a return for obj[index].
    def __getitem__(self, index):
        h = self.get_hash(index)
        return self.arr[h]
    
    # this allows for the '[] = some_value' syntax. Objects made with this class will enable setting a value to obj[index].
    def __setitem__(self, key, val):
        h = self.get_hash(key)
        self.arr[h] = val    
        
    def __delitem__(self, key):
        h = self.get_hash(key)
        self.arr[h] = None        

In [36]:
t = HashTable()

# This only works because of the dunder method '__setitem__`
t["march 6"] = 310 
t["march 7"] = 420

In [37]:
# This only works because of the dunder method '__getitem__' 
t["march 6"] 

310

## [3.2. Collision Handling Code Implementation - Codebasics](https://www.youtube.com/watch?v=54iv1si4YCM&list=PLeo1K3hjS3uu_n_a__MI_KktGTLYopZ12&index=6)

We need to modify our HashTable class:
- We need to have an empty array for each empty spot in our larger array. Strictly, it should be a linked list but we don't have that in Python.
- We need to store the key and value, as opposed to just the value, in the sub-array because, in case of collision, we'll need to manually search for the key within the sub array.
- We need to modify '__setitem__'.

In [38]:
class HashTable2:  
    def __init__(self):
        self.MAX = 10
        self.arr = [[] for i in range(self.MAX)]
        
    def get_hash(self, key):
        hash = 0
        for char in key:
            hash += ord(char)
        return hash % self.MAX
    
    def __getitem__(self, key):
        arr_index = self.get_hash(key)
        for kv in self.arr[arr_index]:
            if kv[0] == key:
                return kv[1]
            
    def __setitem__(self, key, val):
        h = self.get_hash(key)
        found = False
        
        # We are looking in the 'linked list' (self.arr[h]) to see if it already exists
        for idx, element in enumerate(self.arr[h]):
            if len(element)==2 and element[0] == key:
                
                # If exists, we create a new tuple with the key and value.
                self.arr[h][idx] = (key,val)
                found = True
        
        # Otherwise, we append a new tuple to our 'linked list'.
        if not found:
            self.arr[h].append((key,val))
        
    def __delitem__(self, key):
        arr_index = self.get_hash(key)
        for index, kv in enumerate(self.arr[arr_index]):
            if kv[0] == key:
                print("del",index)
                del self.arr[arr_index][index]

'march 6' and 'march 17' produce the same hash (because they have the same ord total), therefore we expect to see a 'linked list' with both of them.

In [39]:
t2 = HashTable2()

t2['march 6'] = 120
t2['march 8'] = 50 # No clash with this one.
t2['march 17'] = 459

t2.arr

[[],
 [('march 8', 50)],
 [],
 [],
 [],
 [],
 [],
 [],
 [],
 [('march 6', 120), ('march 17', 459)]]

# 4. Stacks

## [4.1. Stacks basics - Codebasics](https://www.youtube.com/watch?v=zwb3GmNAtFk&list=PLeo1K3hjS3uu_n_a__MI_KktGTLYopZ12&index=7)

If you're going through a number of webpages and you want to go back a few pages, how does the web browser know the order? We could use an array or linked list, but arrays have an issue if youre browsing through many websites and linkedlists are slow because, if you want to go to the last website you visited, you have to traverse the entire linked list with time O(n).

Stacks are used for any 'ctrl + z' functionality e.g. in word.

<img src=Comp-Images/4.1.1.png width=600 />

The solution is a stack:

<img src=Comp-Images/4.1.2.png width=600 />

Time complexity:

- Push/Pop element -> O(1)
- Search element by value -> O(n)

Stacks are implemented in Python using:
- lists 
- collections.deque (implemented using a doubly linked list) -> `stack = deque()` followed by `stack.append(5)`, `stack.pop()` etc.
- queue.LifoQueue 

# 5. Queues

## [5.1. Queues Basics - Codebasics](https://www.youtube.com/watch?v=rUUrmGKYwHw&list=PLeo1K3hjS3uu_n_a__MI_KktGTLYopZ12&index=8)

Say you're a developer for the New York Stock Exchange (NYSE) and a bunch of companies want financial data, so you make synchronous post calls to all these companies:

<img src=Comp-Images/5.1.1.png width=700 />

This is called a **tightly coupled architecture**. Whenever another company wants the financial info, there have to be code changes to ensure that the right website gets the right information. So, 'finance.google.com' have to be added to the list of post calls. Another issue is if yahoo's HTTP server goes down for 5 mins, they will lose out on all that data because NYSE is making synchronous posts.

What if instead NYSE pushed the data into a memory buffer. This is a **loosely coupled architecture** (aka producer-consumer problem):

<img src=Comp-Images/5.1.2.png width=700 />

The data structure is **FIFO**.

Python implementations:
- lists
- collections.deque (implemented using a doubly linked list) -> `q = deque()` followed by `q.append(5)`, `q.popleft()` etc.
- queue.Queue

# 6. Heaps

# 7. Trees

## [7.1. General Trees Code Implementation - Codebasics](https://www.youtube.com/watch?v=4r_XR9fUPhQ&list=PLeo1K3hjS3uu_n_a__MI_KktGTLYopZ12&index=9)

These are useful for information that has a natural hierarchical structure. 
- The root of the tree is called the **root node**. 
- Other elements are known as **nodes**. 
- The elements that have no children are called **leaf nodes**.
- Nodes have parents, children and ancestors.
- The tree structure can be referred to by levels where level 0 is the root node, level 1 are its children etc.

In [40]:
class TreeNode:
    def __init__(self, data):
        self.data = data
        self.children = []
        self.parent = None

    def get_level(self):
        level = 0
        p = self.parent
        while p:
            level += 1
            p = p.parent

        return level

    def print_tree(self):
        spaces = ' ' * self.get_level() * 3
        prefix = spaces + "|__" if self.parent else ""
        print(prefix + self.data)
        if self.children:
            for child in self.children:
                child.print_tree()

    def add_child(self, child):
        child.parent = self
        self.children.append(child)
        
        
def build_product_tree():
    root = TreeNode("Electronics")

    laptop = TreeNode("Laptop")
    laptop.add_child(TreeNode("Mac"))
    laptop.add_child(TreeNode("Surface"))
    laptop.add_child(TreeNode("Thinkpad"))

    cellphone = TreeNode("Cell Phone")
    cellphone.add_child(TreeNode("iPhone"))
    cellphone.add_child(TreeNode("Google Pixel"))
    cellphone.add_child(TreeNode("Vivo"))

    tv = TreeNode("TV")
    tv.add_child(TreeNode("Samsung"))
    tv.add_child(TreeNode("LG"))

    root.add_child(laptop)
    root.add_child(cellphone)
    root.add_child(tv)

    root.print_tree()

In [41]:
build_product_tree()

Electronics
   |__Laptop
      |__Mac
      |__Surface
      |__Thinkpad
   |__Cell Phone
      |__iPhone
      |__Google Pixel
      |__Vivo
   |__TV
      |__Samsung
      |__LG


## [7.2. Binary Search Trees Code Implementation - Codebasics](https://www.youtube.com/watch?v=lFq5mYUWEBk&list=PLeo1K3hjS3uu_n_a__MI_KktGTLYopZ12&index=10)

The difference between a **binary tree** and a **binary search tree** is that **binary search trees** have some kind of order e.g. *all* nodes on the left side of a particular node will have values less than than that node's value. Another difference is that all BST nodes are unique. There are no duplicates anywhere in the entire tree. 

The search complexity of BSTs is O(logn) because with every iteration, we reduce the search space by half.

Insertions go through the same reductions with each iteration, therefore, insert complexity is also O(logn).

**Depth first search** - There are 3 types of traversal: in-order, pre-order, post-order.

What's the difference?

<img src=Comp-Images/7.2.1.png width=700 />

- Pre-order: You visit your parent node first, then left subtree, then right subtree.
- In-order: You visit your left subtree first, then parent node, then right subtree.
- Post-order: You visit your left subtree first, then right subtree, then parent node.

To remember this easily, just consider the parent node; if it's in order, it goes in the middle; if its pre-order, it goes on the left; if it's post-order, it goes on the right.

**IN ORDER TRAVERSAL** - You will notice that this returns the elements in **ascending order** and not return elements that are duplicates.

So to explain the the in-order traversal array of the base node **15**:

- We go to (but not visit) the left subtree of 15 which is 12. How do we traverse the 12 subtree? By visiting its left subtree -> 7.
- After we visit 7 (left), we visit 12 (parent), then visit 14 (right). The 12 subtree has been traversed.
- We jump to the right subtree of 15 which is 27. How do we traverse the 27 subtree? By visiting its left subtree -> 20. 
- How do we traverse 20? By visiting its left subtree (non-existent), then visit its parent 20 and then 23 (right). 20 has been traversed.. Continue traversal of 27.
- We visited its left subtree (20), now visit 27 (parent), then visit the 88 (right).

**Code Implementation**

In [1]:
class BinarySearchTreeNode:
    def __init__(self, data):
        self.data = data
        self.left = None
        self.right = None

    def add_child(self, data):
        if data == self.data:
            return # node already exist

        if data < self.data:
            if self.left:
                self.left.add_child(data)
            else:
                self.left = BinarySearchTreeNode(data)
        else:
            if self.right:
                self.right.add_child(data)
            else:
                self.right = BinarySearchTreeNode(data)


    def search(self, val):
        if self.data == val:
            return True

        if val < self.data:
            if self.left:
                return self.left.search(val)
            else:
                return False

        if val > self.data:
            if self.right:
                return self.right.search(val)
            else:
                return False

    def in_order_traversal(self):
        elements = []
        if self.left:
            elements += self.left.in_order_traversal()

        elements.append(self.data)

        if self.right:
            elements += self.right.in_order_traversal()

        return elements


def build_tree(elements):
    print("Building tree with these elements:",elements)
    root = BinarySearchTreeNode(elements[0])

    for i in range(1,len(elements)):
        root.add_child(elements[i])

    return root


countries = ["India","Pakistan","Germany", "USA","China","India","UK","USA"]
country_tree = build_tree(countries)

print("UK is in the list? ", country_tree.search("UK"))
print("Sweden is in the list? ", country_tree.search("Sweden"))

numbers_tree = build_tree([17, 4, 1, 20, 9, 23, 18, 34])
print("In order traversal gives this sorted list:",numbers_tree.in_order_traversal())

Building tree with these elements: ['India', 'Pakistan', 'Germany', 'USA', 'China', 'India', 'UK', 'USA']
UK is in the list?  True
Sweden is in the list?  False
Building tree with these elements: [17, 4, 1, 20, 9, 23, 18, 34]
In order traversal gives this sorted list: [1, 4, 9, 17, 18, 20, 23, 34]


The pre- and post-order traversals are just minor modifications to the above code.

## [7.3. Binary Search Trees + Deleting Nodes + Code Implementation - Codebasics](https://www.youtube.com/watch?v=JnrbMQyGLiU&list=PLeo1K3hjS3uu_n_a__MI_KktGTLYopZ12&index=11)

Deleting leaf nodes or nodes with only 1 child is relatively trivial; in the latter case you just replace your deleted node with the child. But deleting nodes with two children is more difficult.

<img src=Comp-Images/7.3.1.png width=600 />

The two rules of BSTs are:
- all nodes are unique.
- nodes to the left are smaller and nodes to the right are larger.

One approach is to:
- look into the **right subtree** and find the **minimum value** from the entire right subtree. 
- **replace** the node you want to delete with this minimum value node.
- Remove the duplicate. (The higher level node becomes the original and the lower level node becomes the duplicate.)
- This guarantees that all nodes in the right subtree are greater than the node that replaced the deleted node.

Another approach is: (basically the inverse of the above approach)
- look into the **left subtree** and find the **minimum value** from the entire left subtree. 
- **replace** the node you want to delete with this minimum value node.
- Remove the duplicate. (The higher level node becomes the original and the lower level node becomes the duplicate.)
- This guarantees that all nodes in the left subtree are smaller than the node that replaced the deleted node.

**Code Implementation**

Focus on the `find_min`, `find_max` and `delete` function

In [6]:
class BinarySearchTreeNode:
    def __init__(self, data):
        self.data = data
        self.left = None
        self.right = None

    def add_child(self, data):
        if data == self.data:
            return # node already exist

        if data < self.data:
            if self.left:
                self.left.add_child(data)
            else:
                self.left = BinarySearchTreeNode(data)
        else:
            if self.right:
                self.right.add_child(data)
            else:
                self.right = BinarySearchTreeNode(data)


    def search(self, val):
        if self.data == val:
            return True

        if val < self.data:
            if self.left:
                return self.left.search(val)
            else:
                return False

        if val > self.data:
            if self.right:
                return self.right.search(val)
            else:
                return False

    def in_order_traversal(self):
        elements = []
        if self.left:
            elements += self.left.in_order_traversal()

        elements.append(self.data)

        if self.right:
            elements += self.right.in_order_traversal()

        return elements
    
    #-----------------------NEW CODE-----------------------

    def find_max(self):
        if self.right is None:
            return self.data
        return self.right.find_max()

    def find_min(self):
        if self.left is None:
            return self.data
        return self.left.find_min()
    
    def delete(self, val):
        if val < self.data:
            if self.left:
                self.left = self.left.delete(val)
        elif val > self.data:
            if self.right:
                self.right = self.right.delete(val)
        else:
            
            # If we reach here, it means we've found the node that we want to delete.
            # But, we could still have subtrees below.
            # Remember that, from this point on in the code, our current position is at the node we want
            # to delete. So, all self.left etc is relative to this base node.
            
            # Our node is actually a leaf node.
            if self.left is None and self.right is None:
                return None
            
            # Our node is a parent to a leaf node.
            elif self.left is None:
                return self.right
            
            # Our node is a parent to a leaf node
            elif self.right is None:
                return self.left
            
            # Use the first approach of deleting a node i.e. min val in right subtree.
            # BELOW and in the right subtree of the node we want to delete, we're finding the minimum value.
            min_val = self.right.find_min()
            
            # We 'delete' our node by replacing it with the minimum value from the right subtree.
            self.data = min_val
            
            # We get rid of any duplicates below our newly replaced node. This is where the None's are used;
            # we need to fill in the duplicate with None or one of the children.
            self.right = self.right.delete(min_val)

        return self


def build_tree(elements):
    print("Building tree with these elements:",elements)
    root = BinarySearchTreeNode(elements[0])

    for i in range(1,len(elements)):
        root.add_child(elements[i])

    return root

if __name__ == '__main__':
    numbers_tree = build_tree([17, 4, 1, 20, 9, 23, 18, 34])
    numbers_tree.delete(20)
    print("After deleting 20 ",numbers_tree.in_order_traversal()) # this should print [1, 4, 9, 17, 18, 23, 34]

Building tree with these elements: [17, 4, 1, 20, 9, 23, 18, 34]
After deleting 20  [1, 4, 9, 17, 18, 23, 34]


Run the above in PythonTutor and add a print statement above `if val < self.data:` and above the final `return self`. Also draw the tree too.

Here is a diagram explanation of the first example: (left to right)

<img src=Comp-Images/7.3.2.png width=600 /> <img src=Comp-Images/7.3.3.png width=600 /> 
<img src=Comp-Images/7.3.4.png width=600 /> <img src=Comp-Images/7.3.5.png width=600 /> 
<img src=Comp-Images/7.3.6.png width=600 /> 


# 8. Graphs

# 9. Binary Search

<img src=Comp-Images/9.1.png width=900 />

# 10. Bubble Sort

<img src=Comp-Images/10.1.png width=700 />

With bubble sort, we compute a pairwise comparison with the first two elements and then work our way up to the end. This causes the largest number to **'bubble'** up to the end of the array. We repeat this process until all elements are ordered.

This has terrible time complexity but good space complexity

- Time complexity: O(n^2)
- Space Complexity: O(1) (This is because we don't need any additional space - we just use the same array and swap numbers around.

In [14]:
def bubble_sort(elements):
    size = len(elements)

    for i in range(size-1):
        swapped = False
        for j in range(size-1-i):
            if elements[j] > elements[j+1]:
                tmp = elements[j]
                elements[j] = elements[j+1]
                elements[j+1] = tmp
                swapped = True

        if not swapped:
            break


elements = [5,9,2,1,67,34,88,34]
# elements = [1,2,3,4,2]
# elements = ["mona", "dhaval", "aamir", "tina", "chang"]

bubble_sort(elements)
print(elements)

[1, 2, 5, 9, 34, 34, 67, 88]


# 11. Quick Sort

There are two ways to quick sort that are different to the easy recursive way in the Grokking Algorithms notebook. These are:
- Hoare Partition Scheme
- Lomuto Partition Scheme

## [11.1. Hoare Implementation - Codebasics](https://www.youtube.com/watch?v=5iSZ7mh_RAk&list=PLeo1K3hjS3uu_n_a__MI_KktGTLYopZ12&index=15)

Steps:

<img src=Comp-Images/11.1.1.png width=400 />

1. Choose a pivot (choosing first element...)

2. Set the start pointer to the next element and the end pointer to the last element.

3. Keep moving the start pointer to the next element until you find an element **less than the pivot** and then STOP.

<img src=Comp-Images/11.1.2.png width=400 />

4. Now move the end pointer to the previous element until you find an element **greater than the pivot** and then STOP.

<img src=Comp-Images/11.1.3.png width=400 />

5. Swap the elements at the start and end pointer.

<img src=Comp-Images/11.1.4.png width=400 />

6. Repeat from step 3: Move the start pointer to the next element until you find an element **less than the pivot** and then STOP.

<img src=Comp-Images/11.1.5.png width=400 />

7. We're moving to step 4..

<img src=Comp-Images/11.1.6.png width=400 />

8. But, if the end pointer precedes the start pointer, **swap the end and the pivot**.

<img src=Comp-Images/11.1.7.png width=400 />

9. The pivot is now in the correct position. Repeat for the left partition and right partition recursively.

In [15]:
# implementation of quick sort in python using hoare partition scheme

class Sort:
    def quicksort(self, array, low, high):
        if low < high:
            # Get the pivot in the right place.
            pivot = self.partition(array, low, high)
            
            # Sort the left side of the partition.
            self.quicksort(array, low, pivot)
            
            # Sort the left side of the partition.
            self.quicksort(array, pivot + 1, high)

    def partition(self, array, low, high):
        pivot = array[(high + low) // 2]
        i = low
        j = high

        while True:
            while array[i] < pivot:
                i += 1
            while array[j] > pivot:
                j -= 1
            if i >= j:
                return j
            array[i], array[j] = array[j], array[i]
    
    # Main function to call
    def sort(self, array):
        self.quicksort(array, 0, len(array) - 1)
        return array
    

my_arr = [11,9,29,7,2,15,28]

sort_object = Sort()

sort_object.sort(my_arr)

[2, 7, 9, 11, 15, 28, 29]