### *This is a reference created for Data structures in python. It's structured to give a brief explanation ,then the python code as well as test outputs*

___________________________________________________________________________________________________________________

# Data Structures

A data structure is a collection of data values, the relationships among them, and the
functions or operations that can be applied to the data.

A data structure is not only used for organizing the data. It is also used for
processing, retrieving, and storing data. There are different basic and advanced
types of data structures that are used in almost every program or software system
that has been developed. So we must have good knowledge of data structures.

Data structures are an integral part of computers used for the arrangement of data in
memory. They are essential and responsible for organizing, processing, accessing,
and storing data efficiently. But this is not all. Various types of data structures have
their own characteristics, features, applications, advantages, and disadvantages

___________________________________________________________________________________________________________________

## Array 

An array is a collection of items stored at contiguous memory locations. The idea is to
store multiple items of the same type together.

<div>
<img src='https://logicmojo.com/assets/dist/new_pages/images/array-ele-java.jpg' width=500 height=500/>

In [1]:
import numpy as np

arr = np.array([1, 2, 3, 5], dtype=int)
arr2 = np.array([3.0, 3.5, 4.0], dtype=float)
arr3 = np.array(["a", "b", "c"], dtype=str)

In [2]:
arr

array([1, 2, 3, 5])

In [3]:
arr2

array([3. , 3.5, 4. ])

In [4]:
arr3

array(['a', 'b', 'c'], dtype='<U1')

___________________________________________________________________________________________________________________

## Linked Lists

A linked list is a linear collection of data elements of any type, called nodes, where each node has itself a value, and points to the next node in the linked list. The principal advantage of a linked list over an array is that values can always be efficiently inserted and removed without relocating the rest of the list. 

![image.png](attachment:image.png)

In [9]:
class Node:
    def __init__(self, data=0, next=0):
        self.data = data
        self.next = next

class Linked:
    def __init__(self):
        self.head = None

    def traverse(self):
        if self.head is None:
            print('List is empty')

        cur = self.head
        data = ''
        while cur:
            data += str(cur.data) + ' --> '
            cur = cur.next
        data += 'None'
        return data

    def __iter__(self):
        if self.head is None:
            print('Nothing to iterate')

        cur = self.head
        while cur:
            yield cur.data
            cur = cur.next

    def length(self):
        if self.head is None:
            print('List is empty')

        count = 0
        cur = self.head
        while cur:
            count += 1
            cur = cur.next
        return count

    def insert_array(self, array:list):
        self.head is None
        for data in array:
            self.insert_end(data)
            return

    def insert_start(self, data):
        if self.head is None:
            self.head = Node(data, None)

        self.head = Node(data, self.head)

    def insert_end(self, data):
        if self.head is None:
            self.head = Node(data, None)

        cur = self.head
        while cur.next:
            cur = cur.next
        cur.next = Node(data, None)
        return

    def insert_somewhere(self, prev_node, data):
        if prev_node is None:
            print('Node doesnt exist')

        new_node = Node(data, None)
        new_node.next = prev_node.next
        prev_node.next = new_node

    def insert_index(self, index, data):
        if index < 0 and index > self.length():
            print('Invalid index')

        if index == 0:
            if self.head is None:
                self.head = Node(data, None)

            self.head = Node(data, self.head)

        count = 0
        cur = self.head
        while cur:
            if count == index - 1:
                new_node = Node(data, None)
                new_node.next = cur.next
                cur.next = new_node
                break
            cur = cur.next
            count += 1

    def insert_byData(self, data, new_data):
        if self.head is None:
            print('List is empty')

        if self.head.data == data:
            self.head.next = Node(new_data, self.head.next)

        cur = self.head
        while cur:
            if cur.data == data:
                cur.next = Node(new_data, cur.next)
                break
            cur = cur.next

    def remove_index(self, index):
        if index < 0 and index > self.length():
            print('Invalid index')

        if index == 0 :
            self.head = self.head.next

        count = 0
        cur = self.head
        while cur.next:
            if count == index - 1:
                cur.next = cur.next.next
                break
            cur = cur.next
            count += 1

    def remove_byData(self, data):
        if self.head is None:
            print('Nothing to remove')

        if self.head.data == data:
            self.head = self.head.next

        cur = self.head
        while cur.next:
            if cur.data == data:
                cur.next = cur.next.next
                break
            cur = cur.next 

    def delete(self):
        while (self.head != None):
            cur  = self.head
            self.head = self.head.next
            del cur
        return "All nodes are deleted successfully"

    def search(self, node):
        if self.head is None:
            return False
        
        cur = self.head
        while cur:
            if cur.data == node:
                return True
            cur = cur.next
        return False

In [11]:
def testcase():
    
    llist = Linked()

    a = Node('India')
    b = Node('England')
    c = Node('Pakistan')
    d = Node('Australia')
    e = Node('New Zealand')
    f = Node('West Indies')
    g = Node('Sri Lanka')

    llist.head = a
    a.next = b
    b.next = c
    c.next = d
    d.next = e
    e.next = f
    f.next = g

    #test traverse()

    print('------------------------Traverse------------------------\n')
    print(llist.traverse())

    #test length()
    print('\n------------------------Length------------------------\n')
    print(llist.length())

    #test insert_start()
    print('\n------------------------Start------------------------\n')
    llist.insert_start('South Africa')
    print(llist.traverse())

    #test insert_end()
    print('\n------------------------End------------------------\n')
    llist.insert_end('Bangladesh')
    print(llist.traverse())

    #test insert_somewhere()
    print('\n------------------------Somewhere------------------------\n')
    llist.insert_somewhere(f, 'Ireland')
    print(llist.traverse())

    #test insert_array()
    #print('\n------------------------Array------------------------\n')
    #ll = Linked()
    #ll.insert_array(['a','b','c','d','e','f','g'])
    #print(ll.traverse())

    #test insert_byData()
    print('\n------------------------ByData------------------------\n')
    llist.insert_byData('India', 'Netherlands')
    print(llist.traverse())

    #tes insert_index()
    print('\n------------------------Index------------------------\n')
    llist.insert_index(2, 'Zimbabwe')
    print(llist.traverse())

    #test remove_index()
    print('\n------------------------Remove Index------------------------\n')
    llist.remove_index(3)
    print(llist.traverse())

    #test remove_byData()
    print('\n------------------------Remove byData------------------------\n')
    llist.remove_byData('Australia')
    print(llist.traverse())

    #test delete()
    #print('\n------------------------Delete------------------------\n')
    #print(llist.delete())
    #print(llist.traverse())

    print('\n------------------------Search------------------------\n')

    print(llist.search('Ireland'))


testcase()

------------------------Traverse------------------------

India --> England --> Pakistan --> Australia --> New Zealand --> West Indies --> Sri Lanka --> None

------------------------Length------------------------

7

------------------------Start------------------------

South Africa --> India --> England --> Pakistan --> Australia --> New Zealand --> West Indies --> Sri Lanka --> None

------------------------End------------------------

South Africa --> India --> England --> Pakistan --> Australia --> New Zealand --> West Indies --> Sri Lanka --> Bangladesh --> None

------------------------Somewhere------------------------

South Africa --> India --> England --> Pakistan --> Australia --> New Zealand --> West Indies --> Ireland --> Sri Lanka --> Bangladesh --> None

------------------------ByData------------------------

South Africa --> India --> Netherlands --> England --> Pakistan --> Australia --> New Zealand --> West Indies --> Ireland --> Sri Lanka --> Bangladesh --> None

--

___________________________________________________________________________________________________________________

## Hash table

Hash tables, also known as hash maps, are data structures that provide fast retrieval
of values based on keys. They use a hashing function to map keys to indexes in an
array, allowing for constant-time access in the average case. Hash tables are
commonly used in dictionaries, caches, and database indexing. However, hash
collisions can occur, which can impact their performance. Techniques like chaining
and open addressing are employed to handle collisions.

![image.png](attachment:image.png)

In [12]:
class Hashmap:
    def __init__(self):
        self.size = 10
        self.data = [[] for _ in range(self.size)]

    def __str__(self):
        return str(self.data)

    def hash_function(self, key):
        hash_key = 0
        for char in key:
            hash_key += ord(char)
        return hash_key % self.size

    def set(self, key, value):
        hash_key = self.hash_function(key)
        bucket = self.data[hash_key]
        for item in bucket:
            if item[0] == key:
                item[1] = value
                return
        bucket.append((key,value))

    def get(self, key):
        hash_key = self.hash_function(key)
        bucket = self.data[hash_key]
        for pair in bucket:
            return pair[1] if pair[0] == key else None

    def remove(self, key):
        hash_key = self.hash_function(key)
        bucket = self.data[hash_key]
        for item in bucket:
            return bucket.remove(item) if item[0] == key else None

    def keys(self):
        keys = []
        for bucket in self.data:
            for pair in bucket:
                keys.append(pair[0])
        return keys

    def values(self):
        values = []
        for bucket in self.data:
            for pair in bucket:
                values.append(pair[1])
        return values

In [13]:
def hash_test():

    test = Hashmap()

    test.set('name', 'Jose')
    test.set('age', 25)
    test.set('location', 'London')
    test.set('division', 'AI')
    test.set('company', 'Amazon')

    print(test.get('age'))

    print(test.keys())
    print(test.values())

    print(test)

    test.remove('division')
    print(test)

hash_test()

25
['age', 'name', 'location', 'division', 'company']
[25, 'Jose', 'London', 'AI', 'Amazon']
[[], [('age', 25)], [], [], [], [], [], [('name', 'Jose'), ('location', 'London')], [], [('division', 'AI'), ('company', 'Amazon')]]
[[], [('age', 25)], [], [], [], [], [], [('name', 'Jose'), ('location', 'London')], [], [('company', 'Amazon')]]


In [19]:
def hash_t():
    
    # Create a new Hash object
    user_hash_table = Hashmap()
    
    # Add users to the hash table
    user_hash_table.set("alice1@example.com", {"name": "Alice", "age": 30, "is_member":True})
    user_hash_table.set("Bob_2@example.com", {"name": "Bob", "age": 25, "is_member":False})
    user_hash_table.set("CHARLIE-3@example.com", {"name": "Charlie", "age": 22, "is_member":True})
    user_hash_table.set("David@example.com", {"name": "David", "age": 28, "is_member":False})
    user_hash_table.set("user5@example.com", {"name": "Eva", "age": 33, "is_member":True})
    
    # Retrieve information about a user
    user_email = "David@example.com"
    user_info = user_hash_table.get(user_email)
    
    if user_info:
        print(f"User {user_email} details:")
        print(f"Name: {user_info['name']}")
        print(f"Age: {user_info['age']}")
        print(f"Is Member: {user_info['is_member']}")
    else:
        print(f"User {user_email} not found.")
   
    # Remove a user from the hash table
    user_to_remove = "Bob_2@example.com"
    user_hash_table.remove(user_to_remove)
    print(f"User {user_to_remove} removed.")
    
    # Get all user emails in the hash table
    all_user_emails = user_hash_table.keys()
    print("All user emails:", all_user_emails)
    
    # Get all user information in the hash table
    all_user_info = user_hash_table.values()
    print("All user information:", all_user_info)

hash_t()

User David@example.com details:
Name: David
Age: 28
Is Member: False
User Bob_2@example.com removed.
All user emails: ['David@example.com', 'alice1@example.com', 'CHARLIE-3@example.com', 'user5@example.com']
All user information: [{'name': 'David', 'age': 28, 'is_member': False}, {'name': 'Alice', 'age': 30, 'is_member': True}, {'name': 'Charlie', 'age': 22, 'is_member': True}, {'name': 'Eva', 'age': 33, 'is_member': True}]


___________________________________________________________________________________________________________________

## Stack

A stack is a linear data structure that stores items in a Last-In/First-Out (LIFO)
manner. In stack, a new element is added at one end and an element is removed from
that end only. The insert and delete operations are often called push and pop.

![image.png](attachment:image.png)

In [20]:
class Stack:
    def __init__(self):
        self.stack = []

    def __str__(self):
        return str(self.stack[::-1])

    def push(self, item):
        self.stack.append(item)
        return f'{item} has been added to stack'

    def pop(self):
        if len(self.stack) > 0:
            return f'item removed from stack is {self.stack.pop()}'
        else:
            None

    def peek(self):
        if len(self.stack) > 0:
            return f'Next available item from stack is {self.stack[-1]}'
        else:
            None

    def clear(self):
        del self.stack
        return 'Stack now empty'

    def is_empty(self):
        if len(self.stack) == 0:
            return True
        return False

In [21]:
def test():
    st = Stack()

    st.push('a')
    st.push('b')
    st.push('c')
    st.push('r')
    st.push('z')

    print(st.peek())

    print(st.pop())

    print(st.is_empty())

    print(st)

test()

Next available item from stack is z
item removed from stack is z
False
['r', 'c', 'b', 'a']


___________________________________________________________________________________________________________________

## Queue

Queue is a linear data structure that stores items in First In First Out (FIFO) manner.
With a queue the least recently added item is removed first. A good example of
queue is any queue of consumers for a resource where the consumer that came first
is served first.

![image.png](attachment:image.png)

In [22]:
from collections import deque

class Queue:
    def __init__(self):
        self.queue = deque()

    def __str__(self):
        return str(self.queue)

    def push(self, item):
        self.queue.append(item)
        return 'item added to queue'

    def pop(self):
        if len(self.queue) > 0:
            return self.queue.popleft()
        else:
            return 'no item in queue'

    def clear(self):
        if len(self.queue) > 0:
            del self.queue
            return ' queue is empty'

    def peek(self):
        if len(self.queue) > 0:
            return self.queue[0]
        else:
            None 

In [23]:
def test_case():
    
    q = Queue()

    q.push('a')
    q.push('b')
    q.push('c')
    q.push('d')
    q.push('e') 

    q.pop()

    print(q.peek())

    print(q)

test_case()

b
deque(['b', 'c', 'd', 'e'])


___________________________________________________________________________________________________________________

## Priority Queue

Priority Queues are abstract data structures where each data/value in the queue has
a certain priority. For example, In airlines, baggage with the title “Business” or “Firstclass”
arrives earlier than the rest.
Priority Queue is an extension of the queue with the following properties.
1. An element with high priority is dequeued before an element with low priority.
2. If two elements have the same priority, they are served according to their order in
the queue

![image.png](attachment:image.png)

In [24]:
from heapq import heappush , heappop

class PriorityQueue:
    def __init__(self):
        self._queue = []
        self._index = 0

    def push(self, item, priority):
        heappush(self._queue, (-priority, self._index, item))
        self._index += 1

    def pop(self):
        if not self._queue:
            raise IndexError("pop from an empty priority queue")
        _, _, item = heappop(self._queue)
        return item

    def peek(self):
        if not self._queue:
            raise IndexError("peek from an empty priority queue")
        _, _, item = self._queue[0]
        return item

    def __len__(self):
        return len(self._queue)

    def is_empty(self):
        return len(self._queue) == 0
    
    def __str__(self):
        return str(self._queue)

In [25]:
# Create a new priority queue
emergency_queue = PriorityQueue()

# Add patients to the priority queue with their priority (10: most critical, 0: less critical)
emergency_queue.push(("John", "Heart Attack"), 10)
emergency_queue.push(("Alice", "Broken Arm"), 6)
emergency_queue.push(("Bob", "High Fever"), 2)
emergency_queue.push(("Emma", "Head Injury"), 9)

# Check the next patient to be treated
next_patient = emergency_queue.peek()
print("Next patient:", next_patient)  # Output: ('John', 'Heart Attack')

# Treat the next patient
treated_patient = emergency_queue.pop()
print("Treated patient:", treated_patient)  # Output: ('John', 'Heart Attack')

# Check the next patient after treating one
next_patient = emergency_queue.peek()
print("Next patient:", next_patient)  # Output: ('Emma', 'Head Injury')

# Add a new patient to the priority queue
emergency_queue.push(("Daniel", "Allergic Reaction"), 2)

# Treat the next patient
treated_patient = emergency_queue.pop()
print("Treated patient:", treated_patient)  # Output: ('Emma', 'Head Injury')

#Remaining patients
print('remaining patients', emergency_queue) # Output [(-6, 1, ('Alice', 'Broken Arm')), (-2, 4, ('Daniel', 'Allergic Reaction')), (-2, 2, ('Bob', 'High Fever'))]

Next patient: ('John', 'Heart Attack')
Treated patient: ('John', 'Heart Attack')
Next patient: ('Emma', 'Head Injury')
Treated patient: ('Emma', 'Head Injury')
remaining patients [(-6, 1, ('Alice', 'Broken Arm')), (-2, 4, ('Daniel', 'Allergic Reaction')), (-2, 2, ('Bob', 'High Fever'))]


___________________________________________________________________________________________________________________

## Tree

Trees represent a hierarchical organization of elements. A tree consists of nodes connected by edges, with one node being the root and all other nodes forming subtrees. Trees are widely used in various algorithms and data storage scenarios.

![image.png](attachment:image.png)


In [28]:
class Tree:
    def __init__(self, name, role):
        self.name = name
        self.role = role
        self.children = []
        self.parent = None

    def add_child(self, child):
        child.parent = self
        self.children.append(child)

    def get_level(self):
        level = 0
        p = self.parent
        while p:
            p = p.parent
            level += 1
        return level

    def print_info(self, type = input('choose between, 1.name, 2.role, 3.both: ')):
        if type =='both':
            value = f'{self.name} ({self.role})'
        elif type == 'name':
            value = self.name
        else:
            value = self.role
        
        spaces = spaces = ' ' * self.get_level() * 3
        prefix = prefix = spaces + '|---> ' if self.parent else ''
        print(prefix + value)
        if self.children:
            for child in self.children:
                child.print_info(type)

choose between, 1.name, 2.role, 3.both: both


In [30]:
def management():

    root = Tree('Nilpul', 'CEO')

    parent1 = Tree('Chinmay', 'CTO')
    child1 = Tree('Vishwa', 'Infrastructure Head')
    grandchild1 = Tree('Dhaval', 'Cloud Manager')
    grandchild2 = Tree('Abhijit', 'App Manager')


    child2 = Tree('Aamir', 'Application Head')

    parent2 = Tree('Gels','HR Head')
    childx = Tree('Peter','Recruitment Manager')
    childy = Tree('Waqqas', 'Policy Manager')

    parent1.add_child(child1)
    parent1.add_child(child2)

    child1.add_child(grandchild1)
    child1.add_child(grandchild2)

    parent2.add_child(childx)
    parent2.add_child(childy)

    root.add_child(parent1)
    root.add_child(parent2)

    root.print_info()

    #print(grandchild2.get_level())

management()

Nilpul (CEO)
   |---> Chinmay (CTO)
      |---> Vishwa (Infrastructure Head)
         |---> Dhaval (Cloud Manager)
         |---> Abhijit (App Manager)
      |---> Aamir (Application Head)
   |---> Gels (HR Head)
      |---> Peter (Recruitment Manager)
      |---> Waqqas (Policy Manager)


___________________________________________________________________________________________________________________

## Binary Search Tree

A Binary Search Tree (BST) is a special type of binary tree in which the left child of a node has a value less than the node’s value and the right child has a value greater than the node’s value. This property is called the BST property and it makes it possible to efficiently search, insert, and delete elements in the tree.

The root of a BST is the node that has the largest value in the left subtree and the smallest value in the right subtree. Each left subtree is a BST with nodes that have smaller values than the root and each right subtree is a BST with nodes that have
larger values than the root.

![image.png](attachment:image.png)

In [32]:
from collections import deque

class Node:
    '''Node class that contains the data to form binary tree'''
    def __init__(self, data):
        self.data = data
        self.left = None
        self.right = None

class BST:
    def __init__(self):
        '''Parent or root of Binary tree'''
        self.root = None

    def _insert(self, root, key):
        '''Function that takes in root: root of tree and key: data to be inserted. Inserts based on wether data is < or > than root'''         
        if key.data < root.data:
            if root.left is None:
                root.left = key
            else:
                self._insert(root.left, key)
        elif key.data > root.data:
            if root.right is None:
                root.right = key
            else:
                self._insert(root.right, key)

    def insert(self, key):
        '''Callable function that uses the _insert function. Checks for key instance , root is None, then call _insert(root, key)''' 
        if not isinstance(key, Node):
            key = Node(key)

        if self.root is None:
            self.root = key

        else:
            self._insert(self.root, key)

    def buildTree(self, array):
        '''Uses _insert function that builds a tree from an array using a for loop. Takes an array or list as an argument'''
        for i in array:
            self.insert(i)

    def MIN(self):
        '''Finds the min value from binary tree ie the left most value'''
        cur = self.root
        while cur.left:
            cur = cur.left
        return cur.data

    def MAX(self):
        '''Finds the max value from binary tree ie the right most value'''
        cur = self.root
        while cur.right:
            cur = cur.right
        return cur.data

    def min_node(self, root):
        '''Finds min Node from binary tree'''
        cur = root
        while cur.left:
            cur = cur.left
        return cur

    def _search(self, root, key):
        '''Function that searchs for provided key in provided root'''
        if root:
            if key == root.data:
                return 'Value found'
            elif key < root.data:
                return self._search(root.left, key)
            else:
                return self._search(root.right, key)
        return 'Value not found'

    def search(self, key):
        '''Helper function for _search , that uses recursion to call that function with self.root'''
        return self._search(self.root, key)

    def _remove(self, root, key):
        '''Function that removes key from root'''
        if root is None:
            return root

        if key < root.data:
            root.left = self._remove(root.left, key)

        elif key > root.data:
            root.right = self._remove(root.right, key)

        else:
            if root.left is None:
                temp = root.right
                root = None
                return temp
            elif root.right is None:
                temp = root.left
                root = None
                return temp

            temp = self.min_node(root.right)
            root.data = temp.data
            root.right = self._remove(root.right, temp.data)

        return root

    def remove(self, key):
        '''helper function for _remove'''
        self.root = self._remove(self.root, key)


    def _in(self, root):
        if root:
            self._in(root.left)
            print(root.data, end=' ')
            self._in(root.right)

    def inorder(self):
        '''Uses _in function for BFS , left -> root -> right'''
        return self._in(self.root)

    def _pre(self, root):
        if root:
            print(root.data, end=' ')
            self._pre(root.left)
            self._pre(root.right)

    def preorder(self):
        '''Uses _pre function for BFS , root -> left -> right'''
        return self._pre(self.root)

    def _post(self, root):
        if root:
            self._post(root.left)
            self._post(root.right)
            print(root.data, end=' ')

    def postorder(self):
        '''Uses _post function for BFS , left -> right -> root'''
        return self._post(self.root)

    def _level(self, root):
        if root is None:
            return

        queue = deque()
        queue.append(root)
        while queue:
            node = queue.popleft()
            print(node.data, end=' ')
            if node.left:
                queue.append(node.left)

            if node.right:
                queue.append(node.right)

    def levelorder(self):
        '''Uses _level for DFS, based on levels'''
        return self._level(self.root)

In [34]:
tree = BST()

tree.insert(9)
tree.insert(3)
tree.insert(12)
tree.insert(6)
tree.insert(1)
tree.insert(8)
tree.insert(13)
tree.insert(14)
tree.insert(5)
tree.insert(10)

tree.levelorder()
tree.remove(6)

print('\n')

tree.levelorder()
print('\n')

print(tree.search(12))
print(tree.search(99))
print(tree.MIN())
print(tree.MAX())

print('\n')

print('Inorder')
tree.inorder()
print('\n')

print('Pre-order')
tree.preorder()
print('\n')

print('Post-order')
tree.postorder()
print('\n')

print('\n')
print('Level-order')
tree.levelorder()

9 3 12 1 6 10 13 5 8 14 

9 3 12 1 8 10 13 5 14 

Value found
Value not found
1
14


Inorder
1 3 5 8 9 10 12 13 14 

Pre-order
9 3 1 8 5 12 10 13 14 

Post-order
1 5 8 3 10 14 13 12 9 



Level-order
9 3 12 1 8 10 13 5 14 

___________________________________________________________________________________________________________________

## Heap

A heap is a specialized tree-based data structure that satisfies the heap property: In a max heap, for any given node C, if P is a parent node of C, then the key (the value) of P is greater than or equal to the key of C. In a min heap, the key of P is less than or equal to the key of C.The node at the "top" of the heap (with no parents) is called the root node.

![image.png](attachment:image.png)

In [38]:
class MinHeap:
    
    def __init__(self):
        self.heap = []

    def insert(self, value):
        self.heap.append(value)
        self._heapify_up(len(self.heap) - 1)

    def extract_min(self):
        if len(self.heap) == 0:
            return None
        if len(self.heap) == 1:
            return self.heap.pop()

        root = self.heap[0]
        self.heap[0] = self.heap.pop()
        self._heapify_down(0)
        return root

    def _heapify_up(self, index):
        while index > 0:
            parent_index = (index - 1) // 2
            if self.heap[parent_index] > self.heap[index]:
                self.heap[parent_index], self.heap[index] = self.heap[index], self.heap[parent_index]
                index = parent_index
            else:
                break

    def _heapify_down(self, index):
        left_child_index = 2 * index + 1
        right_child_index = 2 * index + 2
        smallest = index

        if (
            left_child_index < len(self.heap)
            and self.heap[left_child_index] < self.heap[smallest]
        ):
            smallest = left_child_index
        if (
            right_child_index < len(self.heap)
            and self.heap[right_child_index] < self.heap[smallest]
        ):
            smallest = right_child_index

        if smallest != index:
            self.heap[index], self.heap[smallest] = self.heap[smallest], self.heap[index]
            self._heapify_down(smallest)


class MaxHeap:
    
    def __init__(self):
        self.heap = []

    def insert(self, value):
        self.heap.append(value)
        self._heapify_up(len(self.heap) - 1)

    def extract_max(self):
        if len(self.heap) == 0:
            return None
        if len(self.heap) == 1:
            return self.heap.pop()

        root = self.heap[0]
        self.heap[0] = self.heap.pop()
        self._heapify_down(0)
        return root

    def _heapify_up(self, index):
        while index > 0:
            parent_index = (index - 1) // 2
            if self.heap[parent_index] < self.heap[index]:
                self.heap[parent_index], self.heap[index] = self.heap[index], self.heap[parent_index]
                index = parent_index
            else:
                break

    def _heapify_down(self, index):
        left_child_index = 2 * index + 1
        right_child_index = 2 * index + 2
        largest = index

        if (
            left_child_index < len(self.heap)
            and self.heap[left_child_index] > self.heap[largest]
        ):
            largest = left_child_index
        if (
            right_child_index < len(self.heap)
            and self.heap[right_child_index] > self.heap[largest]
        ):
            largest = right_child_index

        if largest != index:
            self.heap[index], self.heap[largest] = self.heap[largest], self.heap[index]
            self._heapify_down(largest)

# Example usage of MinHeap and MaxHeap:
min_heap = MinHeap()
min_heap.insert(5)
min_heap.insert(3)
min_heap.insert(8)
min_heap.insert(2)
print(min_heap.extract_min())  # Output: 2

max_heap = MaxHeap()
max_heap.insert(5)
max_heap.insert(3)
max_heap.insert(8)
max_heap.insert(2)
print(max_heap.extract_max())  # Output: 8

2
8


___________________________________________________________________________________________________________________

## Graph

A Graph is a non-linear data structure consisting of vertices and edges. The vertices are sometimes also referred to as nodes and the edges are lines or arcs that connect any two nodes in the graph. More formally a Graph is composed of a set of vertices( V ) and a set of edges( E ). The graph is denoted by G(V, E).

Graph data structures are a powerful tool for representing and analyzing complex relationships between objects or entities. They are particularly useful in fields such as social network analysis, recommendation systems, and computer networks. In the field of sports data science, graph data structures can be used to analyze and
understand the dynamics of team performance and player interactions on the field.

![image.png](attachment:image.png)

In [40]:
#Adjacency list
graph = dict()
graph['A'] = ['B', 'C']
graph['B'] = ['A', 'C', 'D']
graph['C'] = ['A', 'D', 'F']
graph['D'] = ['B', 'C']
graph['F'] = ['C']
print(graph)

#Adjacency Matrix - uses the graph = dict() above
matrix_elements = sorted(graph.keys())
cols = rows = len(matrix_elements)

adjacency_matrix = [[0 for _ in range(rows)] for _ in range(cols)]
edges_list = []

for key in matrix_elements:
    for neighbour in graph[key]:
        edges_list.append((key, neighbour))
print(edges_list)

for edge in edges_list:
    index_of_first_vertex = matrix_elements.index(edge[0])
    index_of_second_vertex = matrix_elements.index(edge[1])
    adjacency_matrix[index_of_first_vertex][index_of_second_vertex] = 1

for n in adjacency_matrix:
    print(n)

{'A': ['B', 'C'], 'B': ['A', 'C', 'D'], 'C': ['A', 'D', 'F'], 'D': ['B', 'C'], 'F': ['C']}
[('A', 'B'), ('A', 'C'), ('B', 'A'), ('B', 'C'), ('B', 'D'), ('C', 'A'), ('C', 'D'), ('C', 'F'), ('D', 'B'), ('D', 'C'), ('F', 'C')]
[0, 1, 1, 0, 0]
[1, 0, 1, 1, 0]
[1, 0, 0, 1, 1]
[0, 1, 1, 0, 0]
[0, 0, 1, 0, 0]


In [41]:
import heapq
from collections import deque

class Graph:
    def __init__(self):
        """Initialize an empty graph."""
        self.graph = {}

    def add_node(self, node):
        """
        Add a node to the graph.

        Args:
            node (str): The node to be added.
        """
        self.graph[node] = []

    def search_node(self, node):
        """
        Check if a node exists in the graph.

        Args:
            node (str): The node to search for.

        Returns:
            bool: True if the node is found, False otherwise.
        """
        return node in self.graph

    def remove_node(self, node):
        """
        Remove a node from the graph.

        Args:
            node (str): The node to be removed.
        """
        if node in self.graph:
            del self.graph[node]
            for n in self.graph:
                self.graph[n] = [x for x in self.graph[n] if x != node]

    def add_edge(self, node, edges, weight=1):
        """
        Add edges to a node in the graph.

        Args:
            node (str): The node to add edges to.
            edges (list): List of edges to be added to the node. Each edge can be a node or a tuple (node, weight).
            weight (int, optional): The weight of the edge. Default is 1.
        """
        if node in self.graph:
            if isinstance(edges, list):
                self.graph[node].extend([(neighbor, weight) if isinstance(neighbor, str) else neighbor for neighbor in edges])
            else:
                self.graph[node].append((edges, weight))


    def remove_edge(self, node1, node2):
        """
        Remove an edge between two nodes in the graph.

        Args:
            node1 (str): The first node.
            node2 (str): The second node.
        """
        if node1 in self.graph and node2 in self.graph:
            self.graph[node1] = [x for x in self.graph[node1] if x != node2]
            self.graph[node2] = [x for x in self.graph[node2] if x != node1]

    def bfs(self, start_node):
        """
        Perform a Breadth-First Search (BFS) traversal of the graph.

        Args:
            start_node (str): The node to start the traversal from.

        Returns:
            list: The traversal order as a list of nodes.
        """
        if start_node not in self.graph:
            return []

        visited = set()
        queue = deque([(start_node, 0)])  # Use a tuple (node, distance) in the queue
        traversal = []

        while queue:
            node, distance = queue.popleft()

            if node in visited:
                continue

            visited.add(node)
            traversal.append((node, distance))

            for neighbor, weight in self.graph[node]:
                if neighbor not in visited:
                    queue.append((neighbor, distance + weight))

        return traversal

    def dfs(self, start_node):
        """
        Perform a Depth-First Search (DFS) traversal of the graph.

        Args:
            start_node (str): The node to start the traversal from.

        Returns:
            list: The traversal order as a list of nodes.
        """
        if start_node not in self.graph:
            return []

        visited = set()
        stack = [(start_node, 0)]  # Use a tuple (node, distance) in the stack
        traversal = []

        while stack:
            node, distance = stack.pop()

            if node in visited:
                continue

            visited.add(node)
            traversal.append((node, distance))

            for neighbor, weight in self.graph[node]:
                if neighbor not in visited:
                    stack.append((neighbor, distance + weight))

        return traversal


    def get_edges(self):
        """Get all edges in the graph as a list of tuples (node1, node2, weight)."""
        edges = []
        for node, neighbors in self.graph.items():
            for neighbor in neighbors:
                weight = 1  # Default weight is 1 for unweighted graphs
                if isinstance(neighbor, tuple):
                    neighbor, weight = neighbor
                edges.append((node, neighbor, weight))
        return edges

    
    def find_parent(self, parent, node):
        """Find the parent of a node in a disjoint set."""
        if parent[node] == node:
            return node
        return self.find_parent(parent, parent[node])

    
    def kruskals_mst(self):
        """Find the Minimum Spanning Tree (MST) using Kruskal's algorithm."""
        edges = self.get_edges()
        edges.sort(key=lambda x: x[2])  # Sort edges by weight
        parent = {node: node for node in self.graph}
        mst = []
        for edge in edges:
            node1, node2, weight = edge
            parent_node1 = self.find_parent(parent, node1)
            parent_node2 = self.find_parent(parent, node2)
            if parent_node1 != parent_node2:
                mst.append(edge)
                parent[parent_node1] = parent_node2
        return mst

    def dijkstra(self, start_node):
        """
        Find the shortest paths from a given start_node to all other nodes using Dijkstra's algorithm.

        Args:
            start_node (str): The node to start the algorithm from.

        Returns:
            dict: A dictionary containing the shortest distances from the start_node to all other nodes.
                  The format is {node: distance}.
        """
        distances = {node: float('inf') for node in self.graph}
        distances[start_node] = 0
        priority_queue = [(0, start_node)]

        while priority_queue:
            current_distance, current_node = heapq.heappop(priority_queue)

            if current_distance > distances[current_node]:
                continue

            for neighbor, weight in self.graph[current_node]:
                distance = current_distance + weight
                if distance < distances[neighbor]:
                    distances[neighbor] = distance
                    heapq.heappush(priority_queue, (distance, neighbor))

        return distances

    def delete_graph(self):
        """Delete the entire graph by clearing its content."""
        self.graph = {}

    def __str__(self):
        """Return the string representation of the graph."""
        return str(self.graph)


# Create a graph object
g = Graph()

# Add cities as nodes to the graph
g.add_node("A")
g.add_node("B")
g.add_node("C")
g.add_node("D")
g.add_node("E")

# Add connections with weights as edges to the graph
g.add_edge("A", [("B", 4), ("C", 3)])
g.add_edge("B", [("C", 1), ("D", 2), ("E", 3)])
g.add_edge("C", [("D", 4)])
g.add_edge("D", [("E", 2)])

# Perform BFS traversal starting from city A
bfs_traversal = g.bfs("A")
print("BFS Traversal:", bfs_traversal)
# Output: BFS Traversal: ['A', 'B', 'C', 'D', 'E']

# Perform DFS traversal starting from city A
dfs_traversal = g.dfs("A")
print("DFS Traversal:", dfs_traversal)
# Output: DFS Traversal: ['A', 'B', 'C', 'D', 'E']

# Find the Minimum Spanning Tree (MST) using Kruskal's algorithm
mst = g.kruskals_mst()
print("Minimum Spanning Tree (MST):", mst)
# Output: Minimum Spanning Tree (MST): [('B', 'C', 1), ('B', 'D', 2), ('A', 'C', 3), ('D', 'E', 2)]

# Find the shortest paths from city A using Dijkstra's algorithm
shortest_distances = g.dijkstra("A")
print("Shortest Distances from city A:", shortest_distances)
# Output: Shortest Distances from city A: {'A': 0, 'B': 3, 'C': 3, 'D': 5, 'E': 7}

print(g.search_node('A'))   

BFS Traversal: [('A', 0), ('B', 4), ('C', 3), ('D', 6), ('E', 7)]
DFS Traversal: [('A', 0), ('C', 3), ('D', 7), ('E', 9), ('B', 4)]
Minimum Spanning Tree (MST): [('B', 'C', 1), ('B', 'D', 2), ('D', 'E', 2), ('A', 'C', 3)]
Shortest Distances from city A: {'A': 0, 'B': 4, 'C': 3, 'D': 6, 'E': 7}
True
