# Data Structure

![d82004a61fa9c349f7637265f9d3f2a.jpg](attachment:d82004a61fa9c349f7637265f9d3f2a.jpg)

## Arrays

- Consecutive elements
- Optimal for accessing the elements with indices, O(1)
- Searching and deleting a specific value is O(n)
- Brute force solution always use O(n) space, and clever solution uses O(1) space

In [3]:
# List Creation
a = [x for x in range(0,11)]

# Cumulative Sum
c_sum = [sum(a[:x]) for x in range(1, len(a)+1)]

a, c_sum

([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10], [0, 1, 3, 6, 10, 15, 21, 28, 36, 45, 55])

**Moving all the negative elements to one side of the array**

**Merging two sorted arrays**

**Finding specific sub-squences**

## Linked Lists

- Composed of nodes with data that have pointers to ther nodes
- First node called the head, last node called the tail
- Optimal for insertion and deletion, which are O(1)
- Worse for indexing and searching, which are O(n)

In [None]:
class Node:
    def __init__(self, val):
        self.val = val
        self.next = None

In [None]:
class Linkedlist:
    def __init__(self):
        self.head = None

    def insertAtBegin(self, value):
        new_node = Node(value)
        if self.head is None:
            self.head = new_node
        else:
           new_node.next = self.head
           self.head = new_node
           
    def insertAtEnd(self, value):
        new_node = Node(value)
        if self.head is None:
            self.head = new_node
        else:
            curr = self.head.next
            while curr:
                curr = curr.next
            curr = new_node

    def insertAtIndex(self, value, idx):
        new_node = Node(value)
        curr = self.head
        pos = 0
        if pos == idx: # insert at the beginning
            self.insertAtBegin(value)
        else: # insert at the position within the linkedlist
            while curr and pos+1 != idx: 
                pos += 1
                curr = curr.next
            if pos+1 == idx and curr: # insert to the right place
                new_node.next = curr.next
                curr.next = new_node
            elif pos+1 != idx and curr is None: # has no such location
                print("Index does not exist.")
            elif pos+1 == idx and curr is None: # has no such location
                print("Index does not exist.")

    def printList(self):
        curr = self.head
        while curr:
            print(curr)
            curr = curr.next

    def reverse(self):
        prev = None
        curr = self.head
        while curr:
            next = curr.next # store the next for iteration
            curr.next = prev
            prev = curr
            curr = next
        self.head = prev # the final node
        

**Reversing a linked list**

**Detecting a cycle in a linked list**

**Removing depulicates from a sorted linked list**

**Checking if a linked list represents a palindrome**

## Stacks & Queues

- Stack: LIFO
    - push & pop
    - uses in recursive operations
- Queue: FIFO
    - enqueue & dequeue
    - uses in iterative operations
- Implemented using an array or linked list

In [None]:
# Using List
class Stack:
    def __init__(self):
        self.items = []
    
    def push(self, value):
        self.items.append(value)

    def pop(self):
        if len(self.items)>0:
            return self.items.pop()

In [None]:
class Queue:
    def __init__(self):
        self.items = []

    def enqueue(self, value):
        self.items.append(value)

    def dequeue(self):
        if len(self.items)>0:
            return self.items.pop(0)

In [None]:
def check_balance(string):
    left_parenthese = ["{", "[", "("]
    right_parenthese = ["}", "]", ")"]
    stack = []
    
    for i in string:
        if i in left_parenthese: # left parenthese and push onto the stack
            stack.append(i) 
        elif i in right_parenthese: # right parenthese are verified to be the same as the left-side
            right_idx = right_parenthese.index(i)
            if len(stack) == 0 or left_parenthese[right_idx] != stack[len(stack)-1]: # if the same, pop from the stack; otherwise, return false
                return False
            else:
                stack.pop()
        # continue parsing the input until it both stacks empty
    
    if len(stack) != 0:
        return False
    
    return True
    

**Parser to evaluate regular experssions**

**Evaluating a math formula using order of operations rules**

**BFS or DFS on a graph**

## Hash Maps

- Store key-value pairs, hash function to compute the index
- Lookups, insertion, and deletion all takes O(1)

In [None]:
# Brute force solution takes O(n^2), while hash function takes O(n)
def check_sum(a, target):
    # create a dict to store key-value (num-addend) pair
    d = {}
    # iterate the list and check if the targeted addend exist in the key:
    # if contains, return true; otherwise, add to the keys of the dict
    for i in a:
        if target - i in d:
            return True
        else:
            d[i] = i
    return False

# Return indices version
def twoSum(nums: List[int], target: int) -> List[int]:
        # a dict to store the key(needed value) and value(its original index)
        table = dict()

        for i in range(len(nums)):
            current_val = nums[i]
            # if the needed value exists, return the indices; otherwise, add to the dict
            if current_val in table.keys():
                return [table[current_val],i]
            else:
                table[target-current_val] = i
                print(target-current_val,table[target-current_val])

**Finding the unions or intersection of two lists**

**Finding the frequency of each word in a piece of text**

**Finding 4 elements a, b, c and d in a list such that a+b = c+d**

## Trees

### Binary Search Trees

- RoOt node & subtrees of children nodes
- Traverse order:
    - in-order: left -> current -> right
    - pre-order: current -> left -> right
    - post-order: left -> right -> current (subtree is also left to right)
- Binary Search Tree: left is smaller, right is larger

In [None]:
class TreeNode:
    def __init__(self, value):
        self.val = value
        self.left = None
        self.right = None

In [None]:
def inorder(node):
    if node is None:
        return []
    else:
        return inorder(node.left) + [node.val] + inorder(node.right)
        

In [None]:
class BST:
    def __init__(self, value):
        self.root = TreeNode(value)
    
    def insert(self, node, value):
        if node is not None:
            if value < node.val:
                if node.left is None:
                    node.left = TreeNode(value)
                else:
                    self.insert(node.left, value)
            else:
                if node.right is None:
                    node.right = TreeNode(value)
                else:
                    self.insert(node.right, value)
        else:
            self.root = TreeNode(value)
        return 

**Binary Search**

In [None]:
def binary_search(nums, target):
    # set left and right boundaries and pointer
        left, right = 0, len(nums) - 1
        # split intervals and narrow down
        while left <= right: # <= is to prevent the possible skipping
            mid = left + (right-left) // 2
            if nums[mid] == target:
                return mid
            elif nums[mid] < target: # in the right interval
                left = mid+1
            elif nums[mid] > target: # in the left interval
                right = mid-1
        
        return -1

## Heaps

- Max heap: each parent node >= child node, look up largest value is O(1)
- Heapify: bubble up/down to insert and delete - O(logn)
- Search is O(n): every node needs to be checked

In [1]:
import heapq

k = 5
a = [13, 5, 2, 6, 10, 9, 7 , 4, 3]
heapq.heapify(a)
heapq.nlargest(k,a)

[13, 10, 9, 7, 6]

## Graphs

- Represented in 2 ways: adjacency matrix & adjacency list
    - adjacency matrix: check neighbour takes O(1)
    - adjacency list: check neighbour takes O(n)

In [None]:
# Adjacency list
class Vertex:
    def __init__(self, value):
        self.val = value
        self.neighbors = {}

    def add_to_neighbors(self, neighbor, w): # w is weight
        self.neighbors[neighbor] = w

    def get_neighbors(self):
        return self.neighbors.keys()
    
class Graph:
    def __init__(self):
        self.vertices = {}

    def add_vertex(self, value):
        new_vertex = Vertex(value)
        self.vertices[value] = new_vertex
        return new_vertex
    
    def add_edge(self, u, v, weight):
        if u not in self.vertices:
            self.add_vertex(Vertex(u))
        if v not in self.vertices:
            self.add_vertex(Vertex(v))
        self.vertices[u].add_to_neighbors(self.vertices[v], weight)

    def get_vertex(self, value):
        return self.vertices[value]

    def get_vertices(self):
        return self.vertices.keys()

    def __iter__(self):
        return iter(self,vertices)

### BFS & DFS

- BFS: start with adding one node to queue, and for each node in the queue, process the node and add its neighbors to the queue.
- DFS: start with one node, recursively proceess each neighbor one by one.

In [None]:
def bfs(graph, v):
    # initiate queue and visited
    n = len(graph.vertices)
    visited = [False for i in range(n+1)]
    queue = []

    # add the starting node
    queue.append(v)
    visited[v] = True

    # always add the neighbor nodes to the queue until all node visited
    while queue:
        curr = queue.pop(0)
        for i in graph.get_vertex(curr):
            if visited[i.val] == False:
                queue.append(i.val)
                visited[i.val] = True
                
    return visited

In [None]:
def dfs_helper(graph, v, visited):
    visited.add(v)
    for neighbor in graph.get_vertex(v).get_neighbors():
        if neighbor.val not in visited:
            dfs_helper(graph, neighbor.val, visited)

    return visited

def dfs(graph, v):
    visited = set()
    return dfs_helper(graph, v, visited)

# Algorithm

## Matrix Multiplication

## Recursion

In [None]:
def fib(n):
    if n == 0:
        return 0
    if n == 1 or n == 2:
        return 1
    else:
        return fib(n-1)+fib(n-2)

In [None]:
# dynamic version of fib
def dyn_fib(n):
    dp = [0 for _ in range(n+1)]
    dp[0] = 0
    dp[1] = 1
    
    for i in range(2, n+1):
        dp[i] = dp[i-1]+dp[i-2]
    
    return dp[n]

## Sorting

![image.png](attachment:image.png)

### Mergesort

- Divide & Conquer
    1. repeatedly divide the input into smaller subarrays until single element.
    2. repeatedly merge the smaller sorted arrays until the entire input is merged.
- Time & Space: O(nlogn)

In [None]:
def merge_helper(a, low, high, mid):
    if len(a) == 1:
        return a
    i = j = k = 0

    left = a[:mid]
    right = a[mid:]

    while i < len(left) and j <len(right):
        if left[i]<right[j]:
            a[k] = left[i]
            i+=1
        else:
            a[k] = right[j]
            j +=1
        k+=1

    while i <len(left):
        a[k] = left[i]
        i+=1
        k+=1
    while j <len(right):
        a[k] = right[j]
        j+=1
        k+=1
    return a

def mergesort(a, low, high):
    if low>=high:
        return a
    mid = (low+high-1) // 2
    mergesort(a, low, mid)
    mergesort(b, mid+1, high)
    merge_helper(a, low, high, len(a)//2)
    return a

### Quicksort

- Select a pivot and puts all elements smaller than pivot to the left of the pivot
- Recursively
- Worst case: O(n^2), avg: O(nlogn)

## Greedy Algorithms

## Dynamic Programming