| Data Structure       | Implementations/Operations                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
| -------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Disjoint Sets        | `QuickUnion`<br><br>- Implemented by connecting trees arbitrarily<br>- Contructor, Connect, isConnected all run in $\theta(n)$ worst case, because trees can be long.<br>- We want to prevent the spindly tree case, which is why we need the next improvement to keep tree balanced.<br><br>`WeightedQuickUnion`<br><br>- We connect the root of the smaller tree to the root of the larger tree, so that the tree stays balanced. <br>- In this case, small and large are determined by weight, or the number of elements in each tree. <br>- Constructor still runs in $\theta(n)$ because we have to initialize each element<br>- `connect` and `isConnected` now run in $\theta(log(n))$ because when we go parent-searching, the tree gets pruned in its search.<br><br>`WeightedQuickUnion` with Path Compression<br><br>- As we keep calling `connect` and `isConnected`, we are traversing parents anyway, so it makes sense to just connect along the way<br>- No extra computational costs, but will improve the amortized efficiency over time<br>- Set parent `id` to the root for all items seen                                                                                                                                                                                                                                                                                                              |
| Regular BST          | - Runtimes<br>    - `insert` is $\theta(log(n))$ for a bushy tree, $\theta(n)$ for a spindly, worst-case tree<br>    - `find` is $\theta(log(n))$ for a bushy tree, $\theta(n)$ for a spindly, worst-case tree<br>- No more than two children per node, insertions are at the end first<br>- Since it’s hard to keep balance, we move to Bayer tree models that will keep our tree balanced                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| Bayer Tree           | - Generally, we look at `2-3Trees`, where each node can be stuffed with $2$ elements and each node has $n + 1$ children. This is an invariant, so each stuffed node must have $3$ children, and each normal node must have $2$ children.<br>- Insertion<br>    - Always add new nodes at the bottom, and split the node by the middle left if it gets too overstuffed<br>    - After a split, the invariant that each $n$ node must have $n + 1$ children is also still maintained<br>    - Root is split if we have to keep popping up all the way to the top because of overstuffing<br>- Runtimes<br>    - Guaranteed $\theta(log(n))$ for everything because we will always have a bushy tree<br>    - When $L$, or the stuffing limit gets high, this may begin to change                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
| Red-Black Bayer Tree | - Regular `BayerTrees` are difficult to implement, so we turn to the red-black tree, where each stuffed $2-3$ node is turned into a left-leaning “red link” instead. To convert a $2-3$ tree to an LLRB, just change all the stuffed nodes’ smaller members into red links, splitting elements and rotating links as necessary.<br>- Every path from the root to a leaf has the same number of black links. This is because every leaf in a 2-3 tree has same numbers of links from root. Therefore, the tree is balanced.<br>- Adding to an LLRB<br>    - First, just insert a red link with the new item at the correct location, then perform rotations to fix it.<br>    - If there is a right-leaning red link, perform a rotation to make it left.<br>    - If there are two consecutive left leaning links, rotate right on the top node.<br>    - If there is a node with two red links to children, flip the colors of all links that are touching that node.                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
| Hash Maps            | -  `HashTable` is an array of linked lists, where each array element represents a “bucket” to place elements into based on the modulus result of their resizing function.<br>- Collision Resolution, Ratios<br>    - The `LinkedList` in each array resolves this<br>    - We want to maintain some sort of load factor so that the number of elements in each bucket $M$ is always some relation to the number of elements $N$, so like we want to limit the bucket size $N/M$.<br>    - Cost of a given get, insert, or delete is given by number of entries in the linked list that must be examined, and the constant $N/M$ ratio.<br>- Expected amortized search and insert time (given the costs of resizing) is expected to be $N/M$, or $\theta(1)$.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| Heaps                | - A `Heap` is an array representation of the `PriorityQueue` ADT where each node is larger/smaller than all of its children. For example, a heap of height 5 is composed of two heaps of height 4 plus a parent.<br>- The following are ways to implement a `Heap`.<br>- Unordered Array<br>    - All operations will have a worst-case runtime of $\theta(n)$ because we could potentially have to traverse the entire thing to find the min/max value.<br>- Ordered Array<br>    - `add` is $\theta(n)$ because of array resizing<br>    - `getSmallest` is constant, just get the first element<br>    - `removeSmallest` will be $\theta(n)$ because of array resizing<br>- Bushy `BST`<br>    - All operations in $\theta(log(n))$ because of regular BST traversal procedures, despite not always being in order<br>- Insertion<br>    - Insert at the very end <br>    - Float to appropriate location<br>- Deletion<br>    - Swap the root with the last node<br>    - Keep swapping the new node with its higher-priority child until you’ve either reached a leaf or satisfied the heap condition that both children should have lower/higher priority than its parent<br>- Heap<br>    - `add` is $\theta(log(n))$, standard `BST` runtime<br>    - `removeSmallest` is $\theta(log(n))$, standard `BST` runtime<br>    - `getSmallest` is constant, due to the invariant that it’s always the top element |
| Tries                | - Useful generally for word operations such as prefix matching, longest prefixes, and spell-checking<br>- `add` and `contains` are not dependent on how many things we have in the `Trie` . They depend on the size of the query, not on the number of elements in the data structure, so actually, they run in $\theta(L)$, where $L$ is the length of the query.<br>- “Trie nodes typically do not contain letters, and that instead letters are stored implicitly on edge links. There are many ways of storing these links, and that the fastest but most memory hungry way is with an array of size R. We call such tries R-way tries.”                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
| KD Trees             | - `QuadTrees` <br>    - A rough implementation to solve the range searching problem, where the QuadTree has 4 neighbors, Northwest, Northeast, Southwest, and Southeast.<br>    - This is the concept of spatial partitioning, unlike uniform partitioning, where we throw points into buckets of predetermined area.<br>- `KDTrees`<br>    - Normal `BST`, but each level alternates which dimension we are working in<br>    - Example<br>        - At the root everything to the left has an X value less than the root and everything to the right has a X value greater than the root.<br>        - Then on the next level, every item to the left of some node has a Y value less than that item and everything to the right has a Y value greater than it.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
| Graphs               | - The `Graph` data structure consists of a set of nodes and a set of edges connecting the nodes, but now we can have more than one path between nodes.<br>- `DFS` for graphs is similar to`DFS` for trees, but since there are potential cycles within our graph, we need to mark the visited nodes so we don’t do infinite recursion.<br>- Can be represented multiple ways<br>    - Edge List - List of all start-to-end pairs, requires $\theta(E)$ memory<br>    - Adjacency Matrix - 2D array with all vertices listed each dimension, 1 or 0 represents existence of an edge, requires  $O(V^{2})$ memory<br>    - Adjacency List (often used) - List of vertices, each vertex index holds vertex numbers of outgoing edges, requires $\theta(V+E)$ memory                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |