# Chapter 02: Basic Data Structures

**Producer-Consumer Model**: one computational process is producing data and another is consuming the produced data

## Stacks and Queues

### Stacks

**Stack**: data structures that operates under the last-in, first-out principle

Supports two methods:

- push(o): insert object o at the top of the stack
- pop(): remove from the stack and return its top object

Array-based stacks are efficient (both push(o) and pop() operate in O(1) time), but have the limitation of a fixed size. To overcome this, stacks may also be implemented via linked lists

**Method stacks** are used to track local memory in the runtime environments of programming languages

### Queues

**Queue**: data structure that stores items in a first-in, first out order. Elements enter at the rear and are removed from the front

Supports two methods:

- enqueue(o): insert object o at the rear of the queue
- dequeue(): remove and return from the queue the object at the front 

Queues are typically tracked via two variables:

- **f**: index to the cell storing the first element
- **r**: index to the next available array cell

## Lists

### Index-based Lists

**Index-based lists**: linear sequence that supports access to its elements via their indices. A list storing n elements supports the following methods:

- get(r): returns the element located at index r
- set(r,e): replace the current element at r with e
- add(r,e): insert a new element e into the list to have index r
  - worst case of $\theta(n)$ r = 0, as all elements need to be shifted 
  - best case of $O(n)$ when acting on an item at the end of the list
- remove(r): remove the element at index r 
  - worst case of $\theta(n)$ r = 0, as all elements need to be shifted 
  - best case of $O(n)$ when acting on an item at the end of the list
  
A time-consuming part of index-based lists is shifting the elements to ensure they are continuous after every addition and removal

### Linked Lists

**Linked list**: container of elements that stores each element in a node and keeps references to neighboring nodes in a relative order. Linked lists support the following methods, all of which are $O(1)$:

- first(): return the position of the first element
- last(): return the position of the last element
- before(p): return the position of the element preceding the one at position p
- after(p): return the position of the element following the one at position p
- insertBefore(p,e): insert element e before position p
- insertAfter(p,e): insert element e after position p
- remove(p): remove the element at position p 

**Sentinel** nodes (head and tail) don't store data, but allows for easier insertion and deletion of data

Arrays require $O(N)$ space, where N is the size of the array. A doubly linked list meanwhile requires $O(n)$ space, where n is the number of elements in the sequence. When to use which depends on the situation at hand

## Trees

**Tree**: set of nodes storing elements in a parent-child relationship with the following properties:

- tree T has a special node r, called the root, with no parent
- each node v of T different from r has a unique parent node u

A node is **external** if it has no children (also known as a leaf)

Trees support the following methods:

- accessor
  - root(): return the root of the tree
  - parent(v): return the parent of node v
  - children(v): return a set containing the children of node v
- query
  - isInternal(v): test whether node v is internal
  - isExternal(v): test whether node v is external
  - isRoot(v): test whether node v is the root
- generic
  - size(): return the number of nodes in the tree
  - elements(): return a set containing all the elements stored at nodes of the tree
  - positions(): return a set containing all the nodes of the tree
  - swapElements(v,w): swap the elements stored at the nodes v and w
  - replaceElements(v,e): replace with e and return the element stored at node v

**Depth**: number of ancestors of v, excluding v itself. Calculated in $O(n)$ time

**Height**: maximum depth of an external node. Calculated in $O(n)$ time when called on the root

**Traversal**: systematic way of accessing all nodes of a tree

### Preorder Traversal

The root of T is visited first and then the sub-trees rooted at its children are recursively traversed. If the tree is ordered, then the sub-trees are traversed according to the order of the children. The following algorithm conducts a preorder traversal:

![preorder](./res/02-preorder.PNG)

This algorithm operates in $O(n)$ time. It yields the following tree:

![preorder-tree](./res/02-preorder-tree.PNG)

### Postorder Traversal

This traversal recursively traverses the subtrees rooted at the children of the root first, and then visits the root. Its algorithm is the following:

![postorder](./res/02-postorder.PNG)

This algorithm operates in $O(n)$ time. It yields the following tree:

![postorder-tree](./res/02-postorder-tree.PNG)

### Binary Trees

**Binary tree**: tree with at most two children per node. A binary tree is **proper** if each internal node has two children

Let T be a proper binary tree with n nodes, and let h denote the height of T. Then T has the following properties:

1. the number of external nodes in T is at least $h+1$ and is at most $2^h$
2. the number of internal nodes in T is at least h and at most $2^h-1$
3. the total number of nodes in T is at least $2h+1$ and at most $2^{h+1}-1$
4. the height of T is at least $log(n+1)$ and at most $(n-1)/2$, that is, $log(n+1)-1 \leq h \leq (n-1)/2$

In a proper binary tree T, the number of external nodes is 1 more than the number of internal nodes

An additional traversal is possible for binary trees, the **inorder** traversal. This methodology visits nodes "from left to right" on the tree

![inorder](./res/02-inorder.PNG)

This yields the following tree:

![inorder-tree](./res/02-inorder-tree.PNG)