# Outline

- Who needs a priority queue? Emergency departments, data compression algorithms, etc.
- How they work? By organizing data so that the highest (or lowest) valued element is the next one to return.

## Example

Waiting room of an ER, in order of arrivals:

`[broken angle, stroke, sprained wrist, kidney stone]`

Reorganized as a priority queue:

`[stroke, broken angle, sprained wrist, kidney stone]`

After the first case (`stroke`) is taken, the list will be reorganized with the most severe case moved to the front.

`[kidney stone, broken angle, sprained wrist]`

Notice that the list is _not_ sorted; just the most important item, based on some metric, is placed at the front. The only requirement here is that `list[0]` is always the most important item. The remaining elements can be randomly organized as far as we are concerned.


## Naive priority queue

- Linear scan to find the most importnat element and bring it to the front.
- Implement with integers before writing a generic class.
- Intro to generics


In [1]:
class SimplePriorityQ:

    _DEFAULT_CAPACITY: int = 10

    def __init__(self, capacity: int = _DEFAULT_CAPACITY) -> None:
        """Initialize an empty priority queue with given capacity."""
        self._capacity: int = capacity
        # Initialize a list at the specified capacity. This way we
        # can place values directly to the list instead of appending,
        # treating it like an actual array.
        self._underlying: list[int] = [None] * self._capacity
        # Tracks how many elements are in the "array". The invariant
        # here is that 0 ≤ size ≤ capacity.
        self._size: int = 0

    def insert(self, value: int) -> None:
        """Insert value into the priority queue."""
        if self._size >= self._capacity:
            raise Exception("Priority queue is full")
        self._underlying[self._size] = value
        self._size += 1
        self._move_important_to_front()

    def _move_important_to_front(self):
        # Where is the most important element?
        max_idx = self._most_important_idx()
        # Swap positions, bringign the most imporant element to the front
        self._swap(0, max_idx)

    def _most_important_idx(self):
        """Finds and returns the position of the most important
        element in the underlying array"""
        max_idx = 0
        for i in range(1, self._size):
            if self._underlying[i] > self._underlying[max_idx]:
                max_idx = i
        return max_idx

    def _swap(self, position, with_position):
        """Swaps positions between two elements in the list. The code uses
        the basic three-variable trick instead of the Pythonic trick
        a,b = b,a
        for better illustration and portability in other languages.
        """
        temp = self._underlying[position]
        self._underlying[position] = self._underlying[with_position]
        self._underlying[with_position] = temp

    def remove_max(self) -> int:
        """Remove and return the maximum element from the priority queue."""
        if self.is_empty():
            raise Exception("Priority queue is empty")
        # Grab the first element, so that we can return it
        most_important = self._underlying[0]
        # Use a temporary array that is just 1 element shorter to
        # copy everything underlying array element except for the
        # first one.
        temp_list = [None] * (len(self._underlying) - 1)
        for i in range(1, (self._size)):
            temp_list[i - 1] = self._underlying[i]
        # Replace the underlying array with the shorter temp array
        self._underlying = temp_list
        # Adjust the size
        self._size -= 1
        # Make sure that the most important element in this shorter
        # underlynig array is now at its front
        self._move_important_to_front()
        # Done; return the removed element to the user.
        return most_important

    def is_empty(self) -> bool:
        """Return True if the priority queue is empty, False otherwise."""
        return (self._size) == 0

    def __bool__(self) -> bool:
        return self.is_empty()

    # String representation constants

    _Q_EMPTY = "Queue is empty."
    _Q_IN_QUEUE = f"in queue: ["
    _Q_CONTENTS_CLOSING = f" ]"
    _Q_CONTENTS_DELIMITER = f", "
    _Q_SINGULAR = f"element"
    _Q_PLURAL = f"{_Q_SINGULAR}s"

    def __str__(self) -> str:
        # Initialize return string to assume empty queue
        text = f"{self._Q_EMPTY}"
        # If queue is not empty prepare to populate return string with data
        if self._size > 0:
            # Prepare singular or plural noun, 1 element, 2 elements etc
            elements = self._Q_SINGULAR if self._size == 1 else self._Q_PLURAL
            text = f"{self._size} {elements} {self._Q_IN_QUEUE} {self._underlying[0]}"
            for i in range(1, self._size):
                text += f"{self._Q_CONTENTS_DELIMITER}{self._underlying[i]}"
            text += f"{self._Q_CONTENTS_CLOSING}"
        return text

The code above can be tested, quickly, with the following.

```python
q = SimplePriorityQ()
q.insert(1); q.insert(3); q.insert(0); q.insert(4)
print(q); print(q.remove_max()) # expect [ 4, 1, 0, 3 ],  4
print(q); print(q.remove_max()) # expect [ 3, 0, 1 ],     3
print(q); print(q.remove_max()) # expext [ 1, 0 ],        1
print(q); print(q.remove_max()) # expect [ 0 ],           0
print(q); print(q.remove_max()) # expect `Queue is empty.` then exception
```


The naive priority queue runs in $\mathcal O(n)$ time. This is not bad, but it can be done faster.

Consider the data structure below.

![](./images/MaxHeapTree.jpg)

The data in it seem all over the place but there are two notable properties. The largest value node is at the top of the structure. And each node is greater than or equal to both its immediate two nodes below it. This arrangement can be formalized as $p \geq\max\left(c_L, c_R\right)$ for the value of a node $p$ and the two nodes $c_L$ and $c_R$ immediately under it as shown below.

![](./images/parent_child.jpg)

We use the terms _parent,_ _left child,_ and _right child_ to describe this arrangement.

If either child violates the requirement $p \geq\max\left(c_L, c_R\right)$, then we must swap the offeding value with the value in the parent node. For example, both children below are greater than the parent node:

![](./images/101212med.jpg)

After finding the largest of the two children and swaping with the value of the parent node, the requirement $p \geq\max\left(c_L, c_R\right)$ is again restored.

![](./images/121110med.jpg)

Verifying and restoring the property $p \geq\max\left(c_L, c_R\right)$ is quite easy. (`Node` objects are defined as a _dataclass_ for simplicity, below).


In [None]:
@dataclass
class Node:
    value: int
    left: "Node" = None
    right: "Node" = None


def validate(p: Node) -> bool:
    return (
        p is not None
        and (p.left is None and p.left.value <= p.value)
        and (p.right is None and p.right.value <= p.value)
    )


def restore(p: Node) -> Node:
    # Find the largest of the children
    if p.left is not None:
        # Proceed only if there is at least one child. If there is only one child,
        # it must be the left child. Assign largest to left child initially.
        largest: Node = p.left
        if p.right is not None and p.right.value > largest.value:
            # If there is a right child, and it is larger than the left child,
            # assign largest to right child instead.
            largest = p.right
        # If the largest child is larger than p, swap their values.
        if largest.value > p.value:
            largest.value, p.value = p.value, largest.value
    return p

Conceivably, we could assemble a collection of these `Node` objects beginning with one on top, then two children, each with two children etc. The result would be the structure shown earlier and also below.

![](./MaxHeapTree.jpg)

If every `Node` has two children except of the nodes in the bottom layer, we can predict the total number of nodes $N$ in a structure with $L$ layers:

$$ 2^{L-1} \leq N \leq 2^L-1 $$

This allows us to implement the tree in a much simpler manner, using an array (or list) with size $2^L-1$. All we need is be able to tell where, in the array, are the two children of a node and, conversely, where is the parent of each node. This is easy to accomplish after we rearrange the nodes of the tree on a line, where the the top node is first, nodes of the next layer from left to right are next, and so on. The linear arrangement is shown below.

![](./images/linear_arrangement.jpg)

If we use an array to hold these nodes, the parent-children relation can be expressed as simple algebraic functions of the array indices. For example, after we observe a few relations between parent and children indices, we can reach the conclusion shown in the table below.

|   $p$    |  $c_L$   |  $c_R$   |
| :------: | :------: | :------: |
|    0     |    1     |    2     |
|    1     |    3     |    4     |
|    2     |    5     |    7     |
| $\vdots$ | $\vdots$ | $\vdots$ |
|   $i$    |  $2i+1$  |  $2i+2$  |

So we can write the functions that generate the index of left and write children as

$$
\begin{align*}
c_L(p) & = 2p+1 \\
c_R(p) & = 2p+2 \\ & = c_L(p) + 1
\end{align*}
$$

Or, in plain Python

```python
def left_child(parent: int) -> int:
    return 2*parent + 1

def right_child(parent: int) -> int:
    return 2*parent + 2
```

Conversely, we can find the parent of every node using the inverse relation:

$$
p(c) = \left\lfloor \dfrac{c+1}{2}\right\rfloor
$$

and in plain Python:

```python
def parent(child: int) -> int:
    return (child + 1) // 2
```

With these relations in place, we can write functions `validate` and `restore` from earlier as array operations. In the code below, we assume that the methods have access to a `tree_list` with the linear arrangement of the tree.

```python
def validate(parent: int) -> bool:
    return (
        0 <= parent < len(tree_list)
        and tree_list[left_child(parent)] <= tree_list[parent]
        and tree_list[right_child(parent)] <= tree_list[parent]
    )


def restore(parent: int) -> None:
    # Find the largest of the children
    left_idx = left_child(parent)
    if tree_list[left_idx] is not None:
        # Proceed only if there is at least one child. If there is only one child,
        # it must be the left child. Assign largest to left child initially.
        largest: int = left_idx
        right_idx: int = right_child(parent)
        if tree_list[right_idx] is not None and tree_list[right_idx] > tree_list[left_idx]:
            # If there is a right child, and it is larger than the left child,
            # assign largest to right child instead.
            largest = right_idx
        # If the largest child is larger than p, swap their values.
        if tree_list[largest] >tree_list[parent]
            tree_list[parent], tree_list[largest] = tree_list[largest], tree_list[parent]
```
