Chapter 19 Priority Queues<br>

19.1 The Priority Queue ADT<br>

The Priority Queue ADT is a data type that stores a collection of items
with priorities (not necessarily unique) that supports the following operations.
- insert(item, priority) - Add item with the given priority.
- findmin() - Return the item with the minimum priority. If there
are multiple items with the minimum priority, ties may be broken
arbitrarily.
- removemin() - Remove and return the item with the minimum priority. Ties are broken arbitrarily.

19.2 Using a list

In [1]:
class SimpleListPQ:
    def __init__(self):
        self._L = []

    def insert(self, item, priority):
        self._L.append((item, priority))

    def findmin(self):
        return min(self._L, key = lambda x : x[1])[0]
    
    def removemin(self):
        item, priority = min(self._L, key = lambda x : x[1])
        self._L.remove((item, priority))
        return item

In [2]:
# refactored code using Entry class

class Entry:
    def __init__(self, item, priority):
        self.item = item
        self.priority = priority
    
    def __lt__(self, other):
        return self.priority < other.priority

In [3]:
from ds2.priorityqueue import Entry

class UnsortedListPQ:
    def __init__(self):
        self._entries = []

    def insert(self, item, priority):
        self._entries.append(Entry(item, priority))

    def findmin(self):
        return min(self._entries).item

    def removemin(self):
        entry = min(self._entries)
        self._entries.remove(entry)
        return entry.item

In [4]:
# improving the previous code

from ds2.priorityqueue import Entry

class SortedListPQ:
    def __init__(self):
        self._entries = []

    def insert(self, item, priority):
        self._entries.append(Entry(item, priority))
        self._entries.sort(reverse = True)
        
    def findmin(self):
        return self._entries[-1].item
    
    def removemin(self):
        return self._entries.pop().item

In [5]:
S = SortedListPQ()
S.insert("ham", 3)
S.insert("cheese", 1)
S.insert("bread", 2)
print([S.removemin() for i in [1,2,3]])

['cheese', 'bread', 'ham']


19.3 Heaps<br>

The data structure we’ll use for an efficient priority queue is called a heap.
Heaps are almost always used to implement a priority queue so that as you
look at other sources, you might not see a distinction between the two ideas.
As we are using it, the priority queue is the ADT, the heap is the data
structure.<br>

Matters are complicated by two other vocabulary issues. First, there are
many different kinds of heaps. We’ll study just one example, the so-called
binary heap. Second, the usual word used for the priority in a heap is ”key”.
This can be confusing, because unlike keys in mapping data structures,
there is no requirement that priorities be unique. We will stick with the
word ”priority” despite the usual conventions in order to keep these ideas
separate.<br>

We can think of a binary heap as a binary tree that is arranged so that
smaller priorities are above larger priorities. The name is apt as any good
heap of stuff should have the big things on the bottom and small things on
the top. For any tree with nodes that have priorities, we say that the tree
is heap-ordered if for every node, the priority of its children are at least
as large as the priority of the node itself. This naturally implies that the
minimum priority is at the root.

19.4 Storing a tree in a list<br>

For a node at index i, its left child will be at index 2 * i + 1 and
the right child will be at index 2 * i + 2. This means that the parent
of the node at index i is at index (i-1) // 2. This mapping of nodes to
indices puts the root at index 0 and the rest of the nodes appear level by
level from left-to-right, top-to-bottom. Using this mapping, we say a list
is heap-ordered if the corresponding binary tree is heap-ordered. This is
equivalent to saying that for all i > 0, the priority of the entry at index i is
greater than or equal to the priority at index (i − 1)//2.

In [6]:
from ds2.priorityqueue import Entry

class HeapPQ:
    def __init__(self):
        self._entries = []

    def insert(self, item, priority):
        self._entries.append(Entry(item, priority))
        self._upheap(len(self._entries) - 1)

    def _parent(self, i):
        return (i - 1) // 2
    
    def _children(self, i):
        left = 2 * i + 1
        right = 2 * i + 2
        return range(left, min(len(self._entries), right + 1))
    
    def _swap(self, a, b):
        L = self._entries
        L[a], L[b] = L[b], L[a]

    def _upheap(self, i):
        L = self._entries
        parent = self._parent(i)
        if i > 0 and L[i] < L[parent]:
            self._swap(i, parent)
            self._upheap(parent)

    def findmin(self):
        return self._entries[0].item
    
    def removemin(self):
        L = self._entries
        item = L[0].item
        L[0] = L[-1]
        L.pop()
        self._downheap(0)
        return item
    
    def _downheap(self, i):
        L = self._entries
        children = self._children(i)
        if children:
            child = min(children, key = lambda x:L[x])
            if L[child] <L[i]:
                self._swap(i, child)
                self._downheap(child)

    def __len__(self):
        return len(self._entries)

19.5 Building a Heap from scratch, heapify<br>

Just using the public interface, one could easily construct a HeapPQ from a
list of item-priority pairs. For example, the following code would work just
fine.

In [7]:
pq = HeapPQ()

pairs = [(10, 10), (2, 2), (30, 30), (4,4)]
for item, priority in pairs:
    pq.insert(item, priority)

The insert method takes O(log n) time, so the total running time for
this approach is O(n log n) time.<br>

Perhaps surprisingly, we can construct the HeapPQ in linear time. We
call this heapifying a list. We will exploit the downheap method that we
have already written. The code is deceptively simple.<br>

In [8]:
def _heapify(self):
    n = len(self._entries)
    for i in reversed(range(n)):
        self._downheap(i)

Look at the difference between the heapify code above and the heapify slower
code below. The former works from the end and downheaps every entry and
the latter starts at the beginning and upheaps every entry. Both are correct,
but one is faster.

In [9]:
def _heapify_slower(self):
    n = len(self._entries)
    for i in range(n):
        self._upheap(i)

They may seem to be the same, but they are not. To see why, we have to
look a little closer at the running time of upheap and downheap. Consider
the tree perspective of the list. For upheap, the running time depends on
the depth of the starting node. For downheap, the running time depends on
the height of the subtree rooted at the starting node. Looking at a complete
binary tree, half of the nodes are leaves and so downheap will take constant
time and upheap will take O(log n) time. Thus, heapify slower will take
at least $\frac{n}{2} log_{2} n = O(n log n)$ time.<br>

On the other hand, to analyze heapify, we have to add up the heights of
all the nodes in a complete binary tree. Formally, this will be $n\sum^{log_{2}n}_{i=1}  i/2^{i}$ .
There is a cute trick to bound this sum. Simply observe that if from every
node in the tree, we take a path that goes left on the first step and right for
every step thereafter, no two paths will overlap. This means that the sum
of the lengths of these paths (which is also to the sum of the heights) is at
most the total number of edges, n − 1. Thus, heapify runs in O(n) time.

19.6 Implicit and Changing Priorities

In [1]:
from ds2.priorityqueue import Entry, HeapPQ

class   PriorityQueue(HeapPQ):
    def __init__(self,
                 items = (),
                 entries = (),
                 key = lambda x : x):
        self._key = key
        self._entries = [Entry(i, p) for i, p in entries]
        self._entries.extend([Entry(i, key(i)) for i in items])
        self._itemmap = {entry.item : index for index, entry in enumerate(self._entries)}
        self._heapify()

    def insert(self, item, priorirty=None):
        if priorirty is None:
            priorirty = self._key(item)
        index = len(self._entries)
        self._entries.append(Entry(item, priorirty))
        self._itemmap[item] = index
        self._upheap(index)

    def _swap(self, a, b):
        L = self._entries
        va = L[a].item
        vb = L[b].item
        self._itemmap[va] = b
        self._itemmap[vb] = a
        L[a], L[b] = L[b], L[a]
    
    def changepriority(self, item, priority = None):
        if priority is None:
            priority = self._key(item)
        i = self._itemmap[item]
        self._entries[i].priority = priority
        # Assuming the tree is heap ordered, only one will have an effect.
        self._upheap(i)
        self._downheap(i)

    def removemin(self):
        L = self._entries
        item = L[0].item
        self._swap(0, len(L) - 1)
        del self._itemmap[item]
        L.pop()
        self._downheap(0)
        return item

In [2]:
maxheap = PriorityQueue(key = lambda x: -x)
n = 10
for i in range(n):
    maxheap.insert(i) #no need to specify the priority
# These should print in decreasing order.
print([maxheap.removemin() for i in range(n)])

[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]


19.7 Random Access<br>

Storing the map from items to their indices allows some other operations
on the heap. For example, we could remove an arbitrary item by using the
same approach as in removemin. Instead of working with index 0 (the top
of the heap), we instead find the index of the item to remove. The following
code gives the factored version of both removemin and remove.

In [3]:
def _remove_at_index(self, index):
    L = self._entries
    self._swap(index, len(L) - 1)
    del self._itemmap[L[-1].item]
    L.pop()
    self._downheap(index)

def removemin(self):
    item = self._entries[0].item
    self._remove_at_index(0)
    return item

def remove(self, item):
    self.remove_at_index(self._itemmap[item])

19.8 Iterating over a Priority Queue

In [4]:
def __iter__(self):
    return self

def __next__(self):
    if len(self) > 0:
        return self.removemin()
    else:
        raise StopIteration

19.9 Heapsort

In [5]:
from ds2.priorityqueue import PriorityQueue

def heapsort(L):
    H = PriorityQueue(L)
    L[:] = [item for item in H]

L = [3,2,4,1, 6, 5]
print("before heapsort:", L)
heapsort(L)
print("after heapsort: ", L)

before heapsort: [3, 2, 4, 1, 6, 5]
after heapsort:  [1, 2, 3, 4, 5, 6]
