<center><img src="img/dsa-logo.JPG" width="400"/>

***

<center>Lecture 10</center>

***

<center>Sorted Maps and Trees</center>  

***

<center>31 October 2023<center>
<center>Rahman Peimankar<center>

# Agenda

1. Sorted Maps
2. Sorted Search Tables
3. An implementation of a SortedTableMap class
4. Running Time Analysis
5. An Application of Sorted Maps
6. Trees
7. The Tree Abstract Data Type
8. A Tree Abstract Base Class in Python
9. Exercices

# Recap of Last Week

## 1. Hash Tables

In general, a hash table consists of two major components, a **bucket array** and a **hash function**.

<center>
<img src="img/Qimage-1-lecture9.JPG" width="900"/>

## 2. Hash Functions

<center>
<img src="img/Qimage-2-lecture9.JPG" width="700"/>

1. Polynomial Hash Codes

<center>
<img src="img/Qimage-3-lecture9.JPG" width="700"/>

* If a Hash function adds all characters in a word by their ASCII-codes, then as example these words:

``stop``, ``tops``, ``pots``, ``spot``

would generate the same sum-value:

454 (‘s’=115, ‘t’=116, ‘o’=111, ‘p’=112)

Using the polynomial form instead would give different values for these words:


‘s’ $\times$ 8 + ’t’ $\times$ 4 + ’o’ $\times$ 2 + ’p’ = 1718, 

and 

‘t’$ \times$ 8 + ’o’ $\times$ 4 + ’p’ $\times$ 2 + ’s’ = 1711, ....

2. Cyclic-Shift Hash Codes

<center>
<img src="img/Qimage-5-lecture9.JPG" width="500"/>

    
is achieved by taking the leftmost five bits and placing those on the rightmost side of the representation, resulting in:
    
<center>
<img src="img/Qimage-6-lecture9.JPG" width="500"/>

1. Compression - The Division Method

<center>
<img src="img/Qimage-8-lecture9.JPG" width="250"/>

2. The Multiply-Add-and-Divide (MAD) Method

<center>
<img src="img/Qimage-9-lecture9.JPG" width="400"/>

## 3. Collision Handling

Two or more Key values could map to the same index (same bucket in A) –and then we have a collision.

#### Separate Chaining

<center>
<img src="img/Qimage-10-lecture9.JPG" width="600"/>

## 4. Efficiency of Hash Tables

<center>
<img src="img/Qimage-11-lecture9.JPG" width="600"/>

<center>
    
# 1. Sorted Maps

* The traditional map ADT allows a user to look up the value associated with a given key, but the search for that key is a form known as an **exact search**.


* However, the map ADT does not provide any way to get a list of all events ordered by the time



**Can you give an example of a computer system or map data structure with Time Stamps?**

We introduce an extension known as the **sorted map ADT** that includes all behaviors of the **standard map**, plus the following:

<center>
<img src="img/Qimage-1.JPG" width="900"/>

<center>
<img src="img/Qimage-2.JPG" width="900"/>

<center>
    
# 2. Sorted Search Tables

* We begin by exploring a simple implementation of a sorted map. 

* We store the map’s items in an array-based sequence A so that they are in increasing order of their keys,

<center>
<img src="img/Qimage-3.JPG" width="700"/>
    
    Realization of a map by means of a sorted search table. We show only the keys for this map, so as to highlight their ordering.

1. What would be the space complexity of **Sorted Search Table**?

2. What would be the main advantage of this representation, and our reason for insisting that A be array-based?

<center>
    
# 3. An implementation of a SortedTableMap class

1. nonpublic behaviors

In [5]:
from collections import MutableMapping

class MapBase(MutableMapping):
    """Our own abstract base class that includes a nonpublic Item class."""

#------------------------------- nested Item class -------------------------------
    class _Item:
        """Lightweight composite to store key-value pairs as map items."""
        __slots__ = '_key' , '_value'

        def __init__(self, k, v):
            self._key = k
            self._value = v

        def __eq__(self, other):
            return self._key == other._key # compare items based on their keys

        def __ne__(self, other):
            return not (self == other) # opposite of eq

        def __lt__(self, other):
            return self._key < other._key # compare items based on their keys

  from collections import MutableMapping


In [6]:
class SortedTableMap(MapBase):
    """Map implementation using a sorted table."""

    #----------------------------- nonpublic behaviors -----------------------------
    def _find_index(self, k, low, high):
        """Return index of the leftmost item with key greater than or equal to k.
        
        Return high + 1 if no such item qualifies.
        
        That is, j will be returned such that:
            all items of slice table[low:j] have key < k
            all items of slice table[j:high+1] have key >= k
        """
        if high < low:
            return high + 1 # no element qualifies
        else:
            mid = (low + high) // 2
            if k == self._table[mid]._key:
                return mid # found exact match
            elif k < self._table[mid]._key:
                return self._find_index(k, low, mid - 1) # Note: may return mid
            else:
                return self._find_index(k, mid + 1, high) # answer is right of mid

2. public behaviors

In [7]:
    #----------------------------- public behaviors -----------------------------
    def __init__(self):
        """Create an empty map."""
        self._table = [ ]

    def __len__(self):
        """Return number of items in the map."""
        return len(self._table)

    def __getitem__(self, k):
        """Return value associated with key k (raise KeyError if not found)."""
        j = self._find_index(k, 0, len(self._table) - 1)
        if j == len(self._table) or self._table[j]._key != k:
            raise KeyError('Key Error:' + repr(k))
        return self._table[j]._value
    

In [8]:
    def __setitem__(self, k, v):
        """Assign value v to key k, overwriting existing value if present."""
        j = self._find_index(k, 0, len(self._table) - 1)
        if j < len(self._table) and self._table[j]._key == k:
            self._table[j]._value = v # reassign value
        else:
            self._table.insert(j, self._Item(k,v)) # adds new item
    
    def __delitem__(self, k):
        """Remove item associated with key k (raise KeyError if not found)."""
        j = self._find_index(k, 0, len(self._table) - 1)
        if j == len(self._table) or self._table[j]._key != k:
            raise KeyError('Key Error:' + repr(k))
        self._table.pop(j) # delete item

    def __iter__(self):
        """Generate keys of the map ordered from minimum to maximum."""
        for item in self._table:
            yield item._key
            

In [11]:
    def __reversed__(self):
        """Generate keys of the map ordered from maximum to minimum."""
        for item in reversed(self._table):
            yield item._key

    def find_min(self):
        """Return (key,value) pair with minimum key (or None if empty)."""
        if len(self._table) > 0:
            return (self._table[0]._key, self._table[0]._value)
        else:
            return None

    def find_max(self):
        """Return (key,value) pair with maximum key (or None if empty)."""
        if len(self._table) > 0:
            return (self._table[-1]._key, self._table[-1]._value)
        else:
            return None
    
    def find_ge(self, k):
        """Return (key,value) pair with least key greater than or equal to k."""
        j = self._find_index(k, 0, len(self._table) - 1) # j s key >= k
        if j < len(self._table):
            return (self._table[j]._key, self._table[j]._value)
        else:
            return None

In [12]:
    def find_lt(self, k):
        """Return (key,value) pair with greatest key strictly less than k."""
        j = self._find_index(k, 0, len(self._table) - 1) # j s key >= k
        if j > 0:
            return (self._table[j-1]. key, self._table[j-1]._value) # Note use of j-1
        else:
            return None

    def find_gt(self, k):
        """Return (key,value) pair with least key strictly greater than k."""
        j = self._find_index(k, 0, len(self._table) - 1) # j s key >= k
        if j < len(self._table) and self._table[j]._key == k:
            j += 1 # advanced past match
        if j < len(self._table):
            return (self._table[j]._key, self._table[j]._value)
        else:
            return None

    def find_range(self, start, stop):
        """Iterate all (key,value) pairs such that start <= key < stop.
        
        If start is None, iteration begins with minimum key of map.
        If stop is None, iteration continues through the maximum key of map.
        """
        if start is None:
            j = 0
        else:
            j = self._find_index(start, 0, len(self._table)-1) # find first result
        while j < len(self._table) and (stop is None or self._table[j]._key < stop):
            yield (self._table[j]._key, self._table[j]._value)
            j += 1
            

<center>
    
# 4. Running Time Analysis

<center>
<img src="img/Qimage-4.JPG" width="1000"/>

<center>
    
# 5. An Application of Sorted Maps

* Let's explore an application in which there is particular advantage to using a **sorted map** rather than a traditional **(unsorted) map**.

### Flight Databases

* There are several Web sites on the Internet that allow users to perform queries on flight databases to find flights between various cities.

* To make a query, a user specifies origin and destination cities, a departure date, and a departure time.

* To support such queries, we can model the flight database as a map, where keys are Flight objects that contain fields corresponding to these four parameters.

* And a key is a tuple:
        k = (origin,destination,date,time)

* Finding a requested flight is not simply a matter of finding an exact match for a requested query.
* Although a user typically wants to exactly match the origin and destination cities, he or she may have flexibility for the departure date, and certainly will have some flexibility for the departure time on a specific day.

* We can handle such a query by ordering our keys lexicographically.
* Then, an efficient implementation for a sorted map would be a good way to satisfy users’ queries.

1. Given a user query key k, we could call find ge(k) to return the first flight between the desired cities, having a departure date and time matching the desired query or later.

2. Better yet, with well-constructed keys, we could use find range(k1, k2) to find all flights within a given range of times.
 

**For example**

if ``k1 = (ORD, PVD, 05May, 09:30)``, and ``k2 = (ORD, PVD, 05May, 20:00)``, a respective call to find range(k1, k2) might result in the following sequence of key-value pairs:

<center>
<img src="img/Qimage-5.JPG" width="1000"/>

<center>
    
# 6. Trees

* A **_tree_** is an abstract data type that stores elements hierarchically.
* With the exception of the top element, each element in a tree has a **_parent_** element and zero or more **_children_** elements.
* We typically call the top element the **_root_** of the tree.

<center>
<img src="img/Qimage-6.JPG" width="600"/>

Formally, we define a **_tree T_** as a set of **_nodes_** storing elements such that the nodes
have a **_parent-child_** relationship that satisfies the following properties:

* If $T$ is nonempty, it has a special node, called the **_root_** of $T$, that has no parent.


* Each node $v$ of $T$ different from the root has a unique **_parent_** node $w$; every node with parent $w$ is a **_child_** of $w$.

A tree can be empty, meaning that it does not have any nodes.

### Other Node Relationships

* Two nodes that are children of the same parent are **_siblings_**. 

* A node $v$ is **_external_** if $v$ has no children. 

* A node $v$ is **_internal_** if it has one or more children. 

* External nodes are also known as **_leaves_**.

Inheritance hierarchy for modeling various abstractions and implementations of tree data structures.



<center>
<img src="img/Qimage-7.JPG" width="900"/>

<center>
    
# 7. The Tree Abstract Data Type

* We define a tree ADT using the concept of a **_position_** as an abstraction for a node of a tree.


* An element is stored at each position, and positions satisfy **_parent-child_** relationships that define the tree structure.


* A position object for a tree supports the method:
<center>
<img src="img/Qimage-8.JPG" width="1000"/>


The tree ADT then supports the following **_accessor_** methods, allowing a user to navigate the various positions of a tree:


<center>
<img src="img/Qimage-9.JPG" width="900"/>

<center>
    
# 8. A Tree Abstract Base Class in Python

In [13]:
class Tree:
    """Abstract base class representing a tree structure."""

    #------------------------------- nested Position class -------------------------------
    class Position:
        """An abstraction representing the location of a single element."""

        def element(self):
            """Return the element stored at this Position."""
            raise NotImplementedError('must be implemented by subclass')

        def __eq__(self, other):
            """Return True if other Position represents the same location."""
            raise NotImplementedError('must be implemented by subclass')

        def __ne__(self, other):
            """Return True if other does not represent the same location."""
            return not (self == other) # opposite of eq
        

In [14]:
    # ---------- abstract methods that concrete subclass must support ----------
    def root(self):
        """Return Position representing the tree s root (or None if empty)."""
        raise NotImplementedError('must be implemented by subclass')

    def parent(self, p):
        """Return Position representing p s parent (or None if p is root)."""
        raise NotImplementedError('must be implemented by subclass')

    def num_children(self, p):
        """Return the number of children that Position p has."""
        raise NotImplementedError('must be implemented by subclass')

    def children(self, p):
        """Generate an iteration of Positions representing p s children."""
        raise NotImplementedError('must be implemented by subclass')

    def __len__(self):
        """Return the total number of elements in the tree."""
        raise NotImplementedError('must be implemented by subclass')

In [15]:
    # ---------- concrete methods implemented in this class ----------
    def is_root(self, p):
        """Return True if Position p represents the root of the tree."""
        return self.root() == p

    def is_leaf(self, p):
        """Return True if Position p does not have any children."""
        return self.num_children(p) == 0

    def is_empty(self):
        """Return True if the tree is empty."""
        return len(self) == 0

### Computing Depth and Height

* Let $p$ be the position of a node of a tree $T$.
* The **_depth_** of $p$ is the number of ancestors of $p$, excluding $p$ itself.

1. If $p$ is the root, then the depth of $p$ is 0.
2. Otherwise, the depth of $p$ is one plus the depth of the parent of $p$.

In [16]:
def depth(self, p):
    """Return the number of levels separating Position p from the root."""
    if self.is_root(p):
        return 0
    else:
        return 1 + self.depth(self.parent(p))
    

**Quiz 1 - Trees**

What is depth of the node **_International_**?

<center>
<img src="img/Qimage-6.JPG" width="500"/>
    
1) 1
    
2) 2
    
3) 3
    
4) 5
    
Please answer here: https://PollEv.com/multiple_choice_polls/MYoM7k2Vsi3hlvQOPxV3H/respond

The height of a position $p$ in a tree $T$ is also defined recursively:

* If $p$ is a leaf, then the height of $p$ is 0.
* Otherwise, the height of $p$ is one more than the maximum of the heights of $p$’s children.
* The height of a nonempty tree $T$ is the height of the root of $T$.

In [17]:
def _height(self, p): # time is linear in size of subtree
    """Return the height of the subtree rooted at Position p."""
    if self.is_leaf(p):
        return 0
    else:
        return 1 + max(self._height(c) for c in self.children(p))
    

**Quiz 2 - Trees**

What is height of the below tree?

<center>
<img src="img/Qimage-6.JPG" width="500"/>
    
1) 1
    
2) 4
    
3) 5
    
4) 6
    
Please answer here: https://PollEv.com/multiple_choice_polls/9tIoVcupcyqhKx3cKtc2R/respond

<center>
    
# 9. Exercices

**Ex.1**

Consider the following variant of the ``_find_index`` method in the context of the ``SortedTableMap`` class:

In [3]:
def _find_index(self, k, low, high):
    if high < low:
        return high + 1
    else:
        mid = (low + high) // 2
        if self._table[mid]._key < k:
            return self._find_index(k, mid + 1, high)
        else:
            return self._find_index(k, low, mid - 1)
        

Does this always produce the same result as the original version? Justify your answer.

## Thank you!