# **Lecture 3** #

### **Keys and values**

So far, we have been looking at data structures that store numbers. We saw data structures and algorithms for searching a number, or finding the min/max of a set of numbers. More generally, each number can be viewed as the **key** that is associated with other **values**, that are stored along it. Then, when we search and find a key, we can also retrieve all the associated information. 



### **Dictionaries: Already involving (key,val)**

Dictionaries handle (key,val) pairs by default:





In [0]:
dict = {}
key = "Yiannis SSN"
val = "99-999-9999"

dict[key] = val
dict['Yiannis SSN']

'99-999-9999'

### **A simple class for a 'bundle' of data**

Can we use the following to solve the 'sort with memory' question from assignment 2?

In [0]:
class compositeElement:
  def __init__(self, key, val):
    self.key = key
    self.val = val


# we can now use composite elements and create lists of them
L=[]
L.append(compositeElement(key = 10, val = 0))
print(L[0].key)
print(L[0].val)
print()

L.append(compositeElement(key = 11, val = "alpha"))
print(L[1].key)
print(L[1].val)
print()


# initialize a list of 10 elements with (key = -1 ,val =-1)
initL = compositeElement(-1,-1);
K = [initL]*10             

len(K)

10
0

11
alpha



10

### **Bundle of Data and Functionality on Data: Implementing a min heap** ###


In [0]:
class myMaxHeap:
  def __init__(self):
    self.H = []


  def heapInsert(self,x):
    n = len(self.H)
    self.H.append(x)   # append in last leaf (next available position in array/list)
    
    # now bubble up x
    pos = n;      # current position of bubble-up
    while True:
      parent_pos = (pos-1)//2 
      if self.H[parent_pos] < self.H[pos]:  
        self.H[pos] = self.H[parent_pos]     # copy parent value to current position
        self.H[parent_pos] = x               # move x to parent's position
        pos = parent_pos                     # update current position
      else:
        break                                # break the bubble-up loop
    # return H    

  # function for removing max element from heap
  # WARNING: This function is intentionally incomplete --
  #          You will fix this in the assignment 

  def heapMaxRemove(self):
    x = self.H.pop()                   # pop last element
    self.H[0] = x                      # put it in the place of max 
    
    # now bubble-down x
    pos = 0
    while True:
      c1_pos = 2*pos+1            # child 1 position
      c2_pos = 2*pos+2            # child 2 position
      if self.H[c1_pos] > self.H[c_2]:
        c_pos = c1_pos
      else:
        c_pos = c2_pos            # which child is active in possible swap
      if self.H[pos]< self.H[c_pos]:
        self.H[pos] = self.H[c_pos]         # swap 
        self.H[c_pos] = x 
        pos = c_pos               # update current position
      else:
        break                     # break 




In [0]:
myH = myMaxHeap()

myH.heapInsert(3)
myH.heapInsert(11)

print(myH.H)

myH2 = myMaxHeap()
myH2.heapInsert(-1)

print(myH2.H)

[3, 11]
[-1]


### **An implementation of Prim's Algorithm**

We will now be collecting some elements for the implementation of Prim's algorithm. 

#### **Set membership**

One element that we will need is a simple data structure that maintains a set with up to $n$ elements. The set will consist
of number in range(n), and we assume that it can only get bigger,
i.e. no deletion of set element will occur. We will use this to store nodes that are already part of the tree. We can make a small class using a 'bit vector' as follows:

In [0]:
class mySet:
  def __init__(self,n):
    self.S = [0]*n      # this initializes a list (bit-vector) of n elements all set to 0

  def insert(self,i):   # insert element i
    self.S[i] = 1    

  def isMember(self,i): # check if element i is in the set
  


#### **An edge container**

We will also need to store weighted edges in a data structure, and be able to retrieve the maximum weight edge. We discussed that the best way to do that is to use a max heap, but here I give a suboptimal list-based implementation.





In [0]:
# the following code assumes that each element stored in the data structure has a 'key' attribute
# the max will be found with respect to the key

class edgeBox:
  def __init__(self):
    self.L = []

  def insert(self,elem):
    self.L.append(elem)   # this simply appends the element in the end of H

  def extractMax(self):
    n = len(self.L)
    j_max = 0
    max_el = self.L[j_max]
    for j in range(1,n):                     # search linearly for the max
      if self.L[j].key > max_el:  # this is where the key attribute is used
        j_max = j
        max_el = self.L[j_max]
  
    self.L[j_max] = self.L[n-1]         # copy the last element into the previous max position
    self.L.pop()                        # pop the last element
    return max_el


  def extractMin(self):
    n = len(self.L)
    j_min = 0
    min_el = self.L[j_min]
    for j in range(1,n):                     # search linearly for the max
      if self.L[j].key < min_el:  # this is where the key attribute is used
        j_min = j
        min_el = self.L[j_min]
   
    self.L[j_min] = self.L[n-1]         # copy the last element into the previous max position
    self.L.pop()                        # pop the last element
    return min_el
    


  def size(self):                       # for tracking size
    return len(self.L)




In [0]:
# let's now use an edgeBox to see how it works

H = edgeBox()
e1 = compositeElement(key = 3, val = (1,4))    # (1,4) is a Python 'tuple', here representing edge (1,4), which is assumed to have edge weight 3
e2 = compositeElement(key = 2, val = (1,2))

# insert the two edges
H.insert(e1)
H.insert(e2)

x = H.extractMax()
print("The weight is", x.key)
print("Edge is", x.val)

print()

x = H.extractMax()
print("The weight is", x.key)
print("Edge is", x.val)




The weight is 3
Edge is (1, 4)

The weight is 2
Edge is (1, 2)


#### **Encoding the graph**

There are multiple ways to encode a graph. We will be using a variant of the adjacency list. The graph is encoded as a list of lists, where list $i$ contains the adjacent edges of node $i$. Each each is a tuple $(i,j,w_{i,j})$. 

 I will use as an example the graph from the notes, except that I will be using 0-indexing  (in the notes/whiteboard 1-indexing is used). Taking some time to understand how to access each edge and weight is needed in order to see how the subsequent algorithm works:


In [0]:

G = [ [(0,1,1), (0,2,5), (0,3,11), (0,4,8)], [(1,0,1), (1,2,2), (1,3,5), (1,4,9)], [(2,0,5), (2,1,2), (2,3,1),(2,4,6)], [(3,0,11),(3,1,5),(3,2,1),(3,4,8)], [(4,0,8), (4,1,9),(4,2,6),(4,3,8)]] 


# print all incident edges of node #4
print(G[4])

# print the incident edge #3 of node #4 
print(G[4][3])

# print the weight of that edge
print(G[4][3][2])




[(4, 0, 8), (4, 1, 9), (4, 2, 6), (4, 3, 8)]
(4, 3, 8)
8


We are now ready to give a function for finding the max weigt spanning tree. We will implement a slightly different code. Specifically the initialization will return one node where the building of the tree starts (not an edge as in the class)

In [0]:
def Prim_MST(G):
  n = len(G)      # the number of nodes

  # find the maximum weight edge by linear search
  # -- we assume that weights are positive
  max_w = 0
  for i in range(n):
    for j in range(len(G[i])):
      if G[i][j][2] > max_w: 
        v = G[i][j][0]
        max_w =  G[i][j][2]        

  # after the end of this loop: v is starting node incident to max_weight edge

  # initialize required data structures
  T =  []
  S = mySet(n)
  H = edgeBox()
  new_v_found = True   # flag that indicates a new node was found

  # initialize H with edges incident to v
  m_v = len(G[v])     # number of edges incindent to v
  for j in range(m_v):
    edge = compositeElement( val = (G[v][j][0], G[v][j][1]), key = G[v][j][2] )
    H.insert(edge)  # this makes edges into composite elements inserted in H
      
 
  while len(T)<len(G)-1:

    max_edge = H.extractMax() 
    [i,j, w] = max_edge        # unpack the edge into i, j, w
    
    # NOTE: This can be improved. See question in assignment
    
    if ~S.isMember(i):         # check if i is not in tree
      v = i                    # the new node to add on T
      new_v_found = True       
    elif ~S.isMember(j): 
      v = j                    # the new node to add on T
      new_v_found = True
    else:
      new_v_found = False
  
    
    if new_v_found:     # run only if a new node v has been found

      T = T.append(max_edge)    # append new edge to T
      S.insert(v)               # insert new node to T

      m_v = len(G[v])   # number of edges incident to v
      #insert new edges to H
      for j in range(m_v):
        edge = compositeElement(val = (G[v][j][0], G[v][j][1]), key = G[v][j][2] )
        H.insert(edge)  # this makes edges into composite elements inserted in H 



### **Huffman Codes**

We saw that Huffman code implementations are based on binary trees. These trees are not full like heap trees, and so simple array-based addressing does not work. 

#### **A Class for Nodes**

In [0]:
class Node:
  def __init__(self,key):
    self.key = key
    self.lchild = None
    self.rchild = None 

In [0]:
# here is how to use this class to construct a tree with 

r  = Node(0)

t =  Node(1)
r.lchild = t

t =  Node(2)
r.rchild = t

# let's access the key of the right child
r.rchild.key


None


2