### Question 1
####  a) Convert the pseudocode implementation of disjoint set forests that is given in the lecture slides into valid Python code. Use it to make 1.000 sets, and perform a random mix of 500.000 union and find operations on them, with random parameters. Report the resulting running time. (4P)

In [1]:
# Define disjoint set data structure and its operations
class Node: # Node class
    def __init__(self):
        self.parent = self # Initially a node is its own parent
        self.rank = 0 # Rank of the node

def make_set(x): # Make set of x 
    x.parent = x # A new set's parent is itself
    x.rank = 0 # Rank of x is 0 

def find_set(x): # Find set of x 
    if x != x.parent: # If x is not the parent of x
        x.parent = find_set(x.parent)  # Recursively find the root parent of x
    return x.parent # Return the root parent of x

def union(x, y): # Union two sets x and y
    link(find_set(x), find_set(y)) # Link the root parents of the two sets

def link(x, y): # Link two sets x and y
    if x.rank > y.rank: # If rank of x is greater than rank of y
        y.parent = x # Make x the parent of y
    else: 
        x.parent = y # Make y the parent of x
        if x.rank == y.rank: # If ranks are equal, increment the rank of y
            y.rank += 1


In [2]:
import random
import time

In [3]:
# Create 1000 nodes and make set of each node
n = 1000
nodes = [Node() for _ in range(n)] 

# Perform 500000 random union and find operations
operations = 500000
start_time = time.time()

for _ in range(operations):
    op_type = random.choice(["union", "find"])
    a = random.randint(0, n - 1)
    b = random.randint(0, n - 1)

    if op_type == "union":
        union(nodes[a], nodes[b])
    else:  # "find"
        find_set(nodes[a])

end_time = time.time()

# Report the running time
print(f"Time taken with path compression and union by rank: {end_time - start_time:.2f} seconds")

Time taken with path compression and union by rank: 1.58 seconds


#### b) Remove the path compression. Repeat the experiment and report the resulting running time. (2P)

In [None]:
def find_set_no_compression(x): # Find set of x without path compression
    if x != x.parent:
        return find_set_no_compression(x.parent)
    return x.parent

def union_no_compression(x, y): # Union two sets x and y without path compression
    link(find_set_no_compression(x), find_set_no_compression(y))

# Perform 500000 random union and find operations without path compression
start_time = time.time()
for _ in range(500000):
    op = random.choice(['union', 'find'])
    a, b = random.sample(nodes, 2)
    if op == 'union':
        union_no_compression(a, b)
    else:
        find_set_no_compression(a)
end_time = time.time()

print(f"Time without path compression: {end_time - start_time:.2f} seconds")

Time without Path Compression: 1.85 seconds


#### c) Now, additionally remove the union-by-rank heuristic. Instead, Link should always make the  representative of the second set a child of the rst sets representative. Repeat the experiment  and report the resulting running time. (2P)

In [6]:
def link_no_rank(x, y): # Link two sets x and y without rank
    x.parent = y

def union_no_rank(x, y): # Union two sets x and y without rank
    link_no_rank(find_set_no_compression(x), find_set_no_compression(y))

# Perform 500000 random union and find operations without union by rank and path compression
start_time = time.time()
for _ in range(500000):
    op = random.choice(['union', 'find'])
    a, b = random.sample(nodes, 2)
    if op == 'union':
        union_no_rank(a, b)
    else:
        find_set_no_compression(a)
end_time = time.time()

print(f"Time without union by rank and path compression: {end_time - start_time:.2f} seconds")

Time without Union-by-Rank and Path Compression: 1.86 seconds
