## DSU:
##### We are given several elements, each of which is a separate set. A DSU will have an operation to combine any 
##### two sets, and it will be able to tell in which set a specific element is. 
##### The classical version also introduces a third operation, it can create a set from a new element.



### it consists of the following operations : 

### make_set(v) - creates a new set containing of the element v
### union_sets(a,b) - merges 2 sets, with one of the set containing element a, and the other containing b
### find_set(v) - returns the parent of the element v (the representative of the set containing v)

### each set will be represented by a tree, where the root is the parent of each set



## Naive implementation:


In [16]:
def make_set(v):
    parent[v] = v

def find_set(v):             #o(n), hence inefficient
    if v == parent[v]:
        return v
    else:
        return find_set(parent[v])

def union_sets(a, b):
    a = find_set(a)
    b = find_set(b)
    if not (a == b):
        parent[b] = a

## Path Compression:
##### for speeding up find_set
#### if the elements form a chain like structure, it shall cause o(n) worst case time complexity
#### for speeding up, we can attach the children of a node directly to the parent of the set.
#### this reduces the timecomplexity to o(logn)
![Path-comp](DSU_path_compression.png)

In [17]:
def find_set(v):
    if not (parent[v] == v):
        parent[v] = find_set(parent[v])
    return parent[v]    

## Union by size / rank
##### this determines which tree attaches to the other one.
##### 2 ways can be used - size of the tree as rank, and the depth of the tree.
##### while using the depth of the tree, we use only the upper-bound, since path-compression shortens the depth by a
##### significant amount

##### both optimizations are equivalent on terms of time and space complexity

### 1. Union by size :

In [18]:
def make_set(v):
    parent[v] = v
    size[v] = 1
    
def union_sets(a, b):
    a = find_set(a) #from path compression
    b = find_set(b)
    if not(a == b):
            if size[a] < size[b]:
                a, b = b, a #swap
            parent[b] = a
            size[a] += size[b]
            

### 2. Union by rank:
##### rank specifies how many sets are connected to a set, for example, if a union operation is performed, rank increases by 1

In [19]:
def make_set(v):
    parent[v] = v
    rank[v] = 0

def union_sets(a, b):
    a = find_set(a) #from path compression
    b = find_set(b)
    if not(a == b):
        if rank[a] < rank[b]:
            a, b = b, a #swap 
        parent[b] = a
        if rank[a] == rank[b]:
            rank[a] += 1
            

## Time complexity:

##### If we combine both optimizations - path compression with union by size / rank - we will reach nearly constant time queries
##### It turns out, that the final amortized time complexity is O(α(n)), where α(n) is the inverse Ackermann function, 
##### which grows very slowly. In fact it grows so slowly, 
##### that it doesn't exceed 4 for all reasonable n (approximately n<10^600).

## Support distances up to representive / length to the parent of the set
##### Sometimes in specific applications of the DSU you need to maintain the distance between a vertex and the representative of its set (i.e. the path length in the tree from the current node to the root of the tree). These include finding smallest path in the maze, etc.

In [20]:
def make_set(v):
    parent[v] = (v, 0)       #2nd element is the length to the parent of the set
    rank[v] = 0
    
def find_set(v):
    if not (v == parent[v][0]):
        len = parent[v][1]
        parent[v] = find_set(parent[v][0])
        parent[v][0] += len
    return parent[v]

def union_sets(a, b):
    a = find_set(a)[0]
    b = find_set(b)[0]
    if not (a == b):
        if rank(a) < rank(b):
            a, b = b, a
        parent[b] = (a,1)
        if rank[a] == rank[b]:
            rank[a] += 1

## Support the parity of the path length / Checking bipartiteness online
#### to answer the question :  is the connected component containing this vertex bipartite?
##### To solve this problem, we make a DSU for storing of the components and store the parity of the path up to the representative for each vertex. Thus we can quickly check if adding an edge leads to a violation of the bipartiteness or not: namely if the ends of the edge lie in the same connected component and have the same parity length to the leader, then adding this edge will produce a cycle of odd length, and the component will lose the bipartiteness property.
![Bi-Graph](bigraph.png)
##### Let's derive a formula, which computes the parity issued to the leader of the set that will get attached to another set. Let x be the parity of the path length from vertex a up to its leader A, and y as the parity of the path length from vertex b up to its leader B, and t the desired parity that we have to assign to B after the merge. The path contains the of the three parts: from B to b, from b to a, which is connected by one edge and therefore has parity 1, and from a to A. Therefore we receive the formula 
 #####  t = x ^ y ^ 1,  where ^ represents XOR
##### Thus regardless of how many joins we perform, the parity of the edges is carried from on leader to another.

In [8]:
def make_set(v):
    parent[v] = (v, 0)                   #2nd element is the length to the parent of the set
    rank[v] = 0
    bipartite[v] = True

def find_set(v):
    if not (v == parent[v][0]):
        parity = parent[v][1]
        parent[v] = find_set(parent[v][0])
        parent[v][1] ^= parity
    return parent[v]

def add_edge(a, b):     # add edge from a to b
    pa = find_set(a)
    a = pa[0]
    x = pa[1]
    
    pb = find_set(b)
    b = pb[0]
    y = pb[1]
    
    if a == b:
        if x == y:
            bipartite[a] = False
    else:
        if rank[a] < rank[b]:
            a, b = b, a
        parent[b] = (a, x^y^1)
        bipartite[a] &= bipartite[b]
        if rank[a] == rank[b]:
            rank[a] += 1
            
def is_bipartite(v):
    return bipartite[find_set(v)[0]]

## Example : check if the graph contains a cycle:

In [17]:
from collections import defaultdict as dd
class graph():
    class _dsu():
        def __init__(self, n):
            self.parent = list(range(n))
            self.rank = [0] * n
              
        def find_set(self, node):        #find representative/parent of a node
            if not (node == self.parent[node]):
                self.parent[node] = self.find_set(self.parent[node])
            return self.parent[node]
            
        def union(self, a, b):
            a = self.find_set(a)
            b = self.find_set(b)
            if not (a == b):
                if self.rank[a] < self.rank[b]:
                    a, b = b, a
                self.parent[b] = a
                if self.rank[a] == self.rank[b]:
                    self.rank[a] += 1
            
    def __init__(self, n):
        self.g = dd(list)
        self.size = n
        self.dsu = graph._dsu(n)
        
    def add_edge(self, u, v):
        self.g[u].append(v)   #adjacency list
    
    def is_cyclic(self):
        for i in self.g:
            for j in self.g[i]:
                x = self.dsu.find_set(i)
                y = self.dsu.find_set(j)
                if x == y:
                    return True
                self.dsu.union(x, y)
        return False
    

g = graph(3)
g.add_edge(0,1)
g.add_edge(1,2)
#g.add_edge(2,0)
g.is_cyclic()

False

## Example : Lexicographically minimal string
[Problem_Link](https://www.hackerearth.com/practice/data-structures/disjoint-data-strutures/basics-of-disjoint-data-structures/practice-problems/algorithm/lexicographically-minimal-string-6edc1406/description/)

In [27]:
import string 
from abc import abstractmethod #abstract base class
lkp = {i:j for i,j in zip(string.ascii_lowercase,range(26))}
#rlkp = {j:i for i,j in zip(string.ascii_lowercase,range(26))}
class DSU():
    
    def __init__(self, n = 26):
        self.parent = {i:i for i in string.ascii_lowercase}
    
    def find_set(self, node):
        if not (node == self.parent[node]):
            self.parent[node] = self.find_set(self.parent[node])
        return self.parent[node]
    
    @abstractmethod
    def _union(self, a, b):
        pass
    
class minimal_string(DSU):
    
    def __init__(self, a, b):
        super().__init__(self)
        self.a = a
        self.b = b
        self._union(a, b)
    
    def _union(self, a, b):
        global lkp
        for i in range(len(a)):
            u = self.find_set(a[i])    #find parents for both
            v = self.find_set(b[i])
            if not (u == v):
                if lkp[v] < lkp[u]:
                    u, v = v, u
                self.parent[v] = u
                
    def transform(self, c):
        c2 = list()
        for i in c:
            c2.append(self.find_set(i))
        return ''.join(c2)

In [28]:
mdsu = minimal_string('xyzpqrabcdf', 'yzaqrsbcdef')
mdsu.transform('yyzzxxxqppqabcddce')

'aaaaaaappppaaaaaaa'