# General Trees & Union/Find Problem
https://opendsa-server.cs.vt.edu/ODSA/Books/CS3/html/GenTreeIntro.html

## Table of Contents
- **[General Trees](#intro)**<br>
- **[Union/Find Problem](#union)**<br>
- **[Parent Pointer Trees](#ppt)**<br>
- **[Parent Pointer Trees Implementation](#pptimp)**<br>

<a id="general"></a>

## General Trees
- many organizations are hierarchical in nature
    - military, most businesses, governments, etc.
- binary tree is not adequate to represent organizations that have many many subordinates at lower level
- to represent these hierarchy of many arbitrary number of children, we use general trees
- general trees are trees whose internal nodes have no fixed number of children
- the following figure depicts a general tree
<img width="400px" src="./resources/generaltrees.png">

### General Tree Definitions and Terminology
- a tree, $T$ is a finite set of one or more nodes with one special node $R$, the root
- tree may have many **subtrees** rooted at some nodes that are children of $R$
    - subtrees are arranged from left to right
- a node's **out degree** is the number of children for that node
- a **forest** is a collection of one or more trees
- each node (except for root) has precisely one parent
    - a tree with $n$ nodes must have $n-1$ edges because each node, except the root, has one edge connecting that node to its parent
    
## Implementation
- implementation of general tree is much harder compared to binary tree and is ignored

<a id="union"></a>

## The Union/Find Problem
https://opendsa-server.cs.vt.edu/ODSA/Books/CS3/html/UnionFind.html

### Find: - determine if two objects are in the same set
    - MST: given two nodes, are they in the same tree?
    
### Union: efficiently merge two sets into one
    - MST: merge two disjoint trees into one
    
- Kruskal's minimum spanning tree (MST) uses Union/Find technique
- what data structure can efficiently implement Union/Find operations?

<a id="ppt">

## Parent Pointer Trees
- a simple way to represent general tree 
    - for each node store only a pointer to that node's parent
    - called **parent pointer representation**
- helps us precisely solve the Union/Find problem by offering two basic operations:
    1. determine if two objects are in the same set ( the **FIND** operation)
        - follow the series of parent pointers from each node to its respective root
        - if both nodes have same root they belong to the same tree
        - helps if the height of the trees are shorter (or shortest possible)
     
    2. merge two disjoints sets together (intersection of disjoint sets is empty)
        - disjoint sets are united (the **UNION** operation)
        - perhaps by making one the parent of the other
        - goal is to keep the height shorter when merging
- this 2-step process goes by the name **UNION/FIND**

<a id="pptimp">

## Parent Pointer Tree Implementation
- represented using a single array
- index represents node and the element stored represents its parent
    - a single array is used to implement a collection of trees
- use path compression and weighted union techniques 
    - keep the height of the joined tree to as short as possible
<img src="resources/ParentPtrTree.png" width="30%">

In [1]:
// a simplified demonstration of parent pointer tree
#include <iostream>
#include <vector>

using namespace std;

In [2]:
// represent the above tree using parent pointer implementation
vector<int> parent(10, -1); //initialize parent vector of 10 nodes with -1
// can also initialize parent of a node at index i to itself

In [3]:
parent[0] = 5;
parent[1] = 0;
parent[2] = 0;
parent[3] = 5;
parent[4] = 3;
//parent[5] = -1;
parent[6] = 5;
parent[7] = 2;
parent[8] = 5;
// parent[9] = -1

5

In [4]:
// recursive function to print path in reverse order from node to its root
void printPathReverse(vector<int> &parent, int node) {
    cout << node << " ";
    if (parent[node] == -1) return;
    printPathReverse(parent, parent[node]);
}

In [5]:
// print path to H
printPathReverse(parent, 7);

7 2 0 5 

In [6]:
// recursive function to print path in from root to the given node
void printPath(vector<int> &parent, int node) {
    if (node == -1) return;
    printPath(parent, parent[node]);
    cout << node << " ";
}

In [7]:
// print path from root to to H
printPath(parent, 7);

5 0 2 7 

In [8]:
// recursively find root without compressing path
int find(vector<int> &parent, int node) {
    if (parent[node] == -1) return node;
    return find(parent, parent[node]);
}

In [9]:
// find root of H
cout << find(parent, 7);

5

In [10]:
// check parents of all the nodes in path to H;
// still the same as path has not been compressed
cout << parent[2] << endl; // C
cout << parent[0] << endl; //A

0
5


@0x1097f6010

## Do nodes J and H belong to the same tree?

In [11]:
// find root of J and H
cout << find(parent, 7) << " " << find(parent, 9) << endl;

5 9


In [12]:
if (find(parent, 9) == find(parent, 7))
    cout << "Yes they belong to the same tree!";
else
    cout << "No they do not belong to the same tree!";

No they do not belong to the same tree!

## Path Compression
- path compression technique can be used to create extremely shallow trees
- resets the parent of every node on the path from say $X$ to $R$ to $R$
- keeps the cost of subsequent FIND operations very close to constant
    - O(log$n$) in the worst case

In [13]:
// find root of node by compressing the path
// all the nodes in path to node will have their root changed to the root of 
int findCompression(vector<int> &parent, int node) {
    if (parent[node] == -1) return node;
    parent[node] = findCompression(parent, parent[node]);
    return parent[node];
}

In [14]:
// find root of H and compress path
cout << findCompression(parent, 7);

5

In [15]:
// check parent of H
cout << parent[7] << endl;

5


In [16]:
// check parent of all the nodes in path to H
// path should be compressed making C and A's parents same as H's parents
cout << parent[2] << endl; // C
cout << parent[0] << endl; //A

5
5


@0x1097f6010

## Weighted Union
- technique to join two sets by reducing their height
    - limits the total depth of the tree to $O(logn)$
- joins the tree with fewer nodes to the tree with more nodes
    - make the smaller tree's root point to the root of the bigger tree
- visualize weighted union here:https://opendsa-server.cs.vt.edu/ODSA/Books/CS3/html/UnionFind.html

### parent pointer tree implementation as ADT

In [17]:
#include <iostream>
#include <vector>
#include <string>
#include <algorithm>
#include <sstream>
using namespace std;

In [18]:
// general Parent-Pointer Tree implementation for UNION/FIND
class ParPtrTree {
  private:
    vector<int> parents; // parent pointer vector
    vector<int> weights; // weights for weighted union
  public:
    // constructor
    ParPtrTree(size_t size) {
        parents.resize(size); //create parents vector
        fill(parents.begin(), parents.end(), -1); // each node is its own root to start
        weights.resize(size); 
        fill(weights.begin(), weights.end(), 1);// set all base weights to 1
    }
    
    // Return the root of a given node with path compression
    // recursive algorithm that makes all ancestors of the current node
    // point to the root
    int FIND(int node) {
        if (parents[node] == -1) return node;
        parents[node] = FIND(parents[node]);
        return parents[node];
    }
    
    // Merge two subtrees if they are different
    void UNION(int node1, int node2) {
        int root1 = FIND(node1);
        int root2 = FIND(node2);
        // MERGE two trees
        if (root1 != root2) {
            // if weight of root1 is smaller; 
            // root1 will point to root2
            if (weights[root1] < weights[root2]) {
                parents[root1] = root2;
                weights[root2] += weights[root1];
            }
            // root2 will point to root1
            else {
                parents[root2] = root1;
                weights[root1] += weights[root2];
            }
        }    
    }
    
    // returns string representation of ParentPtrTree
    string toString() {
        string nodes = "nodes:\t";
        string prts = "parents:\t";
        for (int i=0; i < this->parents.size(); i++) {
            prts += to_string(this->parents[i]) + '\t';
            nodes += " \t " + to_string(i); 
        }
        return prts + "\n" + nodes;
    }
};

### Test ParPtrTree
- the following test code can be modified to test examples provided here: 
https://opendsa-server.cs.vt.edu/ODSA/Books/CS3/html/UnionFind.html

In [19]:
// 10 disjoint sets: A-J mapped as 0-9
// A: 0, B: 1, ... J: 9
ParPtrTree ptr(10);

In [20]:
cout << ptr.toString();

parents:	-1	-1	-1	-1	-1	-1	-1	-1	-1	-1	
nodes:	 	 0 	 1 	 2 	 3 	 4 	 5 	 6 	 7 	 8 	 9

In [21]:
// union nodes (H) and (J)
ptr.UNION(7, 9);
cout << ptr.toString();

parents:	-1	-1	-1	-1	-1	-1	-1	-1	-1	7	
nodes:	 	 0 	 1 	 2 	 3 	 4 	 5 	 6 	 7 	 8 	 9

In [22]:
// union nodes (G) and (I)
ptr.UNION(6, 8);
cout << ptr.toString();

parents:	-1	-1	-1	-1	-1	-1	-1	-1	6	7	
nodes:	 	 0 	 1 	 2 	 3 	 4 	 5 	 6 	 7 	 8 	 9

In [23]:
// union nodes (A) and (J)
ptr.UNION(0, 9);
cout << ptr.toString();

parents:	7	-1	-1	-1	-1	-1	-1	-1	6	7	
nodes:	 	 0 	 1 	 2 	 3 	 4 	 5 	 6 	 7 	 8 	 9

In [24]:
ptr.UNION(1, 7);
cout << ptr.toString();

parents:	7	7	-1	-1	-1	-1	-1	-1	6	7	
nodes:	 	 0 	 1 	 2 	 3 	 4 	 5 	 6 	 7 	 8 	 9

In [9]:
ptr.UNION(6, 9);
cout << ptr.toString();

parents:	7	7	-1	-1	-1	-1	7	-1	6	7	
nodes:	 	 0 	 1 	 2 	 3 	 4 	 5 	 6 	 7 	 8 	 9