# AHU Tree Encoding

The AHU (Aho, Hopcroft, Ullman) algorithm is a **clever serialization technique for representing a tree as a unique string**.

Unlike many tree isomorphism invariants and heuristics, AHI is able to capture a <font color="orange" size="3"><b>complete history</b></font> of a tree's <font color="slate" size="3"><b>degree spectrum</b></font> and structure ensuring a deterministic method of checking for tree isomorphisms.

**NOTES**:
- AHU is better performed on a rooted trees.
    - Appropriate algorithms will be used for choosing the right root node and for rooting tree.
    - Rooted tree will be stored as class objects for better encapsulation and ease of access.

### Algorithm
1. Start by assigning all leaf nodes Knuth tuples: `'()'`
2. Every time you move up a layer the labels of the previous subtrees get sorted lexicographically and wrapped in brackets.
	1. Labels need to get **sorted** lexicographically when combined for an algorithm to work.
3. You cannot process a node until you have processed all its children.

In [1]:
# Tree represented as an undirected unweighted acyclic graph.
tree = {
    0: [(1, ), (2, ), (5, )],
    1: [(0, )],
    2: [(0, ), (3, )],
    3: [(2, )],
    4: [(5, )],
    5: [(0, ), (4, ), (6, )],
    6: [(5, )]
}

##### First: 
- Find the appropriate node to root this tree:

In [2]:
def tree_centers(tree):
    '''
    Function to find center(s) of a tree
    
    Args:
    - tree: Tree represented in a form of an undirected graph 
            as a dictinary of adjacency lists, where each key represents a node
            with list of edges, where each tuple represents an edge direction 
            and associated weight (if any)
            e.g.: {0: [(1, 2), (2, 5)], 1: [(0, 2), (3, 4)], ...}
    
    Returns:
    - centers: List, with one or two centers of a tree.
               e.g.: [1, 3]
    '''
    # Array to store the degree of each node
    degree = [0] * len(tree)
    
    # Array to store outermost layer of leaves
    leaves = []
    
    # Calculate the degree of each node in the tree
    for i in range(len(tree)):
        degree[i] = len(tree[i])
        
        # If a node has a degree of 0 or 1, it is a leaf node
        if degree[i] == 0 or degree[i] == 1:
            leaves.append(i)
            degree[i] = 0

    # Counter to keep track of the number of processed nodes.
    processed_nodes_count = len(leaves)
    
    # Continue the process until all nodes have been processed
    while processed_nodes_count < (len(tree)):
        
        # Array to store the new layer of leaves
        new_leaves = []
        
        # Process each node in the current layer of leaves
        for node in leaves:
            for neighbor in tree[node]:
                neighbor = neighbor[0]
                
                # Decrement the degree of a neighbor node
                degree[neighbor] = degree[neighbor] - 1
                
                # If degree becomes 1, add it to the new layer
                if degree[neighbor] == 1:
                    new_leaves.append(neighbor)
                    
            # Set degree of processed node to zero.
            degree[node] = 0
            
        # Update the count of processed nodes
        processed_nodes_count += len(new_leaves)
        
        # The remaining nodes in the new layer of leaves are the centers of the tree
        centers = new_leaves

    return centers

In [3]:
center = tree_centers(tree)[0]
print('Root node for a balanced rooted tree:',center)

Root node for a balanced rooted tree: 0


##### Second: 
- Root the tree on the root node found above:

In [4]:
def root_tree(graph, root_id):
    '''
    Function for rooting an acyclic undirected graph.
    
    Args:
    - graph:   A dictinary of adjacency lists, where each key represents a node
               with list of edges, where each tuple represents an edge direction 
               and associated weight (if any)
               e.g.: [(0, 2), (1, 5), (3, 11), (2, 8)]
    - root_id: Index of a node to perform rooting from.
    
    Returns:
    - root:    List, root node of a rooted tree of form: [node, parent, [children]]
               e.g.: [0, None, [1, 2, 5]] - Root node has no parents.
    '''
    class tree_node:
        '''
        Tree_node class object
            
        Parameters:
            
        - self.id:       Unique integer to indentify a node
        - self.parent:   Pointer to parent tree_node object. Only the
                           root node has a 'None' parent tree_node reference
        - self.children: List of pointers to child tree_nodes
        '''
        def __init__(self, node_id, parent, children):
            self.id = node_id
            self.parent = parent
            self.children = [] 
    
        def print_tree(self, indent='', last=True):
            '''
            Prints the tree structure starting from the current node in a hierarchical format.
    
            Args:
            - indent: String representing the indentation for the current node.
            - last:   Boolean indicating if the current node is the last child of its parent.
                      Determines the marker used for the current node.
            '''
            marker = '└─ ' if last else '├─ '
            print(f"{indent}{marker}{self.id}")
            indent += '    ' if last else '│   '
            child_count = len(self.children)
            for index, child in enumerate(self.children):
                last_child = index == child_count - 1
                child.print_tree(indent, last_child)
                
        def __repr__(self):
            '''
            Returns a string representation of the tree node object.
            The string representation includes the node id, parent id (if parent exists), and a list of children ids.
            '''
            return str((self.id,
                        self.parent.id if self.parent else None,
                        [child.id for child in self.children]))
            
    def build_tree(graph, node, parent):
        '''
        Helper function to recursively build a tree via
        the Depth First Search algorithm.
        '''
        for child_id in graph[node.id]:
            if parent is not None and child_id[0] == parent.id: continue
            child = tree_node(child_id[0], node, [])
            node.children.append(child)
            build_tree(graph, child, child.parent)
        return node
    
    root_node = tree_node(root_id, None, [])
    root = build_tree(graph, root_node, None)
    
    return root

In [5]:
root = root_tree(tree, center)

##### Third
- Perform the AHU encoding, to serialize the rooted tree:

In [6]:
def encode(node):
    '''
    Perform AHU encoding of a tree
    
    Args:
    - node: tree_node class object, root node of a rooted tree.
    
    Returns:
    - String, AHU encoded tree.
      E.g.: {{}{{}}{{}{}}}
    '''
    if node == None:
        print('node == None: True')
        return ""
    labels = []
    for child in node.children:
        labels.append(encode(child))
    # Labels have to be sorted lexicographically    
    sorted(labels)

    result = ""
    for label in labels:
        result += label
        
    return "{" + result + "}"

In [7]:
encoded_tree = encode(root)
encoded_tree

'{{}{{}}{{}{}}}'