# Representing trees

There are usually many ways to represent any given abstract data structures. In Python, however, these ways will always be based on classes and objects, something that goes beyond this class. Therefore, I will just give you a few examples and hope you can figure out from those how trees can be traversed, at least.

The way classes work is that we define a class with syntax similar to functions, but nested functions work as so-called methods and must always take an object as their first argument. Traditionally this object is called `self`. The way methods work is very similar to functions, except that the syntax for calling a method is slightly different. To call a function, `f`, on an object `x`, we write `f(x)`, while for a method we write `x.f()`. The reason we have both methods and functions is that methods can be "polymorphic", meaning that the actual function that is called when we write `x.f()` depends on the type of `x`. This goes beyond the scope of this class, however, so you should for now just think of methods as functions with a slightly different syntax.

We can implement a binary tree recursively as a node with two subtrees like this:

In [16]:
class TreeNode:
    def __init__(self, value, left = None, right = None):
        self.value = value
        self.left = left
        self.right = right

In this definition, I specify that a tree has a value in a node---this can be any object, but we will use numbers in the examples below---and will have two subtrees, `left` and `right`, that might be `None` if those subtrees are empty.

We can construct a tree using this structure as follows:

In [17]:
tree = TreeNode(3, TreeNode(1), TreeNode(6, TreeNode(4), TreeNode(7)))

To traverse trees, we can use recursive functions. Here, basic cases are empty trees, represented as `None`. Leaves are nodes with both children equal to `None`. Depending on the application, we might allow one but not the other child to be `None`. In the following I will assume that either both children are `None` or both are proper nodes. With that assumption, we can write code for displaying a tree like this:

In [23]:
def is_leaf(tree):
    return tree.left is None and tree.right is None

def display_tree(tree):
    if is_leaf(tree):
        return str(tree.value)
    else:
        return "({left},{right}){value}".format(value = str(tree.value),
                                                left = display_tree(tree.left), 
                                                right = display_tree(tree.right))

The function will display the tree in [Newick format](https://en.wikipedia.org/wiki/Newick_format). For the example tree we created above, we get this result:

In [22]:
print(display_tree(tree))

(1,(4,7)6)3


Explicitly representing tree nodes as objects is not the only way we can represent a tree. Instead of representing each node explicitly we can, for example, represent a tree as lists of left and right children and a list of node values.

In [24]:
class Tree:
    def __init__(self, values, left, right):
        self.values_ = values
        self.left_ = left
        self.right_ = right
        
    def value(self, node):
        return self.values_[node]
        
    def left(self, node):
        return self.left_[node]
    
    def right(self, node):
        return self.right_[node]

This representation is more primitive, but also easier to translate into lower level programming langauges since it doesn't require any highlevel object-oriented constructions. We do have three methods, though, for accessing values and children instead of directly accessing the lists.

The tree we constructed using explicit nodes above would have this form in the new representation:

In [26]:
values = [3, 1, 6, 4, 7]
left = [1, None, 3, None, None]
right = [2, None, 4, None, None]
tree = Tree(values, left, right)

The actual tree structure is harder to read ouot of these lists than it is in the explicit construction from above, but it is simply a matter of assigning nodes to indices and letting `left` and `right` point to the correct indices.

We can  update the display code to use this representation by changing the `is_leaf` and `display_tree` functions to take an extra argument and operate both on the tree and a node index. Very little changes otherwise:

In [33]:
def is_leaf(tree, node):
    return tree.left(node) is None and tree.right(node) is None

def display_tree(tree, node):
    if is_leaf(tree, node):
        return str(tree.value(node))
    else:
        return "({left},{right}){value}".format(value = str(tree.value(node)),
                                                left = display_tree(tree, tree.left(node)), 
                                                right = display_tree(tree, tree.right(node)))

To call the `display_tree` function we now need to provide a node as well, and the root we have put at index `0`, so that is what we will use:

In [34]:
print(display_tree(tree, 0))

(1,(4,7)6)3
