# Trees

## Tree-Structured Data

Early in this course, we went over `tree` data abstraction. 

In [None]:
# Constructor
def tree(label, branches=[]):
    return [label] + list(branches)

# Selectors
def label(tree):
    return tree[0]

def branches(tree):
    return tree[1:]

The idea is that we can put together a label and some branches into a tree. Then we invented an implementation for this abstraction, which is to put together the label and the branches into a list. The selectors use that implementation to obtain the label and the branches.

Later in the course, `Tree` was not a data abstraction anymore. Instead, it was a class.

In [None]:
class Tree:
    def __init__(self, label, branches = []):
        self.label = label
        self.branches = list(branches)

These are 2 different methods of defining ADT. We can do it with either functions or object system. Both achieve the same goal but there are advantages to object system: it gets rid of the need to invent implementation, we just use the attribute of the object system.

Regardless how we represent a tree, nothing about how we process the tree changes. In both cases, we need a way to obtain both label and branches and put them together into one value. 

Later on in lectures, we went over other kind of trees, such as `binary tree`, a subclass of the `Tree` class that has a left and right branches instead of a list of branches. 

In [None]:
class BTree(Tree):
    empty = Tree(None)
    
    def __init__(self, label, left = empty, right empty):
        Tree.__init__(self, label, [left, right])
    
    @property
    def left(self):
        return self.branches[0]
    
    @property
    def right(self):
        return self.branches[1]

We invented the `BTree` class to be able to represent tree-structured data where sometimes the left branch is empty while the right branch is not, or the other way around. To achieve this, we introduced the concept of an empty tree so that the left could be empty, or the right could be empty, or both empty (thus we have a `leaf`). We use this to represent `binary search tree`

This is not the only kind of tree we see in this course. A tree is just a general notion in CS for values that contain other trees. A list containing lists that are containing lists are tree data structures as well. 

In [None]:
# A tree can contains other trees
[5, [6, 7], 8, [[9, 10]]]

We worked with tree data structure so much in this course because there are many interesting things in the world that are represented using tree-structured data. Many of them have hierarchical structure. An example is the following Scheme expression, 

In [None]:
(+ 5 (- 6 7) 8 (* (- 9) 10)) ; It's tree structured!

By tree structured, it means the the expression has expression within. 

And the following is a result of sentence parsing by a natural language parser:

In [None]:
(S
 (NP (JJ Short) (NNS cuts))
 (VP (VBP make)
     (NP (JJ long) (NNS delays))
 (. .))

And we also have the structure of web pages, which is written with markup language, which is hierarchical. Here is a small snippet of HTML.

In [None]:
<ul>
  <li>Midterm <b>1</b></li>
  <li>Midterm <b>2</b></li>
</ul>

Tree processing arises in a lot of different CS applications, and it often involves recursive calls on subtrees.