**Trees and tree algorithms**

https://runestone.academy/runestone/books/published/pythonds/Trees/toctree.html

# Objectives

- To understand what a tree data structure is and how it is used.
- To see how trees can be used to implement a map data structure.
- To implement trees using a list.
- To implement trees using classes and references.
- To implement trees as a recursive data structure.
- To implement a priority queue using a heap.

# Examples of trees

Now that we have studied linear data structures like stacks and queues and have some experience with recursion, we will look at a common data structure called the tree. Trees are used in many areas of computer science, including operating systems, graphics, database systems, and computer networking. Tree data structures have many things in common with their botanical cousins. A tree data structure has a root, branches, and leaves. The difference between a tree in nature and a tree in computer science is that a tree data structure has its root at the top and its leaves on the bottom.

Properties of trees:
1. **Trees are hierarchical.** By hierarchical, we mean that trees are structured in layers with the more general things near the top and the more specific things near the bottom.
2. **All of the children of one node are independent of the children of another node.**
3. **Each leaf node is unique.**
4. (derived from hierarchical nature) **You can move entire sections of a tree (called a subtree) to a different position in the tree without affecting the lower levels of the hierarchy**


Examples:
- Biological classification of animals
- Unix file system
- Web page, like HTML tags

# Vocabulary and definitions

**Node**
A node is a fundamental part of a tree. It can have a name, which we call the “key.” A node may also have additional information. We call this additional information the “payload.” While the payload information is not central to many tree algorithms, it is often critical in applications that make use of trees.

**Edge**
An edge is another fundamental part of a tree. An edge connects two nodes to show that there is a relationship between them. Every node (except the root) is connected by exactly one incoming edge from another node. Each node may have several outgoing edges.

**Root**
The root of the tree is the only node in the tree that has no incoming edges. In Figure Figure 2, / is the root of the tree.

**Path**
A path is an ordered list of nodes that are connected by edges. For example, Mammal → Carnivora → Felidae → Felis → Domestica is a path.

**Children**
The set of nodes 𝑐 that have incoming edges from the same node to are said to be the children of that node. In Figure Figure 2, nodes log/, spool/, and yp/ are the children of node var/.

**Parent**
A node is the parent of all the nodes it connects to with outgoing edges. In Figure 2 the node var/ is the parent of nodes log/, spool/, and yp/.

**Sibling**
Nodes in the tree that are children of the same parent are said to be siblings. The nodes etc/ and usr/ are siblings in the filesystem tree.

**Subtree**
A subtree is a set of nodes and edges comprised of a parent and all the descendants of that parent.

**Leaf Node**
A leaf node is a node that has no children. For example, Human and Chimpanzee are leaf nodes in Figure 1.

**Level**
The level of a node 𝑛 is the number of edges on the path from the root node to 𝑛. For example, the level of the Felis node in Figure 1 is five. By definition, the level of the root node is zero.

**Height**
The height of a tree is equal to the maximum level of any node in the tree. The height of the tree in Figure 2 is two.

With the basic vocabulary now defined, we can move on to a formal definition of a tree. In fact, we will provide two definitions of a tree. One definition involves nodes and edges. The second definition, which will prove to be very useful, is a recursive definition.

**Definition One:** A tree consists of a set of nodes and a set of edges that connect pairs of nodes. A tree has the following properties:

- One node of the tree is designated as the root node.
- Every node 𝑛, except the root node, is connected by an edge from exactly one other node 𝑝, where 𝑝 is the parent of 𝑛.
- A unique path traverses from the root to each node.
- If each node in the tree has a maximum of two children, we say that the tree is a binary tree.

Figure 3 illustrates a tree that fits definition one. The arrowheads on the edges indicate the direction of the connection.

![7.3_fig3.png](attachment:7.3_fig3.png)

Figure 3: A Tree Consisting of a Set of Nodes and Edges

**Definition Two:** A tree is either empty or consists of a root and zero or more subtrees, each of which is also a tree. The root of each subtree is connected to the root of the parent tree by an edge. Figure 4 illustrates this recursive definition of a tree. Using the recursive definition of a tree, we know that the tree in Figure 4 has at least four nodes, since each of the triangles representing a subtree must have a root. It may have many more nodes than that, but we do not know unless we look deeper into the tree.

![7.3_fig4.png](attachment:7.3_fig4.png)

Figure 4: A recursive Definition of a tree

# List of lists representation

In a tree represented by a list of lists, we will begin with Python’s list data structure and write the functions defined above. Although writing the interface as a set of operations on a list is a bit different from the other abstract data types we have implemented, it is interesting to do so because it provides us with a simple recursive data structure that we can look at and examine directly. In a list of lists tree, we will store the value of the root node as the first element of the list. The second element of the list will itself be a list that represents the left subtree. The third element of the list will be another list that represents the right subtree. To illustrate this storage technique, let’s look at an example. Figure 1 shows a simple tree and the corresponding list implementation.

![7.4_fig1.png](attachment:7.4_fig1.png)

In [1]:
myTree = ['a',   #root
      ['b',  #left subtree
       ['d', [], []],
       ['e', [], []] ],
      ['c',  #right subtree
       ['f', [], []],
       [] ]
     ]

Notice that we can access subtrees of the list using standard list indexing. 

In [7]:
print(myTree)

['a', ['b', ['d', [], []], ['e', [], []]], ['c', ['f', [], []], []]]


In [8]:
print('root = ', myTree[0])

root =  a


In [9]:
print('left subtree = ', myTree[1])

left subtree =  ['b', ['d', [], []], ['e', [], []]]


In [10]:
print('right subtree = ', myTree[2])

right subtree =  ['c', ['f', [], []], []]


One very nice property of this list of lists approach is that the structure of a list representing a subtree adheres to the structure defined for a tree; the structure itself is recursive! A subtree that has a root value and two empty lists is a leaf node. Another nice feature of the list of lists approach is that it generalizes to a tree that has many subtrees. In the case where the tree is more than a binary tree, another subtree is just another list.

Let’s formalize this definition of the tree data structure by providing some functions that make it easy for us to use lists as trees. Note that we are not going to define a binary tree class. The functions we will write will just help us manipulate a standard list as though we are working with a tree.

In [11]:
def BinaryTree(r):
    return [r, [], []]

In [12]:
BinaryTree(2)

[2, [], []]

The BinaryTree function simply constructs a list with a root node and two empty sublists for the children. To add a left subtree to the root of a tree, we need to insert a new list into the second position of the root list. We must be careful. If the list already has something in the second position, we need to keep track of it and push it down the tree as the left child of the list we are adding. Listing 1 shows the Python code for inserting a left child.

In [13]:
def insertLeft(root,newBranch):
    t = root.pop(1)
    if len(t) > 1:
        root.insert(1,[newBranch,t,[]])
    else:
        root.insert(1,[newBranch, [], []])
    return root

Notice that to insert a left child, we first obtain the (possibly empty) list that corresponds to the current left child. We then add the new left child, installing the old left child as the left child of the new one. This allows us to splice a new node into the tree at any position. The code for insertRight is similar to insertLeft and is shown in Listing 2.

In [15]:
def insertRight(root,newBranch):
    t = root.pop(2)
    if len(t) > 1:
        root.insert(2,[newBranch,[],t])
    else:
        root.insert(2,[newBranch,[],[]])
    return root

To round out this set of tree-making functions(see Listing 3), let’s write a couple of access functions for getting and setting the root value, as well as getting the left or right subtrees.

In [16]:
def getRootVal(root):
    return root[0]

def setRootVal(root,newVal):
    root[0] = newVal

def getLeftChild(root):
    return root[1]

def getRightChild(root):
    return root[2]

In [23]:
r = BinaryTree(3)
insertLeft(r,4)
insertLeft(r,5)

[3, [5, [4, [], []], []], []]

In [24]:
insertRight(r,6)
insertRight(r,7)
l = getLeftChild(r)
print(l)

[5, [4, [], []], []]


In [25]:
setRootVal(l,9)
print(r)

[3, [9, [4, [], []], []], [7, [], [6, [], []]]]


In [26]:
insertLeft(l,11)
print(r)

[3, [9, [11, [4, [], []], []], []], [7, [], [6, [], []]]]


In [27]:
print(getRightChild(getRightChild(r)))

[6, [], []]


## Exercise

Write a function buildTree that returns a tree using the list of lists functions that looks like this:

![7.4_x_build_tree_ex.png](attachment:7.4_x_build_tree_ex.png)

In [28]:
t = BinaryTree('a')
insertLeft(t,'b')
insertRight(t,'c')
insertLeft(getLeftChild(t),'d')
insertLeft(getRightChild(t),'e')
insertRight(getRightChild(t),'f')

['c', ['e', [], []], ['f', [], []]]

In [31]:
for i, v in enumerate(t):
    print(i, v)

0 a
1 ['b', ['d', [], []], []]
2 ['c', ['e', [], []], ['f', [], []]]


From above: **A subtree that has a root value and two empty lists is a leaf node.**

# Nodes and references

Our second method to represent a tree uses nodes and references. In this case we will define a class that has attributes for the root value, as well as the left and right subtrees. **Since this representation more closely follows the object-oriented programming paradigm, we will continue to use this representation for the remainder of the chapter.**

![7.5_fig2.png](attachment:7.5_fig2.png)

Figure 2: A Simple Tree Using a Nodes and References Approach

# -- 