# Introduction to Binary Trees

## [1/10] Introduction

In the "Overview of Recursion" notebook, we learned about recursion. In this lesson, we'll start learning how you can use recursion to define data structures. In particular, we're going to learn about **binary trees**.

So far, we've learned about **linear** data structures such as lists, stacks, and queues. We call these data structures linear because they store the data sequentially. Each element points to a single next element.

<center>
<img src="https://drive.google.com/uc?id=1279pAsKBwk9P3yZj5sqA7AqRxvH1sS5J" width="20%">
</center>

In the worst case, to **lookup** a value in a linear data structure, we need to inspect all of the values that it stores.

By contrast, trees generally allow elements to have more than one successor. We use the term binary when the number of successors is at most equal to two. Here's an example of a binary tree:

<center>
<img src="https://drive.google.com/uc?id=19Vy11WEMJxcdecCGib4v4VOK7UZ0yywF" width="20%">
</center>

We can see in the figure that in a tree, an element can have more than one successor. We call the elements in a tree **``nodes``**. In this example ``node 5``, has two successors, ``2 and 7, while 7`` has a single successor. In a tree, successors are called **``children``**.

We call the top node the **``root``** of the tree. Nodes that don't have children (no successors) are called **``leaves``**. The reason why we call this data structure a tree is that it somewhat resembles an upside-down tree:

<center>
<img src="https://drive.google.com/uc?id=15ALypLHKjkeC3eWsFghuu9TTSQxI47Qu" width="60%">
</center>

We call nodes that have both a parent and children **``internal``**. Nodes ``4 and 9`` are internal nodes.

We'll learn throughout this course why storing data in a tree data structure can yield much faster operations than using a list.

Let's practice our understanding of these concepts by identifying the root node, the internal nodes, and the leaves of a given tree.

### Instructions

Consider the following binary tree.

<center>
<img src="https://drive.google.com/uc?id=1XRtjKQsRs_cZCRokQhVee8rA74NnVVVV" width="40%">
</center>

1. Assign to a variable named **``root``** an integer with the value in the root node.
2. Assign to a variable named **``leaves``** a list with all values stored in the leaf nodes.
3. Assign to a variable named **``internal``** a list with all values stored in the internal nodes.


In [None]:
# put your answer here

## [2/10] Binary Tree Node Structure

On the previous section, we learned some of the terminology of tree data structures. On this section, **we'll start implementing a binary tree data structure**.

The first thing that we need to implement is a **class** to represent nodes. A node needs to store a value and its children.

In a general tree, we can store the children using a list of nodes. In the case of binary trees, since there are at most two children, we instead keep one reference to each of them. The convention is to call one of them the **``left child``** and the other the **``right child``**.

Here's a diagram illustrating node structure:

<center>
<img src="https://drive.google.com/uc?id=1AHXtbyuTwat_3JVJk5u2cC4F93gf95Ga" width="40%">
</center>

Let's quickly review how to create a class in Python. Remember that to define a class named, for example, User, we use the following syntax:

```python
class User:
    # Class definition goes here
```

We can then create an instance by writing **``user = User()``**. When we do so, Python automatically calls the **``__init__()``** method. This method sets the initial state of the class. If the class requires some parameters, this is where they should be set.

Here's an example of a class that could represent the name and age of users:

```python
class User:

    def __init__(self, name, age):
        self.name = name
        self.age = age
```

In this case we would create an instance by providing the ``name`` and ``age``, like so:

```python
user = User("John Smith", 42)
```

Recall that the **self** parameter is a special parameter that is automatically passed to any method. It contains a reference for the object itself — that is, what we use to alter the object state.

### Instructions

1. Define a class named **``Node``**.
2. Inside the **``Node class``**, define the ``__init__()`` method with two arguments:
  - **self**: the self-reference that is automatically passed.
  - **value**: the value that the node will store.
3. Implement the ``__init__()`` method so that it stores value into ``self.value`` and ``None`` into both ``self.left_child`` and ``self.right_child``.

In [None]:
# put your code here

## [3/10] Building Your First Tree

On this section, we'll use the **``Node``** implementation from the previous section to create a **binary tree**.

To help with this, we'll show you how we would build the following tree:

<center>
<img src="https://drive.google.com/uc?id=1RwzvbxMMylTv0GAO44MapclT7kEzFws_" width="40%">
</center>

To build the tree above, we need to create one node instance for each of the values. In total there are four nodes. Then we need to set the **``left_child``** and **``right_child``** references to correct the nodes.

Let's start by creating the four nodes:

```python
node_5 = Node(5)
node_2 = Node(2)
node_7 = Node(7)
node_9 = Node(9)
```

Then we need to set the children references:

```python
node_5.left_child = node_2
node_5.right_child = node_7
node_7.right_child = node_9
```

Now it's your turn to build a larger tree.



### Instructions

The **``Node``** class from the previous section is still available. The goal of this exercise is to use it to create the binary tree shown in the following diagram:


<center>
<img src="https://drive.google.com/uc?id=1FhxwTNc7oznD5HjlHvFURi_uLYPdYOPX" width="40%">
</center>

1. For each of the values, use the **``Node()``** constructor to create a node with the same value. Assign each node to a variable named **``node_value``**, where value should be value in the node (e.g. **``node_5``** for a value of **``5``**).
2. For each non-leaf node, set its children references by setting the **``Node.left_child``** and **``Node.right_child``** attributes. For example:
  - For **``node_3``** we will do **``node_3.left_child = node_1``**.
  - For **``node_9``** we will do **``node_9.left_child = node_8``** and **``node_9.right_child = node_10``**.



In [None]:
# put your code here

## [4/10] Binary Search Trees

On the previous section, we used the node class to manually build a specific tree. This was, to say the least, a bit cumbersome. Our goal now is to start implementing a class that uses the **``Node``** to automatically build the binary tree structure for us.

By the end of this lesson, we will be able to **``add``** and **``lookup``** values to the tree. When we add a new value the tree, it will automatically create a new node and place for it.

But what is the right place for a value? Consider adding value 7 to the following binary tree:

<center>
<img src="https://drive.google.com/uc?id=186rlDYgmwy7ijx6i6S87Yp_Mlp-0F-Gy" width="40%">
</center>

Any of the four positions shown in the figure with a question mark would be appropriate for adding a new node with value ``7``.

The rule that we'll follow is to have all values on the left be smaller than or equal to the value in the parent, and all values on the right bigger than the value parent value. The following figure illustrates this rule

<center>
<img src="https://drive.google.com/uc?id=1-wTJgGy-wkgc9nsES1VOeNyXH3DbMTLM" width="40%">
</center>

We call a binary tree that satisfies this property a **``binary search tree (BST)``**. Note that this rule is recursive. This means that it applies to any node in the tree. For each node, all of the values on the left should be smaller than or equal to the value stored in that node. All values on the right should be bigger.

If we go back to the insertion of value ``7``, following this rule, we have single possible place where we can add it. Since it is bigger than ``5``, it must be on the right side of the root. Applying the rule again to node ``9``, it must be on the left of this node. The following diagram shows proper placement:


<center>
<img src="https://drive.google.com/uc?id=15taqdCF-4qwC6TQux7variQOcDA4gRSl" width="40%">
</center>

We'll come back to this on the following section and implement the insertion method that follows this rule. Before we do that, though, we need to define a class to represent the binary search tree.

### Instructions

1. Define a class named **``BST (binary search tree)``**.
2. Inside the **``BST``** class, define the **``__init__()``** method with only **``self``** as argument.
3. Implement the **``__init__()``** method so that it initializes **``self.root``** to **``None``**. This attribute will represent the root node.

In [None]:
# put your code here

## [5/10] BST Inserting Values

We are now ready to implement an insertion method for our BST. Initially, the root is set to **None**, so the first time we add value, the only thing we need to do is to create a new node with the provided value and set the root to that node.

If the root is already defined, then we need to locate the right place to create the new node. To find this location, go down the tree and, at each node, compare the value that we want to insert with the value stored in the current node.

- If the insert value is smaller than or equal to the node value, we go left because this is where smaller values are stored.

- Otherwise, if the insert value is bigger than the node value, we go right because this is where bigger values are stored.

This process stops either when we want to go left but the current node doesn't have a left child or we want to go right and the current node doesn't have a right child.

The following animation illustrates this process:

<center>
<img src="https://drive.google.com/uc?id=1yeS9P5xorLt78td4eGlbSB_7Y90zdbc0" width="60%">
</center>

To implement this, we will use recursion. We will need to keep track of the current node as we go down the tree. For each node, we check whether we need to go left or right by comparing the value in the current node to the value that we want to insert.

If the insert value is smaller than or equal to the node value, we call the same method again but give the left child as the current node. Otherwise, we call the same method again but give the right child as the current node.

The base case occurs in one of the following situations:

1. We want to go left, but the left child is **``None``**. In this case, we create a new node with the provided value and set the left child of the current node to the newly created node.

2. We want to go right, but the right child is **``None``**. In this case, we create a new node with the provided value and set the right child of the current node to the newly created node.

Let's implement this.


### Instructions

We've provided the **``BST``** class from the previous section, and we've added an **``add()``** method. This method checks whether the root is defined. If it isn't, it creates a new node with the given value.

Otherwise, it will call the **``add_recursive()``** method to go down the tree and insert the value. Your goal is to implement the **``add_recursive()``** method.

1. Define a **``add_recursive()``** method with three arguments:
  - **``self``**: the self-reference.
  - **``current_node``**: The current node.
  - **``value``**: the value that we want to add.
2. Implement the **``add_recursive()``** method by following these steps:
  - If **``value``** is smaller than or equal to the value of **``current_node``**, then check whether the left child of **``current_node``** is **``None``**.
    - If it is, create a new node with value, and assign it to the left child of **``current_node``**.
    - Otherwise, call the **``add_recursive()``** method with the left child of **``current_node``** and value as arguments.
2. If **``value``** is larger than the value of **``current_node``**, then check whether the right child of **``current_node``** is **``None``**.
  - If it is, create a new node with **``value``**, and assign it to the right child of **``current_node``**.
  - Otherwise, call the **``add_recursive()``** method with the right child of **``current_node``** and **``value``** as arguments.

In [None]:
# solved exercise

class Node:
    """
    A class representing a node in a Binary Search Tree (BST).

    Each node contains a value and references to its left and right children.
    """

    def __init__(self, value):
        """
        Initializes a Node with a given value.

        Parameters:
        - value: The value to be stored in the node.

        Both left and right child references are initialized as None.
        """
        self.value = value
        self.left_child = None
        self.right_child = None


class BST:
    """
    A class representing a Binary Search Tree (BST).
    """

    def __init__(self):
        """
        Initializes an empty BST.
        """
        self.root = None

    def add(self, value):
        """
        Add a new value to the BST.

        Parameters:
        - value: The value to be added to the BST.

        If the BST is empty, the new value becomes the root.
        Otherwise, it will be placed in the appropriate position
        based on BST properties.
        """
        if self.root is None:
            self.root = Node(value)
        else:
            self._add_recursive(self.root, value)

    def _add_recursive(self, current_node, value):
        """
        Recursively add a new value to the BST starting from a specified node.

        Parameters:
        - current_node: The starting node to determine where to insert the new value.
        - value: The value to be added to the BST.

        The function determines whether the new value should be placed to the left or
        right of the current_node based on the value. If the appropriate child node is empty,
        the value is inserted. Otherwise, the function is called recursively with the child node.
        """
        if value <= current_node.value:
            if current_node.left_child is None:
                current_node.left_child = Node(value)
            else:
                self._add_recursive(current_node.left_child, value)
        else:
            if current_node.right_child is None:
                current_node.right_child = Node(value)
            else:
                self._add_recursive(current_node.right_child, value)

## [6/10] BST Contains

We can now add values to the tree!

The other method that we want to implement checks whether the BST contains a given value. The process for checking if the tree contains a value is quite similar to the one for adding a value.

Starting from the root, we go down the tree trying to find a node that contains the value that we're looking for. At each node, we compare the lookup value with the node value to decide whether we go to the left or to the right.

The main difference is that if the current node value and the lookup value are equal, then we stop the search and return **``True``**. The following animation demonstrates this process:

<center>
<img src="https://drive.google.com/uc?id=1o7gnFdKdJem8YVnHLzmlYcrnZWE4mDVz" width="60%">
</center>

To implement this method, we can use recursion again. We will keep track of the same information as we did with the **``add_recursive()``** method: the current node and the lookup value.

At each node, we do the following:

1. If the node is **``None``**, we return **``False``** because we reach the place where the value should be, but we did not find it.
2. Otherwise, we compare the value in the current node with the lookup value.
  - If the values are the same, we found the target and return **``True``**.
  - If the lookup value is smaller than or equal to the current node value, we call the same method again but on the left child of the current node.
  - Otherwise, if the lookup value is bigger than the current node value, we call the same method but on the right child of the current node.

### Instructions

We've provided the BST class from the previous exercise.

1. Define a **``contains()``** method with three arguments:
  - **``self``**: the self-reference
  - **``current_node``**: the current node
  - **``value``**: the value that we are looking for
2. Implement the **``contains()``** method by following these steps:
  - If the **``current_node``** is **``None``**, then return **``False``**.
  - Otherwise, compare **``value``** with the **``current_node``** value.
    - If they are the same, return **``True``**.
    - If the **``value``** is smaller, return the result of calling the same method but on the left child of **``current_node``**.
    - If the **``value``** is bigger, return the result of calling the same method but on the right child of **``current_node``**.

In [None]:
# solved exercise

class Node:
    """
    A class representing a node in a Binary Search Tree (BST).

    Each node contains a value and references to its left and right children.
    """

    def __init__(self, value):
        """
        Initializes a Node with a given value.

        Parameters:
        - value: The value to be stored in the node.

        Both left and right child references are initialized as None.
        """
        self.value = value
        self.left_child = None
        self.right_child = None


class BST:
    """
    A Binary Search Tree (BST) implementation.

    Attributes:
        root (Node): The root node of the BST. Initializes as None for an empty tree.
    """

    def __init__(self):
        """Initializes an empty BST."""
        self.root = None

    def add(self, value):
        """
        Adds a new value to the BST.

        Args:
            value: The value to be added to the tree.

        If the tree is empty, the new value becomes the root. Otherwise, the value is
        added to the appropriate position maintaining the BST property.
        """
        if self.root is None:
            self.root = Node(value)
        else:
            self._add_recursive(self.root, value)

    def _add_recursive(self, current_node, value):
        """
        Recursively finds the appropriate position and adds a new value to the BST.

        Args:
            current_node (Node): The starting node to determine where to insert the new value.
            value: The value to be added to the BST.

        The method determines whether the new value should be placed to the left or right of
        the current node. If the correct child node position is empty, the value is inserted.
        Otherwise, the function recurses with the child node.
        """
        if value <= current_node.value:
            if current_node.left_child is None:
                current_node.left_child = Node(value)
            else:
                self._add_recursive(current_node.left_child, value)
        else:
            if current_node.right_child is None:
                current_node.right_child = Node(value)
            else:
                self._add_recursive(current_node.right_child, value)

    def contains(self, current_node, value):
        """
        Recursively checks if the BST contains a given value.

        Args:
            current_node (Node): The starting node for the search.
            value: The value to search for.

        Returns:
            bool: True if the BST contains the value, otherwise False.

        This method begins the search from the provided node and navigates down the tree.
        """
        if current_node is None:
            return False
        if current_node.value == value:
            return True
        if value < current_node.value:
            return self.contains(current_node.left_child, value)
        return self.contains(current_node.right_child, value)


## [7/10] Polishing the Implementation

Congratulations on implementing your first tree data structure! We now have a functional BST!

On this section, we're going to discuss some improvements in the implementation to fix the following problems:

1. Users of the BST class shouldn't directly use the **``add_recursive()``** method. We use this method as a helper method for the **``add()``** method. To add new values, users should always use the **``add()``** method.

2. When using the **``contains()``** method, users need to pass in the root as the current node. This requires them to write more code and be aware of the **``root``** attribute.

For example, if **``bst``** is a **``BST``*** instance, we need to write the following code to check whether it contains value 6:

```python
bst.contains(bst.root, 6)
```

Ideally, we want users to only have to write the following:

```python
bst.contains(6)
```

> In Python, we can indicate that a method should not be used externally by adding an underscore **``_``** to the beginning of the method name.

Let's use this to fix the previously mentioned problems.

We will rename the **``add_recursive()``** method to **``_add_recursive()``**. Remember to also rename the method call inside the **``add()``** method from **``self.add_recursive()``** to **``self._add_recursive()``**.

Next, we will rename the **``contains()``** method to **``_contains()``**. We will also need to change the two recursive calls from **``self.contains()``** to **``self._contains()``**.

Then, we will create a new **``contains()``** method with a single value argument. This method will do nothing more than call the **``hidden _contains()``** method by providing self.root and value as arguments.

### Instructions

We've provided the **``BST``** class from the previous section.

1. Rename all four occurrences of **``add_recursive``** to **``_add_recursive``**.
2. Rename all three occurrences of contains to **``_contains``**.
3. Define a new **``contains()``** method with two arguments:
  - **``self``**: the self-reference
  - **``value``**: the value we are looking for

4. Implement the **``contains()``** method so that it returns the result of calling the **``_contains()``** method with arguments **``self.root``** and **``value``**.

In [None]:
# solved exercise

class Node:
    """
    A class representing a node in a Binary Search Tree (BST).

    Each node contains a value and references to its left and right children.
    """

    def __init__(self, value):
        """
        Initializes a Node with a given value.

        Parameters:
        - value: The value to be stored in the node.

        Both left and right child references are initialized as None.
        """
        self.value = value
        self.left_child = None
        self.right_child = None


class BST:
    """
    Represents a Binary Search Tree (BST).

    Attributes:
        root (Node or None): The root node of the tree. Initializes to None for an empty tree.
    """

    def __init__(self):
        """Initializes an empty BST."""
        self.root = None

    def add(self, value):
        """
        Inserts a new value into the BST.

        Args:
            value: The value to be added to the tree.

        If the tree is empty, the value becomes the root. Otherwise, the method uses
        a recursive helper function to find the appropriate position to maintain the BST property.
        """
        if self.root is None:
            self.root = Node(value)
        else:
            self._add_recursive(self.root, value)

    def _add_recursive(self, current_node, value):
        """
        Recursively finds the correct position and inserts a value into the BST.

        Args:
            current_node (Node): The node to start the search for the insert position from.
            value: The value to be added to the BST.

        The method determines if the new value should be placed to the left or right of
        the current node. If the target position is empty, the value is inserted.
        Otherwise, the function calls itself recursively with the respective child node.
        """
        if value <= current_node.value:
            if current_node.left_child is None:
                current_node.left_child = Node(value)
            else:
                self._add_recursive(current_node.left_child, value)
        else:
            if current_node.right_child is None:
                current_node.right_child = Node(value)
            else:
                self._add_recursive(current_node.right_child, value)

    def _contains(self, current_node, value):
        """
        Recursively checks if the BST contains the specified value starting from a given node.

        Args:
            current_node (Node): The node to start the search from.
            value: The value to search for in the BST.

        Returns:
            bool: True if the value exists in the subtree rooted at current_node, otherwise False.
        """
        if current_node is None:
            return False
        if current_node.value == value:
            return True
        if value < current_node.value:
            return self._contains(current_node.left_child, value)
        return self._contains(current_node.right_child, value)

    def contains(self, value):
        """
        Checks if the BST contains the specified value.

        Args:
            value: The value to search for in the BST.

        Returns:
            bool: True if the BST contains the value, otherwise False.
        """
        return self._contains(self.root, value)

## [8/10] - Building your second tree

With the improvements from the previous section, our BST implementation is much cleaner and easier to use. Users don't need to know the internal structure of the tree anymore.

At the beginning of this lesson, we built the following tree by manually creating the nodes and assigning the children:

<center>
<img src="https://drive.google.com/uc?id=1FhxwTNc7oznD5HjlHvFURi_uLYPdYOPX" width="40%">
</center>

On this section, we're going to create the same tree again but using the **``add()``** method this time. However, to get that exact tree, we can't insert the values in any order!

For example, if we add the values in increasing order **``[1, 3, 5, 6, 8, 9, 10]``**, then we will instead get this:

<center>
<img src="https://drive.google.com/uc?id=1SQI8dJuTVrweJYEL71PjUMm716j7GOWU" width="20%">
</center>

If we add values in the order **``[6, 3, 9, 1, 5, 8, 10]``**, then we will get this:

<center>
<img src="https://drive.google.com/uc?id=1EMq8fZMDW8Fh1F2Zj8QA7Y8qbmipm9md" width="60%">
</center>


### Instructions

The **``BST``** class we've defined in this lesson is available, but the code isn't visible because we won't modify it in this exercise.

The goal of this exercise is to use the **``add()``** method to build the following tree:

<center>
<img src="https://drive.google.com/uc?id=13fs_U1rLJRGg55J0UICXl2_0T1WYVMYh" width="60%">
</center>

The challenge is to find the correct value insertion order so that the resulting tree matches the one in the figure.

1. Use the **``BST()``** constructor to create an empty binary search tree. Assign it to a variable **``bst``**.
2. Use the **``add()``** method to add the values **``1, 3, 5, 6, 8, 9, 10``** so that the resulting tree matches the one in the figure. You will need to add the values in **``another order``** to get the same tree as the diagram.

In [None]:
# put your code here

## [9/10] BST Complexity

Before we finish this section, let's discuss the advantages we get from using a BST instead of a list.

The time it takes to add a value to a BST or check whether it contains a value depends on how tall the tree is. The **``height``** of a tree is the length of the longest path from the root to a leaf.

Here are a few examples with the longest paths highlighted in blue on each example:

<center>
<img src="https://drive.google.com/uc?id=1LDOPfz_awChu20A6kDarR5M3Os1A8PS9" width="80%">
</center>

In both insertions and lookups, we process at most one path of the tree. So in the worst case, the time complexity is **``O(h)``**, where **``h``** is the height of the tree.

The trees in these examples were all built using the same set of values. The difference is that the values were inserted in a different order. The middle example shows the maximum possible height we can get with a tree with seven values. On the other hand, the last example shows the minimum possible height.

In the worst case, each node has a single child and the height is equal to **``n - 1``**, where n is the number of values in the tree. In this case, tree operations have **``O(n)``** complexity.

In the best case, each node has two children so each level has twice as many values as the previous level. The first level has one node, the second has two, the third has four, and so on. Let's calculate the height **``h``** of such a tree as a function of the number of nodes, **``n``**.

The total number of nodes, **``n``**, is equal to the sum of the number of nodes in each level. So in the case where the first level has one node and each subsequent level has double the nodes of the previous level, we have the following:


<center>
<img src="https://drive.google.com/uc?id=1FjSi_bxvMY4V1T2aqbzswfiVIHVI4HZs" width="80%">
</center>

The sum $1 + 2 + 4 + \ldots 2^{h-1}$, on the right-hand side, is equal to $2^{h - 1}$. We won't do a formal proof here but we can convince ourselves that it is true by calculating the first few terms:

```python
1 + 2 = 3 = 4 - 1
1 + 2 + 4 = 7 = 8 - 1
1 + 2 + 4 + 8 = 15 = 16 - 1
1 + 2 + 4 + 8 + 16 = 31 = 32 - 1
```

This means that if each node has two children, $n = 2^{h - 1}$, our goal is to express **``h``** in term of **``n``**:

$
n = 2^h - 1 ⇒ n +1 ⇒ \log(n+1) = h
$

We conclude that, in the best case, the height of the tree is **``O(log(n))``** (recall that the + 1 and the base of the logarithm don't matter for the complexity).

The conclusion is that, depending on the order in which the values are added to the tree, the complexity of the tree operations will range from linear complexity to logarithmic complexity. Linear complexity is not very good in this case since it is not better than a list. On the other hand, logarithmic complexity is much much faster.

In the next notebook, we'll learn how we can ensure that the complexity is logarithmic for any insertion order.

## [10/10] Next Steps

In this notebook, we've defined and built a binary search tree and implemented insertion and lookups using recursion. We've learned that the complexity of these operations strongly depends on the height of the tree, which can be anywhere from logarithmic to linear.

For the next notebook, we'll extend the BST implementation to ensure that the height of the tree remains logarithmic regardless of the order in which values are inserted.