#### Devision Tree Implementation

# Decision Tree Structure Problem

## Problem Description

Create a function that converts a list of examples and attributes into a nested decision tree structure.

## Input Format

### Examples List
The input consists of a list of dictionaries, where each dictionary represents a training example with attribute-value pairs and a class label ('PlayTennis').

```python
examples = [
    {'Outlook': 'Sunny', 'Temperature': 'Hot', 'Humidity': 'High', 'Wind': 'Weak', 'PlayTennis': 'No'},
    {'Outlook': 'Sunny', 'Temperature': 'Hot', 'Humidity': 'High', 'Wind': 'Strong', 'PlayTennis': 'No'},
    {'Outlook': 'Overcast', 'Temperature': 'Hot', 'Humidity': 'High', 'Wind': 'Weak', 'PlayTennis': 'Yes'},
    {'Outlook': 'Rain', 'Temperature': 'Mild', 'Humidity': 'High', 'Wind': 'Weak', 'PlayTennis': 'Yes'}
]
```

### Attributes List
A list of attribute names to consider for splitting the decision tree:

```python
attributes = ['Outlook', 'Temperature', 'Humidity', 'Wind']
```

## Output Format

The output should be a nested dictionary representing the decision tree structure:

```python
{
    'Outlook': {
        'Sunny': {
            'Humidity': {
                'High': 'No',
                'Normal': 'Yes'
            }
        },
        'Overcast': 'Yes',
        'Rain': {
            'Wind': {
                'Weak': 'Yes',
                'Strong': 'No'
            }
        }
    }
}
```

## Structure Explanation

1. Root Node
   - The top level of the dictionary represents the root attribute ('Outlook')

2. Internal Nodes
   - Each non-leaf node is an attribute used for decision making
   - The keys at each level are possible values for that attribute
   - Values can be either:
     - Another nested dictionary (for further splitting)
     - A final classification ('Yes' or 'No')

3. Leaf Nodes
   - Terminal nodes containing the final classification
   - Always contain either 'Yes' or 'No' as values

## Reasoning

The decision tree structure is determined based on the following logic:

1. 'Outlook' is chosen as the root node based on its ability to split the data effectively
2. For 'Outlook = Overcast', all examples lead to 'Yes', so it becomes a leaf node
3. For 'Outlook = Sunny', further splitting on 'Humidity' is needed
4. For 'Outlook = Rain', further splitting on 'Wind' is required
5. The tree structure captures all decision paths from the training examples

This structure allows for efficient classification of new examples by following the decision paths from root to leaf nodes.

In [None]:
import math
from collections import Counter

def calculate_entropy(labels):
    '''
    expected input is a list of labels
    '''
    label_counts = Counter(labels)
    total_count = len(labels)
    # calculate entropy - \sum p(x) log2(p(x))
    entropy = -sum((count / total_count) * math.log2(count / total_count) for count in label_counts.values())
    return entropy

def calculate_information_gain(examples, attr, target_attr):
    # attr: attribute to considering spliting
    # target_attr: target attribute - play_tennis
    # exmaples: list of examples
    total_entropy = calculate_entropy([example[target_attr] for example in examples])
    # get unique values of attributes that we are splitting:
    # e.g. for outlook: ['sunny', 'overcast', 'rainy']
    values = set(example[attr] for example in examples)
    attr_entropy = 0
    for value in values:
        value_subset = [example[target_attr] for example in examples if example[attr] == value]
        # this is like a loss function for the current attribute
        # entropy of the subset
        # if entropy is 0, then we have split that leads to pure class
        value_entropy = calculate_entropy(value_subset)
        # weightaed sum of entropy
        # current attribute entropy = \sum p(x) * entropy(x)
        # this tells us that under current splitting, how much entropy is reduced
        attr_entropy += (len(value_subset) / len(examples)) * value_entropy
    # Information Gain = Total Entropy - Weighted Attribute Entropy
    # IG = H(S) - Σ((|Sv|/|S|) * H(Sv))
    return total_entropy - attr_entropy

def majority_class(examples, target_attr):
    return Counter([example[target_attr] for example in examples]).most_common(1)[0][0]

def learn_decision_tree(examples, attributes, target_attr):
    if not examples:
        return 'No examples'
    # check if all examples have the same class - no need to split further if True
    if all(example[target_attr] == examples[0][target_attr] for example in examples):
        return examples[0][target_attr]
    # if no attributes left to split on, return majority class - majority voting
    if not attributes:
        return majority_class(examples, target_attr)
    
    # assign a information gain to each attribute
    # th best splitting attribute is the one with the highest information gain
    gains = {attr: calculate_information_gain(examples, attr, target_attr) for attr in attributes}

    # attribute with maximum information gain
    best_attr = max(gains, key=gains.get)

    tree = {best_attr: {}}

    unique_values = set(example[best_attr] for example in examples)

    # unique values of the best attribute
    # e.g. for outlook: ['sunny', 'overcast', 'rainy']
    
    for value in unique_values:
        subset = [example for example in examples if example[best_attr] == value]
        new_attributes = attributes.copy()

        # since we have split on the best attribute, we remove it from the list of attributes
        new_attributes.remove(best_attr)
        
        # recursively build the tree
        # with new subset and attributes
        subtree = learn_decision_tree(subset, new_attributes, target_attr)
        tree[best_attr][value] = subtree
    
    return tree


In [11]:
# Example data
examples = [
    {'Outlook': 'Sunny', 'Temperature': 'Hot', 'Humidity': 'High', 
     'Wind': 'Weak', 'PlayTennis': 'No'},
    {'Outlook': 'Sunny', 'Temperature': 'Hot', 'Humidity': 'High', 
     'Wind': 'Strong', 'PlayTennis': 'No'},
    {'Outlook': 'Overcast', 'Temperature': 'Hot', 'Humidity': 'High', 
     'Wind': 'Weak', 'PlayTennis': 'Yes'},
    {'Outlook': 'Rain', 'Temperature': 'Mild', 'Humidity': 'High', 
     'Wind': 'Weak', 'PlayTennis': 'Yes'}
]

attributes = ['Outlook', 'Temperature', 'Humidity', 'Wind']
target_attr = 'PlayTennis'



# Build tree
tree = learn_decision_tree(examples, attributes, target_attr)
print(tree)

{'Outlook': {'Overcast': 'Yes', 'Sunny': 'No', 'Rain': 'Yes'}}


In [4]:
calculate_entropy([example[target_attr] for example in examples])

1.0

In [10]:
len(examples)

4