<a href="https://colab.research.google.com/github/ashokteja123/Machine-Learning/blob/main/ID3_Algorithm.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Step 1: Import Necessary Libraries**

 importing the required libraries for data handling and visualization.



In [1]:
import pandas as pd
import numpy as np
from math import log2

Step 2: Define Functions for Entropy and Information Gain
bold text bold text

**1.
Entropy**





>







 Measures the impurity or disorder in a dataset. A pure dataset, where all instances belong to the same class, has an entropy of 0, while a dataset with an equal distribution among classes has the highest entropy.



Formula:
          
H(S)=
              i=1
∑
c
​
 P(i)log
2
​
 P(i)  

 where
 H(S) = entropy of the dataset S


c = number of classes in the dataset
P
(
i
)

             P(i) = proportion of instances in class i

2. Calculating Proportions
For a dataset with
n
 instances and​
  instances of class
   
           p(k)= nK/n
​

​



**2. Information Gain**

>


Information Gain measures the reduction in entropy after splitting the dataset based on an attribute. It helps identify the attribute that provides the most significant increase in classification accuracy.



Formula:  

                 IG(S,A)=H(S)−H(SA)




A = the attribute being considered for the split



H(S
A
​
 )   is the weighted entropy after splitting the dataset
S
S based on attribute
A
A




Where:
: Original dataset.
: Attribute being evaluated.
​: Subset of  where attribute  takes value .
Example: If splitting the dataset by an attribute reduces the overall entropy from 0.97 to 0.58, the information gain is:
0.39


 **ID3 Algorithm Role**


Calculate Entropy: Compute the entropy for the dataset.

Evaluate Attributes: Compute the information gain for each attribute.

Select Attribute: The attribute with the highest information gain becomes the decision node

Entropy Calculation:
defining functions for entropy and info gain

In [2]:
def calculate_entropy(data):
    labels = data.iloc[:, -1].value_counts()
    total = len(data)
    entropy = -sum((count / total) * log2(count / total) for count in labels)
    return entropy

In [3]:
def calculate_information_gain(data, attribute):
    total_entropy = calculate_entropy(data)
    values = data[attribute].unique()
    weighted_entropy = 0

    for value in values:
        subset = data[data[attribute] == value]
        weighted_entropy += (len(subset) / len(data)) * calculate_entropy(subset)

    return total_entropy - weighted_entropy

Building id3 algorithm

In [6]:
def id3(data, features):
    if len(data.iloc[:, -1].unique()) == 1:
        return data.iloc[0, -1]

    if len(features) == 0:
        return data.iloc[:, -1].mode()[0]

    gains = {feature: calculate_information_gain(data, feature) for feature in features}
    best_feature = max(gains, key=gains.get)

    tree = {best_feature: {}}
    for value in data[best_feature].unique():
        subset = data[data[best_feature] == value]
        remaining_features = [feat for feat in features if feat != best_feature]
        tree[best_feature][value] = id3(subset, remaining_features)

    return tree

example dataset

In [7]:
data = pd.DataFrame({
    'Outlook': ['Sunny', 'Sunny', 'Overcast', 'Rainy', 'Rainy'],
    'Temperature': ['Hot', 'Hot', 'Hot', 'Mild', 'Cool'],
    'Humidity': ['High', 'High', 'High', 'Normal', 'Normal'],
    'Wind': ['Weak', 'Strong', 'Weak', 'Weak', 'Weak'],
    'PlayTennis': ['No', 'No', 'Yes', 'Yes', 'Yes']
})

features = list(data.columns[:-1])
tree = id3(data, features)
print(tree)

{'Outlook': {'Sunny': 'No', 'Overcast': 'Yes', 'Rainy': 'Yes'}}


STEP 5:Visualize the decision tree
  To understand, and visualize the decision tree using libraries like Graphviz.

---



In [8]:
pip install graphviz



In [9]:
from graphviz import Digraph

def visualize_tree(tree, parent=None, graph=None):
    if graph is None:
        graph = Digraph()

    for key, value in tree.items():
        if isinstance(value, dict):
            graph.node(key, key)
            for sub_key in value:
                graph.edge(key, sub_key)
                visualize_tree({sub_key: value[sub_key]}, key, graph)
        else:
            graph.node(value, value)
            graph.edge(parent, value)
    return graph

visualize_tree(tree).view()

'Digraph.gv.pdf'


### Advantages of the ID3 Algorithm

1. **Simplicity and Interpretability**: Produces easy-to-understand decision trees, making them accessible to non-technical users.

2. **Efficient with Categorical Data**: Handles categorical attributes well without needing extra preprocessing.

3. **Greedy Approach**: Quickly constructs trees by selecting attributes with the highest information gain.

4. **Foundation for Advanced Algorithms**: Serves as a basis for more advanced decision tree algorithms like C4.5 and C5.0.

5. **Versatile Applications**: Applicable in various fields, including healthcare, finance, and marketing.

### Limitations of the ID3 Algorithm

1. **Overfitting**: Can create overly complex trees that fit noise in the training data, affecting generalization.

2. **Challenges with Continuous Data**: Struggles with continuous attributes, which need to be discretized, potentially losing information.

3. **Bias Towards Multi-Valued Attributes**: Favors attributes with many unique values, which may not always be the most relevant.

4. **No Pruning**: Lacks mechanisms to simplify trees, which can lead to overly large and complex models.

5. **Scalability Issues**: Faces difficulties with large datasets due to high computational and memory requirements.

