# Experiment 3
## Demonstration of classification rules process on dataset of your choice using ID3 and J48 algorithm.

**ID3 (Iterative Dichotomiser 3):**
1. **Initialization**: Start with the entire dataset and a root node for the decision tree.

2. **Attribute Selection**: Select the best attribute to split the data. ID3 uses the concept of information gain, which measures the reduction in uncertainty (entropy) achieved by splitting based on a particular attribute. It chooses the attribute with the highest information gain.

3. **Splitting**: Create child nodes for each possible value of the selected attribute and divide the data accordingly.

4. **Recursion**: Recursively repeat the process for each child node until a stopping criterion is met, such as a predefined tree depth or no more attributes to split.

5. **Stopping Criteria**: Stop when one of the stopping criteria is met, typically when all instances in a node belong to the same class, or when there are no more attributes to split.

6. **Tree Generation**: The result is a tree where each leaf node represents a class, and internal nodes represent attribute splits.

7. **Pruning** (not in ID3): ID3 does not support pruning, which may lead to overfitting.

8. **Prediction**: To classify a new instance, traverse the tree from the root to a leaf node based on attribute values, and the class of the leaf node is the predicted class.

**J48 (C4.5):**
J48, an improvement over ID3, works as follows:

1. **Initialization**: Start with the entire dataset and a root node for the decision tree.

2. **Attribute Selection**: Instead of information gain, J48 uses the gain ratio, which takes into account the number of branches created by an attribute. It aims to create more balanced trees.

3. **Splitting**: Similar to ID3, create child nodes for each possible value of the selected attribute and divide the data.

4. **Pruning**: J48 supports pruning, a process where the algorithm trims branches of the tree that do not improve classification accuracy. This helps avoid overfitting.

5. **Continuous Attributes**: J48 can handle both discrete and continuous data by converting continuous attributes into discrete ones through a process called binary splits.

6. **Stopping Criteria**: Like ID3, J48 stops when certain criteria are met, such as pure leaf nodes or attribute exhaustion.

7. **Tree Generation**: The result is a decision tree, which may be pruned to improve generalization.

8. **Prediction**: To classify a new instance, traverse the tree from the root to a leaf node based on attribute values, and the class of the leaf node is the predicted class.

In summary, ID3 and J48/C4.5 are decision tree algorithms that build trees to make predictions based on input data. J48 improves upon ID3 by using gain ratio for attribute selection and supporting pruning to prevent overfitting.

In [None]:
# Import necessary libraries
from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier, export_text, export_graphviz
from sklearn.model_selection import train_test_split
import graphviz

# Load the Iris dataset
iris = load_iris()
X = iris.data
y = iris.target

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# ID3 Algorithm
id3_classifier = DecisionTreeClassifier(criterion='entropy', random_state=42)
id3_classifier.fit(X_train, y_train)

# J48 (C4.5) Algorithm
j48_classifier = DecisionTreeClassifier(criterion='gini', random_state=42)
j48_classifier.fit(X_train, y_train)

In [None]:
# Print the ID3 decision tree rules
id3_rules = export_text(id3_classifier, feature_names=iris.feature_names)
print("ID3 Decision Tree Rules:\n", id3_rules)

# Print the J48 (C4.5) decision tree rules
j48_rules = export_text(j48_classifier, feature_names=iris.feature_names)
print("\nJ48 (C4.5) Decision Tree Rules:\n", j48_rules)

ID3 Decision Tree Rules:
 |--- petal length (cm) <= 2.45
|   |--- class: 0
|--- petal length (cm) >  2.45
|   |--- petal length (cm) <= 4.75
|   |   |--- petal width (cm) <= 1.60
|   |   |   |--- class: 1
|   |   |--- petal width (cm) >  1.60
|   |   |   |--- class: 2
|   |--- petal length (cm) >  4.75
|   |   |--- petal length (cm) <= 5.15
|   |   |   |--- petal width (cm) <= 1.75
|   |   |   |   |--- sepal width (cm) <= 2.35
|   |   |   |   |   |--- class: 2
|   |   |   |   |--- sepal width (cm) >  2.35
|   |   |   |   |   |--- petal length (cm) <= 5.05
|   |   |   |   |   |   |--- class: 1
|   |   |   |   |   |--- petal length (cm) >  5.05
|   |   |   |   |   |   |--- sepal length (cm) <= 6.15
|   |   |   |   |   |   |   |--- class: 1
|   |   |   |   |   |   |--- sepal length (cm) >  6.15
|   |   |   |   |   |   |   |--- class: 2
|   |   |   |--- petal width (cm) >  1.75
|   |   |   |   |--- sepal width (cm) <= 3.10
|   |   |   |   |   |--- class: 2
|   |   |   |   |--- sepal width 

In [None]:
# Visualize the ID3 decision tree (requires graphviz)
dot_data_id3 = export_graphviz(id3_classifier, out_file=None, feature_names=iris.feature_names, filled=True, rounded=True)
graph_id3 = graphviz.Source(dot_data_id3)
graph_id3.render("ID3_decision_tree")

# Visualize the J48 (C4.5) decision tree (requires graphviz)
dot_data_j48 = export_graphviz(j48_classifier, out_file=None, feature_names=iris.feature_names, filled=True, rounded=True)
graph_j48 = graphviz.Source(dot_data_j48)
graph_j48.render("J48_decision_tree")

'J48_decision_tree.pdf'