 1. What is a Decision Tree, and how does it work in the context of
classification?
  - A Decision Tree is a supervised machine learning algorithm used for classification and regression. In classification, it predicts a class label by learning a sequence of if–then rules from the training data.

2. Explain the concepts of Gini Impurity and Entropy as impurity measures.
How do they impact the splits in a Decision Tree?
  - In decision tree classification, impurity measures quantify how mixed the class labels are in a node. The goal at each split is to reduce impurity as much as possible, creating purer child nodes.


3. What is the difference between Pre-Pruning and Post-Pruning in Decision
Trees? Give one practical advantage of using each.
  - Both pre-pruning and post-pruning are techniques used to control overfitting in decision trees by limiting model complexity, but they differ in when pruning is applied

4. What is Information Gain in Decision Trees, and why is it important for
choosing the best split?
  - Information Gain (IG) is a criterion used in decision tree classification to decide which feature and split should be chosen at each node. It measures how much uncertainty (impurity) is reduced after a dataset is split on a particular feature.

5. What are some common real-world applications of Decision Trees, and
what are their main advantages and limitations?
  - Decision Trees are widely used because they are simple, interpretable, and effective for both classification and regression tasks.

6. Write a Python program to:
● Load the Iris Dataset
● Train a Decision Tree Classifier using the Gini criterion
● Print the model’s accuracy and feature importances

In [2]:
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Load the Iris dataset
iris = load_iris()
X = iris.data
y = iris.target

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# --- Fully-grown Decision Tree Classifier (criterion='gini') ---
decision_tree_full = DecisionTreeClassifier(criterion='gini', random_state=42)
decision_tree_full.fit(X_train, y_train)

# Make predictions on the test set for the fully-grown tree
y_pred_full = decision_tree_full.predict(X_test)

# Print the model's accuracy for the fully-grown tree
accuracy_full = accuracy_score(y_test, y_pred_full)
print(f"Fully-grown Decision Tree Accuracy: {accuracy_full:.4f}")

# --- Decision Tree Classifier with max_depth=3 (criterion='gini') ---
decision_tree_limited = DecisionTreeClassifier(criterion='gini', max_depth=3, random_state=42)
decision_tree_limited.fit(X_train, y_train)

# Make predictions on the test set for the limited depth tree
y_pred_limited = decision_tree_limited.predict(X_test)

# Print the model's accuracy for the limited depth tree
accuracy_limited = accuracy_score(y_test, y_pred_limited)
print(f"Decision Tree (max_depth=3) Accuracy: {accuracy_limited:.4f}")

# Print feature importances for the fully-grown tree (as it was in the original request)
print("\nFeature Importances (Fully-grown tree):")
for feature, importance in zip(iris.feature_names, decision_tree_full.feature_importances_):
    print(f"  {feature}: {importance:.4f}")

Fully-grown Decision Tree Accuracy: 1.0000
Decision Tree (max_depth=3) Accuracy: 1.0000

Feature Importances (Fully-grown tree):
  sepal length (cm): 0.0000
  sepal width (cm): 0.0191
  petal length (cm): 0.8933
  petal width (cm): 0.0876
