           Decision Tree
#Theoretical.
1. What is a Decision Tree, and how does it work ?
- A Decision Tree is a type of supervised machine learning algorithm that is used for classification and regression tasks
Start with the entire dataset.

Choose the best feature to split the data. The goal is to create the most “pure” child nodes. This is based on metrics like:

Gini Impurity

Entropy (Information Gain)

Variance Reduction (for regression)

Split the data into subsets based on the feature value.

Repeat the process recursively on each subset.
2. What are impurity measures in Decision Trees ?
- Impurity measures in decision trees are metrics used to determine how mixed (or impure) the classes are in a node. The goal of a decision tree is to split data in a way that reduces impurity — in other words, to create child nodes that are as pure (homogeneous) as possible.

3. What is the mathematical formula for Gini Impurity ?
- For a node
𝑡
t with
𝐶
C classes, the Gini Impurity is defined as:

Gini
(
𝑡
)
=
1
−
∑
𝑖
=
1
𝐶
𝑝
𝑖
2
Gini(t)=1−
i=1
∑
C
​
 p
i
2
​
4. What is the mathematical formula for Entropy ?
- The Entropy of a node
𝑡
t, which measures the level of disorder or impurity, is given by:

Entropy
(
𝑡
)
=
−
∑
𝑖
=
1
𝐶
𝑝
𝑖
log
⁡
2
(
𝑝
𝑖
)
Entropy(t)=−
i=1
∑
C
​
 p
i
​
 log
2
​
 (p
i
​
 )
 5. * What is Information Gain, and how is it used in Decision Trees ?
  - Information Gain (IG) is a metric used to measure how well a feature splits the data in a decision tree. It quantifies the reduction in entropy after a dataset is split on a particular feature.

In other words, it tells us how much “information” a feature gives us about the class — or how much uncertainty it removes.
6. What is the difference between Gini Impurity and Entropy ?
- Information Gain (IG) is a metric used to measure how well a feature splits the data in a decision tree. It quantifies the reduction in entropy after a dataset is split on a particular feature.

In other words, it tells us how much “information” a feature gives us about the class — or how much uncertainty it removes.
7. What is the mathematical explanation behind Decision Trees ?
- A Decision Tree is a tree-structured model used to make predictions by recursively splitting the data into subsets based on feature values. At each node, the tree selects the best feature and threshold that reduces impurity the most.

8. What is Pre-Pruning in Decision Trees ?
- Pre-pruning (also called early stopping) is a technique used to prevent overfitting in decision trees by stopping the tree growth early, before it perfectly classifies the training data.

Instead of growing the full tree and then trimming it (as in post-pruning), pre-pruning halts the split process during tree construction based on certain conditions.
9. What is Post-Pruning in Decision Trees ?
- Post-pruning (also called cost-complexity pruning or backward pruning) is a technique used to reduce overfitting in decision trees by growing the full tree first and then removing branches that do not contribute significantly to predictive power.

Unlike pre-pruning (which stops the tree early), post-pruning allows the tree to grow fully and then simplifies it afterward.
10. What is the difference between Pre-Pruning and Post-Pruning ?
- Both pre-pruning and post-pruning are techniques used to prevent overfitting in decision trees by controlling their growth. However, they differ in when and how they prune the tree
Pre-Pruning:
Stop if node has < 10 samples

Stop if depth > 5

Stop if information gain < 0.01

Post-Pruning:
Remove subtrees if validation accuracy doesn't decrease

Use cost-complexity pruning (minimize error + α × tree size)
11. What is a Decision Tree Regressor ?
- A Decision Tree Regressor is a type of decision tree used for regression tasks — that is, predicting continuous numeric values instead of categories
The model splits the dataset into regions based on feature values.

Each split is chosen to minimize the prediction error (e.g., variance or mean squared error) in the resulting subsets.
12. What are the advantages and disadvantages of Decision Trees ?
-  Advantages of Decision Trees
Easy to Understand and Interpret

The tree structure is intuitive and can be visualized.

Even non-experts can follow the decision rules.

Handles Both Numerical and Categorical Data

Works well with mixed data types without extensive preprocessing.

Requires Little Data Preparation

No need for feature scaling or normalization.

Can handle missing values (some implementations)
 Disadvantages of Decision Trees
Prone to Overfitting

Trees can become very deep and fit noise in training data.

Requires pruning or other regularization techniques.

Unstable to Small Changes in Data

Slight variations in data can cause big changes in the tree structure.

Biased Towards Features with More Levels

Features with many distinct values may dominate splits unfairly.

Less Accurate than Some Other Methods

Often outperformed by ensemble methods like Random Forests or Gradient Boosting.
13. * How does a Decision Tree handle missing values
- Decision trees can handle missing data during both training and prediction, though the exact approach depends on the implementation.

1. During Training
Ignore samples with missing values for the splitting feature:
When deciding the best split on a feature, some algorithms consider only samples where that feature is present.
14. * How does a Decision Tree handle categorical features ?
- Decision Trees can naturally work with categorical features, but the way they split on these features depends on the algorithm and implementation.
15. What are some real-world applications of Decision Trees/?
- Healthcare

Disease Diagnosis: Classify patients as having or not having a disease based on symptoms, test results, and medical history.

Treatment Recommendations: Decide the best treatment path based on patient data.

Predicting Patient Outcomes: Regression trees predict length of hospital stay or readmission risk.

Finance

Credit Scoring: Approve or reject loan applications by analyzing applicant’s financial history and demographics.

Fraud Detection: Identify fraudulent transactions based on patterns in transaction data.

Risk Management: Estimate risk levels for investments or insurance underwriting.

#Practical
1. Write a Python program to train a Decision Tree Classifier on the Iris dataset and print the model accuracy ?


In [None]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score

# Load the Iris dataset
iris = load_iris()
X = iris.data
y = iris.target

# Split into train and test sets (70% train, 30% test)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Initialize the Decision Tree Classifier
clf = DecisionTreeClassifier(random_state=42)

# Train the model
clf.fit(X_train, y_train)

# Predict on the test set
y_pred = clf.predict(X_test)

# Calculate accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Model Accuracy: {accuracy:.2f}")


2. Write a Python program to train a Decision Tree Classifier using Gini Impurity as the criterion and print the
feature importances


In [None]:
from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier

# Load the Iris dataset
iris = load_iris()
X = iris.data
y = iris.target

# Initialize Decision Tree Classifier with Gini impurity criterion
clf = DecisionTreeClassifier(criterion='gini', random_state=42)

# Train the model
clf.fit(X, y)

# Print feature importances
feature_names = iris.feature_names
importances = clf.feature_importances_

print("Feature Importances:")
for name, importance in zip(feature_names, importances):
    print(f"{name}: {importance:.4f}")


3. * Write a Python program to train a Decision Tree Classifier using Entropy as the splitting criterion and print the
model accuracy?


In [None]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score

# Load the Iris dataset
iris = load_iris()
X = iris.data
y = iris.target

# Split dataset into training and testing sets (70% train, 30% test)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Initialize Decision Tree Classifier with Entropy criterion
clf = DecisionTreeClassifier(criterion='entropy', random_state=42)

# Train the model
clf.fit(X_train, y_train)

# Predict on the test data
y_pred = clf.predict(X_test)

# Calculate and print accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Model Accuracy using Entropy: {accuracy:.2f}")


4. Write a Python program to train a Decision Tree Regressor on a housing dataset and evaluate using Mean
Squared Error (MSE)


In [None]:
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeRegressor
from sklearn.metrics import mean_squared_error

# Load the California housing dataset
housing = fetch_california_housing()
X = housing.data
y = housing.target

# Split the data into training and testing sets (80% train, 20% test)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Initialize the Decision Tree Regressor
regressor = Dec



5. * Write a Python program to train a Decision Tree Classifier and visualize the tree using graphviz ?


In [None]:
from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier, export_graphviz
import graphviz
import pydotplus
from IPython.display import Image

# Load data
iris = load_iris()
X = iris.data
y = iris.target

# Train Decision Tree Classifier
clf = DecisionTreeClassifier(random_state=42)
clf.fit(X, y)

# Export tree to DOT format
dot_data = export_graphviz(
    clf,
    out_file=None,
    feature_names=iris.feature_names,
    class_names=iris.target_names,
    filled=True,
    rounded=True,
    special_characters=True
)

# Create graph from DOT data
graph = graphviz.Source(dot_data)

# Display the graph (in Jupyter notebooks)
graph.render("iris_decision_tree")  # Saves as PDF and other formats
graph.view()  # Opens the saved file

# If running in Jupyter, just show inline:
# graph  # Uncomment this line to display inline in notebooks


6. Write a Python program to train a Decision Tree Classifier with a maximum depth of 3 and compare its
accuracy with a fully grown tree

In [None]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score

# Load Iris dataset
iris = load_iris()
X = iris.data
y = iris.target

# Split into train and test sets (70% train, 30% test)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Decision Tree with max depth = 3
clf_pruned = DecisionTreeClassifier(max_depth=3, random_state=42)
clf_pruned.fit(X_train, y_train)
y_pred_pruned = clf_pruned.predict(X_test)
accuracy_pruned = accuracy_score(y_test, y_pred_pruned)

# Fully grown Decision Tree (no max depth limit)
clf_full = DecisionTreeClassifier(random_state=42)
clf_full.fit(X_train, y_train)
y_pred_full = clf_full.predict(X_test)
accuracy_full = accuracy_score(y_test, y_pred_full)

print(f"Accuracy with max depth 3: {accuracy_pruned:.2f}")
print(f"Accuracy with fully grown tree: {accuracy_full:.2f}")


7. Write a Python program to train a Decision Tree Classifier using min_samples_split=5 and compare its
accuracy with a default tree ?


In [None]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score

# Load Iris dataset
iris = load_iris()
X = iris.data
y = iris.target

# Split into train and test sets (70% train, 30% test)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Decision Tree with min_samples_split=5
clf_min_split = DecisionTreeClassifier(min_samples_split=5, random_state=42)
clf_min_split.fit(X_train, y_train)
y_pred_min_split = clf_min_split.predict(X_test)
accuracy_min_split = accuracy_score(y_test, y_pred_min_split)

# Default Decision Tree
clf_default = DecisionTreeClassifier(random_state=42)
clf_default.fit(X_train, y_train)
y_pred_default = clf_default.predict(X_test)
accuracy_default = accuracy_score(y_test, y_pred_default)

print(f"Accuracy with min_samples_split=5: {accuracy_min_split:.2f}")
print(f"Accuracy with default parameters: {accuracy_default:.2f}")



8. Write a Python program to apply feature scaling before training a Decision Tree Classifier and compare its
accuracy with unscaled data


In [None]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score

# Load Iris dataset
iris = load_iris()
X = iris.data
y = iris.target

# Split into train/test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# ----------- Without Scaling -----------
clf_unscaled = DecisionTreeClassifier(random_state=42)
clf_unscaled.fit(X_train, y_train)
y_pred_unscaled = clf_unscaled.predict(X_test)
acc_unscaled = accuracy_score(y_test, y_pred_unscaled)

# ----------- With Scaling -----------
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

clf_scaled = DecisionTreeClassifier(random_state=42)
clf_scaled.fit(X_train_scaled, y_train)
y_pred_scaled = clf_scaled.pre_


9.  Write a Python program to train a Decision Tree Classifier using One-vs-Rest (OvR) strategy for multiclass
classification

In [None]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.multiclass import OneVsRestClassifier
from sklearn.metrics import accuracy_score

# Load dataset
iris = load_iris()
X = iris.data
y = iris.target

# Split into train/test
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Initialize One-vs-Rest with Decision Tree Classifier
ovr_clf = OneVsRestClassifier(DecisionTreeClassifier(random_state=42))

# Train the model
ovr_clf.fit(X_train, y_train)

# Predict on test data
y_pred = ovr_clf.predict(X_test)

# Calculate accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy of OvR Decision Tree Classifier: {accuracy:.2f}")


10.Write a Python program to train a Decision Tree Classifier and display the feature importance scores

In [None]:
from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier

# Load the Iris dataset
iris = load_iris()
X = iris.data
y = iris.target

# Initialize and train the Decision Tree Classifier
clf = DecisionTreeClassifier(random_state=42)
clf.fit(X, y)

# Get feature importances
importances = clf.feature_importances_
feature_names = iris.feature_names

# Display feature importances
print("Feature Importances:")
for name, importance in zip(feature_names, importances):
    print(f"{name}: {importance:.4f}")


11. Write a Python program to train a Decision Tree Regressor with max_depth=5 and compare its performance
with an unrestricted tree*

In [None]:
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeRegressor
from sklearn.metrics import mean_squared_error

# Load California housing dataset
housing = fetch_california_housing()
X = housing.data
y = housing.target

# Split into train and test sets (80% train, 20% test)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Decision Tree Regressor with max_depth=5
regressor_restricted = DecisionTreeRegressor(max_depth=5, random_state=42)
regressor_restricted.fit(X_train, y_train)
y_pred_restricted = regressor_restricted.predict(X_test)
mse_restricted = mean_squared_error(y_test, y_pred_restricted)

# Unrestricted Decision Tree Regressor
regressor_unrestricted = DecisionTreeRegressor(random_state=42)
regressor_unrestricted.fit(X_train, y_train)
y_pred_unrestricted = regressor_unrestricted.predict(X_test
