In [None]:
# What is a Decision Tree, and how does it work?

A Decision Tree is a supervised machine learning algorithm used for classification and regression tasks. It splits the data into smaller subsets based on feature values, resulting in a tree-like structure where each internal node represents a decision based on a feature, branches represent the outcome of the decision, and leaves represent the final prediction.

In [None]:
# What are impurity measures in Decision Trees?

Impurity measures help decide how to split the data at each node. A pure node contains only one class, while an impure node contains a mix of classes. The goal is to minimize impurity when splitting.

In [None]:
# What is the mathematical formula for Gini Impurity?

The Gini Impurity measures how often a randomly chosen element would be incorrectly classified. Its formula is:

Gini
=
1
−
∑
𝑖
=
1
𝑛
𝑝
𝑖
2
Gini=1−
i=1
∑
n
​
 p
i
2
​

Where
𝑝
𝑖
p
i
​
  is the probability of class
𝑖
i.

In [None]:
# What is the mathematical formula for Entropy?

Entropy measures the disorder or randomness in the data. Its formula is:

Entropy
=
−
∑
𝑖
=
1
𝑛
𝑝
𝑖
log
⁡
2
(
𝑝
𝑖
)
Entropy=−
i=1
∑
n
​
 p
i
​
 log
2
​
 (p
i
​
 )
Where
𝑝
𝑖
p
i
​
  is the probability of class
𝑖
i.

In [None]:
# What is Information Gain, and how is it used in Decision Trees?

Information Gain measures how much a feature helps reduce uncertainty (impurity). It’s the difference between the impurity of the parent node and the weighted impurity of the child nodes. Decision Trees use Information Gain to decide which feature to split on.

Information Gain
=
Entropy (Parent)
−
Weighted Entropy (Children)
Information Gain=Entropy (Parent)−Weighted Entropy (Children)

In [None]:
# Difference between Gini Impurity and Entropy?

Gini Impurity: Easier to compute, tends to favor larger partitions.

Entropy: Based on the concept of information theory, it works similarly but can be more sensitive to skewed distributions. Both are used to measure impurity, but Gini is computationally faster, whereas Entropy can be more accurate for certain datasets.

In [None]:
# Mathematical explanation behind Decision Trees?

A Decision Tree uses recursive binary splitting to minimize the impurity at each node. The splits are chosen to maximize Information Gain or reduce impurity (Gini or Entropy). Mathematically, at each step, the tree evaluates every possible split for all features and chooses the one that minimizes the chosen impurity measure.

In [None]:
# What is Pre-Pruning in Decision Trees?

Pre-Pruning stops the tree from growing once it meets a certain condition (e.g., maximum depth, minimum number of samples at a node). It prevents overfitting by limiting tree size.

In [None]:
# What is Post-Pruning in Decision Trees?

Post-Pruning allows the tree to grow fully and then removes nodes that don't improve accuracy on a validation set. This also helps prevent overfitting.

In [None]:
# Difference between Pre-Pruning and Post-Pruning?

Pre-Pruning: Stops tree growth early, based on set conditions.

Post-Pruning: Grows the tree fully first and then removes unnecessary nodes.

In [None]:
# What is a Decision Tree Regressor?

A Decision Tree Regressor predicts continuous values rather than classes. It works similarly to classification trees, but instead of reducing class impurity, it minimizes the variance of the target variable in each split.

In [None]:
# Advantages and disadvantages of Decision Trees?

Advantages:

Easy to understand and visualize.
Handles both categorical and numerical data.
No need for feature scaling.

Disadvantages:

Prone to overfitting (complex trees).
Sensitive to small changes in data.

In [None]:
# How does a Decision Tree handle missing values?

Decision Trees can handle missing values by either ignoring them during split calculations or using a surrogate split (a backup feature with similar splits) to make decisions.

In [None]:
#  How does a Decision Tree handle categorical features?

For categorical features, Decision Trees can split based on each category or group similar categories together based on Information Gain or Gini Impurity.

In [None]:
# Real-world applications of Decision Trees?

Medical diagnosis: To classify diseases.
Customer segmentation: To target marketing efforts.
Fraud detection: To detect fraudulent activities in financial systems.

In [None]:
                                                                #  Practical

In [None]:
# Train Decision Tree Classifier on Iris dataset and print model accuracy?

from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Load dataset
iris = load_iris()
X, y = iris.data, iris.target

# Split the data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Train the Decision Tree Classifier
clf = DecisionTreeClassifier()
clf.fit(X_train, y_train)

# Predict and evaluate
y_pred = clf.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f"Model Accuracy: {accuracy * 100:.2f}%")


In [None]:
# Train Decision Tree Classifier using Gini Impurity and print feature importances?

clf_gini = DecisionTreeClassifier(criterion='gini')
clf_gini.fit(X_train, y_train)

# Print feature importances
importances = clf_gini.feature_importances_
print("Feature Importances:", importances)




In [None]:
# Train Decision Tree Classifier using Entropy and print model accuracy?

clf_entropy = DecisionTreeClassifier(criterion='entropy')
clf_entropy.fit(X_train, y_train)

# Predict and evaluate
y_pred_entropy = clf_entropy.predict(X_test)
accuracy_entropy = accuracy_score(y_test, y_pred_entropy)
print(f"Model Accuracy with Entropy: {accuracy_entropy * 100:.2f}%")


In [None]:
# Train Decision Tree Regressor on Housing Dataset (MSE Evaluation)?

from sklearn.tree import DecisionTreeRegressor
from sklearn.metrics import mean_squared_error
from sklearn.datasets import fetch_california_housing

# Load housing data
housing = fetch_california_housing()
X, y = housing.data, housing.target

# Split the data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Train the Decision Tree Regressor
regressor = DecisionTreeRegressor()
regressor.fit(X_train, y_train)

# Predict and evaluate
y_pred_reg = regressor.predict(X_test)
mse = mean_squared_error(y_test, y_pred_reg)
print(f"Mean Squared Error: {mse}")


In [None]:
# Train Decision Tree Classifier and visualize the tree using graphviz?

from sklearn.tree import export_graphviz
import graphviz

# Train the classifier
clf = DecisionTreeClassifier()
clf.fit(X_train, y_train)

# Visualize the tree
dot_data = export_graphviz(clf, out_file=None, feature_names=iris.feature_names,
                           class_names=iris.target_names, filled=True, rounded=True, special_characters=True)

graph = graphviz.Source(dot_data)
graph.view()


In [None]:
# Train Decision Tree Classifier with max depth of 3 and compare with fully grown tree?

# Max depth of 3
clf_depth3 = DecisionTreeClassifier(max_depth=3)
clf_depth3.fit(X_train, y_train)

# Fully grown tree
clf_full = DecisionTreeClassifier()
clf_full.fit(X_train, y_train)

# Evaluate both
accuracy_depth3 = accuracy_score(y_test, clf_depth3.predict(X_test))
accuracy_full = accuracy_score(y_test, clf_full.predict(X_test))

print(f"Accuracy with max depth 3: {accuracy_depth3 * 100:.2f}%")
print(f"Accuracy with full tree: {accuracy_full * 100:.2f}%")


In [None]:
# Train Decision Tree Classifier using min_samples_split=5 and compare with default tree?

clf_min_samples = DecisionTreeClassifier(min_samples_split=5)
clf_min_samples.fit(X_train, y_train)

# Default tree
clf_default = DecisionTreeClassifier()
clf_default.fit(X_train, y_train)

# Evaluate both
accuracy_min_samples = accuracy_score(y_test, clf_min_samples.predict(X_test))
accuracy_default = accuracy_score(y_test, clf_default.predict(X_test))

print(f"Accuracy with min_samples_split=5: {accuracy_min_samples * 100:.2f}%")
print(f"Accuracy with default: {accuracy_default * 100:.2f}%")


In [None]:
# Apply feature scaling before training a Decision Tree Classifier and compare accuracy?

from sklearn.preprocessing import StandardScaler

# Apply scaling
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Train with scaled data
clf_scaled = DecisionTreeClassifier()
clf_scaled.fit(X_train_scaled, y_train)

# Compare with unscaled data
accuracy_scaled = accuracy_score(y_test, clf_scaled.predict(X_test_scaled))
accuracy_unscaled = accuracy_score(y_test, clf.predict(X_test))

print(f"Accuracy with scaled data: {accuracy_scaled * 100:.2f}%")
print(f"Accuracy with unscaled data: {accuracy_unscaled * 100:.2f}%")


In [None]:
# Train Decision Tree Classifier using One-vs-Rest (OvR) strategy for multiclass classification?

from sklearn.multiclass import OneVsRestClassifier

# OvR strategy
clf_ovr = OneVsRestClassifier(DecisionTreeClassifier())
clf_ovr.fit(X_train, y_train)

# Predict and evaluate
y_pred_ovr = clf_ovr.predict(X_test)
accuracy_ovr = accuracy_score(y_test, y_pred_ovr)
print(f"Model Accuracy with OvR: {accuracy_ovr * 100:.2f}%")


In [None]:
#  Train Decision Tree Classifier and display feature importance scores?

clf = DecisionTreeClassifier()
clf.fit(X_train, y_train)

# Print feature importances
importances = clf.feature_importances_
print("Feature Importances:", importances)


In [None]:
# Train Decision Tree Regressor with max_depth=5 and compare with unrestricted tree?

# Max depth of 5
regressor_depth5 = DecisionTreeRegressor(max_depth=5)
regressor_depth5.fit(X_train, y_train)

# Unrestricted regressor
regressor_full = DecisionTreeRegressor()
regressor_full.fit(X_train, y_train)

# Evaluate both
mse_depth5 = mean_squared_error(y_test, regressor_depth5.predict(X_test))
mse_full = mean_squared_error(y_test, regressor_full.predict(X_test))

print(f"MSE with max_depth=5: {mse_depth5}")
print(f"MSE with unrestricted tree: {mse_full}")


In [None]:
# Train Decision Tree Classifier, apply Cost Complexity Pruning (CCP), and visualize effect?

# Train with CCP
clf_ccp = DecisionTreeClassifier(ccp_alpha=0.01)
clf_ccp.fit(X_train, y_train)

# Evaluate and visualize CCP effect
accuracy_ccp = accuracy_score(y_test, clf_ccp.predict(X_test))
print(f"Accuracy with CCP: {accuracy_ccp * 100:.2f}%")


In [None]:
# Train Decision Tree Classifier and evaluate using Precision, Recall, and F1-Score?

from sklearn.metrics import precision_score, recall_score, f1_score

# Train classifier
clf = DecisionTreeClassifier()
clf.fit(X_train, y_train)

# Predict and evaluate
y_pred = clf.predict(X_test)
precision = precision_score(y_test, y_pred, average='weighted')
recall = recall_score(y_test, y_pred, average='weighted')
f1 = f1_score(y_test, y_pred, average='weighted')

print(f"Precision: {precision:.2f}, Recall: {recall:.2f}, F1-Score: {f1:.2f}")


In [None]:
# Train Decision Tree Classifier and visualize confusion matrix using seaborn?

import seaborn as sns
from sklearn.metrics import confusion_matrix

# Train classifier
clf = DecisionTreeClassifier()
clf.fit(X_train, y_train)

# Predict and plot confusion matrix
y_pred = clf.predict(X_test)
cm = confusion_matrix(y_test, y_pred)

sns.heatmap(cm, annot=True, cmap='Blues')
plt.xlabel('Predicted')
plt.ylabel('True')
plt.show()


In [None]:
# Use GridSearchCV to find optimal values for max_depth and min_samples_split?

from sklearn.model_selection import GridSearchCV

# Define parameter grid
param_grid = {'max_depth': [3, 5, 10], 'min_samples_split': [2, 5, 10]}

# Use GridSearchCV
grid_search = GridSearchCV(DecisionTreeClassifier(), param_grid, cv=5)
grid_search.fit(X_train, y_train)

# Best parameters
print(f"Best parameters: {grid_search.best_params_}")
print(f"Best accuracy: {grid_search.best_score_}")
