## Supervised Classification: Decision
Trees, SVM, and Naive Bayes|

1: What is Information Gain, and how is it used in Decision Trees?  Answer: Information Gain is a metric used to train decision trees. It measures the reduction in entropy (uncertainty) after a dataset is split on an attribute. In Decision Trees, the feature with the highest Information Gain is chosen for the split because it provides the most "information" about the target class.

2: What is the difference between Gini Impurity and Entropy?  Answer:


Gini Impurity: It measures the probability of a random element being misclassified. It is computationally faster as it doesn't use logarithms. Its range is 0 to 0.5.


Entropy: It measures the disorder or randomness in the data. It involves logarithmic calculations, making it slightly slower. Its range is 0 to 1.

3: What is Pre-Pruning in Decision Trees?  Answer: Pre-pruning is a technique where the growth of a decision tree is stopped early to avoid Overfitting. This is done by setting constraints like max_depth, min_samples_split, or min_samples_leaf during the training process so the tree doesn't become too complex.

4: Write a Python program to train a Decision Tree Classifier using Gini Impurity and print feature importances.  Answer:

In [5]:
from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier

# Load data
iris = load_iris()
X, y = iris.data, iris.target

# Train model with Gini
# [cite: 60, 110]
clf = DecisionTreeClassifier(criterion='gini', random_state=42)
clf.fit(X, y)

# Print feature importances
# [cite: 60, 110]
for name, importance in zip(iris.feature_names, clf.feature_importances_):
    print(f"{name}: {importance:.4f}")

sepal length (cm): 0.0133
sepal width (cm): 0.0000
petal length (cm): 0.5641
petal width (cm): 0.4226


5: What is a Support Vector Machine (SVM)?  Answer: SVM is a supervised learning algorithm used for classification and regression. It works by finding the "Hyperplane" in an N-dimensional space that separates data points into different classes with the maximum possible margin.

6: What is the Kernel Trick in SVM? Answer: The Kernel Trick is a method used by SVM to handle non-linear data. It allows the algorithm to map data points from a low-dimensional space into a higher-dimensional space where they can be easily separated by a linear hyperplane. The best part is that it does this without the high computational cost of actual transformation.

7: Write a Python program to train two SVM classifiers with Linear and RBF kernels on the Wine dataset, then compare their accuracies. Answer (Code for your Colab):

In [6]:
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

# Data load aur split
data = load_wine()
X_train, X_test, y_train, y_test = train_test_split(data.data, data.target, test_size=0.3, random_state=42)

# Linear Kernel
linear_model = SVC(kernel='linear').fit(X_train, y_train)
lin_acc = accuracy_score(y_test, linear_model.predict(X_test))

# RBF Kernel
rbf_model = SVC(kernel='rbf').fit(X_train, y_train)
rbf_acc = accuracy_score(y_test, rbf_model.predict(X_test))

print(f"Linear Kernel Accuracy: {lin_acc:.4f}")
print(f"RBF Kernel Accuracy: {rbf_acc:.4f}")

Linear Kernel Accuracy: 0.9815
RBF Kernel Accuracy: 0.7593


8: What is the Naïve Bayes classifier, and why is it called "Naïve"? Answer: Naïve Bayes is a classification algorithm based on Bayes' Theorem. It is called "Naïve" because it makes a very strong and simplified assumption that all features in the dataset are completely independent of each other. In real life, features are usually related, but this assumption makes the model very fast and effective for things like spam filtering.

9: Explain the differences between Gaussian Naïve Bayes, Multinomial Naïve Bayes, and Bernoulli Naïve Bayes. Answer: * Gaussian NB: Used when features follow a normal distribution (continuous data like temperature or height).

Multinomial NB: Used for discrete counts, like counting how many times a word appears in a document.

Bernoulli NB: Used when features are binary (Yes/No or 0/1), like checking if a word is present or not.

10: Write a Python program to train a Gaussian Naïve Bayes classifier on the Breast Cancer dataset and evaluate accuracy. Answer (Code for your Colab):

In [7]:
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score

# Data loading
data = load_breast_cancer()
X_train, X_test, y_train, y_test = train_test_split(data.data, data.target, test_size=0.3, random_state=42)

# Training Gaussian NB
gnb = GaussianNB().fit(X_train, y_train)

# Accuracy print karna
y_pred = gnb.predict(X_test)
print(f"Gaussian Naive Bayes Accuracy: {accuracy_score(y_test, y_pred):.4f}")

Gaussian Naive Bayes Accuracy: 0.9415
