Question 1: What is Information Gain, and how is it used in Decision Trees?

Answer -1 Information Gain (IG) measures how much “information” a feature gives us about the target variable. It is used to decide which feature should be chosen to split the dataset at each node of a Decision Tree.

Mathematically:
IG(S, A) = Entropy(S) - Σ (|Sv|/|S|) × Entropy(Sv)

The feature with the highest Information Gain is chosen as the root or split node, since it provides the maximum reduction in uncertainty.


Question 2: What is the difference between Gini Impurity and Entropy?

Answer 2 - Entropy measures information content while Gini measures the probability of misclassification.

Entropy = -Σ p_i log2(p_i)
Gini = 1 - Σ p_i²

Entropy is more informative when class probabilities are evenly distributed, while Gini is computationally faster and used in the CART algorithm.


Question 3: What is Pre-Pruning in Decision Trees?
Answer 3 - Pre-pruning (early stopping) prevents a Decision Tree from growing too deep and overfitting. It stops the tree before it perfectly fits the training data.

Methods include setting max depth, minimum samples per split/leaf, and minimum information gain thresholds.


In [1]:
#Write a Python program to train a Decision Tree Classifier using GiniImpurity as the criterion and print the feature importances (practical).

from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier
import pandas as pd

# Load dataset
data = load_iris()
X = data.data
y = data.target

# Train Decision Tree with Gini Impurity
clf = DecisionTreeClassifier(criterion='gini', random_state=0)
clf.fit(X, y)

# Display feature importances
feature_importances = pd.Series(clf.feature_importances_, index=data.feature_names)
print("Feature Importances:\n", feature_importances)

Feature Importances:
 sepal length (cm)    0.000000
sepal width (cm)     0.013333
petal length (cm)    0.064056
petal width (cm)     0.922611
dtype: float64


Question 5: What is a Support Vector Machine (SVM)?

Answer 5 - Support Vector Machine (SVM) is a supervised learning algorithm that finds the optimal hyperplane separating classes with the maximum margin. Support vectors are the points closest to the hyperplane


Question 6: What is the Kernel Trick in SVM?

Answer 6 - The Kernel Trick allows SVMs to classify non-linearly separable data by mapping it to a higher-dimensional space.

Common Kernels:
Linear: K(x,y)=x·y
Polynomial: K(x,y)=(x·y + c)^d
RBF: K(x,y)=exp(-γ||x−y||²)

It computes inner products in high-dimensional space without explicit transformation.


In [2]:
#Question 7: Python Program – SVM with Linear and RBF Kernels
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

data = load_wine()
X_train, X_test, y_train, y_test = train_test_split(data.data, data.target, test_size=0.2, random_state=42)

svm_linear = SVC(kernel='linear')
svm_linear.fit(X_train, y_train)
acc_linear = accuracy_score(y_test, svm_linear.predict(X_test))

svm_rbf = SVC(kernel='rbf')
svm_rbf.fit(X_train, y_train)
acc_rbf = accuracy_score(y_test, svm_rbf.predict(X_test))

print('Linear Kernel Accuracy:', acc_linear)
print('RBF Kernel Accuracy:', acc_rbf)


Linear Kernel Accuracy: 1.0
RBF Kernel Accuracy: 0.8055555555555556


Question 8: What is the Naïve Bayes classifier, and why is it called 'Naïve'?

Answer 8 - Naïve Bayes is a probabilistic classifier based on Bayes’ Theorem:
P(C|X) = P(X|C)*P(C)/P(X)

It assumes all features are independent given the class label — this unrealistic assumption makes it 'naïve'. Despite this, it performs well in text classification and spam detection


Question 9: Explain the differences between Gaussian, Multinomial, and Bernoulli Naïve Bayes

Answer 9 - Gaussian NB: For continuous data, assumes normal distribution.
Multinomial NB: For count data, like word frequencies.
Bernoulli NB: For binary data, like presence/absence of a feature.

Examples:
- Gaussian: Iris dataset
- Multinomial: Text classification
- Bernoulli: Spam filtering


In [3]:
#Question 10: Python Program – Gaussian Naïve Bayes on Breast Cancer Dataset
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score

data = load_breast_cancer()
X_train, X_test, y_train, y_test = train_test_split(data.data, data.target, test_size=0.2, random_state=42)

gnb = GaussianNB()
gnb.fit(X_train, y_train)
y_pred = gnb.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)

print('Gaussian Naive Bayes Accuracy:', accuracy)


Gaussian Naive Bayes Accuracy: 0.9736842105263158
