Q1. What is Information Gain, and how is it used in Decision Trees?
Answer:
- Information Gain measures the reduction in entropy (uncertainty) after splitting a dataset on a feature.
- Formula:
IG(S,A)=Entropy(S)-\sum _{v\in Values(A)}\frac{|S_v|}{|S|}\cdot Entropy(S_v)- In Decision Trees, the feature with the highest Information Gain is chosen for splitting, ensuring maximum reduction in impurity.


Q2. What is the difference between Gini Impurity and Entropy?
Answer:
- Entropy: Measures disorder using logarithms. Range: 0 (pure) to 1 (highly impure).
- Gini Impurity: Measures probability of misclassification. Range: 0 (pure) to 0.5 (max impurity).
- Comparison:
- Entropy is more computationally expensive (logarithms).
- Gini is faster and often preferred in practice.
- Both yield similar results; choice depends on preference for speed vs. theoretical rigor.


Q3. What is Pre-Pruning in Decision Trees?
Answer:
- Pre-pruning stops tree growth early to avoid overfitting.
- Techniques:
- Limit maximum depth.
- Minimum samples per split/leaf.
- Stop if Information Gain is below a threshold.
- Helps improve generalization and reduces complexity.


In [2]:
#Q4. Python Program: Decision Tree Classifier using Gini Impurity
#Answer (Code):
from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier

# Load dataset
iris = load_iris()
X, y = iris.data, iris.target

# Train Decision Tree with Gini criterion
clf = DecisionTreeClassifier(criterion='gini', random_state=42)
clf.fit(X, y)

# Print feature importances
print("Feature Importances:", clf.feature_importances_)


Feature Importances: [0.01333333 0.         0.56405596 0.42261071]


Q5. What is a Support Vector Machine (SVM)?
Answer:
- SVM is a supervised learning algorithm that finds the optimal hyperplane separating classes with maximum margin.
- Works well in high-dimensional spaces and is robust against overfitting.



Q6. What is the Kernel Trick in SVM?
Answer:
- Kernel Trick allows SVM to classify non-linear data by mapping it into higher dimensions.
- Common kernels: Linear, Polynomial, Radial Basis Function (RBF).
- Advantage: Avoids explicit computation of high-dimensional transformations.


In [3]:
#Q7. Python Program: SVM with Linear and RBF Kernels (Wine Dataset)
#Answer (Code):
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

# Load dataset
wine = load_wine()
X_train, X_test, y_train, y_test = train_test_split(
    wine.data, wine.target, test_size=0.3, random_state=42)

# Linear Kernel
svm_linear = SVC(kernel='linear')
svm_linear.fit(X_train, y_train)
acc_linear = accuracy_score(y_test, svm_linear.predict(X_test))

# RBF Kernel
svm_rbf = SVC(kernel='rbf')
svm_rbf.fit(X_train, y_train)
acc_rbf = accuracy_score(y_test, svm_rbf.predict(X_test))

print("Linear Kernel Accuracy:", acc_linear)
print("RBF Kernel Accuracy:", acc_rbf)


Linear Kernel Accuracy: 0.9814814814814815
RBF Kernel Accuracy: 0.7592592592592593


Q8. What is the Naïve Bayes classifier, and why is it called "Naïve"?
Answer:
- Naïve Bayes is a probabilistic classifier based on Bayes’ Theorem.
- It assumes independence among features, which is rarely true in practice → hence “Naïve.”
- Despite the assumption, it performs well in text classification and spam detection.



Q9. Differences between Gaussian, Multinomial, and Bernoulli Naïve Bayes
Answer:
- Gaussian NB: Assumes features follow a normal distribution. Best for continuous data.
- Multinomial NB: Assumes features are counts/frequencies. Best for text classification (word counts).
- Bernoulli NB: Assumes binary features (0/1). Best for presence/absence data (e.g., spam filters).




In [4]:
#Q10. Python Program: Gaussian Naïve Bayes on Breast Cancer Dataset
#Answer (Code):
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score

# Load dataset
data = load_breast_cancer()
X_train, X_test, y_train, y_test = train_test_split(
    data.data, data.target, test_size=0.3, random_state=42)

# Train Gaussian Naive Bayes
gnb = GaussianNB()
gnb.fit(X_train, y_train)

# Evaluate accuracy
y_pred = gnb.predict(X_test)
print("Accuracy:", accuracy_score(y_test, y_pred))





Accuracy: 0.9415204678362573
