#Supervised Classification: Decision Trees, SVM, and Naive Bayes

Question 1 :  What is Information Gain, and how is it used in Decision Trees?
- Information Gain measures the reduction in entropy (uncertainty) about the target variable after splitting the dataset on a given feature. For a feature A, Information Gain = Entropy(parent) − Weighted average Entropy(children). Decision trees use Information Gain to pick the feature that best separates the classes at each node.

Question 2: What is the difference between Gini Impurity and Entropy?
- Gini impurity (1 − Σp_i²) measures misclassification probability; Entropy (−Σp_i log₂ p_i) measures information uncertainty. Both are impurity measures; Gini is simpler and faster, while Entropy is more information-theoretic. Both produce similar results.

Question 3:What is Pre-Pruning in Decision Trees?
- Pre-pruning (early stopping) limits tree growth during training to avoid overfitting. Criteria include max_depth, min_samples_split, and min_gain. It prevents overly complex trees but can risk underfitting.

Question 4:Write a Python program to train a Decision Tree Classifier using Gini
Impurity as the criterion and print the feature importances (practical).
- from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier

iris = load_iris()
X, y = iris.data, iris.target
dt = DecisionTreeClassifier(criterion='gini', random_state=42)
dt.fit(X, y)
print(dt.feature_importances_)


In [1]:
from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier

iris = load_iris()
X, y = iris.data, iris.target
dt = DecisionTreeClassifier(criterion='gini', random_state=42)
dt.fit(X, y)
print(dt.feature_importances_)


[0.01333333 0.         0.56405596 0.42261071]


Question 5: What is a Support Vector Machine (SVM)?
- SVM finds the hyperplane that maximally separates classes in a dataset. It uses support vectors (data points closest to the boundary) to define the optimal margin.

Question 6:  What is the Kernel Trick in SVM?
- The kernel trick allows SVM to compute in high-dimensional space without explicitly transforming data. Common kernels: Linear, Polynomial, RBF (Gaussian), Sigmoid.

Question 7:  Write a Python program to train two SVM classifiers with Linear and RBF
kernels on the Wine dataset, then compare their accuracies.
- from sklearn.datasets import load_wine
from sklearn.svm import SVC
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score

wine = load_wine()
X_train, X_test, y_train, y_test = train_test_split(wine.data, wine.target, test_size=0.25, random_state=42)
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

svc_linear = SVC(kernel='linear')
svc_rbf = SVC(kernel='rbf')
svc_linear.fit(X_train, y_train)
svc_rbf.fit(X_train, y_train)
acc_linear = accuracy_score(y_test, svc_linear.predict(X_test))
acc_rbf = accuracy_score(y_test, svc_rbf.predict(X_test))


In [4]:
from sklearn.datasets import load_wine
from sklearn.svm import SVC
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score

wine = load_wine()
X_train, X_test, y_train, y_test = train_test_split(wine.data, wine.target, test_size=0.25, random_state=42)
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

svc_linear = SVC(kernel='linear')
svc_rbf = SVC(kernel='rbf')
svc_linear.fit(X_train, y_train)
svc_rbf.fit(X_train, y_train)
acc_linear = accuracy_score(y_test, svc_linear.predict(X_test))
acc_rbf = accuracy_score(y_test, svc_rbf.predict(X_test))


Question 8: What is the Naïve Bayes classifier, and why is it called "Naïve"?
- Naïve Bayes applies Bayes' theorem assuming all features are conditionally independent. The 'naïve' assumption simplifies computation and works well in many practical cases.

Question 9: Explain the differences between Gaussian Naïve Bayes, Multinomial Naïve
Bayes, and Bernoulli Naïve Bayes.
- GaussianNB: Continuous features (real-valued).
MultinomialNB: Discrete counts (e.g., word counts).
BernoulliNB: Binary features (presence/absence).


Question 10:  Breast Cancer Dataset
Write a Python program to train a Gaussian Naïve Bayes classifier on the Breast Cancer
dataset and evaluate accuracy.
- from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score

bc = load_breast_cancer()
X_train, X_test, y_train, y_test = train_test_split(bc.data, bc.target, test_size=0.25, random_state=42)
model = GaussianNB()
model.fit(X_train, y_train)
pred = model.predict(X_test)
print(accuracy_score(y_test, pred))


In [5]:
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score

bc = load_breast_cancer()
X_train, X_test, y_train, y_test = train_test_split(bc.data, bc.target, test_size=0.25, random_state=42)
model = GaussianNB()
model.fit(X_train, y_train)
pred = model.predict(X_test)
print(accuracy_score(y_test, pred))


0.958041958041958
