# SVM & Naive Bayes – Complete Theoretical & Practical Q&A

## Part 1 – Theoretical Answers (Q1–Q20)

**1. What is a Support Vector Machine (SVM)?**

SVM is a powerful supervised learning algorithm that finds the **maximum-margin hyperplane** separating classes. It can work in high dimensions and uses kernels for non‑linear boundaries.

**2. What is the difference between Hard Margin and Soft Margin SVM?**

**Hard Margin** assumes perfectly separable data and maximizes margin with zero misclassification (not robust to noise). **Soft Margin** allows some violations controlled by parameter **C**, balancing margin width and classification errors.

**3. What is the mathematical intuition behind SVM?**

SVM maximizes the geometric margin between classes. The decision function is based on a subset of training points (support vectors). The optimization is a convex quadratic problem with a unique global optimum.

**4. What is the role of Lagrange Multipliers in SVM?**

Lagrange multipliers transform the constrained primal optimization into the **dual problem**, enabling kernels and yielding sparse solutions where only support vectors have non‑zero multipliers.

**5. What are Support Vectors in SVM?**

Training samples that lie on or inside the margin and determine the decision boundary. Removing non‑support vectors does not change the boundary.

**6. What is a Support Vector Classifier (SVC)?**

An SVM model used for **classification**. In scikit‑learn, `SVC` implements kernelized SVM for binary/multiclass tasks.

**7. What is a Support Vector Regressor (SVR)?**

SVR applies SVM ideas to **regression**, using an ε‑insensitive loss to fit a function within a tube around data while maximizing flatness.

**8. What is the Kernel Trick in SVM?**

The kernel trick computes inner products in a high‑dimensional feature space **without explicit mapping**, enabling non‑linear decision boundaries (e.g., RBF, polynomial).

**9. Compare Linear Kernel, Polynomial Kernel, and RBF Kernel.**

- **Linear:** fast, works when data is roughly linearly separable.  
- **Polynomial:** models interactions up to degree d; can overfit.  
- **RBF:** popular default; creates flexible non‑linear boundaries controlled by γ.

**10. What is the effect of the C parameter in SVM?**

**C** controls the trade‑off between maximizing margin and classification error. Small C → wider margin, more tolerance to misclassification; Large C → narrow margin, fit training data more strictly.

**11. What is the role of the Gamma parameter in RBF Kernel SVM?**

**Gamma (γ)** controls the influence of a single training example in RBF. Low γ → smoother, broader influence; High γ → tighter, can overfit.

**12. What is the Naive Bayes classifier, and why is it called 'Naive'?**

Naive Bayes applies Bayes’ theorem with the **naive independence** assumption between features given the class; the assumption makes computation simple and scalable.

**13. What is Bayes’ Theorem?**

$$ P(y|x)=\frac{P(x|y)P(y)}{P(x)} $$ where \(P(y|x)\) is the posterior, \(P(x|y)\) likelihood, \(P(y)\) prior.

**14. Explain the differences between Gaussian NB, Multinomial NB, and Bernoulli NB.**

**Gaussian NB:** continuous features assumed normal. **Multinomial NB:** counts/frequencies (e.g., bag‑of‑words). **Bernoulli NB:** binary features (0/1).

**15. When should you use Gaussian Naive Bayes over other variants?**

When features are continuous and approximately Gaussian (e.g., medical measurements, sensor data).

**16. What are the key assumptions made by Naive Bayes?**

Features are conditionally independent given the class; distributions match the chosen variant (Gaussian/Multinomial/Bernoulli).

**17. What are the advantages and disadvantages of Naive Bayes?**

**Pros:** fast, works with small data, robust to irrelevant features, probabilistic outputs. **Cons:** independence assumption often violated; can be outperformed by more flexible models.

**18. Why is Naive Bayes a good choice for text classification?**

Text features (word counts) are high‑dimensional and sparse; the independence assumption works surprisingly well; Multinomial NB is very efficient.

**19. Compare SVM and Naive Bayes for classification tasks.**

SVM usually yields higher accuracy on complex boundaries but is slower on very large datasets; Naive Bayes is faster and good as a strong baseline, especially for text.

**20. How does Laplace Smoothing help in Naive Bayes?**

Laplace (add‑one) smoothing avoids zero probabilities for unseen feature events by adding a small constant to counts before normalization.

## Part 2 – Practical (Q21–Q46)

In [None]:
import numpy as np
import pandas as pd
from sklearn import datasets
from sklearn.model_selection import train_test_split, GridSearchCV, RandomizedSearchCV, StratifiedKFold, cross_val_score
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report, precision_recall_fscore_support, roc_auc_score, precision_recall_curve, mean_squared_error, mean_absolute_error
from sklearn.svm import SVC, SVR
from sklearn.feature_selection import RFE
from sklearn.naive_bayes import GaussianNB, MultinomialNB, BernoulliNB
from sklearn.pipeline import Pipeline
from sklearn.feature_extraction.text import CountVectorizer
import matplotlib.pyplot as plt
import joblib
import warnings
warnings.filterwarnings('ignore')

# Helper: simple confusion matrix plot (matplotlib only)
def plot_confusion_matrix(cm, classes):
    fig, ax = plt.subplots()
    im = ax.imshow(cm, interpolation='nearest')
    ax.figure.colorbar(im, ax=ax)
    ax.set(xticks=np.arange(cm.shape[1]), yticks=np.arange(cm.shape[0]),
           xticklabels=classes, yticklabels=classes,
           ylabel='True label', xlabel='Predicted label')
    for i in range(cm.shape[0]):
        for j in range(cm.shape[1]):
            ax.text(j, i, format(cm[i, j], 'd'), ha='center', va='center')
    plt.tight_layout()
    plt.show()

# Load standard datasets
iris = datasets.load_iris()
wine = datasets.load_wine()
breast = datasets.load_breast_cancer()

X_iris, y_iris = iris.data, iris.target
X_wine, y_wine = wine.data, wine.target
X_bc, y_bc = breast.data, breast.target


### 21. Train SVM on Iris & evaluate accuracy

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X_iris, y_iris, test_size=0.2, random_state=42)
clf = SVC(kernel='rbf', gamma='scale')
clf.fit(X_train, y_train)
pred = clf.predict(X_test)
print('Accuracy:', accuracy_score(y_test, pred))

### 22. Train SVM with Linear vs RBF on Wine and compare

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X_wine, y_wine, test_size=0.2, random_state=42)
lin = SVC(kernel='linear').fit(X_train, y_train)
rbf = SVC(kernel='rbf').fit(X_train, y_train)
print('Linear:', lin.score(X_test, y_test))
print('RBF:', rbf.score(X_test, y_test))

### 23. Train SVR on synthetic 'housing' data & MSE

In [None]:
from sklearn.datasets import make_regression
X, y = make_regression(n_samples=1200, n_features=8, noise=15, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
reg = SVR(kernel='rbf')
reg.fit(X_train, y_train)
pred = reg.predict(X_test)
print('MSE:', mean_squared_error(y_test, pred))

### 24. SVM with Polynomial Kernel & visualize 2D decision boundary

In [None]:
from sklearn.datasets import make_classification
X, y = make_classification(n_samples=400, n_features=2, n_redundant=0, n_clusters_per_class=1, random_state=42)
clf = SVC(kernel='poly', degree=3).fit(X, y)
# Plot
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.linspace(x_min, x_max, 300), np.linspace(y_min, y_max, 300))
Z = clf.predict(np.c_[xx.ravel(), yy.ravel()]).reshape(xx.shape)
plt.contourf(xx, yy, Z, alpha=0.3)
plt.scatter(X[:,0], X[:,1], c=y)
plt.show()

### 25. Gaussian NB on Breast Cancer – accuracy

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X_bc, y_bc, test_size=0.2, random_state=42)
gnb = GaussianNB().fit(X_train, y_train)
print('Accuracy:', gnb.score(X_test, y_test))

### 26. Multinomial NB for text (20 Newsgroups fallback)

In [None]:
# Try to fetch 20 Newsgroups; if unavailable (offline), use a small synthetic corpus
try:
    from sklearn.datasets import fetch_20newsgroups
    train = fetch_20newsgroups(subset='train', remove=('headers','footers','quotes'))
    test = fetch_20newsgroups(subset='test', remove=('headers','footers','quotes'))
    X_train, X_test, y_train, y_test = train.data, test.data, train.target, test.target
except Exception as e:
    X_train = ['sports match win', 'election debate', 'new graphics card', 'religion philosophy', 'hockey team scores', 'government policy']
    y_train = [0,1,2,3,0,1]
    X_test  = ['team won match', 'policy debate', 'gpu benchmark']
    y_test  = [0,1,2]
vectorizer = CountVectorizer()
Xtr = vectorizer.fit_transform(X_train); Xte = vectorizer.transform(X_test)
mnb = MultinomialNB().fit(Xtr, y_train)
pred = mnb.predict(Xte)
print('Accuracy:', accuracy_score(y_test, pred))

### 27. SVM with different C values – visualize boundaries

In [None]:
from sklearn.datasets import make_classification
X, y = make_classification(n_samples=300, n_features=2, n_redundant=0, random_state=0)
for C in [0.1, 1, 10]:
    clf = SVC(kernel='rbf', C=C).fit(X, y)
    x_min, x_max = X[:, 0].min()-1, X[:, 0].max()+1
    y_min, y_max = X[:, 1].min()-1, X[:, 1].max()+1
    xx, yy = np.meshgrid(np.linspace(x_min, x_max, 200), np.linspace(y_min, y_max, 200))
    Z = clf.predict(np.c_[xx.ravel(), yy.ravel()]).reshape(xx.shape)
    plt.figure()
    plt.contourf(xx, yy, Z, alpha=0.3)
    plt.scatter(X[:,0], X[:,1], c=y)
    plt.title(f'C={C}')
    plt.show()

### 28. Bernoulli NB on binary-feature dataset

In [None]:
X = np.random.randint(0,2,(300,10))
y = np.random.randint(0,2,300)
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.2,random_state=42)
bnb = BernoulliNB().fit(X_train,y_train)
print('Accuracy:', bnb.score(X_test,y_test))

### 29. Feature scaling before SVM & compare

In [None]:
X = X_wine; y = y_wine
X_tr, X_te, y_tr, y_te = train_test_split(X, y, test_size=0.2, random_state=42)
clf_raw = SVC(kernel='rbf').fit(X_tr, y_tr)
raw_acc = clf_raw.score(X_te, y_te)
sc = StandardScaler()
X_trs = sc.fit_transform(X_tr); X_tes = sc.transform(X_te)
clf_scaled = SVC(kernel='rbf').fit(X_trs, y_tr)
scaled_acc = clf_scaled.score(X_tes, y_tes)
print('Raw:', raw_acc, 'Scaled:', scaled_acc)

### 30. Gaussian NB with/without Laplace smoothing (alpha)

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X_bc, y_bc, test_size=0.2, random_state=42)
gnb = GaussianNB(var_smoothing=1e-9).fit(X_train, y_train)
pred1 = gnb.predict(X_test)
gnb2 = GaussianNB(var_smoothing=1e-6).fit(X_train, y_train)
pred2 = gnb2.predict(X_test)
print('Default acc:', accuracy_score(y_test, pred1), 'Higher smoothing acc:', accuracy_score(y_test, pred2))

### 31. GridSearchCV for SVM (C, gamma, kernel)

In [None]:
params = {'C':[0.1,1,10],'gamma':['scale',0.01,0.1],'kernel':['rbf','linear']}
grid = GridSearchCV(SVC(), params, cv=3).fit(X_wine, y_wine)
print('Best:', grid.best_params_)

### 32. Class weighting on imbalanced data (SVM)

In [None]:
from sklearn.datasets import make_classification
X, y = make_classification(n_samples=1000, n_features=20, weights=[0.9,0.1], flip_y=0.01, random_state=42)
X_tr, X_te, y_tr, y_te = train_test_split(X,y,test_size=0.2,random_state=42)
base = SVC().fit(X_tr,y_tr)
weighted = SVC(class_weight='balanced').fit(X_tr,y_tr)
print('Base:', base.score(X_te,y_te), 'Weighted:', weighted.score(X_te,y_te))

### 33. Simple Naive Bayes spam detector (toy email data)

In [None]:
corpus = ['win cash now','limited time offer','meeting agenda attached','project update','cheap meds available','earn money fast']
y = [1,1,0,0,1,1]  # 1=spam, 0=ham
X = CountVectorizer().fit_transform(corpus)
X_tr, X_te, y_tr, y_te = train_test_split(X,y,test_size=0.33,random_state=42)
clf = MultinomialNB().fit(X_tr,y_tr)
print('Accuracy:', clf.score(X_te,y_te))

### 34. SVM vs Naive Bayes on same dataset

In [None]:
X, y = X_wine, y_wine
X_tr, X_te, y_tr, y_te = train_test_split(X,y,test_size=0.2,random_state=42)
svm = SVC().fit(X_tr,y_tr)
nb = GaussianNB().fit(X_tr,y_tr)
print('SVM:', svm.score(X_te,y_te), 'GNB:', nb.score(X_te,y_te))

### 35. Feature selection before Naive Bayes

In [None]:
from sklearn.feature_selection import SelectKBest, f_classif
X, y = X_wine, y_wine
X_tr, X_te, y_tr, y_te = train_test_split(X,y,test_size=0.2,random_state=42)
selector = SelectKBest(f_classif, k=8).fit(X_tr,y_tr)
X_tr_s = selector.transform(X_tr); X_te_s = selector.transform(X_te)
nb1 = GaussianNB().fit(X_tr,y_tr)
nb2 = GaussianNB().fit(X_tr_s,y_tr)
print('No FS:', nb1.score(X_te,y_te), 'With FS:', nb2.score(X_te_s,y_te))

### 36. SVM OvR vs OvO on Wine

In [None]:
from sklearn.multiclass import OneVsRestClassifier, OneVsOneClassifier
X, y = X_wine, y_wine
X_tr, X_te, y_tr, y_te = train_test_split(X,y,test_size=0.2,random_state=42)
ovr = OneVsRestClassifier(SVC()).fit(X_tr,y_tr)
ovo = OneVsOneClassifier(SVC()).fit(X_tr,y_tr)
print('OvR:', ovr.score(X_te,y_te), 'OvO:', ovo.score(X_te,y_te))

### 37. SVM with Linear, Poly, RBF on Breast Cancer

In [None]:
X, y = X_bc, y_bc
X_tr, X_te, y_tr, y_te = train_test_split(X,y,test_size=0.2,random_state=42)
for k in ['linear','poly','rbf']:
    clf = SVC(kernel=k).fit(X_tr,y_tr)
    print(k, 'Accuracy:', clf.score(X_te,y_te))

### 38. SVM with Stratified K-Fold CV

In [None]:
skf = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)
scores = cross_val_score(SVC(), X_wine, y_wine, cv=skf)
print('Mean accuracy:', scores.mean())

### 39. Naive Bayes with different priors

In [None]:
X, y = X_wine, y_wine
X_tr, X_te, y_tr, y_te = train_test_split(X,y,test_size=0.2,random_state=42)
priors = [None, [0.7,0.2,0.1]]
for p in priors:
    g = GaussianNB(priors=p).fit(X_tr,y_tr)
    print('Priors',p, 'Acc:', g.score(X_te,y_te))

### 40. RFE before SVM & compare accuracy

In [None]:
X, y = X_wine, y_wine
X_tr, X_te, y_tr, y_te = train_test_split(X,y,test_size=0.2,random_state=42)
svc = SVC(kernel='linear')
rfe = RFE(estimator=svc, n_features_to_select=8).fit(X_tr,y_tr)
X_tr_r, X_te_r = rfe.transform(X_tr), rfe.transform(X_te)
svc_base = SVC(kernel='linear').fit(X_tr,y_tr)
svc_rfe  = SVC(kernel='linear').fit(X_tr_r,y_tr)
print('Base:', svc_base.score(X_te,y_te), 'RFE:', svc_rfe.score(X_te_r,y_te))

### 41. Precision, Recall, F1 for SVM

In [None]:
X_tr, X_te, y_tr, y_te = train_test_split(X_bc, y_bc, test_size=0.2, random_state=42)
clf = SVC().fit(X_tr, y_tr)
pred = clf.predict(X_te)
p, r, f1, _ = precision_recall_fscore_support(y_te, pred, average='binary')
print('Precision:', p, 'Recall:', r, 'F1:', f1)

### 42. Naive Bayes with Log Loss (Cross-Entropy)

In [None]:
from sklearn.metrics import log_loss
X_tr, X_te, y_tr, y_te = train_test_split(X_bc, y_bc, test_size=0.2, random_state=42)
gnb = GaussianNB().fit(X_tr, y_tr)
probs = gnb.predict_proba(X_te)
print('Log Loss:', log_loss(y_te, probs))

### 43. SVM Confusion Matrix (matplotlib)

In [None]:
X_tr, X_te, y_tr, y_te = train_test_split(X_bc, y_bc, test_size=0.2, random_state=42)
clf = SVC().fit(X_tr, y_tr)
pred = clf.predict(X_te)
cm = confusion_matrix(y_te, pred)
plot_confusion_matrix(cm, classes=['neg','pos'])

### 44. SVR with MAE metric

In [None]:
from sklearn.datasets import make_regression
X, y = make_regression(n_samples=1000, n_features=6, noise=10, random_state=1)
X_tr, X_te, y_tr, y_te = train_test_split(X,y,test_size=0.2,random_state=42)
reg = SVR().fit(X_tr,y_tr)
pred = reg.predict(X_te)
print('MAE:', mean_absolute_error(y_te, pred))

### 45. Naive Bayes ROC-AUC

In [None]:
# Use breast cancer (binary)
X_tr, X_te, y_tr, y_te = train_test_split(X_bc, y_bc, test_size=0.2, random_state=42)
gnb = GaussianNB().fit(X_tr, y_tr)
probs = gnb.predict_proba(X_te)[:,1]
print('ROC-AUC:', roc_auc_score(y_te, probs))

### 46. SVM Precision-Recall Curve

In [None]:
# Binary conversion: malignant vs benign already binary
X_tr, X_te, y_tr, y_te = train_test_split(X_bc, y_bc, test_size=0.2, random_state=42)
svm = SVC(probability=True).fit(X_tr, y_tr)
probs = svm.predict_proba(X_te)[:,1]
prec, rec, _ = precision_recall_curve(y_te, probs)
plt.plot(rec, prec)
plt.xlabel('Recall'); plt.ylabel('Precision'); plt.title('Precision-Recall Curve')
plt.show()