# THEORY

### 1. What is a Support Vector Machine (SVM)

**Answer:**  

SVM is a supervised learning algorithm used for classification and regression tasks.  

It works by finding the best hyperplane that separates data points of different classes with the maximum margin.  


### 2. What is the difference between Hard Margin and Soft Margin SVM

**Answer:**  

- **Hard Margin SVM** assumes that the data is linearly separable and doesn't allow any misclassification.  

- **Soft Margin SVM** allows some misclassifications and introduces slack variables to handle noisy or non-separable data.  


### 3. What is the mathematical intuition behind SVM

**Answer:**  

SVM tries to maximize the margin between two classes by solving an optimization problem:  

Minimize \( \frac{1}{2} \|w\|^2 \), subject to the constraint that all data points are correctly classified with a margin.  

In soft margin, slack variables \( \xi_i \) and penalty term \( C \sum \xi_i \) are added.  


### 4. What is the role of Lagrange Multipliers in SVM

**Answer:**  

Lagrange Multipliers help convert the constrained optimization problem into a form that can be solved using dual formulation.  

They are used to derive the support vectors and the decision boundary in SVM.  


### 5. What are Support Vectors in SVM

**Answer:**  

Support Vectors are the data points that lie closest to the decision boundary.  

They directly influence the position and orientation of the separating hyperplane.  


### 6. What is a Support Vector Classifier (SVC)

**Answer:**  

SVC is the implementation of SVM for classification problems.  

It uses different kernels to handle linear and non-linear classification tasks.  


### 7. What is a Support Vector Regressor (SVR)

**Answer:**  

SVR is the regression version of SVM.  

Instead of maximizing the margin for classification, SVR tries to fit a function within an ε-tube around the target values while minimizing model complexity.  


### 8. What is the Kernel Trick in SVM

**Answer:**  

The kernel trick allows SVM to operate in a high-dimensional feature space without explicitly transforming the data.  

It uses kernel functions to compute dot products in the transformed space efficiently.  


### 9. Compare Linear Kernel, Polynomial Kernel, and RBF Kernel

**Answer:**  

- **Linear Kernel:** Used for linearly separable data. Simple and fast.  

- **Polynomial Kernel:** Captures more complex patterns with polynomial degrees.  

- **RBF Kernel:** Handles highly non-linear data. It maps data into infinite-dimensional space.  


### 10. What is the effect of the C parameter in SVM

**Answer:**  

C controls the trade-off between maximizing the margin and minimizing classification error.  

- High C → less tolerant of misclassifications (hard margin behavior).  
- Low C → more tolerant of misclassifications (soft margin behavior).  


### 11. What is the role of the Gamma parameter in RBF Kernel SVM

**Answer:**  

Gamma defines the influence of a single training example.  

- High gamma → closer fitting to training data (risk of overfitting).  
- Low gamma → smoother decision boundary (more generalization).  


### 12. What is the Naïve Bayes classifier, and why is it called "Naïve"

**Answer:**  

Naïve Bayes is a probabilistic classifier based on Bayes' Theorem.  

It is called "naïve" because it assumes that all features are conditionally independent given the class label.  


### 13. What is Bayes’ Theorem

**Answer:**  

\[
P(A|B) = \frac{P(B|A) \cdot P(A)}{P(B)}
\]  

It describes the probability of event A occurring given that event B is true.  


### 14. Explain the differences between Gaussian Naïve Bayes, Multinomial Naïve Bayes, and Bernoulli Naïve Bayes

**Answer:**  

- **Gaussian NB:** Assumes features follow a normal distribution. Used for continuous data.  

- **Multinomial NB:** Used for discrete counts (e.g., word counts in text).  

- **Bernoulli NB:** Assumes binary features (presence or absence of words).  


### 15. When should you use Gaussian Naïve Bayes over other variants

**Answer:**  

Use Gaussian NB when your input features are continuous and roughly follow a normal distribution.  


### 16. What are the key assumptions made by Naïve Bayes

**Answer:**  

- All features are conditionally independent given the class label.  

- All features contribute equally and independently to the outcome.  


### 17. What are the advantages and disadvantages of Naïve Bayes

**Answer:**  

**Advantages:**  

- Fast and efficient  
- Works well with high-dimensional data  
- Performs well on text classification  

**Disadvantages:**  

- Assumes feature independence  
- Doesn't work well with correlated features  


### 18. Why is Naïve Bayes a good choice for text classification

**Answer:**  

Text data often has many features (words), and Naïve Bayes handles high-dimensional spaces well.  

It’s fast, works well with sparse data, and gives good performance in practice.  


### 19. Compare SVM and Naïve Bayes for classification tasks

**Answer:**  

- **SVM:** More powerful and flexible, handles complex boundaries well, slower training.  

- **Naïve Bayes:** Simpler, faster, and works surprisingly well for text.  

- Use NB when speed and simplicity matter; use SVM for more complex tasks.  


### 20. How does Laplace Smoothing help in Naïve Bayes

**Answer:**  

Laplace Smoothing (add-1 smoothing) prevents zero probabilities for words that are not seen in the training data.  

It ensures that the model does not completely discard unseen features during prediction.  


#PRACTICAL

# 🤖 SVM and Naïve Bayes Practical Questions (with Descriptive Answers)

### 21. Write a Python program to train an SVM Classifier on the Iris dataset and evaluate accuracy

**Answer:**  

Use `sklearn.datasets.load_iris`, `SVC`, and `train_test_split`.  

Train the model, make predictions, and evaluate using `accuracy_score`.  


### 22. Write a Python program to train two SVM classifiers with Linear and RBF kernels on the Wine dataset, then compare their accuracies

**Answer:**  

Use `SVC(kernel='linear')` and `SVC(kernel='rbf')`.  

Train both models and compare `accuracy_score` on test data.  


### 23. Write a Python program to train an SVM Regressor (SVR) on a housing dataset and evaluate it using Mean Squared Error (MSE)

**Answer:**  

Use `SVR()` from `sklearn.svm`, train on housing data, and evaluate using `mean_squared_error`.  


### 24. Write a Python program to train an SVM Classifier with a Polynomial Kernel and visualize the decision boundary

**Answer:**  

Use `SVC(kernel='poly')` and `matplotlib` or `seaborn` for 2D decision boundary plotting.  


### 25. Write a Python program to train a Gaussian Naïve Bayes classifier on the Breast Cancer dataset and evaluate accuracy

**Answer:**  

Use `GaussianNB()` from `sklearn.naive_bayes` with `load_breast_cancer()` dataset.  

Evaluate using `accuracy_score`.  


### 26. Write a Python program to train a Multinomial Naïve Bayes classifier for text classification using the 20 Newsgroups dataset

**Answer:**  

Use `MultinomialNB()` and `fetch_20newsgroups_vectorized()`.  

Train and evaluate the model using accuracy or F1 score.  


### 27. Write a Python program to train an SVM Classifier with different C values and compare the decision boundaries visually

**Answer:**  

Loop through several values of `C` in `SVC()` and plot decision boundaries using `matplotlib`.  


### 28. Write a Python program to train a Bernoulli Naïve Bayes classifier for binary classification on a dataset with binary features

**Answer:**  

Use `BernoulliNB()` on binary datasets such as text data with presence/absence of words.  


### 29. Write a Python program to apply feature scaling before training an SVM model and compare results with unscaled data

**Answer:**  

Use `StandardScaler()` to scale features before training `SVC()`.  

Compare model accuracy with and without scaling.  


### 30. Write a Python program to train a Gaussian Naïve Bayes model and compare the predictions before and after Laplace Smoothing

**Answer:**  

Compare predictions with and without adding small constant to likelihoods (Laplace Smoothing).  


### 31. Write a Python program to train an SVM Classifier and use GridSearchCV to tune the hyperparameters (C, gamma, kernel)

**Answer:**  

Use `GridSearchCV()` with a parameter grid of C, gamma, and kernel for tuning `SVC`.  


### 32. Write a Python program to train an SVM Classifier on an imbalanced dataset and apply class weighting and check if it improves accuracy

**Answer:**  

Set `class_weight='balanced'` in `SVC()` and evaluate accuracy before and after.  


### 33. Write a Python program to implement a Naïve Bayes classifier for spam detection using email data

**Answer:**  

Use `MultinomialNB()` with a preprocessed email dataset (like SMS Spam Collection).  

Evaluate using accuracy or F1 score.  


### 34. Write a Python program to train an SVM Classifier and a Naïve Bayes Classifier on the same dataset and compare their accuracy

**Answer:**  

Train both `SVC()` and `MultinomialNB()` or `GaussianNB()` on the same train-test split.  

Compare their `accuracy_score`.  


### 35. Write a Python program to perform feature selection before training a Naïve Bayes classifier and compare results

**Answer:**  

Use `SelectKBest` or `chi2` for feature selection before `MultinomialNB()` or `GaussianNB()`.  

Compare accuracy with and without selection.  


### 36. Write a Python program to train an SVM Classifier using One-vs-Rest (OvR) and One-vs-One (OvO) strategies on the Wine dataset and compare their accuracy

**Answer:**  

Use `OneVsRestClassifier(SVC())` and `OneVsOneClassifier(SVC())` and compare accuracies.  


### 37. Write a Python program to train an SVM Classifier using Linear, Polynomial, and RBF kernels on the Breast Cancer dataset and compare their accuracy

**Answer:**  

Train three `SVC()` models with `kernel='linear'`, `kernel='poly'`, and `kernel='rbf'`.  

Compare model accuracies.  


### 38. Write a Python program to train an SVM Classifier using Stratified K-Fold Cross-Validation and compute the average accuracy

**Answer:**  

Use `StratifiedKFold` with `cross_val_score(SVC())` to compute mean accuracy.  


### 39. Write a Python program to train a Naïve Bayes classifier using different prior probabilities and compare performance

**Answer:**  

Use the `priors` parameter in `GaussianNB()` and compare accuracy or log loss.  


### 40. Write a Python program to perform Recursive Feature Elimination (RFE) before training an SVM Classifier and compare accuracy

**Answer:**  

Use `RFE(estimator=SVC(kernel='linear'))` to select features before training.  

Compare model accuracy with full vs selected features.  


### 41. Write a Python program to train an SVM Classifier and evaluate its performance using Precision, Recall, and F1-Score instead of accuracy

**Answer:**  

Use `precision_score`, `recall_score`, and `f1_score` from `sklearn.metrics` to evaluate `SVC`.  


### 42. Write a Python program to train a Naïve Bayes Classifier and evaluate its performance using Log Loss (Cross-Entropy Loss)

**Answer:**  

Use `log_loss()` from `sklearn.metrics` on probabilities from `predict_proba()` of NB model.  


### 43. Write a Python program to train an SVM Classifier and visualize the Confusion Matrix using seaborn

**Answer:**  

Use `confusion_matrix()` and `seaborn.heatmap()` after training `SVC()` on test data.  


### 44. Write a Python program to train an SVM Regressor (SVR) and evaluate its performance using Mean Absolute Error (MAE) instead of MSE

**Answer:**  

Train `SVR()` and evaluate using `mean_absolute_error` from `sklearn.metrics`.  


### 45. Write a Python program to train a Naïve Bayes classifier and evaluate its performance using the ROC-AUC score

**Answer:**  

Use `roc_auc_score()` with `predict_proba()` or `decision_function()` from `NB` model.  


### 46. Write a Python program to train an SVM Classifier and visualize the Precision-Recall Curve

**Answer:**  

Use `precision_recall_curve()` and `matplotlib.pyplot.plot()` to visualize the curve for `SVC`.  
