<a href="https://colab.research.google.com/github/kanika0216/python-Basics/blob/main/SVM_%26_Naive_bayes.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Theoretical**

1.  What is a Support Vector Machine (SVM)?

Support Vector Machine (SVM) is a supervised learning algorithm used for classification and regression tasks. It works by finding the optimal hyperplane that best separates different classes in a dataset, maximizing the margin between them.



2. What is the difference between Hard Margin and Soft Margin SVM?

Hard Margin SVM: Only works when data is linearly separable. It strictly enforces that no points from different classes overlap.  
Soft Margin SVM: Allows some misclassification by introducing a penalty for misclassified points, making it more robust for real-world data that may have noise.


3. What is the mathematical intuition behind SVM?

SVM aims to maximize the **margin** (distance between the closest points of different classes, called support vectors) while minimizing misclassification. The optimization problem involves minimizing \( ||w||^2 \) subject to correct classification constraints. It is solved using **Lagrange multipliers** and **quadratic programming**.


4. What is the role of Lagrange Multipliers in SVM?

Lagrange multipliers help in transforming the constrained optimization problem of SVM into an unconstrained problem. This allows the model to find the optimal decision boundary while considering only **support vectors** rather than the entire dataset.



5. What are Support Vectors in SVM?

Support vectors are the data points closest to the decision boundary (hyperplane). They are the most critical points in defining the classifier, as they determine the margin width.


6. What is a Support Vector Classifier (SVC)?

A Support Vector Classifier (SVC) is an extension of SVM that applies to datasets where classes are not completely separable. It introduces a **soft margin** to allow misclassification and improve generalization.



7. What is a Support Vector Regressor (SVR)?

Support Vector Regressor (SVR) applies SVM principles to regression problems. Instead of maximizing the margin between classes, it tries to fit a function within a specified error margin (\(\epsilon\)).



8. What is the Kernel Trick in SVM?

The **Kernel Trick** allows SVM to operate in a **higher-dimensional space** without explicitly transforming the data. It uses kernel functions to compute the dot product in this higher space, enabling SVM to classify data that is not linearly separable.



9. Compare Linear Kernel, Polynomial Kernel, and RBF Kernel

- **Linear Kernel**: Best for linearly separable data. Simple and efficient.  
- **Polynomial Kernel**: Captures more complex relationships, but can be computationally expensive.  
- **RBF (Radial Basis Function) Kernel**: Maps data into infinite-dimensional space, making it highly flexible for non-linear data.



10. HB: What is the effect of the C parameter in SVM?

The **C parameter** controls the trade-off between margin size and misclassification.  
- **High C** → Smaller margin, fewer misclassifications (risk of overfitting).  
- **Low C** → Larger margin, more misclassification allowed (better generalization).  



11. What is the role of the Gamma parameter in RBF Kernel SVM?

Gamma (\(\gamma\)) defines how far the influence of a single training example reaches.  
- **High Gamma** → Each training point has a close influence (may overfit).  
- **Low Gamma** → More general decision boundary (better generalization).  



12. What is the Naïve Bayes classifier, and why is it called "Naïve"?

Naïve Bayes is a probabilistic classifier based on **Bayes' Theorem**, assuming that features are **independent** given the class label. It is called "naïve" because this independence assumption is often unrealistic in real-world data.



13.  What is Bayes’ Theorem?

Bayes' Theorem describes the probability of an event based on prior knowledge of related conditions:  
\[
P(A|B) = \frac{P(B|A) P(A)}{P(B)}
\]  
Where:  
- \( P(A|B) \) = Posterior probability of A given B  
- \( P(B|A) \) = Likelihood  
- \( P(A) \) = Prior probability  
- \( P(B) \) = Evidence  



14. Explain the differences between Gaussian Naïve Bayes, Multinomial Naïve Bayes, and Bernoulli Naïve Bayes

- **Gaussian Naïve Bayes**: Used for **continuous** data (assumes a normal distribution).  
- **Multinomial Naïve Bayes**: Used for **text classification** (word counts).  
- **Bernoulli Naïve Bayes**: Used for **binary** features (e.g., presence/absence of words in text data).  



15.  When should you use Gaussian Naïve Bayes over other variants?

Gaussian Naïve Bayes is ideal when working with **continuous numerical features** that follow a normal (Gaussian) distribution, such as **height, weight, or temperature**.



16. What are the key assumptions made by Naïve Bayes?

 **Feature Independence**: All features are conditionally independent given the class.  
 **Equal Contribution**: Every feature contributes equally to the final classification.  
 **Class Conditional Independence**: The effect of a feature is independent of other features.  



17.  What are the advantages and disadvantages of Naïve Bayes?

**Advantages:**  
- Fast and efficient, even with large datasets.  
- Works well for text classification.  
- Requires small training data to estimate parameters.  

**Disadvantages:**  
- Assumes **feature independence**, which is often unrealistic.  
- Struggles with features that are highly correlated.  



18. Why is Naïve Bayes a good choice for text classification?

Naïve Bayes performs well for text classification because:  
- It handles **high-dimensional** data efficiently.  
- The **independence assumption** is reasonable for words in a document.  
- It requires **low computational power**.  



19.  Compare SVM and Naïve Bayes for classification tasks

| Feature | SVM | Naïve Bayes |  
|---------|-----|-------------|  
| Works well with | Large feature space | High-dimensional text data |  
| Assumptions | No independence assumption | Assumes feature independence |  
| Speed | Slower for large datasets | Very fast and efficient |  
| Robustness | More complex, better accuracy | Works well with small data |  
| Overfitting | Tuned using C & gamma | Less prone to overfitting |  





20. How does Laplace Smoothing help in Naïve Bayes?

Laplace Smoothing prevents **zero probability issues** when a word is absent in training data but appears in testing. It modifies probability estimation by adding a smoothing parameter (\(\alpha\)):


\[P(word | class) = \frac{\text{word count} + \alpha}{\text{total words} + \alpha \times \text{vocabulary size}}\]  


Where **\(\alpha = 1\)** is commonly used.  



**Practical**

1. Write a Python program to train an SVM Classifier on the Iris dataset and evaluate accuracy

In [1]:
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

iris = datasets.load_iris()
X, y = iris.data, iris.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

svm_model = SVC(kernel='linear')
svm_model.fit(X_train, y_train)

y_pred = svm_model.predict(X_test)

accuracy = accuracy_score(y_test, y_pred)
print(f"SVM Classifier Accuracy on Iris Dataset: {accuracy:.2f}")


SVM Classifier Accuracy on Iris Dataset: 1.00


2. Write a Python program to train two SVM classifiers with Linear and RBF kernels on the Wine dataset, then
compare their accuracies

In [2]:
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

wine = datasets.load_wine()
X, y = wine.data, wine.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

linear_svm = SVC(kernel='linear')
rbf_svm = SVC(kernel='rbf')

linear_svm.fit(X_train, y_train)
rbf_svm.fit(X_train, y_train)

linear_acc = accuracy_score(y_test, linear_svm.predict(X_test))
rbf_acc = accuracy_score


3. Write a Python program to train an SVM Regressor (SVR) on a housing dataset and evaluate it using Mean
Squared Error (MSE)

In [None]:
from sklearn.svm import SVR
from sklearn.metrics import mean_squared_error
from sklearn.datasets import fetch_openml

housing = fetch_openml(name='house_prices', version=1)
X, y = housing.data, housing.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

svr = SVR()
svr.fit(X_train, y_train)

y_pred = svr.predict(X_test)

mse = mean_squared_error(y_test, y_pred)
print(f"SVR Mean Squared Error: {mse:.2f}")


4. Write a Python program to train an SVM Classifier with a Polynomial Kernel and visualize the decision
boundary

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.svm import SVC
from sklearn.datasets import load_iris

iris = load_iris()
X, y = iris.data[:, :2], iris.target  # Using only two features for visualization

model = SVC(kernel='poly', degree=3)
model.fit(X, y)

x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.linspace(x_min, x_max, 100), np.linspace(y_min, y_max, 100))
Z = model.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)

plt.contourf(xx, yy, Z, alpha=0.3)
plt.scatter(X[:, 0], X[:, 1], c=y, edgecolors='k')
plt.show()


5. Write a Python program to train a Gaussian Naïve Bayes classifier on the Breast Cancer dataset and
evaluate accuracy

In [None]:
from sklearn.naive_bayes import GaussianNB
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

cancer = load_breast_cancer()
X, y = cancer.data, cancer.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

nb_model = GaussianNB()
nb_model.fit(X_train, y_train)

y_pred = nb_model.predict(X_test)

accuracy = accuracy_score(y_test, y_pred)
print(f"Gaussian Naïve Bayes Accuracy on Breast Cancer Dataset: {accuracy:.2f}")


6. Write a Python program to train a Multinomial Naïve Bayes classifier for text classification using the 20
Newsgroups dataset.

In [None]:
from sklearn.naive_bayes import MultinomialNB
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.datasets import fetch_20newsgroups
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

newsgroups = fetch_20newsgroups(subset='train', remove=('headers', 'footers', 'quotes'))
X_train, X_test, y_train, y_test = train_test_split(newsgroups.data, newsgroups.target, test_size=0.2, random_state=42)

vectorizer = TfidfVectorizer()
X_train_tfidf = vectorizer.fit_transform(X_train)
X_test_tfidf = vectorizer.transform(X_test)

nb_model = MultinomialNB()
nb_model.fit(X_train_tfidf, y_train)

y_pred = nb_model.predict(X_test_tfidf)

accuracy = accuracy_score(y_test, y_pred)
print(f"Multinomial Naïve Bayes Accuracy on 20 Newsgroups Dataset: {accuracy:.2f}")


7. Write a Python program to train an SVM Classifier with different C values and compare the decision
boundaries visually

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.svm import SVC
from sklearn.datasets import load_iris

iris = load_iris()
X, y = iris.data[:, :2], iris.target

C_values = [0.1, 1, 10]
for C in C_values:
    model = SVC(kernel='linear', C=C)
    model.fit(X, y)

    x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
    y_min, y_max = X[:, 1


8. Write a Python program to train a Bernoulli Naïve Bayes classifier for binary classification on a dataset with
binary features.

In [None]:
from sklearn.naive_bayes import BernoulliNB
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

X, y = make_classification(n_samples=1000, n_features=10, n_informative=5, n_redundant=0, n_clusters_per_class=1, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

bnb_model = BernoulliNB()
bnb_model.fit(X_train, y_train)

y_pred = bnb_model.predict(X_test)

accuracy = accuracy_score(y_test, y_pred)
print(f"Bernoulli Naïve Bayes Accuracy: {accuracy:.2f}")


9. Write a Python program to apply feature scaling before training an SVM model and compare results with
unscaled data

In [None]:
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

iris = load_iris()
X, y = iris.data, iris.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

svm_model = SVC(kernel='linear')
svm_model.fit(X_train_scaled, y_train)

y_pred_scaled = svm_model.predict(X_test_scaled)
accuracy_scaled = accuracy_score(y_test, y_pred_scaled)

svm_model.fit(X_train, y_train)
y_pred_unscaled = svm_model.predict(X_test)
accuracy_unscaled = accuracy_score(y_test, y_pred_unscaled)

print(f"SVM Accuracy with Scaled Data: {accuracy_scaled:.2f}")
print(f"SVM Accuracy with Unscaled Data: {accuracy_unscaled:.2f}")


10. Write a Python program to train a Gaussian Naïve Bayes model and compare the predictions before and
after Laplace Smoothing

In [None]:
from sklearn.naive_bayes import GaussianNB
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

cancer = load_breast_cancer()
X, y = cancer.data, cancer.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

nb_model_no_smoothing = GaussianNB(var_smoothing=0)
nb_model_no_smoothing.fit(X_train, y_train)

y_pred_no_smoothing = nb_model_no_smoothing.predict(X_test)
accuracy_no_smoothing = accuracy_score(y_test, y_pred_no_smoothing)

nb_model_with_smoothing = GaussianNB()
nb_model_with_smoothing.fit(X_train, y_train)

y_pred_with_smoothing = nb_model_with_smoothing.predict(X_test)
accuracy_with_smoothing = accuracy_score(y_test, y_pred_with_smoothing)

print(f"Accuracy without Laplace Smoothing: {accuracy_no_smoothing:.2f}")
print(f"Accuracy with Laplace Smoothing: {accuracy_with_smoothing:.2f}")


11. Write a Python program to train an SVM Classifier and use GridSearchCV to tune the hyperparameters (C,
gamma, kernel)

In [None]:
from sklearn.svm import SVC
from sklearn.model_selection import GridSearchCV
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

iris = load_iris()
X, y = iris.data, iris.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

param_grid = {'C': [0.1, 1, 10], 'gamma': ['scale', 'auto'], 'kernel': ['linear', 'rbf']}
grid_search = GridSearchCV(SVC(), param_grid, cv=5)
grid_search.fit(X_train, y_train)

print(f"Best Parameters: {grid_search.best_params_}")


12. Write a Python program to train an SVM Classifier on an imbalanced dataset and apply class weighting and
check it improve accuracy

In [None]:
from sklearn.svm import SVC
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

X, y = make_classification(n_samples=1000, n_features=10, n_informative=5, n_redundant=0, weights=[0.9, 0.1], random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

svm_model = SVC(kernel='linear', class_weight='balanced')
svm_model.fit(X_train, y_train)

y_pred = svm_model.predict(X_test)

accuracy = accuracy_score(y_test, y_pred)
print(f"SVM Classifier with Class Weighting Accuracy: {accuracy:.2f}")


13. Write a Python program to implement a Naïve Bayes classifier for spam detection using email data

In [None]:
from sklearn.naive_bayes import MultinomialNB
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Example email data (for demonstration purposes, replace with actual email dataset)
emails = ['Free money now!', 'Meeting at 3pm', 'Limited time offer', 'Let\'s catch up tomorrow']
labels = [1, 0, 1, 0]  # 1: Spam, 0: Not Spam

vectorizer = CountVectorizer()
X = vectorizer.fit_transform(emails)
y = labels

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

nb_model = MultinomialNB()
nb_model.fit(X_train, y_train)

y_pred = nb_model.predict(X_test)

accuracy = accuracy_score(y_test, y_pred)
print(f"Naïve Bayes Spam Detection Accuracy: {accuracy:.2f}")


14. Write a Python program to train an SVM Classifier and a Naïve Bayes Classifier on the same dataset and
compare their accuracy

In [None]:
from sklearn.svm import SVC
from sklearn.naive_bayes import GaussianNB
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

iris = load_iris()
X, y = iris.data, iris.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

svm_model = SVC(kernel='linear')
svm_model.fit(X_train, y_train)
svm_acc = accuracy_score(y_test, svm_model.predict(X_test))

nb_model = GaussianNB()
nb_model.fit(X_train, y_train)
nb_acc = accuracy_score(y_test, nb_model.predict(X_test))

print(f"SVM Classifier Accuracy: {svm_acc:.2f}")
print(f"Naïve Bayes Classifier Accuracy: {nb_acc:.2f}")


15. Write a Python program to perform feature selection before training a Naïve Bayes classifier and compare
results

In [None]:
from sklearn.feature_selection import SelectKBest, f_classif
from sklearn.naive_bayes import GaussianNB
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

iris = load_iris()
X, y = iris.data, iris.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

selector = SelectKBest(f_classif, k=2)
X_train_selected = selector.fit_transform(X_train, y_train)
X_test_selected = selector.transform(X_test)

nb_model = GaussianNB()
nb_model.fit(X_train_selected, y_train)

y_pred_selected = nb_model.predict(X_test_selected)
accuracy_selected = accuracy_score(y_test, y_pred_selected)

nb_model.fit(X_train, y_train)
y_pred_full = nb_model.predict(X_test)
accuracy_full = accuracy_score(y_test, y_pred_full)

print(f"Naïve Bayes Accuracy with Feature Selection: {accuracy_selected:.2f}")
print(f"Naïve Bayes Accuracy without Feature Selection: {accuracy_full:.2f}")


16. Write a Python program to train an SVM Classifier using One-vs-Rest (OvR) and One-vs-One (OvO)
strategies on the Wine dataset and compare their accuracy

In [None]:
from sklearn.svm import SVC
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

wine = load_wine()
X, y = wine.data, wine.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

ovr_model = SVC(kernel='linear', decision_function_shape='ovr')
ovo_model = SVC(kernel='linear', decision_function_shape='ovo')

ovr_model.fit(X_train, y_train)
ovo_model.fit(X_train, y_train)

ovr_acc = accuracy_score(y_test, ovr_model.predict(X_test))
ovo_acc = accuracy_score(y_test, ovo_model.predict(X_test))

print(f"One-vs-Rest Accuracy: {ovr_acc:.2f}")
print(f"One-vs-One Accuracy: {ovo_acc:.2f}")


17. Write a Python program to train an SVM Classifier using Linear, Polynomial, and RBF kernels on the Breast
Cancer dataset and compare their accuracy

In [None]:
from sklearn.svm import SVC
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

cancer = load_breast_cancer()
X, y = cancer.data, cancer.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

kernels = ['linear', 'poly', 'rbf']
for kernel in kernels:
    model = SVC(kernel=kernel)
    model.fit(X_train, y_train)
    accuracy = accuracy_score(y_test, model.predict(X_test))
    print(f"Accuracy with {kernel} kernel: {accuracy:.2f}")


18. Write a Python program to train an SVM Classifier using Stratified K-Fold Cross-Validation and compute the
average accuracy

In [None]:
from sklearn.svm import SVC
from sklearn.datasets import load_iris
from sklearn.model_selection import cross_val_score

iris = load_iris()
X, y = iris.data, iris.target

svm_model = SVC(kernel='linear')
cv_scores = cross_val_score(svm_model, X, y, cv=5)

print(f"Average Accuracy with Stratified K-Fold Cross-Validation: {cv_scores.mean():.2f}")


19. Write a Python program to train a Naïve Bayes classifier using different prior probabilities and compare
performance

In [None]:
from sklearn.naive_bayes import GaussianNB
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
import numpy as np

iris = load_iris()
X, y = iris.data, iris.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

prior_probabilities = [np.array([0.3, 0.4, 0.3]), np.array([0.2, 0.5, 0.3]), np.array([0.4, 0.4, 0.2])]
for priors in prior_probabilities:
    nb_model = GaussianNB(priors=priors)
    nb_model.fit(X_train, y_train)
    y_pred = nb_model.predict(X_test)
    accuracy = accuracy_score(y_test, y_pred)
    print(f"Naïve Bayes Accuracy with priors {priors}: {accuracy:.2f}")


20. Write a Python program to perform Recursive Feature Elimination (RFE) before training an SVM Classifier and
compare accuracy

In [None]:
from sklearn.svm import SVC
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.feature_selection import RFE
from sklearn.metrics import accuracy_score

iris = load_iris()
X, y = iris.data, iris.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

svm_model = SVC(kernel='linear')
rfe = RFE(svm_model, n_features_to_select=2)
X_train_rfe = rfe.fit_transform(X_train, y_train)
X_test_rfe = rfe.transform(X_test)

svm_model.fit(X_train_rfe, y_train)
y_pred_rfe = svm_model.predict(X_test_rfe)
accuracy_rfe = accuracy_score(y_test, y_pred_rfe)

svm_model.fit(X_train, y_train)
y_pred_full = svm_model.predict(X_test)
accuracy_full = accuracy_score(y_test, y_pred_full)

print(f"SVM Accuracy with RFE: {accuracy_rfe:.2f}")
print(f"SVM Accuracy without RFE: {accuracy_full:.2f}")


21. Write a Python program to train an SVM Classifier and evaluate its performance using Precision, Recall, and
F1-Score instead of accuracy

In [None]:
from sklearn.svm import SVC
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.metrics import precision_score, recall_score, f1_score

cancer = load_breast_cancer()
X, y = cancer.data, cancer.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

svm_model = SVC(kernel='linear')
svm_model.fit(X_train, y_train)

y_pred = svm_model.predict(X_test)

precision = precision_score(y_test, y_pred)
recall = recall_score(y_test, y_pred)
f1 = f1_score(y_test, y_pred)

print(f"Precision: {precision:.2f}")
print(f"Recall: {recall:.2f}")
print(f"F1-Score: {f1:.2f}")


22. Write a Python program to train a Naïve Bayes Classifier and evaluate its performance using Log Loss
(Cross-Entropy Loss).

In [None]:
from sklearn.naive_bayes import GaussianNB
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.metrics import log_loss

cancer = load_breast_cancer()
X, y = cancer.data, cancer.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

nb_model = GaussianNB()
nb_model.fit(X_train, y_train)

y_pred_prob = nb_model.predict_proba(X_test)
loss = log_loss(y_test, y_pred_prob)

print(f"Log Loss (Cross-Entropy Loss): {loss:.2f}")


23. Write a Python program to train an SVM Classifier and visualize the Confusion Matrix using seaborn

In [None]:
from sklearn.svm import SVC
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix
import seaborn as sns
import matplotlib.pyplot as plt

cancer = load_breast_cancer()
X, y = cancer.data, cancer.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

svm_model = SVC(kernel='linear')
svm_model.fit(X_train, y_train)

y_pred = svm_model.predict(X_test)

cm = confusion_matrix(y_test, y_pred)
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', xticklabels=cancer.target_names, yticklabels=cancer.target_names)
plt.xlabel('Predicted')
plt.ylabel('True')
plt.show()


24. Write a Python program to train an SVM Regressor (SVR) and evaluate its performance using Mean Absolute
Error (MAE) instead of MSE

In [None]:
from sklearn.svm import SVR
from sklearn.datasets import fetch_openml
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_absolute_error

housing = fetch_openml(name='house_prices', version=1)
X, y = housing.data, housing.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

svr = SVR()
svr.fit(X_train, y_train)

y_pred = svr.predict(X_test)

mae = mean_absolute_error(y_test, y_pred)
print(f"SVR Mean Absolute Error: {mae:.2f}")


25. Write a Python program to train a Naïve Bayes classifier and evaluate its performance using the ROC-AUC
score

In [None]:
from sklearn.naive_bayes import GaussianNB
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.metrics import roc_auc_score

cancer = load_breast_cancer()
X, y = cancer.data, cancer.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

nb_model = GaussianNB()
nb_model.fit(X_train, y_train)

y_pred_prob = nb_model.predict_proba(X_test)[:, 1]

roc_auc = roc_auc_score(y_test, y_pred_prob)
print(f"ROC-AUC Score: {roc_auc:.2f}")


26. Write a Python program to train an SVM Classifier and visualize the Precision-Recall Curve.

In [None]:
from sklearn.svm import SVC
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.metrics import precision_recall_curve
import matplotlib.pyplot as plt

cancer = load_breast_cancer()
X, y = cancer.data, cancer.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

svm_model = SVC(kernel='linear', probability=True)
svm_model.fit(X_train, y_train)

y_pred_prob = svm_model.predict_proba(X_test)[:, 1]

precision, recall, _ = precision_recall_curve(y_test, y_pred_prob)

plt.plot(recall, precision, color='blue')
plt.xlabel('Recall')
plt.ylabel('Precision')
plt.title('Precision-Recall Curve')
plt.show()
