
1 What is a Support Vector Machine (SVM)?

SVM is a supervised learning algorithm used for classification and regression tasks. It aims to find the hyperplane (decision boundary) that best separates data points of different classes by maximizing the margin between them.

2 What is the difference between Hard Margin and Soft Margin SVM?

Hard Margin SVM: Used when the data is linearly separable. It creates a decision boundary where no data points are allowed on the wrong side of the hyperplane. Soft Margin SVM: Used when the data is not perfectly separable. It allows some misclassification of data points but tries to balance maximizing the margin and minimizing the classification error, controlled by a parameter

**3 What is the mathematical intuition behind SVM? **

SVM works by transforming the data into a higher-dimensional space where a hyperplane can be used to separate the classes. The goal is to find the hyperplane that maximizes the margin, defined as the distance between the hyperplane and the closest points from each class (the support vectors).

4 What is the role of Lagrange Multipliers in SVM?

Lagrange multipliers are used to convert the constrained optimization problem in SVM (maximizing the margin subject to constraints that points must be classified correctly) into an unconstrained one. This allows for the use of optimization techniques to find the optimal hyperplane.

5 What are Support Vectors in SVM?

Support vectors are the data points that are closest to the decision boundary (hyperplane). These points are critical as they define the position and orientation of the hyperplane. They are the ones that determine the margin of the SVM model.

6 What is a Support Vector Classifier (SVC)?

SVC is a specific implementation of SVM for classification tasks. It learns the optimal hyperplane that maximizes the margin between classes, making it suitable for binary and multiclass classification.

7 What is a Support Vector Regressor (SVR)?

SVR is a variant of SVM used for regression tasks. Instead of separating data into classes, it tries to fit a function that has at most a certain margin of tolerance (controlled by a parameter 𝜖 ϵ) between the actual data points and the predicted values.

8 What is the Kernel Trick in SVM?

The kernel trick allows SVM to operate in higher-dimensional feature spaces without explicitly computing the coordinates of the data in that space. This is done by using a kernel function (such as the polynomial or radial basis function (RBF) kernel) to calculate the dot product in the higher-dimensional space.

9 Compare Linear Kernel, Polynomial Kernel, and RBF Kernel:

Linear Kernel: Suitable for linearly separable data. It computes the dot product of input vectors directly.

Polynomial Kernel: Useful for non-linear data. It computes the dot product raised to a power d, allowing for curved decision boundaries.

RBF Kernel: The most commonly used kernel in SVM. It projects data into an infinite-dimensional space, allowing complex decision boundaries. It computes the similarity between data points based on their Euclidean distance.

10 What is the effect of the C parameter in SVM?

The C parameter controls the trade-off between maximizing the margin and minimizing classification error. A high value of C tries to classify all training points correctly, possibly leading to overfitting. A lower C allows more misclassification but can improve generalization.

11 What is the role of the Gamma parameter in RBF Kernel SVM?

The Gamma parameter in RBF kernel determines the influence of a single training example. A small value of gamma means a far-reaching influence, while a large gamma means that each training point has a small influence, leading to a more complex decision boundary.

12 What is the Naïve Bayes classifier, and why is it called "Naïve"

Naïve Bayes is a classification algorithm based on Bayes' Theorem that assumes that features are conditionally independent given the class. The term "naïve" comes from this assumption, which is often unrealistic but still works well in many cases, especially for text classification.

13 What is Bayes’ Theorem?

Bayes' Theorem describes the probability of a class 𝐶 C given observed features 𝑋 X using the formula: 𝑃 ( 𝐶 ∣ 𝑋 )
𝑃 ( 𝑋 ∣ 𝐶 ) 𝑃 ( 𝐶 ) 𝑃 ( 𝑋 ) P(C∣X)= P(X) P(X∣C)P(C)​

where 𝑃 ( 𝐶 ∣ 𝑋 ) P(C∣X) is the posterior probability of the class given the features, 𝑃 ( 𝑋 ∣ 𝐶 ) P(X∣C) is the likelihood, 𝑃 ( 𝐶 ) P(C) is the prior probability of the class, and 𝑃 ( 𝑋 ) P(X) is the evidence.

**14 Explain the differences between Gaussian Naïve BayesMultinomial Naïve Bayes, and Bernoulli Naïve Bayes:

Gaussian Naïve Bayes: Assumes that the features follow a Gaussian (normal) distribution.

Multinomial Naïve Bayes: Assumes that the features (especially used for text classification) are counts or frequencies of events (e.g., word counts in text).

Bernoulli Naïve Bayes: Assumes that the features are binary (e.g., the presence or absence of a word in text classification).

15 When should you use Gaussian Naïve Bayes over other variants?

Gaussian Naïve Bayes is ideal when the features are continuous and can be modeled using a Gaussian distribution. It is suitable for cases where the feature values are real numbers, such as in medical data with continuous attributes.

16 What are the key assumptions made by Naïve Bayes?

The key assumption in Naïve Bayes is the conditional independence of features given the class. This assumption simplifies the computation of the likelihood term 𝑃 ( 𝑋 ∣ 𝐶 ) P(X∣C) as the product of individual feature probabilities.

17 What are the advantages and disadvantages of Naïve Bayes?

Advantages:
Simple and easy to implement. Efficient with large datasets. Works well for text classification (e.g., spam detection).

Disadvantages:
The conditional independence assumption is often unrealistic. It doesn't work well when features are highly correlated.

18 Why is Naïve Bayes a good choice for text classification?

Naïve Bayes works well for text classification because it models the conditional independence between words in the documents, which makes it computationally efficient and effective for tasks like spam detection or sentiment analysis, even with large vocabularies.

19 Compare SVM and Naïve Bayes for classification tasks:

SVM is more powerful when the decision boundary is complex and non-linear. It works well for small to medium-sized datasets with many features, especially when there is a clear margin of separation.

Naïve Bayes, on the other hand, is more efficient and works well when the features are independent, and the data is relatively simple. It is often used in text classification due to its simplicity and speed.

20 How does Laplace Smoothing help in Naïve Bayes?

Laplace Smoothing (also called additive smoothing) is used in Naïve Bayes to handle the problem of zero probabilities when a feature value does not appear in the training set for a particular class. It adds a small constant (usually 1) to the count of each feature to ensure that all features have a non-zero probability.


# 21  Write a Python program to train an SVM Classifier on the Iris dataset and evaluate accuracy:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

# Load Iris dataset
data = load_iris()
X = data.data
y = data.target

# Split dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Train an SVM Classifier with a Linear Kernel
svm_clf = SVC(kernel='linear')
svm_clf.fit(X_train, y_train)

# Make predictions and evaluate accuracy
y_pred = svm_clf.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)

print(f"SVM Classifier Accuracy with Linear Kernel: {accuracy:.4f}")


     
SVM Classifier Accuracy with Linear Kernel: 1.0000

# 22 Write a Python program to train two SVM classifiers with Linear and RBF kernels on the Wine dataset, then
# compare their accuracies:
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

# Load Wine dataset
data = load_wine()
X = data.data
y = data.target

# Split dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Train an SVM Classifier with Linear Kernel
svm_linear = SVC(kernel='linear')
svm_linear.fit(X_train, y_train)

# Train an SVM Classifier with RBF Kernel
svm_rbf = SVC(kernel='rbf')
svm_rbf.fit(X_train, y_train)

# Make predictions and evaluate accuracy for Linear Kernel
y_pred_linear = svm_linear.predict(X_test)
accuracy_linear = accuracy_score(y_test, y_pred_linear)

# Make predictions and evaluate accuracy for RBF Kernel
y_pred_rbf = svm_rbf.predict(X_test)
accuracy_rbf = accuracy_score(y_test, y_pred_rbf)

# Print the accuracies
print(f"SVM Accuracy with Linear Kernel: {accuracy_linear:.4f}")
print(f"SVM Accuracy with RBF Kernel: {accuracy_rbf:.4f}")

     
SVM Accuracy with Linear Kernel: 0.9815
SVM Accuracy with RBF Kernel: 0.7593

# 24 Write a Python program to train an SVM Classifier with a Polynomial Kernel and visualize the decision
# boundary:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.svm import SVC
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# Generate a synthetic 2D dataset
# Set n_informative, n_redundant, and n_repeated to values that sum to less than n_features
X, y = make_classification(n_samples=100, n_features=2, n_classes=2, n_informative=2, n_redundant=0, n_repeated=0, random_state=42)

# Split dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Standardize the features (important for SVMs)
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

# Train an SVM Classifier with Polynomial Kernel
svm_poly = SVC(kernel='poly', degree=3)
svm_poly.fit(X_train, y_train)

# Visualize the decision boundary
xx, yy = np.meshgrid(np.linspace(X_train[:, 0].min(), X_train[:, 0].max(), 100),
                     np.linspace(X_train[:, 1].min(), X_train[:, 1].max(), 100))
Z = svm_poly.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)

plt.contourf(xx, yy, Z, alpha=0.75, cmap='coolwarm')
plt.scatter(X_train[:, 0], X_train[:, 1], c=y_train, edgecolors='k', marker='o', cmap='coolwarm')
plt.title('SVM with Polynomial Kernel - Decision Boundary')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.show()
     


# 25 Write a Python program to train a Gaussian Naïve Bayes classifier on the Breast Cancer dataset and
# evaluate accuracy:
from sklearn.naive_bayes import GaussianNB
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Load Breast Cancer dataset
data = load_breast_cancer()
X = data.data
y = data.target

# Split dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Train a Gaussian Naïve Bayes classifier
gnb = GaussianNB()
gnb.fit(X_train, y_train)

# Make predictions and evaluate accuracy
y_pred = gnb.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)

print(f"Gaussian Naïve Bayes Accuracy: {accuracy:.4f}")

     
Gaussian Naïve Bayes Accuracy: 0.9415

# 26 Write a Python program to train a Multinomial Naïve Bayes classifier for text classification using the 20
# Newsgroups dataset
from sklearn.naive_bayes import MultinomialNB
from sklearn.datasets import fetch_20newsgroups
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics import accuracy_score

# Load 20 Newsgroups dataset
newsgroups = fetch_20newsgroups(subset='all')

# Convert the text data into TF-IDF features
vectorizer = TfidfVectorizer(stop_words='english')
X = vectorizer.fit_transform(newsgroups.data)
y = newsgroups.target

# Split dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Train a Multinomial Naïve Bayes classifier
mnb = MultinomialNB()
mnb.fit(X_train, y_train)

# Make predictions and evaluate accuracy
y_pred = mnb.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)

print(f"Multinomial Naïve Bayes Accuracy: {accuracy:.4f}")

     
Multinomial Naïve Bayes Accuracy: 0.8721

# 28 Write a Python program to train a Bernoulli Naïve Bayes classifier for binary classification on a dataset with
# binary features
from sklearn.naive_bayes import BernoulliNB
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Generate synthetic binary classification dataset with binary features
X, y = make_classification(n_samples=100, n_features=10, n_informative=5, n_classes=2, random_state=42,
                            n_clusters_per_class=1, flip_y=0, class_sep=2)

# Convert the dataset to binary features
X = (X > 0).astype(int)  # Convert to binary features (0 or 1)

# Split dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Train a Bernoulli Naïve Bayes classifier
bnb = BernoulliNB()
bnb.fit(X_train, y_train)

# Make predictions
y_pred = bnb.predict(X_test)

# Evaluate accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Bernoulli Naïve Bayes Classifier Accuracy: {accuracy:.4f}")

     
Bernoulli Naïve Bayes Classifier Accuracy: 1.0000

# 29 Write a Python program to apply feature scaling before training an SVM model and compare results with
# unscaled data
import numpy as np
from sklearn.svm import SVC
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score

# Load Iris dataset
data = load_iris()
X = data.data
y = data.target

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Train an SVM Classifier without scaling the data
svm_clf_unscaled = SVC(kernel='linear')
svm_clf_unscaled.fit(X_train, y_train)
y_pred_unscaled = svm_clf_unscaled.predict(X_test)
accuracy_unscaled = accuracy_score(y_test, y_pred_unscaled)

# Apply feature scaling (Standardization)
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Train an SVM Classifier with scaled data
svm_clf_scaled = SVC(kernel='linear')
svm_clf_scaled.fit(X_train_scaled, y_train)
y_pred_scaled = svm_clf_scaled.predict(X_test_scaled)
accuracy_scaled = accuracy_score(y_test, y_pred_scaled)

# Print the accuracies
print(f"Accuracy without scaling: {accuracy_unscaled:.4f}")
print(f"Accuracy with scaling: {accuracy_scaled:.4f}")

     
Accuracy without scaling: 1.0000
Accuracy with scaling: 0.9778

# 30 Write a Python program to train a Gaussian Naïve Bayes model and compare the predictions before and
# after Laplace Smoothing
from sklearn.naive_bayes import GaussianNB
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Load Iris dataset
data = load_iris()
X = data.data
y = data.target

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Train Gaussian Naïve Bayes without Laplace smoothing (default)
gnb_no_smoothing = GaussianNB(var_smoothing=1e-9)  # Small variance to avoid underflow (no smoothing)
gnb_no_smoothing.fit(X_train, y_train)
y_pred_no_smoothing = gnb_no_smoothing.predict(X_test)

# Train Gaussian Naïve Bayes with Laplace smoothing (default var_smoothing=1.0)
gnb_with_smoothing = GaussianNB(var_smoothing=1.0)  # Laplace smoothing
gnb_with_smoothing.fit(X_train, y_train)
y_pred_with_smoothing = gnb_with_smoothing.predict(X_test)

# Evaluate accuracy
accuracy_no_smoothing = accuracy_score(y_test, y_pred_no_smoothing)
accuracy_with_smoothing = accuracy_score(y_test, y_pred_with_smoothing)

# Print the results
print(f"Accuracy without Laplace smoothing: {accuracy_no_smoothing:.4f}")
print(f"Accuracy with Laplace smoothing: {accuracy_with_smoothing:.4f}")

     
Accuracy without Laplace smoothing: 0.9778
Accuracy with Laplace smoothing: 0.9778

# 31 Write a Python program to train an SVM Classifier and use GridSearchCV to tune the hyperparameters (C,
# gamma, kernel)
from sklearn.svm import SVC
from sklearn.model_selection import GridSearchCV
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Load Iris dataset
data = load_iris()
X = data.data
y = data.target

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Define the SVM model
svm = SVC()

# Define the parameter grid for C, gamma, and kernel
param_grid = {
    'C': [0.1, 1, 10],
    'gamma': ['scale', 'auto', 0.1, 1],
    'kernel': ['linear', 'rbf']
}

# Set up GridSearchCV with 5-fold cross-validation
grid_search = GridSearchCV(svm, param_grid, cv=5, verbose=1)

# Fit the grid search model
grid_search.fit(X_train, y_train)

# Print the best parameters and best score
print("Best parameters found: ", grid_search.best_params_)
print("Best cross-validation score: ", grid_search.best_score_)

# Evaluate the model on the test set
y_pred = grid_search.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f"Test accuracy: {accuracy:.4f}")





     
Fitting 5 folds for each of 24 candidates, totalling 120 fits
Best parameters found:  {'C': 1, 'gamma': 'scale', 'kernel': 'linear'}
Best cross-validation score:  0.9619047619047618
Test accuracy: 1.0000

# 32 Write a Python program to train an SVM Classifier on an imbalanced dataset and apply class weighting and
# check it improve accuracy
from sklearn.svm import SVC
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Generate an imbalanced dataset
X, y = make_classification(n_samples=1000, n_features=20, n_informative=2, n_classes=2,
                           weights=[0.9, 0.1], flip_y=0, random_state=42)

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Train an SVM classifier without class weighting
svm_no_weight = SVC(kernel='linear')
svm_no_weight.fit(X_train, y_train)

# Make predictions
y_pred_no_weight = svm_no_weight.predict(X_test)

# Train an SVM classifier with class weighting
svm_with_weight = SVC(kernel='linear', class_weight='balanced')
svm_with_weight.fit(X_train, y_train)

# Make predictions
y_pred_with_weight = svm_with_weight.predict(X_test)

# Evaluate accuracy
accuracy_no_weight = accuracy_score(y_test, y_pred_no_weight)
accuracy_with_weight = accuracy_score(y_test, y_pred_with_weight)

print(f"Accuracy without class weighting: {accuracy_no_weight:.4f}")
print(f"Accuracy with class weighting: {accuracy_with_weight:.4f}")

     
Accuracy without class weighting: 0.9400
Accuracy with class weighting: 0.8600

# 33 Write a Python program to implement a Naïve Bayes classifier for spam detection using email data
from sklearn.naive_bayes import MultinomialNB
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Sample email dataset (spam and ham)
emails = [
    "Free money now!!!", "Hi, I wanted to check on the report", "Congratulations, you won a prize",
    "Hello, how are you?", "Get a free iPhone by clicking here", "Important meeting tomorrow"
]
labels = [1, 0, 1, 0, 1, 0]  # 1 for spam, 0 for ham (non-spam)

# Convert emails to word counts (Bag of Words model)
vectorizer = CountVectorizer()
X = vectorizer.fit_transform(emails)

# Split dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, labels, test_size=0.3, random_state=42)

# Train a Naïve Bayes classifier
nb_clf = MultinomialNB()
nb_clf.fit(X_train, y_train)

# Make predictions
y_pred = nb_clf.predict(X_test)

# Evaluate accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Spam detection accuracy: {accuracy:.4f}")

     
Spam detection accuracy: 1.0000

# 34 Write a Python program to train an SVM Classifier and a Naïve Bayes Classifier on the same dataset and
# compare their accuracy
from sklearn.svm import SVC
from sklearn.naive_bayes import GaussianNB
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Load Iris dataset
data = load_iris()
X = data.data
y = data.target

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Train an SVM classifier
svm_clf = SVC(kernel='linear')
svm_clf.fit(X_train, y_train)
y_pred_svm = svm_clf.predict(X_test)

# Train a Naïve Bayes classifier
nb_clf = GaussianNB()
nb_clf.fit(X_train, y_train)
y_pred_nb = nb_clf.predict(X_test)

# Evaluate accuracy
accuracy_svm = accuracy_score(y_test, y_pred_svm)
accuracy_nb = accuracy_score(y_test, y_pred_nb)

print(f"SVM Classifier accuracy: {accuracy_svm:.4f}")
print(f"Naïve Bayes Classifier accuracy: {accuracy_nb:.4f}")

     
SVM Classifier accuracy: 1.0000
Naïve Bayes Classifier accuracy: 0.9778

# 35 Write a Python program to perform feature selection before training a Naïve Bayes classifier and compare
# results
from sklearn.datasets import load_iris
from sklearn.feature_selection import SelectKBest, f_classif
from sklearn.naive_bayes import GaussianNB
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Load Iris dataset
data = load_iris()
X = data.data
y = data.target

# Split dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Perform feature selection using SelectKBest (select top 2 features)
selector = SelectKBest(f_classif, k=2)
X_train_selected = selector.fit_transform(X_train, y_train)
X_test_selected = selector.transform(X_test)

# Train a Naïve Bayes classifier on selected features
nb_clf = GaussianNB()
nb_clf.fit(X_train_selected, y_train)

# Make predictions
y_pred = nb_clf.predict(X_test_selected)

# Evaluate accuracy
accuracy_selected = accuracy_score(y_test, y_pred)

# Train Naïve Bayes without feature selection
nb_clf_no_selection = GaussianNB()
nb_clf_no_selection.fit(X_train, y_train)
y_pred_no_selection = nb_clf_no_selection.predict(X_test)

# Evaluate accuracy without feature selection
accuracy_no_selection = accuracy_score(y_test, y_pred_no_selection)

print(f"Accuracy with feature selection: {accuracy_selected:.4f}")
print(f"Accuracy without feature selection: {accuracy_no_selection:.4f}")

     
Accuracy with feature selection: 1.0000
Accuracy without feature selection: 0.9778

# 36 Write a Python program to train an SVM Classifier using One-vs-Rest (OvR) and One-vs-One (OvO)
# strategies on the Wine dataset and compare their accuracy
from sklearn.svm import SVC
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.multiclass import OneVsRestClassifier, OneVsOneClassifier

# Load the Wine dataset
data = load_wine()
X = data.data
y = data.target

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# One-vs-Rest strategy
ovr_clf = OneVsRestClassifier(SVC(kernel='linear'))
ovr_clf.fit(X_train, y_train)
y_pred_ovr = ovr_clf.predict(X_test)
accuracy_ovr = accuracy_score(y_test, y_pred_ovr)

# One-vs-One strategy
ovo_clf = OneVsOneClassifier(SVC(kernel='linear'))
ovo_clf.fit(X_train, y_train)
y_pred_ovo = ovo_clf.predict(X_test)
accuracy_ovo = accuracy_score(y_test, y_pred_ovo)

# Print accuracies
print(f"Accuracy with One-vs-Rest: {accuracy_ovr:.4f}")
print(f"Accuracy with One-vs-One: {accuracy_ovo:.4f}")


     
Accuracy with One-vs-Rest: 0.9815
Accuracy with One-vs-One: 0.9815

# 37 Write a Python program to train an SVM Classifier using Linear, Polynomial, and RBF kernels on the Breast
# Cancer dataset and compare their accuracy
from sklearn.svm import SVC
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Load the Breast Cancer dataset
data = load_breast_cancer()
X = data.data
y = data.target

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Linear Kernel SVM
svm_linear = SVC(kernel='linear')
svm_linear.fit(X_train, y_train)
y_pred_linear = svm_linear.predict(X_test)
accuracy_linear = accuracy_score(y_test, y_pred_linear)

# Polynomial Kernel SVM
svm_poly = SVC(kernel='poly', degree=3)
svm_poly.fit(X_train, y_train)
y_pred_poly = svm_poly.predict(X_test)
accuracy_poly = accuracy_score(y_test, y_pred_poly)

# RBF Kernel SVM
svm_rbf = SVC(kernel='rbf')
svm_rbf.fit(X_train, y_train)
y_pred_rbf = svm_rbf.predict(X_test)
accuracy_rbf = accuracy_score(y_test, y_pred_rbf)

# Print accuracies
print(f"Accuracy with Linear Kernel: {accuracy_linear:.4f}")
print(f"Accuracy with Polynomial Kernel: {accuracy_poly:.4f}")
print(f"Accuracy with RBF Kernel: {accuracy_rbf:.4f}")

     
Accuracy with Linear Kernel: 0.9649
Accuracy with Polynomial Kernel: 0.9415
Accuracy with RBF Kernel: 0.9357

# 38 Write a Python program to train an SVM Classifier using Stratified K-Fold Cross-Validation and compute the
# average accuracy
from sklearn.svm import SVC
from sklearn.model_selection import StratifiedKFold, cross_val_score
from sklearn.datasets import load_breast_cancer

# Load the Breast Cancer dataset
data = load_breast_cancer()
X = data.data
y = data.target

# Initialize the SVM classifier
svm = SVC(kernel='linear')

# Use Stratified K-Fold Cross-Validation
skf = StratifiedKFold(n_splits=5)
cross_val_scores = cross_val_score(svm, X, y, cv=skf)

# Print the average accuracy
print(f"Average accuracy with Stratified K-Fold Cross-Validation: {cross_val_scores.mean():.4f}")

     
Average accuracy with Stratified K-Fold Cross-Validation: 0.9455

# 39 Write a Python program to train a Naïve Bayes classifier using different prior probabilities and compare
# performance
from sklearn.naive_bayes import GaussianNB
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Load the Iris dataset
data = load_iris()
X = data.data
y = data.target

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Train Naïve Bayes classifier with different priors
nb_clf_default = GaussianNB()
nb_clf_default.fit(X_train, y_train)
y_pred_default = nb_clf_default.predict(X_test)
accuracy_default = accuracy_score(y_test, y_pred_default)

# Train Naïve Bayes classifier with custom priors
nb_clf_custom = GaussianNB(priors=[0.4, 0.3, 0.3])
nb_clf_custom.fit(X_train, y_train)
y_pred_custom = nb_clf_custom.predict(X_test)
accuracy_custom = accuracy_score(y_test, y_pred_custom)

# Print accuracies
print(f"Accuracy with default priors: {accuracy_default:.4f}")
print(f"Accuracy with custom priors: {accuracy_custom:.4f}")

     
Accuracy with default priors: 0.9778
Accuracy with custom priors: 0.9778