1. What is a Support Vector Machine (SVM)?

SVM is a supervised learning algorithm that finds the optimal hyperplane to separate data into different classes.


2. Difference between Hard Margin and Soft Margin SVM?


Hard Margin: No misclassifications allowed (strict separation).

Soft Margin: Allows some errors to improve generalization.


3. Mathematical intuition behind SVM?

SVM aims to maximize the margin between classes by solving a convex optimization problem.


4. Role of Lagrange Multipliers in SVM?

They convert the constrained optimization problem into a solvable dual form.


5. What are Support Vectors in SVM?

These are data points closest to the hyperplane; they define the margin.


6. What is a Support Vector Classifier (SVC)?

SVC is SVM used for classification problems.



7. What is a Support Vector Regressor (SVR)?

SVR is an SVM variant used for regression tasks, fitting data within a tolerance margin.


8. What is the Kernel Trick in SVM?

A technique to handle non-linear data by mapping it into higher dimensions using kernels.


9. Compare Linear, Polynomial, and RBF Kernels.

Linear: Straight-line separation

Polynomial: Captures curved relationships

RBF: Handles complex, non-linear patterns well


10. Effect of the C parameter in SVM?

C controls trade-off between margin width and classification error; small C = wider margin, large C = less tolerance for errors.



11. Role of Gamma in RBF Kernel SVM?

Gamma defines how far the influence of a training example reaches. High gamma = close influence.




12. What is the Naïve Bayes classifier, and why is it called "Naïve"?

It applies Bayes’ Theorem assuming all features are independent (hence “naïve”).


13. What is Bayes’ Theorem?
𝑃
(
𝐴
∣
𝐵
)
=
𝑃
(
𝐵
∣
𝐴
)
⋅
𝑃
(
𝐴
)
𝑃
(
𝐵
)
P(A∣B)=
P(B)
P(B∣A)⋅P(A)
​



14. Gaussian, Multinomial, and Bernoulli Naïve Bayes:

Gaussian: For continuous, normally-distributed features

Multinomial: For count-based text data

Bernoulli: For binary/boolean features




15. When should you use Gaussian Naïve Bayes?

When features are continuous and follow a normal distribution.



16. Key assumptions made by Naïve Bayes:

Features are conditionally independent

All features contribute equally



17. Advantages and disadvantages of Naïve Bayes:

Fast, simple, good for high-dimensional data
− Assumes independence, less accurate than complex models



18. Why is Naïve Bayes good for text classification?

It's efficient with large vocabularies and works well with sparse data.



19. Compare SVM and Naïve Bayes for classification:

SVM: More accurate, handles complex data better

NB: Simpler, faster, good for text and noisy data




20. How does Laplace Smoothing help in Naïve Bayes?

It avoids zero probabilities by adding 1 to all feature counts.

In [None]:
#21: SVM Classifier on Iris Dataset

from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

# Load dataset
iris = datasets.load_iris()
X = iris.data
y = iris.target

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Train SVM
svm = SVC(kernel='linear')
svm.fit(X_train, y_train)

# Predict and evaluate
y_pred = svm.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.2f}")

In [None]:
#22: SVM with Linear and RBF Kernels on Wine Dataset


from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score
from sklearn.preprocessing import StandardScaler

# Load dataset
wine = datasets.load_wine()
X = wine.data
y = wine.target

# Scale data
scaler = StandardScaler()
X = scaler.fit_transform(X)

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Train linear SVM
linear_svm = SVC(kernel='linear')
linear_svm.fit(X_train, y_train)
linear_pred = linear_svm.predict(X_test)
linear_acc = accuracy_score(y_test, linear_pred)

# Train RBF SVM
rbf_svm = SVC(kernel='rbf')
rbf_svm.fit(X_train, y_train)
rbf_pred = rbf_svm.predict(X_test)
rbf_acc = accuracy_score(y_test, rbf_pred)

print(f"Linear Kernel Accuracy: {linear_acc:.2f}")
print(f"RBF Kernel Accuracy: {rbf_acc:.2f}")

In [None]:
#23: SVM Regressor on Housing Dataset


from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.svm import SVR
from sklearn.metrics import mean_squared_error
from sklearn.preprocessing import StandardScaler

# Load dataset
housing = fetch_california_housing()
X = housing.data
y = housing.target

# Scale data
scaler = StandardScaler()
X = scaler.fit_transform(X)

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Train SVR
svr = SVR(kernel='rbf')
svr.fit(X_train, y_train)

# Predict and evaluate
y_pred = svr.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
print(f"Mean Squared Error: {mse:.2f}")

In [None]:
#24: SVM with Polynomial Kernel and Decision Boundary Visualization

import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.svm import SVC
from sklearn.inspection import DecisionBoundaryDisplay

# Load dataset (using only 2 features for visualization)
iris = datasets.load_iris()
X = iris.data[:, :2]  # Only first two features
y = iris.target

# Train SVM with polynomial kernel
svm = SVC(kernel='poly', degree=3)
svm.fit(X, y)

# Create decision boundary plot
DecisionBoundaryDisplay.from_estimator(
    svm,
    X,
    cmap=plt.cm.Paired,
    response_method="predict"
)

# Plot training points
plt.scatter(X[:, 0], X[:, 1], c=y, edgecolors='k')
plt.xlabel('Sepal length')
plt.ylabel('Sepal width')
plt.title('SVM with Polynomial Kernel Decision Boundary')
plt.show()

In [None]:
#25: Gaussian Naive Bayes on Breast Cancer Dataset

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score

# Load dataset
data = load_breast_cancer()
X = data.data
y = data.target

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Train Gaussian Naive Bayes
gnb = GaussianNB()
gnb.fit(X_train, y_train)

# Predict and evaluate
y_pred = gnb.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.2f}")

In [None]:
#26: Multinomial Naive Bayes for Text Classification

from sklearn.datasets import fetch_20newsgroups
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.pipeline import make_pipeline
from sklearn.metrics import accuracy_score

# Load dataset
categories = ['alt.atheism', 'soc.religion.christian', 'comp.graphics', 'sci.med']
train = fetch_20newsgroups(subset='train', categories=categories)
test = fetch_20newsgroups(subset='test', categories=categories)

# Create pipeline with TF-IDF and Multinomial NB
model = make_pipeline(TfidfVectorizer(), MultinomialNB())
model.fit(train.data, train.target)

# Predict and evaluate
predicted = model.predict(test.data)
accuracy = accuracy_score(test.target, predicted)
print(f"Accuracy: {accuracy:.2f}")

In [None]:
#27: SVM with Different C Values and Decision Boundaries


import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.svm import SVC
from sklearn.inspection import DecisionBoundaryDisplay

# Load dataset (using only 2 features for visualization)
iris = datasets.load_iris()
X = iris.data[:, :2]  # Only first two features
y = iris.target

# Different C values to try
C_values = [0.1, 1, 10, 100]

plt.figure(figsize=(15, 10))
for i, C in enumerate(C_values, 1):
    # Train SVM
    svm = SVC(kernel='linear', C=C)
    svm.fit(X, y)

    # Create subplot
    plt.subplot(2, 2, i)
    DecisionBoundaryDisplay.from_estimator(
        svm,
        X,
        cmap=plt.cm.Paired,
        response_method="predict"
    )
    plt.scatter(X[:, 0], X[:, 1], c=y, edgecolors='k')
    plt.title(f'SVM Decision Boundary (C={C})')

plt.tight_layout()
plt.show()

In [None]:
#28: Bernoulli Naive Bayes for Binary Classification

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import BernoulliNB
from sklearn.metrics import accuracy_score

# Create binary dataset with binary features
X, y = make_classification(n_samples=1000, n_features=20, n_informative=15,
                          n_redundant=2, n_classes=2, random_state=42)
X = (X > 0).astype(int)  # Convert to binary features

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Train Bernoulli Naive Bayes
bnb = BernoulliNB()
bnb.fit(X_train, y_train)

# Predict and evaluate
y_pred = bnb.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.2f}")

In [None]:
#29: Feature Scaling for SVM

from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score

# Load dataset
wine = datasets.load_wine()
X = wine.data
y = wine.target

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Without scaling
svm_unscaled = SVC(kernel='rbf')
svm_unscaled.fit(X_train, y_train)
y_pred_unscaled = svm_unscaled.predict(X_test)
acc_unscaled = accuracy_score(y_test, y_pred_unscaled)

# With scaling
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

svm_scaled = SVC(kernel='rbf')
svm_scaled.fit(X_train_scaled, y_train)
y_pred_scaled = svm_scaled.predict(X_test_scaled)
acc_scaled = accuracy_score(y_test, y_pred_scaled)

print(f"Accuracy without scaling: {acc_unscaled:.2f}")
print(f"Accuracy with scaling: {acc_scaled:.2f}")

In [None]:
#30: Gaussian Naive Bayes with Laplace Smoothing

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score

# Load dataset
data = load_breast_cancer()
X = data.data
y = data.target

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Without Laplace smoothing (var_smoothing=0)
gnb_no_smooth = GaussianNB(var_smoothing=0)
gnb_no_smooth.fit(X_train, y_train)
y_pred_no_smooth = gnb_no_smooth.predict(X_test)
acc_no_smooth = accuracy_score(y_test, y_pred_no_smooth)

# With Laplace smoothing (default var_smoothing=1e-9)
gnb_smooth = GaussianNB()
gnb_smooth.fit(X_train, y_train)
y_pred_smooth = gnb_smooth.predict(X_test)
acc_smooth = accuracy_score(y_test, y_pred_smooth)

print(f"Accuracy without smoothing: {acc_no_smooth:.2f}")
print(f"Accuracy with smoothing: {acc_smooth:.2f}")

In [None]:
#31: SVM with GridSearchCV

from sklearn import datasets
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

# Load dataset
iris = datasets.load_iris()
X = iris.data
y = iris.target

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Define parameter grid
param_grid = {
    'C': [0.1, 1, 10, 100],
    'gamma': [1, 0.1, 0.01, 0.001],
    'kernel': ['rbf', 'linear', 'poly']
}

# Grid search
grid = GridSearchCV(SVC(), param_grid, refit=True, verbose=2)
grid.fit(X_train, y_train)

# Best parameters
print(f"Best parameters: {grid.best_params_}")

# Predict with best model
y_pred = grid.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.2f}")

In [None]:
#32: SVM on Imbalanced Dataset with Class Weighting

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, classification_report

# Create imbalanced dataset
X, y = make_classification(n_samples=1000, n_features=20, n_informative=15,
                          n_redundant=2, weights=[0.9, 0.1], random_state=42)

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Without class weighting
svm_noweight = SVC(kernel='linear')
svm_noweight.fit(X_train, y_train)
y_pred_noweight = svm_noweight.predict(X_test)
acc_noweight = accuracy_score(y_test, y_pred_noweight)
print("Without class weighting:")
print(f"Accuracy: {acc_noweight:.2f}")
print(classification_report(y_test, y_pred_noweight))

# With class weighting
svm_weighted = SVC(kernel='linear', class_weight='balanced')
svm_weighted.fit(X_train, y_train)
y_pred_weighted = svm_weighted.predict(X_test)
acc_weighted = accuracy_score(y_test, y_pred_weighted)
print("\nWith class weighting:")
print(f"Accuracy: {acc_weighted:.2f}")
print(classification_report(y_test, y_pred_weighted))

In [None]:
#33: Naive Bayes for Spam Detection

import pandas as pd
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, classification_report

# Example email dataset (you would replace this with your actual dataset)
data = {
    'text': [
        'win money now', 'free lottery', 'meeting tomorrow',
        'project update', 'buy cheap viagra', 'urgent help needed',
        'congratulations you won', 'account verification required',
        'team lunch next week', 'security alert'
    ],
    'label': [1, 1, 0, 0, 1, 0, 1, 1, 0, 0]  # 1=spam, 0=ham
}
df = pd.DataFrame(data)

# Vectorize text
vectorizer = CountVectorizer()
X = vectorizer.fit_transform(df['text'])
y = df['label']

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Train Naive Bayes
nb = MultinomialNB()
nb.fit(X_train, y_train)

# Predict and evaluate
y_pred = nb.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.2f}")
print("\nClassification Report:")
print(classification_report(y_test, y_pred))

In [None]:
#34: Compare SVM and Naive Bayes on Same Dataset

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score
from sklearn.preprocessing import StandardScaler

# Load dataset
data = load_breast_cancer()
X = data.data
y = data.target

# Scale data (important for SVM)
scaler = StandardScaler()
X = scaler.fit_transform(X)

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Train SVM
svm = SVC(kernel='rbf')
svm.fit(X_train, y_train)
y_pred_svm = svm.predict(X_test)
acc_svm = accuracy_score(y_test, y_pred_svm)

# Train Naive Bayes
nb = GaussianNB()
nb.fit(X_train, y_train)
y_pred_nb = nb.predict(X_test)
acc_nb = accuracy_score(y_test, y_pred_nb)

print(f"SVM Accuracy: {acc_svm:.2f}")
print(f"Naive Bayes Accuracy: {acc_nb:.2f}")

In [None]:
#35: Feature Selection for Naive Bayes


from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.feature_selection import SelectKBest, f_classif
from sklearn.metrics import accuracy_score

# Load dataset
data = load_breast_cancer()
X = data.data
y = data.target

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Without feature selection
nb_all = GaussianNB()
nb_all.fit(X_train, y_train)
y_pred_all = nb_all.predict(X_test)
acc_all = accuracy_score(y_test, y_pred_all)

# With feature selection (select top 10 features)
selector = SelectKBest(f_classif, k=10)
X_train_selected = selector.fit_transform(X_train, y_train)
X_test_selected = selector.transform(X_test)

nb_selected = GaussianNB()
nb_selected.fit(X_train_selected, y_train)
y_pred_selected = nb_selected.predict(X_test_selected)
acc_selected = accuracy_score(y_test, y_pred_selected)

print(f"Accuracy with all features: {acc_all:.2f}")
print(f"Accuracy with selected features: {acc_selected:.2f}")

In [None]:
#36: SVM with OvR and OvO on Wine Dataset

from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.multiclass import OneVsRestClassifier, OneVsOneClassifier
from sklearn.metrics import accuracy_score
from sklearn.preprocessing import StandardScaler

# Load dataset
wine = datasets.load_wine()
X = wine.data
y = wine.target

# Scale data
scaler = StandardScaler()
X = scaler.fit_transform(X)

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# One-vs-Rest
ovr = OneVsRestClassifier(SVC(kernel='linear'))
ovr.fit(X_train, y_train)
y_pred_ovr = ovr.predict(X_test)
acc_ovr = accuracy_score(y_test, y_pred_ovr)

# One-vs-One
ovo = OneVsOneClassifier(SVC(kernel='linear'))
ovo.fit(X_train, y_train)
y_pred_ovo = ovo.predict(X_test)
acc_ovo = accuracy_score(y_test, y_pred_ovo)

print(f"OvR Accuracy: {acc_ovr:.2f}")
print(f"OvO Accuracy: {acc_ovo:.2f}")

In [None]:
#37: SVM with Different Kernels on Breast Cancer Dataset

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score
from sklearn.preprocessing import StandardScaler

# Load dataset
data = load_breast_cancer()
X = data.data
y = data.target

# Scale data
scaler = StandardScaler()
X = scaler.fit_transform(X)

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Different kernels to try
kernels = ['linear', 'poly', 'rbf']

for kernel in kernels:
    svm = SVC(kernel=kernel)
    svm.fit(X_train, y_train)
    y_pred = svm.predict(X_test)
    accuracy = accuracy_score(y_test, y_pred)
    print(f"Kernel: {kernel}, Accuracy: {accuracy:.2f}")

In [None]:
#38: SVM with Stratified K-Fold Cross-Validation


from sklearn import datasets
from sklearn.model_selection import StratifiedKFold, cross_val_score
from sklearn.svm import SVC
from sklearn.preprocessing import StandardScaler

# Load dataset
iris = datasets.load_iris()
X = iris.data
y = iris.target

# Scale data
scaler = StandardScaler()
X = scaler.fit_transform(X)

# Create SVM model
svm = SVC(kernel='linear')

# Stratified K-Fold CV
skf = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)
scores = cross_val_score(svm, X, y, cv=skf)

# Print results
print("Cross-validation scores:", scores)
print(f"Average accuracy: {scores.mean():.2f} (+/- {scores.std() * 2:.2f})")

In [None]:
#39: Naive Bayes with Different Priors

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score

# Load dataset
data = load_breast_cancer()
X = data.data
y = data.target

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Different priors to try
priors = [
    None,  # Let the model estimate
    [0.5, 0.5],  # Equal priors
    [0.3, 0.7],  # Skewed priors
    [0.7, 0.3]   # Opposite skew
]

for prior in priors:
    nb = GaussianNB(priors=prior)
    nb.fit(X_train, y_train)
    y_pred = nb.predict(X_test)
    accuracy = accuracy_score(y_test, y_pred)
    print(f"Priors: {prior}, Accuracy: {accuracy:.2f}")

In [None]:
#40: Recursive Feature Elimination for SVM

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.feature_selection import RFE
from sklearn.metrics import accuracy_score
from sklearn.preprocessing import StandardScaler

# Load dataset
data = load_breast_cancer()
X = data.data
y = data.target

# Scale data
scaler = StandardScaler()
X = scaler.fit_transform(X)

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Without RFE
svm = SVC(kernel='linear')
svm.fit(X_train, y_train)
y_pred = svm.predict(X_test)
acc = accuracy_score(y_test, y_pred)
print(f"Accuracy without RFE: {acc:.2f}")

# With RFE (select half the features)
rfe = RFE(estimator=SVC(kernel='linear'), n_features_to_select=X.shape[1]//2)
X_train_rfe = rfe.fit_transform(X_train, y_train)
X_test_rfe = rfe.transform(X_test)

svm_rfe = SVC(kernel='linear')
svm_rfe.fit(X_train_rfe, y_train)
y_pred_rfe = svm_rfe.predict(X_test_rfe)
acc_rfe = accuracy_score(y_test, y_pred_rfe)
print(f"Accuracy with RFE: {acc_rfe:.2f}")

In [None]:
#41: SVM Evaluation with Precision, Recall, F1-Score

from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import precision_score, recall_score, f1_score

# Load dataset
iris = datasets.load_iris()
X = iris.data
y = iris.target

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Train SVM
svm = SVC(kernel='linear')
svm.fit(X_train, y_train)

# Predict
y_pred = svm.predict(X_test)

# Evaluate
precision = precision_score(y_test, y_pred, average='weighted')
recall = recall_score(y_test, y_pred, average='weighted')
f1 = f1_score(y_test, y_pred, average='weighted')

print(f"Precision: {precision:.2f}")
print(f"Recall: {recall:.2f}")
print(f"F1-Score: {f1:.2f}")

In [None]:
#42: Naive Bayes Evaluation with Log Loss

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import log_loss

# Load dataset
data = load_breast_cancer()
X = data.data
y = data.target

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Train Naive Bayes
nb = GaussianNB()
nb.fit(X_train, y_train)

# Get predicted probabilities
y_proba = nb.predict_proba(X_test)

# Calculate log loss
loss = log_loss(y_test, y_proba)
print(f"Log Loss: {loss:.2f}")

In [None]:
#43: SVM Confusion Matrix Visualization

from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import confusion_matrix
import seaborn as sns
import matplotlib.pyplot as plt

# Load dataset
iris = datasets.load_iris()
X = iris.data
y = iris.target

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Train SVM
svm = SVC(kernel='linear')
svm.fit(X_train, y_train)

# Predict
y_pred = svm.predict(X_test)

# Confusion matrix
cm = confusion_matrix(y_test, y_pred)

# Plot
plt.figure(figsize=(8, 6))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues',
            xticklabels=iris.target_names,
            yticklabels=iris.target_names)
plt.xlabel('Predicted')
plt.ylabel('Actual')
plt.title('Confusion Matrix for SVM Classifier')
plt.show()

In [None]:
#44: SVR Evaluation with MAE

from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.svm import SVR
from sklearn.metrics import mean_absolute_error
from sklearn.preprocessing import StandardScaler

# Load dataset
housing = fetch_california_housing()
X = housing.data
y = housing.target

# Scale data
scaler = StandardScaler()
X = scaler.fit_transform(X)

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Train SVR
svr = SVR(kernel='rbf')
svr.fit(X_train, y_train)

# Predict and evaluate
y_pred = svr.predict(X_test)
mae = mean_absolute_error(y_test, y_pred)
print(f"Mean Absolute Error: {mae:.2f}")

In [None]:
#45: Naive Bayes ROC-AUC Evaluation

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import roc_auc_score

# Load dataset
data = load_breast_cancer()
X = data.data
y = data.target

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Train Naive Bayes
nb = GaussianNB()
nb.fit(X_train, y_train)

# Get predicted probabilities
y_proba = nb.predict_proba(X_test)[:, 1]  # Probability of positive class

# Calculate ROC-AUC
roc_auc = roc_auc_score(y_test, y_proba)
print(f"ROC-AUC Score: {roc_auc:.2f}")

In [None]:
#46: SVM Precision-Recall Curve Visualization

from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import precision_recall_curve
import matplotlib.pyplot as plt

# Load dataset (binary for precision-recall curve)
data = datasets.load_breast_cancer()
X = data.data
y = data.target

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Train SVM
svm = SVC(kernel='linear', probability=True)
svm.fit(X_train, y_train)

# Get predicted probabilities
y_proba = svm.predict_proba(X_test)[:, 1]

# Calculate precision-recall curve
precision, recall, thresholds = precision_recall_curve(y_test, y_proba)

# Plot
plt.figure(figsize=(8, 6))
plt.plot(recall, precision, marker='.')
plt.xlabel('Recall')
plt.ylabel('Precision')
plt.title('Precision-Recall Curve for SVM Classifier')
plt.grid()
plt.show()