# **Logistic Regression and SVM**



1. Implement logistic regression for binary classification on CIFAR 10 dataset. Out of 10 classes, select the two classes Birds and Planes. The CIFAR-10 dataset comprises 60,000 color images (32x32 pixels) across 10 distinct classes. (https://www.kaggle.com/c/cifar-10/)
2. Implement SVM using PCA for binary classification on the above task. (https://www.analyticsvidhya.com/blog/2021/07/svm-and-pca-tutorial-for-beginners/)

# **Library import and Dataset load**

In [4]:
# Import necessary libraries
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.keras.datasets import cifar10
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score, classification_report

from sklearn.svm import SVC
from sklearn.decomposition import PCA
from sklearn.pipeline import make_pipeline

ModuleNotFoundError: No module named 'tensorflow'

In [None]:
pip install tensorflow

In [None]:
# load dataset
(x_train, y_train), (x_test, y_test) = cifar10.load_data()

In [None]:
print("Dataset Information:")
print(f"Training data shape: {x_train.shape}")
print(f"Training labels shape: {y_train.shape}")
print(f"Test data shape: {x_test.shape}")
print(f"Test labels shape: {y_test.shape}")

# **Logistic Regression**

**Logistic Regression :** Implement logistic regression for binary classification.

**Class :** Birds(class 2) and Airplane(Class 0).

In [None]:
# Filter for classes 'airplane' (0) and 'bird' (2)
def filter_classes(x, y, classes=(0, 2)):
    mask = np.isin(y, classes).flatten()
    x_filtered = x[mask]
    y_filtered = y[mask]
    # Convert labels to binary: 0 for plane, 1 for bird
    y_filtered = (y_filtered == 2).astype(int).flatten()
    return x_filtered, y_filtered

In [None]:
# filter train and test data
x_train, y_train = filter_classes(x_train, y_train)
x_test, y_test = filter_classes(x_test, y_test)

In [None]:
# Flatten the images (32x32x3 -> 3072)
x_train_flat = x_train.reshape(x_train.shape[0], -1)
x_test_flat = x_test.reshape(x_test.shape[0], -1)

In [None]:
# Standardize and transform the data
scaler = StandardScaler()
x_train_scaled = scaler.fit_transform(x_train_flat)
x_test_scaled = scaler.transform(x_test_flat)

In [None]:
# Split train and test data for validation
x_train_split, x_val_split, y_train_split, y_val_split = train_test_split(x_train_scaled, y_train, test_size=0.2, random_state=42)

In [None]:
# Initialize and Train the Logistic Regression model
log_reg = LogisticRegression(max_iter=1000, random_state=42, verbose=1)  # Initialize the model
log_reg.fit(x_train_split, y_train_split) # Train the model on the training data

In [None]:
# Evaluate on validation set
y_val_pred = log_reg.predict(x_val_split)
print("Validation Accuracy:", accuracy_score(y_val_split, y_val_pred))
print(classification_report(y_val_split, y_val_pred))

In [None]:
# Evaluate on test set
y_test_pred = log_reg.predict(x_test_scaled)
print("\nTest Accuracy:", accuracy_score(y_test, y_test_pred))
print(classification_report(y_test, y_test_pred))

# **SVM**

**SVM using PCA for binary classification**

In [None]:
# SVM with PCA (preserving 95% of variance)
pca = PCA(n_components=0.95, random_state=42)

In [None]:
# Create SVM pipeline with PCA
svm = SVC(kernel='rbf', random_state=42)
svm_pipeline = make_pipeline(pca, svm)

In [None]:
# Train SVM with PCA
svm_pipeline.fit(x_train_split, y_train_split)

In [None]:
# Evaluate on validation set
y_val_pred_svm = svm_pipeline.predict(x_val_split)
print("\nSVM Validation Accuracy:", accuracy_score(y_val_split, y_val_pred_svm))
print(classification_report(y_val_split, y_val_pred_svm))

In [None]:
# Evaluate on test set (for both models)
y_test_pred_log_reg = log_reg.predict(x_val_split)
y_test_pred_svm = svm_pipeline.predict(x_val_split)
print("\nLogistic Regression Test Accuracy:", accuracy_score(y_test, y_test_pred_log_reg))
print("SVM Test Accuracy:", accuracy_score(y_test, y_test_pred_svm))

In [None]:
# Print PCA information
print(f"Original dimension: {x_train_split.shape[1]}")
print(f"Reduced dimension after PCA: {pca.n_components_}")