The CIFAR-10 dataset is a complex image classification dataset that is commonly used in machine learning tasks. It consists of 60,000 32x32 color images divided into 10 different classes. While SVM can show powerful performance in image classification tasks, image data is high-dimensional, which means that a feature extraction process may be required when applying SVM.

Loading and Preprocessing the CIFAR-10 Dataset The CIFAR-10 dataset consists of RGB images, and to classify each image, the pixel data needs to be converted into vectors. Since SVM is not efficient when directly handling 2D image data, the images are converted into 1D vectors for training.

In [6]:
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
from tensorflow.keras.datasets import cifar10
from sklearn.preprocessing import StandardScaler
import numpy as np

# Load the CIFAR-10 dataset
(X_train, y_train), (X_test, y_test) = cifar10.load_data()

# Flatten the images (convert 32x32x3 images into 1D vectors)
X_train_flat = X_train.reshape(X_train.shape[0], -1)  # Convert to (50000, 32*32*3)
X_test_flat = X_test.reshape(X_test.shape[0], -1)      # Convert to (10000, 32*32*3)

# Normalize the data (important for SVM)
scaler = StandardScaler()
X_train_flat = scaler.fit_transform(X_train_flat)
X_test_flat = scaler.transform(X_test_flat)

# Optional: Use a smaller subset of the data to speed up training
X_train_subset, _, y_train_subset, _ = train_test_split(X_train_flat, y_train, test_size=0.9, random_state=42)
X_test_subset, _, y_test_subset, _ = train_test_split(X_test_flat, y_test, test_size=0.9, random_state=42)

# Create the SVM model (RBF kernel is commonly used for image data)
svm_model_rbf = SVC(kernel='rbf', C=1, gamma='scale')
svm_model_linear = SVC(kernel='linear')
svm_model_poly = SVC(kernel='poly', degree=3)  # Polynomial kernel of degree 3

svm_model_rbf.fit(X_train_subset, y_train_subset.ravel())
svm_model_linear.fit(X_train_subset, y_train_subset.ravel())
svm_model_poly.fit(X_train_subset, y_train_subset.ravel())

y_pred_rbf = svm_model_rbf.predict(X_test_subset)
y_pred_linear = svm_model_linear.predict(X_test_subset)
y_pred_poly = svm_model_poly.predict(X_test_subset)

Generate random data for prediction (similar to CIFAR-10 data range)
Compare all models.

In [8]:
random_data = np.random.uniform(X_train_flat.min(), X_train_flat.max(), (5, X_train_flat.shape[1]))

print("Randomly generated data:")
print(random_data)

# 7. Predict using the linear, rbf, and polynomial SVM models
linear_predictions = svm_model_linear.predict(random_data)
rbf_predictions = svm_model_rbf.predict(random_data)
poly_predictions = svm_model_poly.predict(random_data)

# 8. Compare predictions from different models
print("\nPredictions using Linear SVM:", linear_predictions)
print("Predictions using RBF SVM:", rbf_predictions)
print("Predictions using Polynomial SVM:", poly_predictions)

# 9. Evaluate the models on the test set (optional)
linear_accuracy = accuracy_score(y_test_subset, svm_model_linear.predict(X_test_subset))
rbf_accuracy = accuracy_score(y_test_subset, svm_model_rbf.predict(X_test_subset))
poly_accuracy = accuracy_score(y_test_subset, svm_model_poly.predict(X_test_subset))

print(f"\nAccuracy on test set - Linear SVM: {linear_accuracy:.2f}")
print(f"Accuracy on test set - RBF SVM: {rbf_accuracy:.2f}")
print(f"Accuracy on test set - Polynomial SVM: {poly_accuracy:.2f}")

Randomly generated data:
[[-0.43357579  0.89447971 -0.37489942 ... -0.92984659  1.5716975
   0.83432099]
 [ 2.52457201  1.79165611  0.56537516 ... -1.52348857 -1.0327751
  -2.08653381]
 [-1.87012474  0.44157122 -0.39083918 ...  0.74197546  2.31628587
  -0.08688807]
 [ 0.19739427  1.09002944  0.48228726 ...  0.61973457 -1.06131282
   1.99285663]
 [ 0.16217142  0.10607983  0.41640426 ...  1.69995937  1.9658678
  -1.7277571 ]]

Predictions using Linear SVM: [2 3 6 7 2]
Predictions using RBF SVM: [1 1 1 1 1]
Predictions using Polynomial SVM: [4 4 4 4 4]

Accuracy on test set - Linear SVM: 0.33
Accuracy on test set - RBF SVM: 0.42
Accuracy on test set - Polynomial SVM: 0.33
