# Exercise 8 - SVM for Digits Dataset (Applied)

**Objective: SVM Classifier on Digits Dataset**

The Digits dataset contains images of handwritten digits (0-9) represented as 8x8 pixel images. Your task is to train an SVM classifier on this dataset using scikit-learn, evaluate its performance, and optimize its hyperparameters.



In [84]:
import pandas as pd 
import numpy as np 
import matplotlib.pyplot as plt 
import seaborn as sns
from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.svm import SVC
from sklearn.model_selection import GridSearchCV
import warnings
warnings.filterwarnings("ignore")

**1. Load the Digits dataset from sklearn.datasets.load_digits.**

In [85]:
df = load_digits()
X = df.data
Y = df.target

In [86]:
X = pd.DataFrame(X)
Y = pd.DataFrame(Y)

In [87]:
X

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,54,55,56,57,58,59,60,61,62,63
0,0.0,0.0,5.0,13.0,9.0,1.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,6.0,13.0,10.0,0.0,0.0,0.0
1,0.0,0.0,0.0,12.0,13.0,5.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,11.0,16.0,10.0,0.0,0.0
2,0.0,0.0,0.0,4.0,15.0,12.0,0.0,0.0,0.0,0.0,...,5.0,0.0,0.0,0.0,0.0,3.0,11.0,16.0,9.0,0.0
3,0.0,0.0,7.0,15.0,13.0,1.0,0.0,0.0,0.0,8.0,...,9.0,0.0,0.0,0.0,7.0,13.0,13.0,9.0,0.0,0.0
4,0.0,0.0,0.0,1.0,11.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,2.0,16.0,4.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1792,0.0,0.0,4.0,10.0,13.0,6.0,0.0,0.0,0.0,1.0,...,4.0,0.0,0.0,0.0,2.0,14.0,15.0,9.0,0.0,0.0
1793,0.0,0.0,6.0,16.0,13.0,11.0,1.0,0.0,0.0,0.0,...,1.0,0.0,0.0,0.0,6.0,16.0,14.0,6.0,0.0,0.0
1794,0.0,0.0,1.0,11.0,15.0,1.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,2.0,9.0,13.0,6.0,0.0,0.0
1795,0.0,0.0,2.0,10.0,7.0,0.0,0.0,0.0,0.0,0.0,...,2.0,0.0,0.0,0.0,5.0,12.0,16.0,12.0,0.0,0.0


In [146]:
Y.head(30)

Unnamed: 0,0
0,0
1,1
2,2
3,3
4,4
5,5
6,6
7,7
8,8
9,9


**2. Split the dataset into 80% training and 20% test data.**

In [None]:
# Two way either shuffle = False or stratify = y
x_train, x_test, y_train, y_test = train_test_split(X, Y, test_size=0.2, random_state=42, shuffle=False)

**3. Train an SVM classifier using an RBF kernel with default hyperparameters.**

In [133]:
SV = SVC(kernel="rbf")
SV.fit(x_train, y_train)

0,1,2
,C,1.0
,kernel,'rbf'
,degree,3
,gamma,'scale'
,coef0,0.0
,shrinking,True
,probability,False
,tol,0.001
,cache_size,200
,class_weight,


**4. Compute and print the accuracy on the test data.**

In [None]:
print(f"Accuracy of default SVC : {accuracy_score(y_test, SV.predict(x_test)) * 100:.4f} %")

Accuracy of default SVC : 94.1667


**5. Optimize the hyperparameters C and gamma using GridSearchCV with values:**

- C: [0.1, 1, 10, 100]
- gamma: [0.001, 0.01, 0.1, 1]

In [135]:
params = {
    'C': [0.1, 1, 10, 100],
    'gamma': [0.001, 0.01, 0.1, 1]
}

GV = GridSearchCV(SV, params, cv=5)
GV.fit(x_train, y_train)

0,1,2
,estimator,SVC()
,param_grid,"{'C': [0.1, 1, ...], 'gamma': [0.001, 0.01, ...]}"
,scoring,
,n_jobs,
,refit,True
,cv,5
,verbose,0
,pre_dispatch,'2*n_jobs'
,error_score,
,return_train_score,False

0,1,2
,C,10
,kernel,'rbf'
,degree,3
,gamma,0.001
,coef0,0.0
,shrinking,True
,probability,False
,tol,0.001
,cache_size,200
,class_weight,


In [136]:
print(f"Accuracy of best SVC : {accuracy_score(y_test, GV.predict(x_test)) * 100:.4f} %")

Accuracy of best SVC : 96.3889 %


**6. Print the best C and gamma values.**

In [137]:
GV.best_params_

{'C': 10, 'gamma': 0.001}

**7. Train a new SVM model with the optimal hyperparameters and report its accuracy.**

In [138]:
OptSV = SVC(kernel="rbf", C=GV.best_params_["C"], gamma=GV.best_params_["gamma"])
OptSV.fit(x_train, y_train)

0,1,2
,C,10
,kernel,'rbf'
,degree,3
,gamma,0.001
,coef0,0.0
,shrinking,True
,probability,False
,tol,0.001
,cache_size,200
,class_weight,


In [139]:
print(f"Accuracy of Optimal SVC : {accuracy_score(y_test, OptSV.predict(x_test)) * 100:.4f} %")

Accuracy of Optimal SVC : 96.3889 %


## Conclusion

- The accuracies for **default SVC parameters** and **GridSearchCV** vary whenever the train-test split is performed differently.  
- When I set `random_state=42` for reproducibility, I obtained the **same accuracies** for both default SVC and GridSearchCV.  
  - In this case, the **best parameters** found were:  
    - **C = 10**  
    - **gamma = 0.001**  

- When I tried **different train-test splits (without fixed random state)**:  
  - GridSearchCV consistently gave **better accuracy than default parameters**.  
  - However, the **optimal C values changed** depending on the split.  

- Across all split combinations tested, the **most optimal accuracy** achieved was:  
  **0.9917**


## Data Splitting and Its Impact on Accuracy

There are two main ways to split the data: **with shuffle** and **without shuffle**.

- **Shuffle = False**:  
  - The data is split in order, so the **first 80%** of the dataset goes directly to training and the **remaining 20%** to testing.  
  - In my dataset, the target numbers (0–9) are **repeated in order**, but splitting without shuffle can still lead to **unequal representation of numbers** in the training set.  
  - As a result, the model may **not learn all digits equally**, which affects accuracy negatively.  

- **Optimal training with stratification**:  
  - Using `stratify=y` ensures that **each digit is proportionally represented** in both training and test sets.  
  - This allows the model to **see all numbers equally**, leading to more balanced training.  
  - Consequently, the model achieves **optimal accuracy** when stratified.  

- **Observation regarding hyperparameter tuning**:  
  - With stratification, the **default SVC parameters already give optimal performance**.  
  - This reduces the need for extensive GridSearchCV, as hyperparameter tuning does not significantly improve accuracy in this case.  

**Conclusion:**  
- **Shuffle=False** → poorer performance due to unequal class representation.  
- **Stratify=True** → optimal accuracy achieved with default SVC parameters, making hyperparameter tuning less critical.
