# Support Vector Machine(SVM) Classifier 

# Understanding SVM

. Suppose we have two classes, like A,B.
. SVM tries to find a line called the hyperplane that best spearates these two classes.
. The points closest to the decision boundary are called marginal distance.
. SVM tries to maximize the margin - which means better generalization and fewer errors.


# Linear vs Non-linear SVM

. Linear : Can separate classes using a straight line.
. Non-Linear : Data is not linearly separately, it use a kernal function to project the data into higher dimensions.

# Importing libraries

In [1]:
# Step 1: Import Libraries
import pandas as pd
import numpy as np
import warnings
warnings.filterwarnings('ignore')

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report

# Load and Prepare Data 

In [2]:
# Step 2: Load Dataset
dataset = pd.read_csv(r"C:\Ds & AI ( my work)\Machine Learning\Classification Algorithms\Datasets\logit classification.csv")  # adjust path if needed
dataset.head()

Unnamed: 0,User ID,Gender,Age,EstimatedSalary,Purchased
0,15624510,Male,19,19000,0
1,15810944,Male,35,20000,0
2,15668575,Female,26,43000,0
3,15603246,Female,27,57000,0
4,15804002,Male,19,76000,0


# Feature Selection & Train-Test Split

In [3]:

# Step 3: Select Features and Target
X = dataset[["Age", "EstimatedSalary"]].values
y = dataset["Purchased"].values

In [4]:
# Split into train-test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=0)

# Feature Scaling 

In [5]:

# Step 4: Scale Features
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Train SVM Classifier

In [6]:
# Step 5: Train the SVM model
svm_model = SVC(kernel='rbf', C=1.0, gamma='scale')  # Try 'linear' or 'poly' as well
svm_model.fit(X_train_scaled, y_train)

# Model Evaluation

In [7]:
# Step 6: Make Predictions and Evaluate
y_pred_svm = svm_model.predict(X_test_scaled)

print("SVM Classifier Results")
print("Accuracy:", accuracy_score(y_test, y_pred_svm))
print("Confusion Matrix:\n", confusion_matrix(y_test, y_pred_svm))
print("Classification Report:\n", classification_report(y_test, y_pred_svm))

SVM Classifier Results
Accuracy: 0.93
Confusion Matrix:
 [[64  4]
 [ 3 29]]
Classification Report:
               precision    recall  f1-score   support

           0       0.96      0.94      0.95        68
           1       0.88      0.91      0.89        32

    accuracy                           0.93       100
   macro avg       0.92      0.92      0.92       100
weighted avg       0.93      0.93      0.93       100



# Future Prediction 

. once the model is trained, you can pass new user data for prediction

In [8]:
new_data = [[30, 87000]]
scaled_data = scaler.transform(new_data)
svm_model.predict(scaled_data)

array([0], dtype=int64)

# Predicting from Future Data CSV 

In [10]:
# Step 7: Load Future Data for Prediction

import pandas as pd

future_data = pd.DataFrame({
    "User ID": [1674381, 1674382, 1674383, 1674384, 1674385, 1674386, 1674387,
                1674388, 1674389, 1674390, 1674391, 1674392, 1674393, 1674394],
    "Gender": ["Male", "Female", "Male", "Female", "Female", "Male", "Male",
               "Female", "Male", "Female", "Female", "Male", "Male", "Female"],
    "Age": [29, 14, 28, 58, 80, 90, 100, 45, 37, 48, 59, 60, 61, 62],
    "EstimatedSalary": [39000, 34500, 40000, 56490, 59000, 41000, 23000, 20000,
                        33000, 23000, 64000, 33000, 23000, 45000]
})

print(future_data)


    User ID  Gender  Age  EstimatedSalary
0   1674381    Male   29            39000
1   1674382  Female   14            34500
2   1674383    Male   28            40000
3   1674384  Female   58            56490
4   1674385  Female   80            59000
5   1674386    Male   90            41000
6   1674387    Male  100            23000
7   1674388  Female   45            20000
8   1674389    Male   37            33000
9   1674390  Female   48            23000
10  1674391  Female   59            64000
11  1674392    Male   60            33000
12  1674393    Male   61            23000
13  1674394  Female   62            45000


# Sclae the future data using the same scaler used during training 

In [11]:
future_scaled = scaler.transform(future_data[["Age", "EstimatedSalary"]])

# Predict using trained SVM Model

In [12]:
future_predictions = svm_model.predict(future_scaled)

# Append predictions to the DataFrame