![image.png](attachment:image.png)

# What is SVC (Support Vector Classifier)?
### SVC (Support Vector Classifier) is a type of machine learning model used for classification tasks. It is based on the Support Vector Machine (SVM) algorithm. SVC helps to classify data by finding the best boundary (decision boundary) that separates different classes.
---
# Think of SVC as a smart way to divide things into two groups based on their features.
---
# How Does SVC Work?
### It draws a straight (or curved) line between two groups of data in a way that maximizes the gap between them.
### The data points that are closest to this boundary are called support vectors.
### Even if data is not perfectly separable, SVC can use a trick (called the kernel trick) to make it work in higher dimensions.
---
# Real-Life Example of SVC:
### Imagine you are a bank manager, and you need to decide whether to approve or reject loan applications.
### You have customer data like credit score, salary, past loan history, etc.
### Using SVC, you can classify customers into "Loan Approved" and "Loan Rejected" groups based on past records.
### The algorithm finds the best decision boundary that separates good borrowers from risky borrowers.
---
# Advantages of SVC:
### ✅ Works Well on Small & Complex Datasets – It performs great even with limited data.
### ✅ Effective for High-Dimensional Data – Works well even if you have many features.
### ✅ Robust to Outliers (in some cases) – Uses support vectors, so some noisy points don’t affect it much.
### ✅ Uses Different Kernels – Can work with linear and non-linear classification problems.
---
# Disadvantages of SVC:
###  ❌ Slow on Large Datasets – Training can take a long time if the dataset is very large.
###  ❌ Difficult to Tune – Choosing the right kernel and hyperparameters requires experience.
###  ❌ Not Always the Best for Noisy Data – If there are too many overlapping points, it might not work well.
---
# Summary:
### SVC is a powerful machine-learning algorithm for classification problems. It works by finding the best boundary that separates different categories. It is useful for cases where data is well-structured and not too large, but it may struggle with very big or noisy datasets.

In [1]:
import pandas as pd
import numpy as np
import sklearn

In [2]:
df = pd.read_csv("Social_Network_Ads.csv")
df

Unnamed: 0,Age,EstimatedSalary,Purchased
0,19,19000,0
1,35,20000,0
2,26,43000,0
3,27,57000,0
4,19,76000,0
...,...,...,...
395,46,41000,1
396,51,23000,1
397,50,20000,1
398,36,33000,0


In [3]:
df.isnull().sum()

Age                0
EstimatedSalary    0
Purchased          0
dtype: int64

In [4]:
X = df.drop(columns='Purchased')
y = df['Purchased']

# preprocessing

In [5]:
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X = sc.fit_transform(X)

# train test split

In [6]:
from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.2,random_state=0)

In [7]:
X_train

array([[ 1.94321462,  2.18056084],
       [ 2.03872775,  0.38930459],
       [-1.30423178, -0.4329114 ],
       [-1.11320552, -1.02020853],
       [ 1.94321462, -0.93211396],
       [ 0.41500455,  0.30121002],
       [ 0.22397829,  0.15438573],
       [ 2.03872775,  1.76945285],
       [ 0.79705706, -0.84401939],
       [ 0.31949142, -0.28608712],
       [ 0.41500455, -0.16862769],
       [-0.0625611 ,  2.23929055],
       [-1.39974491, -0.63846539],
       [-1.20871865, -1.07893824],
       [-1.30423178,  0.41866944],
       [-1.01769239,  0.77104772],
       [-1.39974491, -0.19799255],
       [ 0.98808332, -1.07893824],
       [ 0.98808332,  0.59485858],
       [ 0.41500455,  1.00596657],
       [ 0.60603081, -0.9027491 ],
       [-0.54012675,  1.47580428],
       [ 0.03295203, -0.57973568],
       [-0.54012675,  1.91627713],
       [ 1.37013584, -1.43131652],
       [ 1.46564897,  1.00596657],
       [ 0.12846516, -0.81465453],
       [ 0.03295203, -0.25672226],
       [-0.15807423,

In [8]:
X_test

array([[-0.73115301,  0.50676401],
       [ 0.03295203, -0.57973568],
       [-0.25358736,  0.15438573],
       [-0.73115301,  0.27184516],
       [-0.25358736, -0.57973568],
       [-1.01769239, -1.46068138],
       [-0.63563988, -1.60750566],
       [-0.15807423,  2.18056084],
       [-1.87731056, -0.05116826],
       [ 0.89257019, -0.78528968],
       [-0.73115301, -0.60910054],
       [-0.92217926, -0.4329114 ],
       [-0.0625611 , -0.4329114 ],
       [ 0.12846516,  0.21311545],
       [-1.6862843 ,  0.47739916],
       [-0.54012675,  1.38770971],
       [-0.0625611 ,  0.21311545],
       [-1.78179743,  0.4480343 ],
       [ 1.65667523,  1.76945285],
       [-0.25358736, -1.40195167],
       [-0.25358736, -0.66783025],
       [ 0.89257019,  2.18056084],
       [ 0.31949142, -0.55037082],
       [ 0.89257019,  1.03533143],
       [-1.39974491, -1.22576253],
       [ 1.08359645,  2.09246627],
       [-0.92217926,  0.50676401],
       [-0.82666613,  0.30121002],
       [-0.0625611 ,

In [9]:
y_train

336    1
64     0
55     0
106    0
300    1
      ..
323    1
192    0
117    0
47     0
172    0
Name: Purchased, Length: 320, dtype: int64

In [10]:
y_test

132    0
309    0
341    0
196    0
246    0
      ..
14     0
363    0
304    0
361    1
329    1
Name: Purchased, Length: 80, dtype: int64

# SVC

In [11]:
from sklearn.svm import SVC
classifier = SVC(kernel='linear')
classifier.fit(X_train,y_train)

# prediction

In [12]:
y_pred = classifier.predict(X_test)
y_pred

array([0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1,
       0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1,
       0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1], dtype=int64)

# evalution

In [13]:
from sklearn.metrics import confusion_matrix,accuracy_score,classification_report
confusion_matrix(y_pred,y_test)

array([[57,  6],
       [ 1, 16]], dtype=int64)

In [14]:
accuracy_score(y_pred,y_test)

0.9125

In [15]:
print(classification_report(y_pred,y_test))

              precision    recall  f1-score   support

           0       0.98      0.90      0.94        63
           1       0.73      0.94      0.82        17

    accuracy                           0.91        80
   macro avg       0.86      0.92      0.88        80
weighted avg       0.93      0.91      0.92        80



# SVC

In [16]:
from sklearn.svm import SVC
classifier = SVC(kernel='rbf')   #radial basis function
classifier.fit(X_train,y_train)

# prediction

In [17]:
y_pred = classifier.predict(X_test)
y_pred

array([0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1,
       0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0,
       1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1,
       0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1], dtype=int64)

# evalution

In [18]:
from sklearn.metrics import confusion_matrix,accuracy_score,classification_report ,r2_score
confusion_matrix(y_pred,y_test)

array([[55,  1],
       [ 3, 21]], dtype=int64)

In [None]:
r2_score(y)

In [19]:
accuracy_score(y_pred,y_test)

0.95

In [20]:
print(classification_report(y_pred,y_test))

              precision    recall  f1-score   support

           0       0.95      0.98      0.96        56
           1       0.95      0.88      0.91        24

    accuracy                           0.95        80
   macro avg       0.95      0.93      0.94        80
weighted avg       0.95      0.95      0.95        80

