# Support Vector Machine [ Classifiction ] 
 - Support Vector Machine [ SVM ] is a supervised machine learning algorithm used for classification and regression tasks.
   It aims to find the optimal boundary [ hyperplane ] that best separates data points of different classes.
   In a 2D space, SVM finds a line that divides the data points into two classes with the maximum margin between them.

 - Support Vector Classifiction [ SVC ] is a classification algorithm that finds the best decision boundary (hyperplane) 
   that separates different classes by maximizing the margin between them.
   It uses support vectors (the closest data points to the boundary) and can handle both linear and non-linear 
   classification using kernels.

 - Hyperplane => A hyperplane is the decision boundary that separates data points belonging to different classes in SVM.

 - The margin is the distance between the hyperplane and the closest data points from each class.
   SVM tries to maximize this margin, ensuring better generalization on unseen data.
   The data points that lie on the edges of this margin are called Support Vectors.
   Larger margin → better classifier.

 - How SVC Works (Simple Steps)
  1. Takes your training data.
  2. Searches for the best hyperplane, separating classes.
  3. Uses support vectors to maximize margin.
  4. If data is not linearly separable → uses a kernel to convert into higher dimension.
  5. Classifies new data based on which side of the hyperplane it falls.

### Import Libraries

In [16]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC 
from sklearn.preprocessing import StandardScaler
from sklearn.datasets import load_breast_cancer

### Load dataset

In [18]:
data = load_breast_cancer()
data.data

array([[1.799e+01, 1.038e+01, 1.228e+02, ..., 2.654e-01, 4.601e-01,
        1.189e-01],
       [2.057e+01, 1.777e+01, 1.329e+02, ..., 1.860e-01, 2.750e-01,
        8.902e-02],
       [1.969e+01, 2.125e+01, 1.300e+02, ..., 2.430e-01, 3.613e-01,
        8.758e-02],
       ...,
       [1.660e+01, 2.808e+01, 1.083e+02, ..., 1.418e-01, 2.218e-01,
        7.820e-02],
       [2.060e+01, 2.933e+01, 1.401e+02, ..., 2.650e-01, 4.087e-01,
        1.240e-01],
       [7.760e+00, 2.454e+01, 4.792e+01, ..., 0.000e+00, 2.871e-01,
        7.039e-02]])

### Dataset Features Name

In [20]:
data.feature_names

array(['mean radius', 'mean texture', 'mean perimeter', 'mean area',
       'mean smoothness', 'mean compactness', 'mean concavity',
       'mean concave points', 'mean symmetry', 'mean fractal dimension',
       'radius error', 'texture error', 'perimeter error', 'area error',
       'smoothness error', 'compactness error', 'concavity error',
       'concave points error', 'symmetry error',
       'fractal dimension error', 'worst radius', 'worst texture',
       'worst perimeter', 'worst area', 'worst smoothness',
       'worst compactness', 'worst concavity', 'worst concave points',
       'worst symmetry', 'worst fractal dimension'], dtype='<U23')

#### Dataset Target 
     0 = Benign ( Negative class, the absence of the serious condition )
     1 = Malignant (  Positive class, the presence of the condition )

### Dataset target name

In [23]:
data.target_names

array(['malignant', 'benign'], dtype='<U9')

### Final dataframe

In [25]:
data_1 = pd.DataFrame(data.data,columns=data.feature_names)
data_1['Target'] = data.target
data_1.sample(5)

Unnamed: 0,mean radius,mean texture,mean perimeter,mean area,mean smoothness,mean compactness,mean concavity,mean concave points,mean symmetry,mean fractal dimension,...,worst texture,worst perimeter,worst area,worst smoothness,worst compactness,worst concavity,worst concave points,worst symmetry,worst fractal dimension,Target
336,12.99,14.23,84.08,514.3,0.09462,0.09965,0.03738,0.02098,0.1652,0.07238,...,16.91,87.38,576.0,0.1142,0.1975,0.145,0.0585,0.2432,0.1009,1
564,21.56,22.39,142.0,1479.0,0.111,0.1159,0.2439,0.1389,0.1726,0.05623,...,26.4,166.1,2027.0,0.141,0.2113,0.4107,0.2216,0.206,0.07115,0
301,12.46,19.89,80.43,471.3,0.08451,0.1014,0.0683,0.03099,0.1781,0.06249,...,23.07,88.13,551.3,0.105,0.2158,0.1904,0.07625,0.2685,0.07764,1
133,15.71,13.93,102.0,761.7,0.09462,0.09462,0.07135,0.05933,0.1816,0.05723,...,19.25,114.3,922.8,0.1223,0.1949,0.1709,0.1374,0.2723,0.07071,1
389,19.55,23.21,128.9,1174.0,0.101,0.1318,0.1856,0.1021,0.1989,0.05884,...,30.44,142.0,1313.0,0.1251,0.2414,0.3829,0.1825,0.2576,0.07602,0


#### Splitting X & y Into Train And Test Data 

In [27]:
X = data_1.drop(columns=['Target'])
y = data_1['Target']
X_train,X_test, y_train,y_test = train_test_split(X,y,test_size=0.2,random_state=42)

####  Base Support Vector Machine [ SVC ] Model 

In [29]:
classifiction = SVC()
classifiction.fit(X_train,y_train)

In [30]:
print("SVC Score Without Scaling ::",classifiction.score(X_test,y_test))

SVC Score Without Scaling :: 0.9473684210526315


#### Using StandardScaler for scaling data

In [32]:
scale = StandardScaler()
scale.fit(X_train)

In [33]:
X_train_scale = scale.transform(X_train)
X_test_scale = scale.transform(X_test)

#### SVC Model With Scale Data With ' rbf ' Kernel 
- A kernel is a mathematical function that allows SVM or SVR to work in a higher-dimensional space without explicitly transforming the data
- The RBF kernel (also known as the Gaussian kernel) is the most popular and widely used kernel in SVR. It maps the data into an infinite-dimensional space. It assumes that points closer to each other in the feature space should have similar target values.
     

In [35]:
classifiction_rbf = SVC(kernel='rbf')
classifiction_rbf.fit(X_train_scale,y_train)

In [36]:
print("SVC [rbf Kernel ] Score With Scaling  ::",classifiction_rbf.score(X_test_scale,y_test))

SVC [rbf Kernel ] Score With Scaling  :: 0.9824561403508771


#### SVC With ' linear ' Kernel
-   The Linear kernel is the simplest form of kernel. It is used when the relationship between the features and the target variable is already linear. It does not map the data to a higher dimension but instead works in the original feature space.

In [38]:
classifiction_lin = SVC(kernel='linear')
classifiction_lin.fit(X_train_scale,y_train)

In [39]:
print("SVC [linear Kernel ] Score With Scaling  ::",classifiction_lin.score(X_test_scale,y_test))

SVC [linear Kernel ] Score With Scaling  :: 0.956140350877193


#### SVC With ' poly ' Kernel 
- The Polynomial kernel represents the similarity of vectors in a feature space over polynomials of the original variables. It allows for curved regression lines by considering not only the given features but also their combinations.

In [41]:
classifiction_poly = SVC(kernel='poly',degree=1)
classifiction_poly.fit(X_train_scale,y_train)

In [42]:
print("SVC [poly Kernel ] Score With Scaling  ::",classifiction_poly.score(X_test_scale,y_test))

SVC [poly Kernel ] Score With Scaling  :: 0.9824561403508771


#### Prediction  [SVC With 'rbf' Kernel]

In [44]:
y_test[0:10]

204    1
70     0
131    0
431    1
540    1
567    0
369    0
29     0
81     1
477    1
Name: Target, dtype: int32

In [45]:
y_pred = classifiction_rbf.predict(X_test_scale[0:10])
y_pred

array([1, 0, 0, 1, 1, 0, 0, 0, 1, 1])