# Support Vector Classification(SVC)

* Support Vector Classification(SVC)  model tries to find the optimum line, function or plane that will ensure the maximum margin between two classes.



* Maximizing the margin means that the width of the two lines defining the margin must be maximum. And  there should be no observation value in this margin width.



* When the Support Vector Classification model is used for classification, not only linear function determine the but also non-linear function or plane determiene the classes.



**Linear and Non-Linear Function**

![alt text](https://yavuz.github.io/assets/img/support_vector_regression2.jpg)

**Plane**


![alt text](https://miro.medium.com/max/1676/1*mCwnu5kXot6buL7jeIafqQ.png)

![alt text](https://i.stack.imgur.com/qonyo.png)

* We create two  Support Vector Classification  ( SVC) model by using different kernel model.
 
  * 1-)Support Vector Classification  model with **linear kernel**

  * 2-)Support Vector Classification  model with**non-linear kernel ("rbf")**

## 1-)SVC Model with linear kernel

 ### 1.1-)Model

In [1]:
import numpy as np
import pandas as pd 
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC

In [2]:
diabetes = pd.read_csv("diabetes.csv")
df = diabetes.copy()
df = df.dropna()
y = df["Outcome"]
X = df.drop(['Outcome'], axis=1)
X_train, X_test, y_train, y_test = train_test_split(X, y, 
                                                    test_size=0.30, 
                                                    random_state=42)

In [3]:
svm_model = SVC(kernel = "linear")

In [4]:
svm_model.fit(X_train,y_train)
svm_model

SVC(kernel='linear')

### 1.2-)Prediction

In [5]:
y_pred = svm_model.predict(X_test)
y_pred[0:10]

array([0, 0, 0, 0, 1, 0, 0, 1, 1, 1], dtype=int64)

In [6]:
from sklearn.metrics import confusion_matrix, accuracy_score, classification_report

In [7]:
accuracy_score(y_test, y_pred) # before model tuning

0.7445887445887446

In [8]:
confusion_matrix(y_test, y_pred) # before model tuning

array([[122,  29],
       [ 30,  50]], dtype=int64)

In [9]:
print(classification_report(y_test, y_pred)) # before model tuning

              precision    recall  f1-score   support

           0       0.80      0.81      0.81       151
           1       0.63      0.62      0.63        80

    accuracy                           0.74       231
   macro avg       0.72      0.72      0.72       231
weighted avg       0.74      0.74      0.74       231



### 1.3-)Model tuning

* In this section, we will try to determine the optimum **Regularization parameter: C**  with the GridSearchCV method.


* GridSearchCV: Grid Search Cross Validation Methode



* Then , we will create the most optimum model by using optimum **Regularization parameter: C** .





* **Regularization parameter: C** are the hyperparameters that we will determine according to ourselves and we want it to be the most optimum.



* But instead of relying on our own feeling and sense in order to find the  optimum value of these hyperparameters   , we will find the optimum value of these hyperparameters   by using the gridsearch method.


* Default value of **Regularization parameter: C** is 1.0


In [10]:
from sklearn.model_selection import GridSearchCV

In [11]:
svc_params = {"C": np.arange(1,6)}

In [12]:
svc = SVC(kernel = "linear")

In [13]:
svc_cv_model = GridSearchCV(svc,svc_params, 
                            cv = 10, 
                            n_jobs = -1, 
                            verbose = 2 )

In [14]:
svc_cv_model.fit(X_train, y_train)

Fitting 10 folds for each of 5 candidates, totalling 50 fits


[Parallel(n_jobs=-1)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=-1)]: Done  33 tasks      | elapsed:  2.0min
[Parallel(n_jobs=-1)]: Done  50 out of  50 | elapsed:  4.5min finished


GridSearchCV(cv=10, estimator=SVC(kernel='linear'), n_jobs=-1,
             param_grid={'C': array([1, 2, 3, 4, 5])}, verbose=2)

In [15]:
svc_cv_model.best_params_

{'C': 5}

#### 1.3.1-) Tuned Model

In [16]:
svc_tuned = SVC(kernel = "linear", C = 5)
svc_tuned

SVC(C=5, kernel='linear')

In [17]:
svc_tuned.fit(X_train, y_train)

SVC(C=5, kernel='linear')

In [18]:
y_pred1 = svc_tuned.predict(X_test)
y_pred1[0:10]

array([0, 0, 0, 0, 1, 0, 0, 1, 1, 1], dtype=int64)

In [19]:
accuracy_score(y_test, y_pred1)# after model tuning

0.7445887445887446

In [20]:
confusion_matrix(y_test, y_pred1)# after model tuning

array([[122,  29],
       [ 30,  50]], dtype=int64)

In [21]:
print(classification_report(y_test, y_pred1))# after model tuning

              precision    recall  f1-score   support

           0       0.80      0.81      0.81       151
           1       0.63      0.62      0.63        80

    accuracy                           0.74       231
   macro avg       0.72      0.72      0.72       231
weighted avg       0.74      0.74      0.74       231



## 2-)SVC Model with Non-linear kernel(rbf)

 ### 2.1-)Model

In [37]:
import numpy as np
import pandas as pd 
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC

In [38]:
diabetes = pd.read_csv("diabetes.csv")
df = diabetes.copy()
df = df.dropna()
y = df["Outcome"]
X = df.drop(['Outcome'], axis=1)
X_train, X_test, y_train, y_test = train_test_split(X, y, 
                                                    test_size=0.30, 
                                                    random_state=42)

In [39]:
svc_model1 = SVC(kernel = "rbf")
svc_model1.fit(X_train,y_train)
svc_model1

SVC()

### 2.2-)Prediction

In [40]:
y_pred2 = svc_model1.predict(X_test)
y_pred2[0:10]

array([0, 0, 0, 0, 0, 0, 0, 0, 1, 1], dtype=int64)

In [41]:
from sklearn.metrics import confusion_matrix, accuracy_score, classification_report

In [42]:
accuracy_score(y_test, y_pred2) # before model tuning

0.7359307359307359

In [43]:
confusion_matrix(y_test, y_pred2) # before model tuning

array([[131,  20],
       [ 41,  39]], dtype=int64)

In [44]:
print(classification_report(y_test, y_pred2)) # before model tuning

              precision    recall  f1-score   support

           0       0.76      0.87      0.81       151
           1       0.66      0.49      0.56        80

    accuracy                           0.74       231
   macro avg       0.71      0.68      0.69       231
weighted avg       0.73      0.74      0.72       231



### 2.3-)Model tuning

* In this section, we will try to determine the optimum **Regularization parameter: C and gamma**  with the GridSearchCV method.


* GridSearchCV: Grid Search Cross Validation Methode



* Then , we will create the most optimum model by using optimum **Regularization parameter: C and gamma** .





* **Regularization parameter: C and gamma** are the hyperparameters that we will determine according to ourselves and we want it to be the most optimum.



* But instead of relying on our own feeling and sense in order to find the  optimum value of these hyperparameters   , we will find the optimum value of these hyperparameters   by using the gridsearch method.




In [35]:
from sklearn.model_selection import GridSearchCV

In [36]:
svc_params = {"C": [0.0001, 0.001, 0.1, 1, 5, 10 ,50 ,100],
             "gamma": [0.0001, 0.001, 0.1, 1, 5, 10 ,50 ,100]}

In [46]:
svc_model2 = SVC(kernel = "rbf")

In [47]:
svc_cv_model = GridSearchCV(svc_model2, svc_params, 
                         cv = 10, 
                         n_jobs = -1,
                         verbose = 2)

In [48]:
svc_cv_model.fit(X_train, y_train)

Fitting 10 folds for each of 64 candidates, totalling 640 fits


[Parallel(n_jobs=-1)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=-1)]: Done  38 tasks      | elapsed:    2.2s
[Parallel(n_jobs=-1)]: Done 516 tasks      | elapsed:    7.5s
[Parallel(n_jobs=-1)]: Done 640 out of 640 | elapsed:    9.3s finished


GridSearchCV(cv=10, estimator=SVC(), n_jobs=-1,
             param_grid={'C': [0.0001, 0.001, 0.1, 1, 5, 10, 50, 100],
                         'gamma': [0.0001, 0.001, 0.1, 1, 5, 10, 50, 100]},
             verbose=2)

In [49]:
svc_cv_model.best_params_

{'C': 10, 'gamma': 0.0001}

#### 1.3.1-) Tuned Model

In [51]:
svc_tuned = SVC(kernel = "rbf",C = 10, gamma = 0.0001)

In [52]:
svc_tuned.fit(X_train, y_train)

SVC(C=10, gamma=0.0001)

In [55]:
y_pred3 = svc_tuned.predict(X_test)
y_pred3[0:10]

array([1, 0, 0, 0, 0, 1, 0, 1, 1, 1], dtype=int64)

In [56]:
accuracy_score(y_test, y_pred3)# after model tuning

0.7359307359307359

In [57]:
confusion_matrix(y_test, y_pred3)# after model tuning

array([[120,  31],
       [ 30,  50]], dtype=int64)

In [58]:
print(classification_report(y_test, y_pred3))# after model tuning

              precision    recall  f1-score   support

           0       0.80      0.79      0.80       151
           1       0.62      0.62      0.62        80

    accuracy                           0.74       231
   macro avg       0.71      0.71      0.71       231
weighted avg       0.74      0.74      0.74       231

