## Support Vector Machine

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

### Data :

In [2]:
from sklearn.datasets import load_breast_cancer

cancer = load_breast_cancer()

In [3]:
cancer.keys()

dict_keys(['data', 'target', 'target_names', 'DESCR', 'feature_names', 'filename'])

In [12]:
# print(cancer['DESCR'])

#### Set up a Data Frame :

In [14]:
df_feat = pd.DataFrame(cancer['data'], columns = cancer['feature_names'])

In [17]:
# df_feat.info()

df_feat.head(2)

Unnamed: 0,mean radius,mean texture,mean perimeter,mean area,mean smoothness,mean compactness,mean concavity,mean concave points,mean symmetry,mean fractal dimension,...,worst radius,worst texture,worst perimeter,worst area,worst smoothness,worst compactness,worst concavity,worst concave points,worst symmetry,worst fractal dimension
0,17.99,10.38,122.8,1001.0,0.1184,0.2776,0.3001,0.1471,0.2419,0.07871,...,25.38,17.33,184.6,2019.0,0.1622,0.6656,0.7119,0.2654,0.4601,0.1189
1,20.57,17.77,132.9,1326.0,0.08474,0.07864,0.0869,0.07017,0.1812,0.05667,...,24.99,23.41,158.8,1956.0,0.1238,0.1866,0.2416,0.186,0.275,0.08902


In [24]:
df_target = pd.DataFrame(cancer['target'],columns=['cancer'])

In [28]:
df_target.head(2)

Unnamed: 0,cancer
0,0
1,0


### Train_Test_Split :

In [30]:
X = df_feat

Y = cancer['target']

In [31]:
from sklearn.model_selection import train_test_split

X_train , X_test , Y_train , Y_test = train_test_split(X,Y, test_size=0.3, random_state = 101)

### Support Vector Classifier :

In [32]:
from sklearn.svm import SVC

model = SVC()

In [33]:
model.fit(X_train,Y_train)



SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
    decision_function_shape='ovr', degree=3, gamma='auto_deprecated',
    kernel='rbf', max_iter=-1, probability=False, random_state=None,
    shrinking=True, tol=0.001, verbose=False)

In [34]:
predictions = model.predict(X_test)

#### Evaluation :

In [35]:
from sklearn.metrics import classification_report , confusion_matrix

print(classification_report(Y_test,predictions))

print('\n')

print(confusion_matrix(Y_test,predictions))

              precision    recall  f1-score   support

           0       0.00      0.00      0.00        66
           1       0.61      1.00      0.76       105

    accuracy                           0.61       171
   macro avg       0.31      0.50      0.38       171
weighted avg       0.38      0.61      0.47       171



[[  0  66]
 [  0 105]]


  'precision', 'predicted', average, warn_for)


###### Now the problem is it is showing everything belongs to class 1. So we have to adjust the parameters and it also helps in normalising the data before passing it into SVM

_____________________________________________________________________________________________________________________________

###### Now we can search for the best parameters using a `Grid Search`
   
` Grid Search allows us to find the right parameters such as like what C or gamma values to use`

* The idea of creating a grid of parameters is trying out all the best possible combinations is called a grid search

### Grid Search :

* GridSearchCV takes a dictionary that describes the parameters that should be be tried in a model to train

* The grid of parameters is defined as a dictionary where the keys are the parameters and the values are basically a list of settings to be tested

In [36]:
from sklearn.model_selection import GridSearchCV

###### C-value : 

             * controls the cost of mis-classification on the training data
             * large C-value gives low bias and high variance
             * low bias because you penalize the cost of mis-classification for a large C-value
             * for smaller C-values : not going to penalize cost that much : `higher bias and lower variance`


###### Gamma value :

             * It is a free parameter in that 'Gaussian radial basis function' and which is why we had kernel = 'rbf'
             * Basically small gamma = Guassian for large variance
             * Large gamma = high bias and low variance 
             * i.e., support vector does not have a wide spread influence

In [40]:
param_grid = { 'C' : [0.1 , 1 , 10 , 100 , 1000] , 'gamma' : [1 , 0.1 , 0.01 , 0.001 , 0.0001] , 'kernel' : ['rbf']}

` One of the great things with GridsearchCV is that it is an meta-estimator. It takes an estimator like SVC and creates a new estimator that behaves exactly the same`

`refit=True and verbose = some number apart from 0, verbose means the text output describing the process`

**If verbose=0 we don't know whether or not your model is doing something because GridSearch takes a long time especially when u have a ton of parameters to check**

In [41]:
grid = GridSearchCV(SVC(), param_grid , verbose = 3)

In [42]:
grid.fit(X_train,Y_train)

[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done   1 out of   1 | elapsed:    0.0s remaining:    0.0s
[Parallel(n_jobs=1)]: Done   2 out of   2 | elapsed:    0.0s remaining:    0.0s


Fitting 3 folds for each of 25 candidates, totalling 75 fits
[CV] C=0.1, gamma=1, kernel=rbf ......................................
[CV] .......... C=0.1, gamma=1, kernel=rbf, score=0.632, total=   0.0s
[CV] C=0.1, gamma=1, kernel=rbf ......................................
[CV] .......... C=0.1, gamma=1, kernel=rbf, score=0.632, total=   0.0s
[CV] C=0.1, gamma=1, kernel=rbf ......................................
[CV] .......... C=0.1, gamma=1, kernel=rbf, score=0.636, total=   0.0s
[CV] C=0.1, gamma=0.1, kernel=rbf ....................................
[CV] ........ C=0.1, gamma=0.1, kernel=rbf, score=0.632, total=   0.0s
[CV] C=0.1, gamma=0.1, kernel=rbf ....................................
[CV] ........ C=0.1, gamma=0.1, kernel=rbf, score=0.632, total=   0.0s
[CV] C=0.1, gamma=0.1, kernel=rbf ....................................
[CV] ........ C=0.1, gamma=0.1, kernel=rbf, score=0.636, total=   0.0s
[CV] C=0.1, gamma=0.01, kernel=rbf ...................................
[CV] ....... C=0

[Parallel(n_jobs=1)]: Done  75 out of  75 | elapsed:    0.7s finished


GridSearchCV(cv='warn', error_score='raise-deprecating',
             estimator=SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
                           decision_function_shape='ovr', degree=3,
                           gamma='auto_deprecated', kernel='rbf', max_iter=-1,
                           probability=False, random_state=None, shrinking=True,
                           tol=0.001, verbose=False),
             iid='warn', n_jobs=None,
             param_grid={'C': [0.1, 1, 10, 100, 1000],
                         'gamma': [1, 0.1, 0.01, 0.001, 0.0001],
                         'kernel': ['rbf']},
             pre_dispatch='2*n_jobs', refit=True, return_train_score=False,
             scoring=None, verbose=3)

###### What fit does is : a bit more involved than first. It's going to run the same loop with cross_validation to find the best parameter combination. Once it has the best parameter it runs fit again on all data passed to that fit without cross_validation to build a single new model using the best parameter setting  

#### Grabbing the best parameter setting :

In [45]:
grid.best_params_

{'C': 10, 'gamma': 0.0001, 'kernel': 'rbf'}

In [46]:
grid.best_estimator_       # to get the best estimator and best score

SVC(C=10, cache_size=200, class_weight=None, coef0=0.0,
    decision_function_shape='ovr', degree=3, gamma=0.0001, kernel='rbf',
    max_iter=-1, probability=False, random_state=None, shrinking=True,
    tol=0.001, verbose=False)

###### re-run predictions on this grid object :

In [47]:
grid_pred = grid.predict(X_test)

In [49]:
print(confusion_matrix(Y_test,grid_pred))

print('\n')

print(classification_report(Y_test,grid_pred))

[[ 60   6]
 [  3 102]]


              precision    recall  f1-score   support

           0       0.95      0.91      0.93        66
           1       0.94      0.97      0.96       105

    accuracy                           0.95       171
   macro avg       0.95      0.94      0.94       171
weighted avg       0.95      0.95      0.95       171

