# <center> Using a Support Vector Classifier (SVC) <br> to Train and Classification Model <br> with the Purchasing Dataset</center>
<center> by: Nicole Woodland, P. Eng. for RoboGarden Inc. </center>

---

Support Vector Classifiers are a method under the umbrella of Support Vector Machines (SVM's) available in the ScikitLearn Library. They aim to transform data into a higher dimensional space, using a kernal, to allow identification of new patterns that seperate the data and can be used to identify the labels for the predictions.

In SVC's the parameters adjust to match the chosen Kernel.

<br> Available Kernels:
- linear
- poly
- rbf (default)
- sigmoid
- precomputed

Required Parameters for Chosen Kernel:
<br>**All**:
- **C:** float, default=1.0 The Regularization parameter. The strength of the regularization is inversely proportional to C. Must be positive.
- **tol:** float, default=1e-3  Tolerance for stopping criterion. A measure of how stable is good enough.
- **max_iter:** int, default=-1 Hard limit on iterations within solver, or -1 for no limit.

Linear:
- none

Poly:
- **degree:** The degree of the polynomial kernel function. Must be non-negative.
- **gamma:** the Kernel coefficient. Choose one of:
- 'scale' (default) is passed then it uses 1 / (n_features * X.var()) or
- 'auto' = 1 / n_features or float,
- 'float' = must be non-negative


rbf (default):
- **gamma:** the Kernel coefficient. Choose one of:
- 'scale' (default) is passed then it uses 1 / (n_features * X.var()) or
- 'auto' = 1 / n_features or float,
- 'float' = must be non-negative

## Read data from csv file and perform preprocessing

In [1]:
import pandas as pd
df = pd.read_csv("Social_Network_Ads.csv")

pd.set_option("display.max_columns", None)
pd.set_option("display.max_rows", 10)

In [2]:
# Create X and y data
x_columns = 2
X = df.iloc[:, 0:x_columns].values
y = df.iloc[:, x_columns].values

In [3]:
df

Unnamed: 0,Age,EstimatedSalary,Purchased
0,19,19000,0
1,35,20000,0
2,26,43000,0
3,27,57000,0
4,19,76000,0
...,...,...,...
395,46,41000,1
396,51,23000,1
397,50,20000,1
398,36,33000,0


In [4]:
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X = scaler.fit_transform(X)

## Create training and testing sets

In [5]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=1)

## Support Vector Machines (SVM) Classifier

In [6]:
# Access the module in sklearr to do the SVM analysis (must be capitalized, it's a class)
from sklearn.svm import SVC

In [7]:
# Create an instance of 'svm_model' with the RBF Kernel
svm_model = SVC(max_iter = 1000,
               kernel = "rbf",
               C= 1.0,
               gamma = 'scale'
               ) # max_iter is not required, but can prevent crashing when models don't converge as the default is unlimited iterations

In [8]:
# Training and fitting the model now
svm_model.fit(X_train, y_train) 

SVC(max_iter=1000)

In [9]:
# Guess what, time to do the predictions!
predictions = svm_model.predict(X_test)
svm_model.score(X_test,y_test)

0.88

In [10]:
# SVC
from sklearn.metrics import classification_report,confusion_matrix
print(confusion_matrix(y_test,predictions))
print(classification_report(y_test,predictions))

[[49  9]
 [ 3 39]]
              precision    recall  f1-score   support

           0       0.94      0.84      0.89        58
           1       0.81      0.93      0.87        42

    accuracy                           0.88       100
   macro avg       0.88      0.89      0.88       100
weighted avg       0.89      0.88      0.88       100



In [11]:
# Linear Model

svm_model = SVC(max_iter = 1000,
               kernel = "linear",
               C= 1.0,
               )

svm_model.fit(X_train, y_train)
svm_model.score(X_test,y_test)

0.82

### Optimize a SVC Model using GridSearch

In [12]:
param_grid = {'C': [0.1,1, 10, 100, 1000],
              'gamma': [1,0.1,0.01,0.001,0.0001]}
              #'kernel': ['rbf',]} 

In [13]:
# We need to import GridSearchCV from model_selection. ** Must be capitalized in GridSearchCV! **
from sklearn.model_selection import GridSearchCV

In [14]:
# refit = True by default, this re-trains the best model and saves it in 'grid' for further use. 
grid = GridSearchCV(SVC(), param_grid, verbose=2, refit = True) 

#grid is an instance of the gridsearch. 

In [15]:
# Run a model and save the results for each combination of variables in the grid
grid.fit(X_train,y_train)

Fitting 5 folds for each of 25 candidates, totalling 125 fits
[CV] END .....................................C=0.1, gamma=1; total time=   0.0s
[CV] END .....................................C=0.1, gamma=1; total time=   0.0s
[CV] END .....................................C=0.1, gamma=1; total time=   0.0s
[CV] END .....................................C=0.1, gamma=1; total time=   0.0s
[CV] END .....................................C=0.1, gamma=1; total time=   0.0s
[CV] END ...................................C=0.1, gamma=0.1; total time=   0.0s
[CV] END ...................................C=0.1, gamma=0.1; total time=   0.0s
[CV] END ...................................C=0.1, gamma=0.1; total time=   0.0s
[CV] END ...................................C=0.1, gamma=0.1; total time=   0.0s
[CV] END ...................................C=0.1, gamma=0.1; total time=   0.0s
[CV] END ..................................C=0.1, gamma=0.01; total time=   0.0s
[CV] END ..................................C=0.

GridSearchCV(estimator=SVC(),
             param_grid={'C': [0.1, 1, 10, 100, 1000],
                         'gamma': [1, 0.1, 0.01, 0.001, 0.0001]},
             verbose=2)

In [16]:
print('The best parameters are %s with a score of %0.2f' 
      % (grid.best_params_, grid.best_score_))

The best parameters are {'C': 1000, 'gamma': 0.01} with a score of 0.92


In [17]:
grid.best_params_

{'C': 1000, 'gamma': 0.01}

In [18]:
#Now we have everything in grid, including the best one, so we can just use that:
grid_predictions = grid.predict(X_test)

In [19]:
print(confusion_matrix(y_test,grid_predictions))
print(classification_report(y_test,grid_predictions))

[[49  9]
 [ 3 39]]
              precision    recall  f1-score   support

           0       0.94      0.84      0.89        58
           1       0.81      0.93      0.87        42

    accuracy                           0.88       100
   macro avg       0.88      0.89      0.88       100
weighted avg       0.89      0.88      0.88       100



## Naive Bayes Classifier

From ScikitLearn - These classifiers are based on applying Bayes’ theorem with the “naive” assumption of conditional independence between every pair of features given the value of the class variable.
i.e. It assumes the presence of a particular feature in a class is unrelated to the presence of any other feature.

In spite of their apparently over-simplified assumptions, naive Bayes classifiers have worked quite well in many real-world situations, famously document classification and spam filtering. They require a small amount of training data to estimate the necessary parameters.

They do well at guessing the label (a good classifier), but it is known to be a bad estimator - so don't rely on the specific probability. 

In [20]:
from sklearn.naive_bayes import GaussianNB
model_NB = GaussianNB().fit(X_train, y_train)

y_pred_NB = model_NB.predict(X_test)
print(confusion_matrix(y_test,y_pred_NB))
print(classification_report(y_test,y_pred_NB))

[[51  7]
 [ 7 35]]
              precision    recall  f1-score   support

           0       0.88      0.88      0.88        58
           1       0.83      0.83      0.83        42

    accuracy                           0.86       100
   macro avg       0.86      0.86      0.86       100
weighted avg       0.86      0.86      0.86       100

