# MODEL_SVM

**Support Vector Mchines (SVMs)** are supervised learning models with associated learning algorithms that analyze data used for classification and regression analysis.

In addition to performing linear classification, SVMs can efficiently perform a non-linear classification using what is called the kernel trick, implicitly mapping their inputs into high-dimensional feature spaces.

**Index**

* 1. Import Libraries & Data

**Import Libraries**

In [1]:
import pandas as pd
import numpy as np
from sklearn.svm import SVC
from sklearn.model_selection import train_test_split
from sklearn.decomposition import PCA
from imblearn.under_sampling import NearMiss, RandomUnderSampler
from imblearn.over_sampling import SMOTE
from sklearn.metrics import confusion_matrix, accuracy_score, classification_report
from sklearn.model_selection import RandomizedSearchCV, GridSearchCV
from sklearn import metrics

**Import Data**

In [2]:
churn_norm = pd.read_csv("Churn_Norm.csv")

In [3]:
churn_norm.head()

Unnamed: 0,CreditScore,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited,Gender_Int,Geography_Germany,Geography_Spain,Balance_Int
0,0.538,0.324324,0.2,0.0,0.0,1.0,1.0,0.506735,1.0,1.0,0.0,0.0,0.0
1,0.516,0.310811,0.1,0.334031,0.0,0.0,1.0,0.562709,0.0,1.0,0.0,1.0,1.0
2,0.304,0.324324,0.8,0.636357,0.666667,1.0,0.0,0.569654,1.0,1.0,0.0,0.0,1.0
3,0.698,0.283784,0.1,0.0,0.333333,0.0,0.0,0.46912,0.0,1.0,0.0,0.0,0.0
4,1.0,0.337838,0.2,0.500246,0.0,1.0,1.0,0.3954,0.0,1.0,0.0,1.0,1.0


In [4]:
# place target column at the end of the dataset
churn_norm= churn_norm[["CreditScore","Age","Tenure","Balance","NumOfProducts","HasCrCard","IsActiveMember","EstimatedSalary","Gender_Int","Geography_Germany","Geography_Spain","Balance_Int","Exited"]]

**Define X,y**

In [5]:
# Define X (selects every row and every column except the last column)
X = churn_norm.iloc[:,:-1]

# Define target/labels  
y = churn_norm['Exited']

**Split the data into training and testing sets**

In [6]:
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size=0.1, random_state=1, stratify=y)

## SVC_without_LDA

**Linear SVC Model**

In [7]:
svc_linear = SVC()

We use **GridSearch** to select the values for a model’s parameters that maximize the accuracy of the model. Grid Search does this by fitting every combination of parameters and selecting the best ones

In [8]:
param_grid = {'C': [1, 10],  
              'gamma': [1, 0.1, 0.01], 
              'kernel': ['linear']}

In [9]:
svc_linear_sel=GridSearchCV(svc_linear, param_grid, refit=True, verbose=3)

In [10]:
svc_linear_sel.fit(X_train, y_train)

Fitting 5 folds for each of 6 candidates, totalling 30 fits
[CV] C=1, gamma=1, kernel=linear .....................................


[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.


[CV] ......... C=1, gamma=1, kernel=linear, score=0.797, total=   1.4s
[CV] C=1, gamma=1, kernel=linear .....................................


[Parallel(n_jobs=1)]: Done   1 out of   1 | elapsed:    1.3s remaining:    0.0s


[CV] ......... C=1, gamma=1, kernel=linear, score=0.797, total=   1.9s
[CV] C=1, gamma=1, kernel=linear .....................................


[Parallel(n_jobs=1)]: Done   2 out of   2 | elapsed:    3.2s remaining:    0.0s


[CV] ......... C=1, gamma=1, kernel=linear, score=0.796, total=   1.7s
[CV] C=1, gamma=1, kernel=linear .....................................
[CV] ......... C=1, gamma=1, kernel=linear, score=0.796, total=   1.5s
[CV] C=1, gamma=1, kernel=linear .....................................
[CV] ......... C=1, gamma=1, kernel=linear, score=0.796, total=   1.5s
[CV] C=1, gamma=0.1, kernel=linear ...................................
[CV] ....... C=1, gamma=0.1, kernel=linear, score=0.797, total=   1.9s
[CV] C=1, gamma=0.1, kernel=linear ...................................
[CV] ....... C=1, gamma=0.1, kernel=linear, score=0.797, total=   1.8s
[CV] C=1, gamma=0.1, kernel=linear ...................................
[CV] ....... C=1, gamma=0.1, kernel=linear, score=0.796, total=   1.6s
[CV] C=1, gamma=0.1, kernel=linear ...................................
[CV] ....... C=1, gamma=0.1, kernel=linear, score=0.796, total=   1.6s
[CV] C=1, gamma=0.1, kernel=linear ...................................
[CV] .

[Parallel(n_jobs=1)]: Done  30 out of  30 | elapsed:   53.0s finished


GridSearchCV(estimator=SVC(),
             param_grid={'C': [1, 10], 'gamma': [1, 0.1, 0.01],
                         'kernel': ['linear']},
             verbose=3)

In [11]:
svc_linear_sel.best_estimator_

SVC(C=1, gamma=1, kernel='linear')

In [12]:
svc_linear_sel.best_estimator_.score(X_test, y_test)

0.796

In [13]:
svc_linear_sel.best_params_

{'C': 1, 'gamma': 1, 'kernel': 'linear'}

**Define model with its best params**

In [14]:
svc_linear = SVC(random_state=42,C=1, gamma=1, kernel="linear")
svc_linear.fit(X_train, y_train)

SVC(C=1, gamma=1, kernel='linear', random_state=42)

**Evaluation**

In [15]:
svc_linear_y_pred = svc_linear.predict(X_test)
svc_linear_score = round (svc_linear.score(X_train, y_train) * 100, 2)
print(round(svc_linear_score,2,), "%")

79.63 %


In [16]:
print("Accuracy:", metrics.accuracy_score(y_test, svc_linear_y_pred))

Accuracy: 0.796


In [17]:
print(confusion_matrix(y_test, svc_linear_y_pred))

[[796   0]
 [204   0]]


In [18]:
print(classification_report(y_test, svc_linear_y_pred))

              precision    recall  f1-score   support

         0.0       0.80      1.00      0.89       796
         1.0       0.00      0.00      0.00       204

    accuracy                           0.80      1000
   macro avg       0.40      0.50      0.44      1000
weighted avg       0.63      0.80      0.71      1000



  _warn_prf(average, modifier, msg_start, len(result))


**RBF SVC**

In [24]:
svc_rbf = SVC()

In [31]:
param_grid = {'C': [1, 10],  
              'gamma': [1, 0.1, 0.01], 
              'kernel': ['rbf']}

In [32]:
svc_rbf_sel=GridSearchCV(svc_rbf, param_grid, refit=True, verbose=3)

In [33]:
svc_rbf_sel.fit(X_train, y_train)

Fitting 5 folds for each of 6 candidates, totalling 30 fits
[CV] C=1, gamma=1, kernel=rbf ........................................


[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.


[CV] ............ C=1, gamma=1, kernel=rbf, score=0.843, total=   3.0s
[CV] C=1, gamma=1, kernel=rbf ........................................


[Parallel(n_jobs=1)]: Done   1 out of   1 | elapsed:    2.9s remaining:    0.0s


[CV] ............ C=1, gamma=1, kernel=rbf, score=0.838, total=   4.1s
[CV] C=1, gamma=1, kernel=rbf ........................................


[Parallel(n_jobs=1)]: Done   2 out of   2 | elapsed:    7.0s remaining:    0.0s


[CV] ............ C=1, gamma=1, kernel=rbf, score=0.833, total=   3.7s
[CV] C=1, gamma=1, kernel=rbf ........................................
[CV] ............ C=1, gamma=1, kernel=rbf, score=0.842, total=   2.6s
[CV] C=1, gamma=1, kernel=rbf ........................................
[CV] ............ C=1, gamma=1, kernel=rbf, score=0.841, total=   2.4s
[CV] C=1, gamma=0.1, kernel=rbf ......................................
[CV] .......... C=1, gamma=0.1, kernel=rbf, score=0.798, total=   2.2s
[CV] C=1, gamma=0.1, kernel=rbf ......................................
[CV] .......... C=1, gamma=0.1, kernel=rbf, score=0.796, total=   2.3s
[CV] C=1, gamma=0.1, kernel=rbf ......................................
[CV] .......... C=1, gamma=0.1, kernel=rbf, score=0.796, total=   2.1s
[CV] C=1, gamma=0.1, kernel=rbf ......................................
[CV] .......... C=1, gamma=0.1, kernel=rbf, score=0.797, total=   2.2s
[CV] C=1, gamma=0.1, kernel=rbf ......................................
[CV] .

[Parallel(n_jobs=1)]: Done  30 out of  30 | elapsed:  1.4min finished


GridSearchCV(estimator=SVC(),
             param_grid={'C': [1, 10], 'gamma': [1, 0.1, 0.01],
                         'kernel': ['rbf']},
             verbose=3)

In [34]:
svc_rbf_sel.best_estimator_

SVC(C=10, gamma=1)

In [35]:
svc_rbf_sel.best_estimator_.score(X_test, y_test)

0.856

In [36]:
svc_rbf_sel.best_params_

{'C': 10, 'gamma': 1, 'kernel': 'rbf'}

In [56]:
svc_rbf = SVC(random_state=42,C=10, gamma=1, kernel="rbf")
svc_rbf.fit(X_train, y_train)

SVC(C=10, gamma=1, random_state=42)

In [57]:
svc_rbf_y_pred = svc_rbf.predict(X_test)
svc_rbf_score = round (svc_rbf.score(X_train, y_train) * 100, 2)
print(round(svc_rbf_score,2,), "%")

88.04 %


In [58]:
print("Accuracy:", metrics.accuracy_score(y_test, svc_rbf_y_pred))

Accuracy: 0.856


In [59]:
print(confusion_matrix(y_test, svc_rbf_y_pred))

[[767  29]
 [115  89]]


In [60]:
print(classification_report(y_test, svc_rbf_y_pred))

              precision    recall  f1-score   support

         0.0       0.87      0.96      0.91       796
         1.0       0.75      0.44      0.55       204

    accuracy                           0.86      1000
   macro avg       0.81      0.70      0.73      1000
weighted avg       0.85      0.86      0.84      1000



**Poly SVC**

In [42]:
svc_poly = SVC()

In [43]:
param_grid = {'C': [1, 10],  
              'gamma': [1, 0.1, 0.01], 
              'kernel': ['poly']}

In [46]:
svc_poly_sel=GridSearchCV(svc_poly, param_grid, refit=True, verbose=3)

In [47]:
svc_poly_sel.fit(X_train, y_train)

Fitting 5 folds for each of 6 candidates, totalling 30 fits
[CV] C=1, gamma=1, kernel=poly .......................................


[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.


[CV] ........... C=1, gamma=1, kernel=poly, score=0.861, total=   7.0s
[CV] C=1, gamma=1, kernel=poly .......................................


[Parallel(n_jobs=1)]: Done   1 out of   1 | elapsed:    6.9s remaining:    0.0s


[CV] ........... C=1, gamma=1, kernel=poly, score=0.859, total=   6.2s
[CV] C=1, gamma=1, kernel=poly .......................................


[Parallel(n_jobs=1)]: Done   2 out of   2 | elapsed:   13.1s remaining:    0.0s


[CV] ........... C=1, gamma=1, kernel=poly, score=0.844, total=   6.8s
[CV] C=1, gamma=1, kernel=poly .......................................
[CV] ........... C=1, gamma=1, kernel=poly, score=0.854, total=   5.8s
[CV] C=1, gamma=1, kernel=poly .......................................
[CV] ........... C=1, gamma=1, kernel=poly, score=0.854, total=   6.0s
[CV] C=1, gamma=0.1, kernel=poly .....................................
[CV] ......... C=1, gamma=0.1, kernel=poly, score=0.797, total=   1.6s
[CV] C=1, gamma=0.1, kernel=poly .....................................
[CV] ......... C=1, gamma=0.1, kernel=poly, score=0.797, total=   1.6s
[CV] C=1, gamma=0.1, kernel=poly .....................................
[CV] ......... C=1, gamma=0.1, kernel=poly, score=0.796, total=   1.7s
[CV] C=1, gamma=0.1, kernel=poly .....................................
[CV] ......... C=1, gamma=0.1, kernel=poly, score=0.796, total=   1.6s
[CV] C=1, gamma=0.1, kernel=poly .....................................
[CV] .

[Parallel(n_jobs=1)]: Done  30 out of  30 | elapsed:  5.2min finished


GridSearchCV(estimator=SVC(),
             param_grid={'C': [1, 10], 'gamma': [1, 0.1, 0.01],
                         'kernel': ['poly']},
             verbose=3)

In [48]:
svc_poly_sel.best_estimator_

SVC(C=10, gamma=1, kernel='poly')

In [49]:
svc_poly_sel.best_estimator_.score(X_test, y_test)

0.86

In [50]:
svc_poly_sel.best_params_

{'C': 10, 'gamma': 1, 'kernel': 'poly'}

In [51]:
svc_poly = SVC(random_state=42,C=10, gamma=1, kernel="poly")
svc_poly.fit(X_train, y_train)

SVC(C=10, gamma=1, kernel='poly', random_state=42)

In [52]:
svc_poly_y_pred = svc_poly.predict(X_test)
svc_poly_score = round (svc_poly.score(X_train, y_train) * 100, 2)
print(round(svc_poly_score,2,), "%")


86.87 %


In [53]:
print("Accuracy:", metrics.accuracy_score(y_test, svc_poly_y_pred))

Accuracy: 0.86


In [54]:
print(confusion_matrix(y_test, svc_poly_y_pred))

[[774  22]
 [118  86]]


In [55]:
print(classification_report(y_test, svc_poly_y_pred))

              precision    recall  f1-score   support

         0.0       0.87      0.97      0.92       796
         1.0       0.80      0.42      0.55       204

    accuracy                           0.86      1000
   macro avg       0.83      0.70      0.73      1000
weighted avg       0.85      0.86      0.84      1000



**To sum up:**

The best model is RBF because it will give us a better recall of 0.42 of those who will be churn.

**We will try if with under sampling or over sampling our results for RBF will improve**

**RandomUnderSampler**

In [61]:
random_sample = RandomUnderSampler()

X_rus, y_rus = random_sample.fit_resample(X_train, y_train)

In [63]:
X_train_rus, X_test_rus, y_train_rus, y_test_rus = train_test_split(X_rus, y_rus, random_state=42, test_size=0.1)

In [65]:
y_pred_rus = svc_rbf.fit(X_train_rus, y_train_rus).predict(X_test_rus)

In [66]:
print(confusion_matrix(y_test_rus,y_pred_rus))
print(classification_report(y_test_rus, y_pred_rus))

[[152  54]
 [ 41 120]]
              precision    recall  f1-score   support

         0.0       0.79      0.74      0.76       206
         1.0       0.69      0.75      0.72       161

    accuracy                           0.74       367
   macro avg       0.74      0.74      0.74       367
weighted avg       0.74      0.74      0.74       367



**NearMiss**

In [67]:
nearmiss = NearMiss()

X_nearmiss, y_nearmiss = nearmiss.fit_resample(X_train, y_train)

In [69]:
X_train_nr, X_test_nr, y_train_nr, y_test_nr = train_test_split(X_nearmiss, y_nearmiss,random_state=42, test_size=0.2)

In [70]:
y_pred_nearmiss = svc_rbf.fit(X_train_nr, y_train_nr).predict(X_test_nr)

In [71]:
print(confusion_matrix(y_test_nr,y_pred_nearmiss))
print(classification_report(y_test_nr, y_pred_nearmiss))

[[303  86]
 [129 216]]
              precision    recall  f1-score   support

         0.0       0.70      0.78      0.74       389
         1.0       0.72      0.63      0.67       345

    accuracy                           0.71       734
   macro avg       0.71      0.70      0.70       734
weighted avg       0.71      0.71      0.71       734



**Oversampling - SMOTE**

In [72]:
oversample = SMOTE()
X, y = oversample.fit_resample(X_train, y_train)

In [73]:
X_train_ov, X_test_ov, y_train_ov, y_test_ov = train_test_split(X, y, random_state=42, test_size=0.1)

In [75]:
y_pred_ov = svc_rbf.fit(X_train_ov, y_train_ov).predict(X_test_ov)

In [76]:
print(confusion_matrix(y_test_ov,y_pred_ov))
print(classification_report(y_test_ov, y_pred_ov))

[[567 139]
 [ 98 630]]
              precision    recall  f1-score   support

         0.0       0.85      0.80      0.83       706
         1.0       0.82      0.87      0.84       728

    accuracy                           0.83      1434
   macro avg       0.84      0.83      0.83      1434
weighted avg       0.84      0.83      0.83      1434



**With smote our recall and f1-score is higher than 80%**

## SVC_with_LDA

In [77]:
churn_norm_lda = pd.read_csv("Churn_norm_LDA.csv")

In [78]:
churn_norm_lda.head()

Unnamed: 0,CreditScore,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited,Gender_Int,Geography_Germany,Geography_Spain,Balance_Int,PC1
0,0.538,0.324324,0.2,0.0,0.0,1.0,1.0,0.506735,1.0,1.0,0.0,0.0,0.0,-0.244017
1,0.516,0.310811,0.1,0.334031,0.0,0.0,1.0,0.562709,0.0,1.0,0.0,1.0,1.0,-0.036741
2,0.304,0.324324,0.8,0.636357,0.666667,1.0,0.0,0.569654,1.0,1.0,0.0,0.0,1.0,0.868267
3,0.698,0.283784,0.1,0.0,0.333333,0.0,0.0,0.46912,0.0,1.0,0.0,0.0,0.0,0.388012
4,1.0,0.337838,0.2,0.500246,0.0,1.0,1.0,0.3954,0.0,1.0,0.0,1.0,1.0,-0.021615


In [79]:
# place target column at the end of the dataset
churn_norm_lda= churn_norm_lda[["CreditScore","Age","Tenure","Balance","NumOfProducts","HasCrCard",
                                "IsActiveMember","EstimatedSalary","Gender_Int","Geography_Germany",
                                "Geography_Spain","Balance_Int","PC1","Exited"]]

**Define X,y**

In [80]:
X = churn_norm_lda.iloc[:,:-1]
y = churn_norm_lda['Exited']

**Split the data into training and testing sets**

In [81]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.10, random_state=1, stratify=y)

In [82]:
svc_rbf = SVC()

In [83]:
param_grid = {'C': [1, 10],  
              'gamma': [1, 0.1, 0.01], 
              'kernel': ['rbf']}

In [84]:
svc_rbf_sel=GridSearchCV(svc_rbf, param_grid, refit=True, verbose=3)

In [85]:
svc_rbf_sel.fit(X_train, y_train)

Fitting 5 folds for each of 6 candidates, totalling 30 fits
[CV] C=1, gamma=1, kernel=rbf ........................................


[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.


[CV] ............ C=1, gamma=1, kernel=rbf, score=0.841, total=   2.9s
[CV] C=1, gamma=1, kernel=rbf ........................................


[Parallel(n_jobs=1)]: Done   1 out of   1 | elapsed:    2.8s remaining:    0.0s


[CV] ............ C=1, gamma=1, kernel=rbf, score=0.846, total=   2.9s
[CV] C=1, gamma=1, kernel=rbf ........................................


[Parallel(n_jobs=1)]: Done   2 out of   2 | elapsed:    5.7s remaining:    0.0s


[CV] ............ C=1, gamma=1, kernel=rbf, score=0.842, total=   3.4s
[CV] C=1, gamma=1, kernel=rbf ........................................
[CV] ............ C=1, gamma=1, kernel=rbf, score=0.844, total=   3.5s
[CV] C=1, gamma=1, kernel=rbf ........................................
[CV] ............ C=1, gamma=1, kernel=rbf, score=0.842, total=   2.7s
[CV] C=1, gamma=0.1, kernel=rbf ......................................
[CV] .......... C=1, gamma=0.1, kernel=rbf, score=0.831, total=   2.4s
[CV] C=1, gamma=0.1, kernel=rbf ......................................
[CV] .......... C=1, gamma=0.1, kernel=rbf, score=0.831, total=   2.6s
[CV] C=1, gamma=0.1, kernel=rbf ......................................
[CV] .......... C=1, gamma=0.1, kernel=rbf, score=0.832, total=   2.4s
[CV] C=1, gamma=0.1, kernel=rbf ......................................
[CV] .......... C=1, gamma=0.1, kernel=rbf, score=0.829, total=   2.4s
[CV] C=1, gamma=0.1, kernel=rbf ......................................
[CV] .

[Parallel(n_jobs=1)]: Done  30 out of  30 | elapsed:  1.5min finished


GridSearchCV(estimator=SVC(),
             param_grid={'C': [1, 10], 'gamma': [1, 0.1, 0.01],
                         'kernel': ['rbf']},
             verbose=3)

In [87]:
svc_rbf_sel.best_estimator_

SVC(C=10, gamma=1)

In [89]:
svc_rbf_sel.best_estimator_.score(X_test, y_test)

0.846

In [90]:
svc_rbf_sel.best_params_

{'C': 10, 'gamma': 1, 'kernel': 'rbf'}

In [91]:
svc_rbf = SVC(random_state=42,C=10, gamma=1, kernel="rbf")
svc_rbf.fit(X_train, y_train)

SVC(C=10, gamma=1, random_state=42)

In [92]:
svc_rbf_y_pred = svc_rbf.predict(X_test)
svc_rbf_score = round (svc_rbf.score(X_train, y_train) * 100, 2)
print(round(svc_rbf_score,2,), "%")

90.02 %


In [93]:
print("Accuracy:", metrics.accuracy_score(y_test, svc_rbf_y_pred))

Accuracy: 0.846


In [94]:
print(confusion_matrix(y_test, svc_rbf_y_pred))

[[757  39]
 [115  89]]


In [95]:
print(classification_report(y_test, svc_rbf_y_pred))

              precision    recall  f1-score   support

         0.0       0.87      0.95      0.91       796
         1.0       0.70      0.44      0.54       204

    accuracy                           0.85      1000
   macro avg       0.78      0.69      0.72      1000
weighted avg       0.83      0.85      0.83      1000



In [96]:
oversample = SMOTE()
X, y = oversample.fit_resample(X_train, y_train)

In [97]:
X_train_ov, X_test_ov, y_train_ov, y_test_ov = train_test_split(X, y, random_state=42, test_size=0.1)

In [98]:
y_pred_ov = svc_rbf.fit(X_train_ov, y_train_ov).predict(X_test_ov)

In [99]:
print(confusion_matrix(y_test_ov,y_pred_ov))
print(classification_report(y_test_ov, y_pred_ov))

[[576 130]
 [ 79 649]]
              precision    recall  f1-score   support

         0.0       0.88      0.82      0.85       706
         1.0       0.83      0.89      0.86       728

    accuracy                           0.85      1434
   macro avg       0.86      0.85      0.85      1434
weighted avg       0.86      0.85      0.85      1434



**Our model with LDA does improve our results**

___________________________