## Scikit-learn Support Vector Machines

Notebook for showcasing model usage

In [7]:
import os
import sys
import pandas as pd
# the models dir contains tf_gbt.py
sys.path.append(os.path.join(os.path.abspath(''), "../models"))
from sk_svm import SVM, load_datasets

### Model Configurations and Datasets

We wil declare configurations for the model 

In [12]:
data_path = "../data"

output_path = "../submissions"

# set parameters
select_features = ['CryoSleep','Age','RoomService','Cabin_num','FoodCourt', 'ShoppingMall', 'Spa', 'HomePlanet', 'Side', 'Deck', 'Transported', 'VRDeck','Destination']

label = 'Transported'

# hyperparameters
kernel = 'rbf'
C = 1.0
gamma = 0.1

Let's load the data sets

In [5]:
# load datasets
train_df, valid_df, test_df = load_datasets(data_path=data_path)

### Running scikit-learn Support Vector Machines experiments

We will first instantiate the sklearn SVM model using the predefined configurations and datasets. 

We will use the optimal hyperparameters previously generated to run a model experiment. 
GridSearchCV iteration takes up a long time and will not be demonstrated in this notebook

In [13]:
# experiment SVM models
clf = SVM(train_df=train_df,valid_df=valid_df, test_df=test_df,label=label)
clf.feature_selection(selected_features=select_features)
clf.prepare_data(encoder="OneHotEncoder")
clf.create_svm_model(kernel=kernel, C=C, gamma=gamma)
clf.run_experiments()

Accuracy metric using validation datasets for evaluation

In [21]:
# evaluate model using validation set and get model metrics
accuracy_score, f1_score, recall_score, precision_score = clf.evaluate()
print(f"""
      Accuracy score is {accuracy_score}; 
      F1 score is {f1_score}; 
      Recall score is {precision_score}; 
      Precision score is {recall_score}""")


      Accuracy score is 0.7508630609896433; 
      F1 score is 0.7512923607122344; 
      Recall score is 0.7908101571946796; 
      Precision score is 0.7155361050328227


Predict test data

In [15]:
# predict test data
predictions, output = clf.predict()

### Export output as csv for submission

In [None]:
# export output to path
# os.makedirs(output_path, exist_ok=True) 
# output.to_csv(os.path.join(os.path.abspath(''),output_path,"sk_svm.csv"),index=False)