## Final Project - Support Vector Machine

### Model Development

Karyl Abigail Grasparil


In [1]:
import pandas as pd
import seaborn as sb
import random
from sklearn.svm import SVC
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, precision_score, recall_score, fbeta_score, roc_auc_score, classification_report, confusion_matrix
import matplotlib.pyplot as plt

## **Support Vector Machine**
Develop SVM using the sci-kit learn module within a single notebook.

### Question 1:

Split the data into training, validation, and testing sets. Try different proportions and justify the final choices.

In [2]:
health = pd.read_csv('fetal_health.csv')
health

Unnamed: 0,baseline value,accelerations,fetal_movement,uterine_contractions,light_decelerations,severe_decelerations,prolonged_decelerations,percentage_of_time_with_abnormal_short_term_variability,mean_value_of_short_term_variability,percentage_of_time_with_abnormal_long_term_variability,...,histogram_min,histogram_max,histogram_number_of_peaks,histogram_number_of_zeroes,histogram_mode,histogram_mean,histogram_median,histogram_variance,histogram_tendency,fetal_health
0,120,0.000,0.000,0.000,0.000,0.0,0.0,73,0.5,43,...,62,126,2,0,120,137,121,73,1,2
1,132,0.006,0.000,0.006,0.003,0.0,0.0,17,2.1,0,...,68,198,6,1,141,136,140,12,0,1
2,133,0.003,0.000,0.008,0.003,0.0,0.0,16,2.1,0,...,68,198,5,1,141,135,138,13,0,1
3,134,0.003,0.000,0.008,0.003,0.0,0.0,16,2.4,0,...,53,170,11,0,137,134,137,13,1,1
4,132,0.007,0.000,0.008,0.000,0.0,0.0,16,2.4,0,...,53,170,9,0,137,136,138,11,1,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2121,140,0.000,0.000,0.007,0.000,0.0,0.0,79,0.2,25,...,137,177,4,0,153,150,152,2,0,2
2122,140,0.001,0.000,0.007,0.000,0.0,0.0,78,0.4,22,...,103,169,6,0,152,148,151,3,1,2
2123,140,0.001,0.000,0.007,0.000,0.0,0.0,79,0.4,20,...,103,170,5,0,153,148,152,4,1,2
2124,140,0.001,0.000,0.006,0.000,0.0,0.0,78,0.4,27,...,103,169,6,0,152,147,151,4,1,2


In [3]:
#Split the data into training, validation, and testing sets
features = health.drop(labels = "fetal_health", axis = 1)
labels = health["fetal_health"]

#Initial split for training
features_train, features_temp, labels_train, labels_temp = train_test_split(features, labels, stratify = labels, test_size = 0.4, train_size = 0.6)

#Split for validation and testing
features_validation, features_test, labels_validation, labels_test = train_test_split(features_temp, labels_temp, stratify = labels_temp, test_size = 0.5)

print("Number of features_train data records:", len(features_train))
print("Number of features_validation data records:", len(features_validation))
print("Number of features_test data records:", len(features_test))
print("Number of labels_train data records:", len(labels_train))
print("Number of labels_validation data records:", len(labels_validation))
print("Number of labels_test data records:", len(labels_test))

Number of features_train data records: 1275
Number of features_validation data records: 425
Number of features_test data records: 426
Number of labels_train data records: 1275
Number of labels_validation data records: 425
Number of labels_test data records: 426


Since there is a class imbalance in the dataset, I opted to do stratified sampling rather than random sampling. I opted for stratified sampling to ensure that there is a proportional amount of classes included in the training, validation, and test data. As for the proportion, I tried doing 80/10/10 and 60/20/20 split but I ultimately decided to settle on a 70/15/15 split. This is a common ratio used in splitting data and with the numbers indicated in the cell above, I believe it is an appropriate proportion to stick with.

### Question 2:

Experiment with different values of the C parameter; try the linear, rbf (with different choices of gamma) and polynomial kernels (with different degrees); try both options for decision_function_shape. Keep your best two models.

In [4]:
# Define search space
param_dist = {
    'C': [0.5, 1, 1.5, 2, 2.5, 3, 3.5],
    'kernel': ['linear', 'poly', 'rbf'],
    'decision_function_shape': ['ovo', 'ovr'],
    'degree': [3, 4, 5, 6]  # only applicable for poly kernel
}

# List to store models
health_model = []

# Create SVM models using specified hyperparameters
for i in range(10):  # Generate 10 models
    hyperparams = {param: random.choice(values) for param, values in param_dist.items()}
    if hyperparams['kernel'] != 'poly':  # If the kernel is not 'poly', remove 'degree' parameter
        hyperparams.pop('degree', None)
    model = SVC(**hyperparams)
    health_model.append({'model': model, 'model_number': i+1})

# Prints out the values of the different models based on the for loop above
for model_info in health_model:
    model = model_info['model']
    model_number = model_info['model_number']
    print(f"\nModel Number: {model_number} \nParameters: {model.get_params()}")


Model Number: 1 
Parameters: {'C': 2, 'break_ties': False, 'cache_size': 200, 'class_weight': None, 'coef0': 0.0, 'decision_function_shape': 'ovo', 'degree': 4, 'gamma': 'scale', 'kernel': 'poly', 'max_iter': -1, 'probability': False, 'random_state': None, 'shrinking': True, 'tol': 0.001, 'verbose': False}

Model Number: 2 
Parameters: {'C': 3, 'break_ties': False, 'cache_size': 200, 'class_weight': None, 'coef0': 0.0, 'decision_function_shape': 'ovo', 'degree': 3, 'gamma': 'scale', 'kernel': 'rbf', 'max_iter': -1, 'probability': False, 'random_state': None, 'shrinking': True, 'tol': 0.001, 'verbose': False}

Model Number: 3 
Parameters: {'C': 3, 'break_ties': False, 'cache_size': 200, 'class_weight': None, 'coef0': 0.0, 'decision_function_shape': 'ovr', 'degree': 3, 'gamma': 'scale', 'kernel': 'linear', 'max_iter': -1, 'probability': False, 'random_state': None, 'shrinking': True, 'tol': 0.001, 'verbose': False}

Model Number: 4 
Parameters: {'C': 0.5, 'break_ties': False, 'cache_siz

In [20]:
# Initialize lists to store evaluation metrics for the top models
top_models = []
top_metrics = []

# Define weights for each metric
weights = {
    'accuracy': 1,
    'precision': 1,
    'recall': 1,
    'fbeta_score': 2, # Giving Fbeta-score more weight
    'auc_roc': 1
}

# Evaluate each model in health_model list
for model_info in health_model:
    model = model_info['model']
    model_number = model_info['model_number']
    
    # Get model parameters
    params = model.get_params()
    
    # Update parameters with probability=True
    params.update({'probability': True})
    
    # Initialize SVC model with updated parameters
    model = SVC(**params)
    
    # Train the model
    model.fit(features_train, labels_train)
    
    # Evaluate on training set
    train_labels_pred = model.predict(features_train)
    train_accuracy = accuracy_score(labels_train, train_labels_pred)
    train_precision = precision_score(labels_train, train_labels_pred, average='weighted', zero_division=0)
    train_recall = recall_score(labels_train, train_labels_pred, average='weighted')
    train_fbeta_score = fbeta_score(labels_train, train_labels_pred, average='weighted', zero_division=0, beta=2)
    train_auc_roc = roc_auc_score(labels_train, model.predict_proba(features_train), multi_class='ovr')
    
    # Predict labels on validation set
    labels_pred = model.predict(features_validation)
    
    # Calculate evaluation metrics on validation set
    accuracy = accuracy_score(labels_validation, labels_pred)
    precision = precision_score(labels_validation, labels_pred, average='weighted', zero_division=0)
    recall = recall_score(labels_validation, labels_pred, average='weighted')
    fbeta_score_value = fbeta_score(labels_validation, labels_pred, average='weighted', zero_division=0, beta=2)
    auc_roc = roc_auc_score(labels_validation, model.predict_proba(features_validation), multi_class='ovr')

    # Summing evaluation metrics to determine best models
    evaluation_metric = (
        weights['accuracy'] * accuracy +
        weights['precision'] * precision +
        weights['recall'] * recall +
        weights['fbeta_score'] * fbeta_score_value +
        weights['auc_roc'] * auc_roc
    )
    
    # Update the top 2 models if necessary
    if len(top_models) < 2:
        top_models.append((model, model_number))
        top_metrics.append(evaluation_metric)
    else:
        min_index = top_metrics.index(min(top_metrics))
        if evaluation_metric > top_metrics[min_index]:
            top_models[min_index] = (model, model_number)
            top_metrics[min_index] = evaluation_metric

    # Print evaluation metrics for the current model
    print(f"\nModel Number {model_number}:")
    print("Parameters:", model.get_params())
    print("Evaluation Metrics on Training Set:")
    print(f"  Accuracy: {train_accuracy}")
    print(f"  Precision: {train_precision}")
    print(f"  Recall: {train_recall}")
    print(f"  F-beta Score: {train_fbeta_score}")
    print(f"  AUC-ROC: {train_auc_roc}")
    print("\nEvaluation Metrics on Validation Set:")
    print(f"  Accuracy: {accuracy}")
    print(f"  Precision: {precision}")
    print(f"  Recall: {recall}")
    print(f"  F-beta Score: {fbeta_score_value}")
    print(f"  AUC-ROC: {auc_roc}")        
    
# Sort models by evaluation metric
sorted_models = sorted(zip(top_models, top_metrics), key=lambda x: x[1], reverse=True)

# Print the models sorted by evaluation metric
for i, ((model, model_number), evaluation_metric) in enumerate(sorted_models, start=1):
    print(f"\nModel {i} (Model Number {model_number}):")
    print("Parameters:", model.get_params())
    print("Evaluation Metrics on Training Set:")
    print(f"  Accuracy: {train_accuracy}")
    print(f"  Precision: {train_precision}")
    print(f"  Recall: {train_recall}")
    print(f"  F-beta Score: {train_fbeta_score}")
    print(f"  AUC-ROC: {train_auc_roc}")
    print("\nEvaluation Metrics on Validation Set:")
    print(f"  Accuracy: {accuracy}")
    print(f"  Precision: {precision}")
    print(f"  Recall: {recall}")
    print(f"  F-beta Score: {fbeta_score_value}")
    print(f"  AUC-ROC: {auc_roc}")   
    
# Store the top models in separate variables
top_model_1, top_model_2 = [model_info[0] for model_info in sorted_models[:2]]

# Access individual models and their model numbers
model_1, model_number_1 = top_model_1[0], top_model_1[1]
model_2, model_number_2 = top_model_2[0], top_model_2[1]



Model Number 1:
Parameters: {'C': 2, 'break_ties': False, 'cache_size': 200, 'class_weight': None, 'coef0': 0.0, 'decision_function_shape': 'ovo', 'degree': 4, 'gamma': 'scale', 'kernel': 'poly', 'max_iter': -1, 'probability': True, 'random_state': None, 'shrinking': True, 'tol': 0.001, 'verbose': False}
Evaluation Metrics on Training Set:
  Accuracy: 0.8925490196078432
  Precision: 0.8879800526271115
  Recall: 0.8925490196078432
  F-beta Score: 0.8907386928422389
  AUC-ROC: 0.9539625295920336

Evaluation Metrics on Validation Set:
  Accuracy: 0.8494117647058823
  Precision: 0.8511903815580287
  Recall: 0.8494117647058823
  F-beta Score: 0.8492682144674406
  AUC-ROC: 0.9245254362507308

Model Number 2:
Parameters: {'C': 3, 'break_ties': False, 'cache_size': 200, 'class_weight': None, 'coef0': 0.0, 'decision_function_shape': 'ovo', 'degree': 3, 'gamma': 'scale', 'kernel': 'rbf', 'max_iter': -1, 'probability': True, 'random_state': None, 'shrinking': True, 'tol': 0.001, 'verbose': False

### Question 3:

Use the classification_report and confusion_matrix functions in sklearn to display metrics for your best models, using training and validation (but not testing) data.

In [21]:
# Loop through sorted_models and generate reports for each model
for i, ((model, model_number), evaluation_metric) in enumerate(sorted_models, start=1):
    print(f"\nModel {i} (Model Number {model_number}):")
    print("Parameters:", model.get_params())
    print("Evaluation Metrics on Training Set:")
    print(f"  Accuracy: {train_accuracy}")
    print(f"  Precision: {train_precision}")
    print(f"  Recall: {train_recall}")
    print(f"  F-beta Score: {train_fbeta_score}")
    print(f"  AUC-ROC: {train_auc_roc}")
    print("\nEvaluation Metrics on Validation Set:")
    print(f"  Accuracy: {accuracy}")
    print(f"  Precision: {precision}")
    print(f"  Recall: {recall}")
    print(f"  F-beta Score: {fbeta_score_value}")
    print(f"  AUC-ROC: {auc_roc}")
    
    train_labels_pred = model.predict(features_train)
    labels_pred = model.predict(features_validation)
    
    # Classification report
    print("\nClassification Report:")
    print(f"Training Classification Report:\n", classification_report(labels_train, train_labels_pred))
    print(f"\nValidation Classification Report:\n", classification_report(labels_validation, labels_pred))
    
    # Confusion matrix
    print("Confusion Matrix:")
    print(f"Training Confusion Matrix:\n", confusion_matrix(labels_train, train_labels_pred))
    print(f"\nValidation Confusion Matrix:\n", confusion_matrix(labels_validation, labels_pred))


Model 1 (Model Number 6):
Parameters: {'C': 3.5, 'break_ties': False, 'cache_size': 200, 'class_weight': None, 'coef0': 0.0, 'decision_function_shape': 'ovr', 'degree': 6, 'gamma': 'scale', 'kernel': 'poly', 'max_iter': -1, 'probability': True, 'random_state': None, 'shrinking': True, 'tol': 0.001, 'verbose': False}
Evaluation Metrics on Training Set:
  Accuracy: 0.8854901960784314
  Precision: 0.8763238346182958
  Recall: 0.8854901960784314
  F-beta Score: 0.8802693218014694
  AUC-ROC: 0.9509593806187544

Evaluation Metrics on Validation Set:
  Accuracy: 0.8494117647058823
  Precision: 0.8388932680764614
  Recall: 0.8494117647058823
  F-beta Score: 0.8458515674380112
  AUC-ROC: 0.9254076913232371

Classification Report:
Training Classification Report:
               precision    recall  f1-score   support

           1       0.93      0.98      0.95       992
           2       0.81      0.59      0.68       177
           3       0.89      0.83      0.86       106

    accuracy     

### Question 4:

Select and justify your final choice of hyperparameters based on the training and validation metrics. Provide a written analysis in markdown.

Based on the training and validation metrics presented in the classification report, **Model 1 (Model number 6)** stands out as the preferred choice. This model is characterized by the following parameters: {'C': 3.5, 'break_ties': False, 'cache_size': 200, 'class_weight': None, 'coef0': 0.0, 'decision_function_shape': 'ovr', 'degree': 6, 'gamma': 'scale', 'kernel': 'poly', 'max_iter': -1, 'probability': True, 'random_state': None, 'shrinking': True, 'tol': 0.001, 'verbose': False}. It possesses a polynomial kernel with a degree of 6, enabling it to capture intricate data relationships more effectively compared to the other model's linear kernel. Although both models exhibit identical evaluation metrics, Model 1's higher degree of freedom suggests superior performance on the training data. Therefore, Model 1 is deemed the final choice due to its potential for better fitting the data and capturing complex patterns.

## **Comparison**

Choose between the decision tree and SVM model. Write this section at the bottom of the chosen notebook.

### Question 1:

Provide a written comparison of the training and validation metrics for the best SVM and decision tree models using markdown. Select the best model and justify your choice.

In comparing the performance of the best models for Decision Tree and SVM, the SVM model emerges as the clear winner across all evaluation metrics.

Decision Tree (Model 1):

Evaluation Metrics on Training Set:
- Accuracy: 0.8721568627450981
- Precision: 0.8816984936456731
- Recall: 0.8721568627450981
- F-beta Score: 0.8715080595389335
- AUC-ROC: 0.8634483751819216

Evaluation Metrics on Validation Set:
- Accuracy: 0.8352941176470589
- Precision: 0.8514797427518015
- Recall: 0.8352941176470589
- F-beta Score: 0.8368139930644299
- AUC-ROC: 0.8443883953403549


SVM (Model 1):

Evaluation Metrics on Training Set:
- Accuracy: 0.8854901960784314
- Precision: 0.8763238346182958
- Recall: 0.8854901960784314
- F-beta Score: 0.8802693218014694
- AUC-ROC: 0.9509593806187544

Evaluation Metrics on Validation Set:
-  Accuracy: 0.8494117647058823
- Precision: 0.8388932680764614
- Recall: 0.8494117647058823
- F-beta Score: 0.8458515674380112
- AUC-ROC: 0.9254076913232371

The SVM model consistently outperforms the Decision Tree model, achieving higher values across all metrics. Additionally, the confusion matrix analysis underscores the SVM model's superior accuracy in predicting class labels, especially for class 3 ("Pathological"), a critical aspect for precise diagnostics.

In summary, the SVM model demonstrates superior performance and reliability compared to the Decision Tree model, making it the better and preferred choice.

### Question 2:

Use your selected model to make predictions on the test set. Use the classification_report and confusion_matrix functions in sklearn to display metrics for the test data.

In [18]:
((model, model_number), evaluation_metric) = sorted_models[0]

# Print the evaluation report for the first model
print(f"\nModel 1 (Model Number {model_number}):")
print("Evaluation Metrics:")
print(f"  Accuracy: {accuracy}")
print(f"  Precision: {precision}")
print(f"  Recall: {recall}")
print(f"  F-beta Score: {fbeta_score_value}")
print(f"  AUC-ROC: {auc_roc}")

labels_pred_test = model.predict(features_test)

# Classification and Confusion Matrix
print("\nClassification Report:")
print(f"Test Classification Report:\n", classification_report(labels_test, labels_pred_test))
print("\nConfusion Matrix:")
print(f"Test Confusion Matrix:\n", confusion_matrix(labels_test, labels_pred_test))


Model 1 (Model Number 6):
Evaluation Metrics:
  Accuracy: 0.8494117647058823
  Precision: 0.8388932680764614
  Recall: 0.8494117647058823
  F-beta Score: 0.8458515674380112
  AUC-ROC: 0.9255452311419495

Classification Report:
Test Classification Report:
               precision    recall  f1-score   support

           1       0.91      0.96      0.93       332
           2       0.67      0.51      0.58        59
           3       0.81      0.71      0.76        35

    accuracy                           0.88       426
   macro avg       0.79      0.73      0.76       426
weighted avg       0.87      0.88      0.87       426


Confusion Matrix:
Test Confusion Matrix:
 [[318  12   2]
 [ 25  30   4]
 [  7   3  25]]


### Question 3:

Use markdown to review the best model; restate important metrics and describe how well it will work for your use case.

Based on the evaluation metrics from the test data, the model demonstrates good performance in classifying the data into three categories: 1: Healthy, 2: Suspect, and 3: Pathological. In the context of this dataset, Recall is a critical metric, as it is crucial in medical scenarios to minimize false negatives. In this case, it is particularly important that “Pathological” cases are not misclassified as “Healthy”. The model successfully identifies 84.94% of actual positive instances.

However, Precision is also an important metric that should not be overlooked. With this in mind, the F-beta score, which considers both Precision and Recall, is a valuable metric. In this case, its value of 84.58% indicates a good balance between Recall and Precision.

The AUC-ROC, which measures how well the model distinguishes between classes, has an excellent value of 0.9255. This suggests that the model is highly capable of differentiating between the classes.

Additionally, the confusion matrix further illustrates the model’s performance, showing minimal misclassifications, particularly for classes 1 (Healthy) and 3 (Pathological), which is crucial for accurate monitoring.

Despite these positive results, the model could still be improved. Notably, there were still 7 patients who were actually “Pathological” but were misclassified as “Healthy”. This area should be a focus for future model refinement.