# Identifying Fraudulent Credit Card Transactions Using a One-Class Support Vector Machine (SVM) Model

## Resources 

- [Geeks for Geeks - Understanding One-Class Support Vector Machines](https://www.geeksforgeeks.org/understanding-one-class-support-vector-machines/)
- [Scikit-learn - OneClassSVM Documentation](https://scikit-learn.org/stable/modules/generated/sklearn.svm.OneClassSVM.html)

## Load Dataset

In [1]:
import pandas as pd 

df = pd.read_csv("creditcard.csv")

target_feature = "Class"
input_features = list(df.columns)
input_features.remove(target_feature)

print(f"Target Variable: {target_feature}")
print(f"Input Variables: {input_features}")

Target Variable: Class
Input Variables: ['Time', 'V1', 'V2', 'V3', 'V4', 'V5', 'V6', 'V7', 'V8', 'V9', 'V10', 'V11', 'V12', 'V13', 'V14', 'V15', 'V16', 'V17', 'V18', 'V19', 'V20', 'V21', 'V22', 'V23', 'V24', 'V25', 'V26', 'V27', 'V28', 'Amount']


## Train / Test Split 

One-class SVMs are trained only on the "normal" class. This requires a little extra work to separate out a training class compromised of only legitimate transactions.

In [2]:
import numpy as np
from sklearn.model_selection import train_test_split

# Separating features and the target
X = df.drop(columns = ['Class'])
y = df['Class']

# Splitting data into training and testing sets, with only normal transactions (Class = 0) for training
X_train = X[y == 0]
X_test, y_test = X[y == 1].copy(), y[y == 1].copy()

# Include some normal transactions in the test set as well
X_train, X_test_normal, y_train, y_test_normal = train_test_split(X_train, 
                                                                  y[y == 0], 
                                                                  test_size= 0.2, 
                                                                  random_state = 42)


X_test = np.vstack([X_test, X_test_normal])
y_test = np.concatenate([y_test, y_test_normal])


## Data Scaling

In [3]:
from sklearn.preprocessing import StandardScaler

# Scaling the data
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)



## Fit Model and Predict Class

### One-Class SVM Hyperparameters 

| Hyperparameter | Default | Other Options | Description |
|----------------|---------|---------------|-------------|
| kernel         | `rbf`   | `linear`, `poly`, `rbf`, `sigmoid`, `precomputed` | Determines the transformation applied to the input data in a higher-dimensional space. |
| degree         | 3       | non-negative number | Only applied for the polynomial kernel to define the function's degree. |  
| gamma          | `scale` | `auto`, float | Influences the shape of the decision boundary. A smaller gamma value provides a broader decision boundary, which makes the model less sensitive to individual data points. A larger value creates a more complex decision boundary that is less sensitive to individual boundary points | 
| nu             | `0.5`   | float (0 - 1) | Provides an upper bound on the fraction of training errors and a lower bound of the fraction of support vectors. It allows users to control the balance between precision and recall in the model. A smaller nu value makes the algorithm more lenient, permitting a higher fraction of margin errors and support vectors, which can be useful in scenarios with a considerable number of anomalies. |


### Choosing a kernel function 

| Kernel    | Description |
|-----------|-------------|
| Linear Kernel (`linear`)  | Equivalent to performing a linear transformation. Suitable when the relationship between features is approximately linear. The decision boundary in the hyper-dimensional space is a hyperplane. |
| Polynomial Kernel (`poly`)    | Introduces non-linearity by considering both the dot product and higher-order interactions between features. Characterized by a user-defined degree parameter (degree). A higher degree allows the model to capture more complex relationships but may increase the risk of overfitting. |
| Radial Basis Function (`rbf`)   | Suitable for complex, non-linear relationships. Transforms data into a space where intricate decision boundaries can be draft. Useful when the exact form of relationships is unknown or intricate. |
| Sigmoid Kernel (`sigmoid`) | Suitable for scenarios where the data distribution is not well defines or exhibits sigmoidal patterns. Shape and position of the decision boundary are determined by `gamma` and `coef0`/ |



### TLDR 

Let's run with a linear kernel and default values, and see how things go. 

In [None]:
from sklearn.svm import OneClassSVM

# Instantiate model 
oc_svm = OneClassSVM(kernel = 'linear', 
                     nu = 0.5, 
                     gamma = "scale")

# Train model
oc_svm.fit(X_train_scaled)

# Predict outliers in the test set
predictions = oc_svm.predict(X_test_scaled)

# Convert predictions to binary: 1 for inliers (normal), -1 for outliers (fraud)
predictions = (predictions == -1).astype(int)

## Evaluate Model Performance

In [4]:
from sklearn.metrics import classification_report

print("Classification Report:\n", classification_report(y_test, predictions))

Classification Report:
               precision    recall  f1-score   support

           0       1.00      0.90      0.95     56863
           1       0.07      0.91      0.13       492

    accuracy                           0.90     57355
   macro avg       0.54      0.90      0.54     57355
weighted avg       0.99      0.90      0.94     57355



Initial result, rbf, nu = .1, gamme = .1
```
Classification Report:
               precision    recall  f1-score   support

           0       1.00      0.90      0.95     56863
           1       0.07      0.91      0.13       492

    accuracy                           0.90     57355
   macro avg       0.54      0.90      0.54     57355
weighted avg       0.99      0.90      0.94     57355
```

In [None]:
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import roc_auc_score


# Hyperparameter grid
param_grid = {
    'kernel': ['linear', 'rbf', 'poly'],
    'nu': [0.1, 0.2, 0.5],
    'gamma': ['scale', 'auto'],
    'degree': [2, 3, 4]  # Only relevant for poly kernel
}

# Cross-Validation and Grid Search
grid_search = GridSearchCV(OneClassSVM(), 
                           param_grid, 
                           cv = 5, 
                           scoring='roc_auc', 
                           verbose = 2, 
                           n_jobs = -1)

grid_search.fit(X_train_scaled)

# Best model
best_oc_svm = grid_search.best_estimator_
print(f"Best Parameters: {grid_search.best_params_}")

# Predictions
predictions = best_oc_svm.predict(X_test_scaled)
predictions = (predictions == -1).astype(int)

# Evaluation
print("Classification Report:\n", classification_report(y_test, predictions))
print("ROC-AUC Score:", roc_auc_score(y_test, predictions))