# ðŸš¨ Online Payment Fraud Detection  
### Machine Learning Model Training Notebook

This notebook includes:
- Data preprocessing
- Exploratory Data Analysis
- Model training & comparison
- Model evaluation
- Saving model for Flask deployment

Author: Fayaz


In [None]:
# Import required libraries

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
from sklearn.svm import SVC
from sklearn.ensemble import RandomForestClassifier
from sklearn.tree import DecisionTreeClassifier

import pickle

plt.style.use("ggplot")

In [None]:
# Load dataset
df = pd.read_csv("../dataset/PS_20174392719_1491204439457_log.csv")
df.head()

In [None]:
# Basic information
df.info()
df.describe()

In [None]:
# Data preprocessing

# Remove unnecessary columns
df.drop(['nameOrig', 'nameDest'], axis=1, inplace=True)

# Encode categorical variable
le = LabelEncoder()
df['type'] = le.fit_transform(df['type'])

df.head()

In [None]:
# Correlation heatmap
plt.figure(figsize=(10,8))
sns.heatmap(df.corr(), cmap="coolwarm")
plt.title("Correlation Heatmap")
plt.show()

In [None]:
# Split dataset
X = df.drop('isFraud', axis=1)
y = df['isFraud']

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

print("Training size:", X_train.shape)
print("Testing size:", X_test.shape)

In [None]:
# Train multiple models

models = {
    "SVC": SVC(),
    "RandomForest": RandomForestClassifier(),
    "DecisionTree": DecisionTreeClassifier()
}

for name, model in models.items():
    model.fit(X_train, y_train)
    preds = model.predict(X_test)
    print("Model:", name)
    print("Accuracy:", accuracy_score(y_test, preds))
    print("-" * 40)

In [None]:
# Final Model - SVC

final_model = SVC()
final_model.fit(X_train, y_train)

predictions = final_model.predict(X_test)

print("Final Model Accuracy:", accuracy_score(y_test, predictions))
print(classification_report(y_test, predictions))

In [None]:
# Save model for Flask deployment

pickle.dump(final_model, open("../model.pkl", "wb"))

print("Model saved successfully!")

## âœ… Conclusion

The SVC model was selected based on accuracy comparison.
The trained model is saved as `model.pkl` and used in the Flask application for real-time fraud detection.
