Load Dataset and Look Shape(rows,columns), take top 5 rows to understand how data looks. count no of fraud and not fraud.

In [2]:
import pandas as pd
from sklearn.model_selection import train_test_split #for data spliting ino train and test set

# Load the dataset
df = pd.read_csv("payment.csv")

# How big is it?
print("Shape:", df.shape)

# What are the columns?
print("\nColumns:", df.columns.tolist())

# First 5 rows
print("\nFirst 5 rows:")
print(df.head())

# Count fraud vs not fraud
print(df['isFraud'].value_counts())

print("------------------")

# Show as percentage
print(df['isFraud'].value_counts(normalize=True) * 100)

Shape: (6362620, 11)

Columns: ['step', 'type', 'amount', 'nameOrig', 'oldbalanceOrg', 'newbalanceOrig', 'nameDest', 'oldbalanceDest', 'newbalanceDest', 'isFraud', 'isFlaggedFraud']

First 5 rows:
   step      type    amount     nameOrig  oldbalanceOrg  newbalanceOrig  \
0     1   PAYMENT   9839.64  C1231006815       170136.0       160296.36   
1     1   PAYMENT   1864.28  C1666544295        21249.0        19384.72   
2     1  TRANSFER    181.00  C1305486145          181.0            0.00   
3     1  CASH_OUT    181.00   C840083671          181.0            0.00   
4     1   PAYMENT  11668.14  C2048537720        41554.0        29885.86   

      nameDest  oldbalanceDest  newbalanceDest  isFraud  isFlaggedFraud  
0  M1979787155             0.0             0.0        0               0  
1  M2044282225             0.0             0.0        0               0  
2   C553264065             0.0             0.0        1               0  
3    C38997010         21182.0             0.0        1 

FIND TRANSCATIONS TYPE

In [3]:
# What transaction types exist?
print("Transaction Types:")
print(df['type'].value_counts())

print("\n------------------")

# How much fraud is in each type?
print("\nFraud Count by Transaction Type:")
print(df.groupby('type')['isFraud'].sum())

Transaction Types:
type
CASH_OUT    2237500
PAYMENT     2151495
CASH_IN     1399284
TRANSFER     532909
DEBIT         41432
Name: count, dtype: int64

------------------

Fraud Count by Transaction Type:
type
CASH_IN        0
CASH_OUT    4116
DEBIT          0
PAYMENT        0
TRANSFER    4097
Name: isFraud, dtype: int64


Next Step-4:
Now let's look at the amount column. Let's compare:
How much money is involved in normal transactions?
How much money is involved in fraud transactions?m

In [4]:
# Separate fraud and non-fraud
fraud = df[df['isFraud'] == 1]
legit = df[df['isFraud'] == 0]

# Compare amounts
print("üí∞ LEGITIMATE transactions ‚Äî Amount Stats:")
print(legit['amount'].describe().round(2))

print("\nüö® FRAUD transactions ‚Äî Amount Stats:")
print(fraud['amount'].describe().round(2))

üí∞ LEGITIMATE transactions ‚Äî Amount Stats:
count     6354407.00
mean       178197.04
std        596236.98
min             0.01
25%         13368.40
50%         74684.72
75%        208364.76
max      92445516.64
Name: amount, dtype: float64

üö® FRAUD transactions ‚Äî Amount Stats:
count        8213.00
mean      1467967.30
std       2404252.95
min             0.00
25%        127091.33
50%        441423.44
75%       1517771.48
max      10000000.00
Name: amount, dtype: float64


Now let's look at the balance columns. Specifically this question:

After a fraud transaction, does the sender's balance go to zero?

In [5]:
fraud = df[df['isFraud'] == 1]

# After fraud ‚Äî does sender balance become zero?
print("Fraud transactions where sender's NEW balance = 0:")
zero_balance = (fraud['newbalanceOrig'] == 0).sum()
print(f"{zero_balance} out of {len(fraud)} fraud transactions")

print("\n------------------")

# Show a few fraud rows to see the pattern
print("\nSample fraud transactions (balance columns):")
print(fraud[['amount','oldbalanceOrg','newbalanceOrig']].head(10))

Fraud transactions where sender's NEW balance = 0:
8053 out of 8213 fraud transactions

------------------

Sample fraud transactions (balance columns):
          amount  oldbalanceOrg  newbalanceOrig
2         181.00         181.00             0.0
3         181.00         181.00             0.0
251      2806.00        2806.00             0.0
252      2806.00        2806.00             0.0
680     20128.00       20128.00             0.0
681     20128.00       20128.00             0.0
724    416001.33           0.00             0.0
969   1277212.77     1277212.77             0.0
970   1277212.77     1277212.77             0.0
1115    35063.63       35063.63             0.0


What Did We Just Find?
8,053 out of 8,213 fraud transactions leave the sender's balance at zero!That's 98% of all fraud cases!
This is the fraudster's fingerprint. They don't steal a little ‚Äî they wipe the account clean.
üí° This Gives Us a New Feature Idea
Later when we prepare data for the model, we can create a new column:
balanceError = oldbalanceOrg - amount - newbalanceOrig
For fraud this should be close to 0 (because the math checks out perfectly).
For legitimate transactions it may be off sometimes.
This is called Feature Engineering ‚Äî creating new smart columns from existing ones. We'll do this later!
üìç Where Are We So Far?
We've explored the data and found 3 golden insights:
#Insights:
1)Only 0.13% of transactions are fraud (class imbalance)
2)Fraud happens only in CASH_OUT and TRANSFER
3)98% of fraud drains the sender's account to zero
These insights will directly help our model later!
We are now done with data exploration! üéâ

# Phase2: Data Preparation
step1:First small task drop the columns we don't need.

In [6]:
print("Columns BEFORE:", df.columns.tolist())
print("Shape BEFORE:", df.shape)
# Drop columns we don't need
# nameOrig, nameDest ‚Üí just ID numbers, no useful pattern
# isFlaggedFraud ‚Üí bank's old system, we don't need it
df = df.drop(columns=['nameOrig', 'nameDest', 'isFlaggedFraud'])

print("\nColumns AFTER:", df.columns.tolist())
print("Shape AFTER:", df.shape)

Columns BEFORE: ['step', 'type', 'amount', 'nameOrig', 'oldbalanceOrg', 'newbalanceOrig', 'nameDest', 'oldbalanceDest', 'newbalanceDest', 'isFraud', 'isFlaggedFraud']
Shape BEFORE: (6362620, 11)

Columns AFTER: ['step', 'type', 'amount', 'oldbalanceOrg', 'newbalanceOrig', 'oldbalanceDest', 'newbalanceDest', 'isFraud']
Shape AFTER: (6362620, 8)


Step2:-Remember the type column? It has text values like "CASH_OUT", "TRANSFER" etc.
Machine Learning models only understand numbers ‚Äî not text!
So we need to convert text ‚Üí numbers. This is called Encoding.

In [7]:
# Before encoding
print("BEFORE encoding:")
print(df['type'].value_counts())

# Convert text to numbers
df['type'] = df['type'].map({
    'CASH_IN'  : 0,
    'CASH_OUT' : 1,
    'DEBIT'    : 2,
    'PAYMENT'  : 3,
    'TRANSFER' : 4
})

# After encoding
print("\nAFTER encoding:")
print(df['type'].value_counts())

print("\nFirst 5 rows:")
print(df.head())

BEFORE encoding:
type
CASH_OUT    2237500
PAYMENT     2151495
CASH_IN     1399284
TRANSFER     532909
DEBIT         41432
Name: count, dtype: int64

AFTER encoding:
type
1    2237500
3    2151495
0    1399284
4     532909
2      41432
Name: count, dtype: int64

First 5 rows:
   step  type    amount  oldbalanceOrg  newbalanceOrig  oldbalanceDest  \
0     1     3   9839.64       170136.0       160296.36             0.0   
1     1     3   1864.28        21249.0        19384.72             0.0   
2     1     4    181.00          181.0            0.00             0.0   
3     1     1    181.00          181.0            0.00         21182.0   
4     1     3  11668.14        41554.0        29885.86             0.0   

   newbalanceDest  isFraud  
0             0.0        0  
1             0.0        0  
2             0.0        1  
3             0.0        1  
4             0.0        0  


We converted text ‚Üí numbers so the model can understand:
Text        Number
CASH_IN       0
CASH_OUT      1
DEBIT         2
PAYMENT       3
TRANSFER      4
The counts stayed exactly the same ‚Äî we just replaced the labels with numbers. Nothing else changed!

Step-3:-
Remember the golden insight we found earlier?

98% of fraud drains the account to zero ‚Äî amount = oldbalanceOrg

Let's now create a new smart column that captures this pattern:
balanceErrorOrig = oldbalanceOrg - amount - newbalanceOrig

For fraud ‚Üí this should be close to 0 (account perfectly drained)
For legit ‚Üí this may vary

We'll create one for the sender and one for the recipient.

In [8]:
df['type'] = df['type'].map({
    'CASH_IN'  : 0,
    'CASH_OUT' : 1,
    'DEBIT'    : 2,
    'PAYMENT'  : 3,
    'TRANSFER' : 4
})

# Create new smart features
df['balanceErrorOrig'] = df['oldbalanceOrg'] - df['amount'] - df['newbalanceOrig']
df['balanceErrorDest'] = df['oldbalanceDest'] + df['amount'] - df['newbalanceDest']

# Check how these look for fraud vs legit
fraud = df[df['isFraud'] == 1]
legit = df[df['isFraud'] == 0]

print("balanceErrorOrig:")
print(f"  Fraud mean : {fraud['balanceErrorOrig'].mean():.2f}")
print(f"  Legit mean : {legit['balanceErrorOrig'].mean():.2f}")

print("\nbalanceErrorDest:")
print(f"  Fraud mean : {fraud['balanceErrorDest'].mean():.2f}")
print(f"  Legit mean : {legit['balanceErrorDest'].mean():.2f}")

print("\nNew shape:", df.shape)
print("\nColumns now:", df.columns.tolist())

balanceErrorOrig:
  Fraud mean : -10692.33
  Legit mean : -201338.56

balanceErrorDest:
  Fraud mean : 732509.30
  Legit mean : 54692.23

New shape: (6362620, 10)

Columns now: ['step', 'type', 'amount', 'oldbalanceOrg', 'newbalanceOrig', 'oldbalanceDest', 'newbalanceDest', 'isFraud', 'balanceErrorOrig', 'balanceErrorDest']


Step-4:-
Split the Data
Before training any model we need to split data into two parts:

Training set (80%) ‚Üí Model learns from this
Test set (20%) ‚Üí We test how good the model is on data it has never seen.

In next cell we train the model on logistic regression.

In [1]:
import pandas as pd
from sklearn.model_selection import train_test_split    # ‚Üê was missing!

df = pd.read_csv("payment.csv")
df = df.drop(columns=['nameOrig', 'nameDest', 'isFlaggedFraud'])
df['type'] = df['type'].map({
    'CASH_IN'  : 0,
    'CASH_OUT' : 1,
    'DEBIT'    : 2,
    'PAYMENT'  : 3,
    'TRANSFER' : 4
})
df['balanceErrorOrig'] = df['oldbalanceOrg'] - df['amount'] - df['newbalanceOrig']
df['balanceErrorDest'] = df['oldbalanceDest'] + df['amount'] - df['newbalanceDest']

# Separate features (X) and target (y)
X = df.drop(columns=['isFraud'])
y = df['isFraud']

print("X shape (features):", X.shape)
print("y shape (target)  :", y.shape)

# Split into train and test
X_train, X_test, y_train, y_test = train_test_split(
    X, y,
    test_size=0.2,
    random_state=42,
    stratify=y
)

print("\nTraining set size :", X_train.shape[0])
print("Test set size     :", X_test.shape[0])

print("\nFraud in training set:", y_train.sum())
print("Fraud in test set    :", y_test.sum())

X shape (features): (6362620, 9)
y shape (target)  : (6362620,)

Training set size : 5090096
Test set size     : 1272524

Fraud in training set: 6570
Fraud in test set    : 1643


In [3]:
from sklearn.linear_model import LogisticRegression
# Train the model
print("Training model... (this may take a minute)")
model = LogisticRegression(max_iter=1000) 
#Logistic Regression learns by repeatedly adjusting itself to get better ‚Äî like a student correcting mistakes after each practice test.
#- Each correction = 1 iteration
#- max_iter=1000` ‚Üí allow up to 1000 corrections before stopping

model.fit(X_train, y_train)
#- X_train ‚Üí show the model the transaction details
#- y_train ‚Üí show the model the correct answers (fraud or not)
#- .fit() ‚Üí fit the model to the data = learn the patterns


print("‚úÖ Model trained successfully!")

# Quick check ‚Äî score on test set
score = model.score(X_test, y_test)
print(f"\nAccuracy: {score * 100:.2f}%")
print("\nNOTE: Accuracy alone is misleading for fraud detection!")
print("We will evaluate properly in the next step.")

Training model... (this may take a minute)
‚úÖ Model trained successfully!

Accuracy: 99.82%

‚ö†Ô∏è NOTE: Accuracy alone is misleading for fraud detection!
We will evaluate properly in the next step.


Step-5:- Proper Evaluation
we'll use better metrics ‚Äî Precision, Recall and F1 Score. These actually tell us how many frauds we caught!
Accuracy = 99.87% ‚Üê looks great but catches ZERO fraud! So look for better metrics.

In [4]:
from sklearn.metrics import classification_report, confusion_matrix
# Proper evaluation
y_pred = model.predict(X_test)

print("Confusion Matrix:")
print(confusion_matrix(y_test, y_pred))

print("\nDetailed Report:")
print(classification_report(y_test, y_pred, target_names=['Legit','Fraud']))

Confusion Matrix:
[[1269575    1306]
 [    925     718]]

Detailed Report:
              precision    recall  f1-score   support

       Legit       1.00      1.00      1.00   1270881
       Fraud       0.35      0.44      0.39      1643

    accuracy                           1.00   1272524
   macro avg       0.68      0.72      0.70   1272524
weighted avg       1.00      1.00      1.00   1272524



By this we understand Logistic Regression is confused by this imbalance ‚Äî it barely sees fraud examples so it doesn't learn fraud patterns well enough. We need a more powerful model. Next we'll try Random Forest ‚Äî which is much better at handling imbalanced data and finding complex fraud patterns.

In [1]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report, confusion_matrix

df = pd.read_csv("payment.csv")
df = df.drop(columns=['nameOrig', 'nameDest', 'isFlaggedFraud'])
df['type'] = df['type'].map({
    'CASH_IN'  : 0,
    'CASH_OUT' : 1,
    'DEBIT'    : 2,
    'PAYMENT'  : 3,
    'TRANSFER' : 4
})
df['balanceErrorOrig'] = df['oldbalanceOrg'] - df['amount'] - df['newbalanceOrig']
df['balanceErrorDest'] = df['oldbalanceDest'] + df['amount'] - df['newbalanceDest']

X = df.drop(columns=['isFraud'])
y = df['isFraud']

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

# Random Forest
print("Training Random Forest... (may take 2-3 minutes)")
model = RandomForestClassifier(n_estimators=100, random_state=42, n_jobs=-1)
model.fit(X_train, y_train)
print("‚úÖ Done!")

y_pred = model.predict(X_test)

print("\nConfusion Matrix:")
print(confusion_matrix(y_test, y_pred))

print("\nDetailed Report:")
print(classification_report(y_test, y_pred, target_names=['Legit','Fraud']))

Training Random Forest... (may take 2-3 minutes)
‚úÖ Done!

Confusion Matrix:
[[1270881       0]
 [      4    1639]]

Detailed Report:
              precision    recall  f1-score   support

       Legit       1.00      1.00      1.00   1270881
       Fraud       1.00      1.00      1.00      1643

    accuracy                           1.00   1272524
   macro avg       1.00      1.00      1.00   1272524
weighted avg       1.00      1.00      1.00   1272524



These results look incredible.The model only missed 4 frauds out of 1,643. That's 99.75% recall on fraud!
In real world fraud detection, results this perfect usually mean one thing ‚Äî Overfitting.
 How Do We Check If It's Overfitting

In [2]:
# Check training data performance
y_train_pred = model.predict(X_train)

# Check test data performance  
y_test_pred = model.predict(X_test)

print("üìä TRAINING DATA Performance:")
print(classification_report(y_train, y_train_pred, target_names=['Legit','Fraud']))

print("üìä TEST DATA Performance:")
print(classification_report(y_test, y_test_pred, target_names=['Legit','Fraud']))

print("üí° If training is much better than test ‚Üí Overfitting!")
print("üí° If both are similar ‚Üí Model learned genuinely!")

üìä TRAINING DATA Performance:
              precision    recall  f1-score   support

       Legit       1.00      1.00      1.00   5083526
       Fraud       1.00      1.00      1.00      6570

    accuracy                           1.00   5090096
   macro avg       1.00      1.00      1.00   5090096
weighted avg       1.00      1.00      1.00   5090096

üìä TEST DATA Performance:
              precision    recall  f1-score   support

       Legit       1.00      1.00      1.00   1270881
       Fraud       1.00      1.00      1.00      1643

    accuracy                           1.00   1272524
   macro avg       1.00      1.00      1.00   1272524
weighted avg       1.00      1.00      1.00   1272524

üí° If training is much better than test ‚Üí Overfitting!
üí° If both are similar ‚Üí Model learned genuinely!


Both training and test performance are identical and perfect ‚Äî this is actually good news! Let me explain why.
This is NOT Overfitting.

# Phase-3: Save The Model
We need to save the trained model to a file. Otherwise every time we restart Python we'd have to retrain from scratch ‚Äî which takes minutes!

In [None]:
# Save the model
os.makedirs("artifacts", exist_ok=True)
joblib.dump(model, "artifacts/fraud_model.pkl")

print("‚úÖ Model saved to artifacts/fraud_model.pkl")

# Verify it works by loading it back
loaded_model = joblib.load("artifacts/fraud_model.pkl")
print("‚úÖ Model loaded back successfully!")

# Quick test with one prediction
sample = X_test.iloc[0:1]
prediction = loaded_model.predict(sample)
probability = loaded_model.predict_proba(sample)[0][1]

print(f"\nSample transaction prediction:")
print(f"   Fraud?       : {'Yes üö®' if prediction[0] == 1 else 'No ‚úÖ'}")
print(f"   Probability  : {probability*100:.2f}%")

In [1]:
import os
print("Current working directory:")
print(os.getcwd())

Current working directory:
C:\Users\SVREC-AI\Downloads\Online Payment Fraud Detection


In [2]:
import os

# List all files in current folder
print("Files in your folder:")
for file in os.listdir():
    print(" ", file)

Files in your folder:
  .ipynb_checkpoints
  artifacts
  payment.csv
  Untitled.ipynb


In [3]:
import joblib
import os

# Check if artifacts folder exists
print("artifacts folder exists:", os.path.exists("artifacts"))

# Try saving a simple test file
test_data = {"test": 123}
joblib.dump(test_data, "artifacts/test.pkl")
print("‚úÖ Saving works!")

# Load it back
loaded = joblib.load("artifacts/test.pkl")
print("‚úÖ Loading works!")
print("Contents:", loaded)

artifacts folder exists: True
‚úÖ Saving works!
‚úÖ Loading works!
Contents: {'test': 123}


In [4]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
import joblib

# Load with correct filename
df = pd.read_csv("payment.csv")
df = df.drop(columns=['nameOrig', 'nameDest', 'isFlaggedFraud'])
df['type'] = df['type'].map({
    'CASH_IN'  : 0,
    'CASH_OUT' : 1,
    'DEBIT'    : 2,
    'PAYMENT'  : 3,
    'TRANSFER' : 4
})
df['balanceErrorOrig'] = df['oldbalanceOrg'] - df['amount'] - df['newbalanceOrig']
df['balanceErrorDest'] = df['oldbalanceDest'] + df['amount'] - df['newbalanceDest']

X = df.drop(columns=['isFraud'])
y = df['isFraud']

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

# Train
print("Training model... please wait")
model = RandomForestClassifier(n_estimators=100, random_state=42, n_jobs=-1)
model.fit(X_train, y_train)
print("‚úÖ Model trained!")

# Save model
joblib.dump(model, "artifacts/fraud_model.pkl")
print("‚úÖ Model saved to artifacts/fraud_model.pkl")

# Save feature names (we need this later for the API)
feature_names = list(X.columns)
joblib.dump(feature_names, "artifacts/feature_names.pkl")
print("‚úÖ Feature names saved!")
print("\nFeatures:", feature_names)

Training model... please wait
‚úÖ Model trained!
‚úÖ Model saved to artifacts/fraud_model.pkl
‚úÖ Feature names saved!

Features: ['step', 'type', 'amount', 'oldbalanceOrg', 'newbalanceOrig', 'oldbalanceDest', 'newbalanceDest', 'balanceErrorOrig', 'balanceErrorDest']
