## Kaggle Credit Card Fraud Detection Dataset
## Source: https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud
# Importing necessary libraries
* `pandas`                     Load and manipulate the CSV dataset                                     
* `train_test_split`           Split data into training and testing sets                               
* `StandardScaler`             Normalize continuous features (e.g., 'Time', 'Amount')                  
* `SMOTE`                      Address class imbalance by oversampling the minority class              
* `RandomForestClassifier`     Fast, robust ensemble learning model                                    
* `GradientBoostingClassifier` More accurate but slower boosting model                                 
* `precision_score`            Evaluates exactness (how many predicted frauds were right)              
* `recall_score`               Evaluates completeness (how many real frauds were caught)               
* `f1_score`                  | Balances precision and recall                                           
* `warnings.filterwarnings("ignore")`  Cleans up the output by ignoring minor warnings   

In [1]:
# pandas: for loading and manipulating the dataset
import pandas as pd

# train_test_split: to split the dataset into training and testing subsets
from sklearn.model_selection import train_test_split

# StandardScaler: to normalize the 'Time' and 'Amount' features
from sklearn.preprocessing import StandardScaler

# SMOTE: Synthetic Minority Oversampling Technique to handle class imbalance
from imblearn.over_sampling import SMOTE

# Machine Learning Models:
# RandomForestClassifier: an ensemble method using decision trees
# GradientBoostingClassifier: boosting technique for high performance on imbalanced data
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier

# Evaluation metrics:
# precision_score: measures how many predicted frauds were actually frauds
# recall_score: measures how many actual frauds were detected
# f1_score: harmonic mean of precision and recall, useful for imbalanced classes
from sklearn.metrics import precision_score, recall_score, f1_score

# Warnings: Suppresses warning messages for cleaner output
import warnings
warnings.filterwarnings("ignore")

### 📥 Load Dataset

The dataset is loaded from a local file using `pandas.read_csv()`. Make sure the file path is correct and the dataset (`creditcard.csv`) exists at:



In [4]:
# Ensure the file 'creditcard.csv' exists at the specified path before running
data = pd.read_csv(r"C:\Users\Ak\Desktop\Developers Hub Internship\Fraud Detection\creditcard.csv")

###  Create a Balanced Subset (In-Memory)

Since fraud cases are rare, the dataset is heavily imbalanced. To build a balanced model quickly:

1. **Separate Classes**:
   - Extract all fraudulent transactions (`Class == 1`)
   - Extract all legitimate transactions (`Class == 0`)

2. **Downsample Legitimate Transactions**:
   - Randomly sample a number of legitimate transactions equal to the number of fraud cases

3. **Combine & Shuffle**:
   - Concatenate the fraud cases and the sampled legitimate cases
   - Shuffle the combined dataset to ensure randomness

This technique helps in:
- Faster training and testing
- Avoiding model bias toward the majority class


In [5]:
# Separate fraudulent and legitimate transactions
fraud = data[data['Class'] == 1]   # All fraud cases
legit = data[data['Class'] == 0]   # All non-fraud cases

# 🧪 Downsample legitimate transactions to match the number of fraud cases
legit_sample = legit.sample(n=len(fraud), random_state=42)  # Random sample for balance

# 🧬 Combine the fraud and downsampled legitimate transactions into one DataFrame
data_subset = pd.concat([fraud, legit_sample])

# 🔀 Shuffle the resulting dataset to randomize row order
data_subset = data_subset.sample(frac=1, random_state=42).reset_index(drop=True)


###  Feature and Label Separation

To train a machine learning model, the dataset is divided into:

- **Features (`X`)**: All columns except `'Class'`, representing the transaction attributes (e.g., `Time`, `V1` to `V28`, `Amount`)
- **Labels (`y`)**: The `'Class'` column, where:
  - `0` = Legitimate transaction
  - `1` = Fraudulent transaction

This separation is essential for supervised learning, where the model learns patterns from features to predict labels.


In [6]:
#  Separate input features (X) and target labels (y)
X = data_subset.drop('Class', axis=1)   # Features: all columns except 'Class'
y = data_subset['Class']                # Labels: the 'Class' column indicating fraud (1) or legitimate (0)

###  Train-Test Split

The dataset is divided into training and testing subsets using `train_test_split()`:

- **Training Set (80%)**: Used to train the machine learning model.
- **Testing Set (20%)**: Used to evaluate model performance on unseen data.
- **Stratified Sampling**: The `stratify=y` parameter ensures that the proportion of fraud and legitimate transactions is consistent across both sets.
- **Random State**: The seed (`random_state=42`) ensures the split is reproducible each time the script runs.


In [None]:
#  Split the balanced dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(
    X,                  # Input features
    y,                  # Target labels
    test_size=0.2,      # 20% of the data for testing, 80% for training
    stratify=y,         # Ensure class proportions are preserved in both sets
    random_state=42     # Fixed seed for reproducibility
)

###  Standardization of Features

Credit card transaction features like `'Time'` and `'Amount'` are on different scales and can affect model performance. To normalize them:

- Two separate `StandardScaler` objects are created:
  - `scaler_time`: Standardizes the `'Time'` feature
  - `scaler_amount`: Standardizes the `'Amount'` feature

Standardization transforms data to have a mean of 0 and a standard deviation of 1, which helps many machine learning algorithms perform better.


In [9]:
#  Initialize standard scalers for continuous numeric features
scaler_time = StandardScaler()     # Scaler for the 'Time' feature
scaler_amount = StandardScaler()   # Scaler for the 'Amount' feature

###  Apply Feature Scaling to 'Time' and 'Amount'

After initializing `StandardScaler` objects, we apply them to standardize the `'Time'` and `'Amount'` features:

1. **Training Set**:
   - `.fit_transform()` is used to calculate scaling parameters (mean, std) and apply them to the training data.
2. **Test Set**:
   - `.transform()` is used to apply the same scaling (from training data) to the test set. This ensures consistency and prevents data leakage.

This step ensures both sets have features on the same scale, improving model learning and accuracy.


In [None]:
#  Apply standardization to the 'Time' and 'Amount' features in the training set
X_train['Time'] = scaler_time.fit_transform(X_train[['Time']])      # Fit & transform 'Time'
X_train['Amount'] = scaler_amount.fit_transform(X_train[['Amount']])# Fit & transform 'Amount'

#  Apply the same transformation to the test set using the already-fitted scalers
X_test['Time'] = scaler_time.transform(X_test[['Time']])            # Transform 'Time' using training scaler
X_test['Amount'] = scaler_amount.transform(X_test[['Amount']])      # Transform 'Amount' using training scaler

###  Handle Class Imbalance with SMOTE

To prevent the model from being biased toward the majority class (legitimate transactions), we apply **SMOTE**:

- **SMOTE (Synthetic Minority Oversampling Technique)** creates synthetic examples of the minority class (fraud cases) based on feature-space similarities.
- It is applied **only to the training data** to avoid data leakage into the test set.
- The result is a balanced training set: `X_train_res` and `y_train_res` contain equal numbers of fraud and non-fraud samples.

This technique improves the model's ability to detect rare fraudulent transactions.


In [14]:
# Apply SMOTE (Synthetic Minority Oversampling Technique) to balance the training data
smote = SMOTE(random_state=42)                  # Initialize SMOTE with a fixed random seed
X_train_res, y_train_res = smote.fit_resample(  # Generate synthetic examples for the minority class
    X_train, y_train                            # Only applied to training data to avoid data leakage
)

###  Initialize Machine Learning Models

Two powerful ensemble classifiers are used for fraud detection:

- **RandomForestClassifier**
  - An ensemble of decision trees trained with bagging.
  - Fast, accurate, and robust to overfitting.
  - Good baseline model for classification tasks.

- **GradientBoostingClassifier**
  - Builds trees sequentially, with each tree correcting the errors of the previous ones.
  - Often achieves higher accuracy but is slower to train.
  - More sensitive to parameter tuning but good for imbalanced datasets.

Both models are initialized with a fixed `random_state=42` for reproducibility.


In [15]:
#  Initialize two machine learning models for training

#  Random Forest Classifier: Ensemble of decision trees for fast, robust classification
rf_model = RandomForestClassifier(random_state=42)

#  Gradient Boosting Classifier: Boosting technique for better performance on complex patterns
gb_model = GradientBoostingClassifier(random_state=42)

###  Train the Models

After resampling the training data to balance the classes using **SMOTE**, we train both machine learning models:

- **Random Forest** is trained on the resampled training data (`X_train_res` and `y_train_res`). This ensemble model aggregates predictions from multiple decision trees, making it more robust.
  
- **Gradient Boosting** is also trained on the same resampled data. This model sequentially builds trees that learn from the mistakes of the previous ones, improving accuracy, especially on imbalanced datasets.

Both models are trained using the **balanced** data, which helps improve the detection of fraud.

In [None]:
# Random Forest Model
rf_model.fit(X_train_res, y_train_res)  # Train the Random Forest model on the resampled training data

In [18]:
# Gradient Boosting Model
gb_model.fit(X_train_res, y_train_res)  # Train the Gradient Boosting model on the resampled training data

### Evaluate Model Performance on the Test Set

After training the models on the resampled training data, we evaluate their performance on the test set using key metrics:

1. **Predictions**: 
   - Both the Random Forest and Gradient Boosting models generate predictions on the test data (`X_test`).
   
2. **Performance Metrics**:
   - **Precision**: Measures how many of the predicted fraud cases are actually fraud.
     - Formula: `Precision = TP / (TP + FP)`
   - **Recall**: Measures how many of the actual fraud cases were detected.
     - Formula: `Recall = TP / (TP + FN)`
   - **F1 Score**: A balanced metric that considers both Precision and Recall.
     - Formula: `F1 = 2 * (Precision * Recall) / (Precision + Recall)`

Both models' precision, recall, and F1 scores are displayed to compare their effectiveness in fraud detection.


In [19]:
#  Evaluate models on the test set

# Make predictions using the Random Forest model
y_pred_rf = rf_model.predict(X_test)

# Make predictions using the Gradient Boosting model
y_pred_gb = gb_model.predict(X_test)

# 📝 Display model performance metrics on the test set
print("\n Model Performance on Test Set:")

# Random Forest Performance Metrics
print("Random Forest:")
print(f"  Precision: {precision_score(y_test, y_pred_rf):.4f}")  # Precision: true positives / (true positives + false positives)
print(f"  Recall:    {recall_score(y_test, y_pred_rf):.4f}")     # Recall: true positives / (true positives + false negatives)
print(f"  F1 Score:  {f1_score(y_test, y_pred_rf):.4f}")         # F1 Score: harmonic mean of Precision and Recall

# Gradient Boosting Performance Metrics
print("\nGradient Boosting:")
print(f"  Precision: {precision_score(y_test, y_pred_gb):.4f}")  # Precision: true positives / (true positives + false positives)
print(f"  Recall:    {recall_score(y_test, y_pred_gb):.4f}")     # Recall: true positives / (true positives + false negatives)
print(f"  F1 Score:  {f1_score(y_test, y_pred_gb):.4f}")         # F1 Score: harmonic mean of Precision and Recall


 Model Performance on Test Set:
Random Forest:
  Precision: 0.9783
  Recall:    0.9184
  F1 Score:  0.9474

Gradient Boosting:
  Precision: 0.9674
  Recall:    0.9082
  F1 Score:  0.9368


### Model Evaluation: Accuracy, Classification Report, and Confusion Matrix

Once the models have made predictions on the test data, we evaluate their performance using various metrics:

1. **Accuracy**:
   - **Accuracy** is the percentage of correct predictions made by the model.
   - It is calculated using the formula: `Accuracy = (TP + TN) / (TP + TN + FP + FN)`
   - Accuracy for both models (Random Forest and Gradient Boosting) is printed.

2. **Classification Report**:
   - This report includes the following metrics for each class:
     - **Precision**: The proportion of positive predictions that were correct.
     - **Recall**: The proportion of actual positives that were correctly predicted.
     - **F1-Score**: The harmonic mean of Precision and Recall, balancing both metrics.
     - **Support**: The number of true instances for each class.
   - A detailed classification report is printed for both models.

3. **Confusion Matrix**:
   - The **Confusion Matrix** is used to evaluate classification performance by comparing the predicted labels (`y_pred_rf` and `y_pred_gb`) with the actual labels (`y_test`).
   - It contains four key values:
     - **True Positive (TP)**: Fraudulent transactions correctly classified as fraud.
     - **True Negative (TN)**: Legitimate transactions correctly classified as legitimate.
     - **False Positive (FP)**: Legitimate transactions incorrectly classified as fraud.
     - **False Negative (FN)**: Fraudulent transactions incorrectly classified as legitimate.
   - The confusion matrices for both models are printed, showing the detailed breakdown of predictions.

These metrics give a comprehensive view of how well each model is performing, especially for an imbalanced dataset like this one.


In [21]:
#  Import additional evaluation metrics
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix

#  Accuracy Calculation
# Calculate accuracy for both models
acc_rf = accuracy_score(y_test, y_pred_rf)  # Accuracy for Random Forest
acc_gb = accuracy_score(y_test, y_pred_gb)  # Accuracy for Gradient Boosting

# Print the accuracy of both models
print(f"\nRandom Forest Accuracy: {acc_rf:.4f}")
print(f"Gradient Boosting Accuracy: {acc_gb:.4f}")

#  Detailed classification report for both models
print("\n Classification Report (Random Forest):")
print(classification_report(y_test, y_pred_rf))  # Precision, Recall, F1 Score, and Support for RF

print("\n Classification Report (Gradient Boosting):")
print(classification_report(y_test, y_pred_gb))  # Precision, Recall, F1 Score, and Support for GB

#  Confusion Matrix for both models
cm_rf = confusion_matrix(y_test, y_pred_rf)  # Confusion matrix for Random Forest
cm_gb = confusion_matrix(y_test, y_pred_gb)  # Confusion matrix for Gradient Boosting

# Print the confusion matrix for both models
print("\n Confusion Matrix (Random Forest):")
print(cm_rf)

print("\n Confusion Matrix (Gradient Boosting):")
print(cm_gb)


Random Forest Accuracy: 0.9492
Gradient Boosting Accuracy: 0.9391

 Classification Report (Random Forest):
              precision    recall  f1-score   support

           0       0.92      0.98      0.95        99
           1       0.98      0.92      0.95        98

    accuracy                           0.95       197
   macro avg       0.95      0.95      0.95       197
weighted avg       0.95      0.95      0.95       197


 Classification Report (Gradient Boosting):
              precision    recall  f1-score   support

           0       0.91      0.97      0.94        99
           1       0.97      0.91      0.94        98

    accuracy                           0.94       197
   macro avg       0.94      0.94      0.94       197
weighted avg       0.94      0.94      0.94       197


 Confusion Matrix (Random Forest):
[[97  2]
 [ 8 90]]

 Confusion Matrix (Gradient Boosting):
[[96  3]
 [ 9 89]]


###  Command-line Interface for Manual Transaction Testing

This section of the code allows users to input a new transaction and predict whether it is fraudulent or not, using the trained fraud detection model.

1. **Input**:
   - The user is prompted to enter 30 feature values separated by spaces. These features correspond to the transaction data, including:
     - `Time`, `V1` to `V28`, and `Amount`.

2. **Input Validation**:
   - The input string is converted into a list of floating-point numbers.
   - If the user does not provide exactly 30 values (as expected for this dataset), the system will notify them and prompt for the correct number of values.

3. **Feature Scaling**:
   - Since the `Time` and `Amount` features were standardized during training, the same scaling transformations are applied to the user input for consistency.

4. **Prediction**:
   - The best model (previously selected based on F1-score) is used to predict if the transaction is fraudulent or legitimate.
   - If the model predicts `1`, it indicates a **fraudulent** transaction.
   - If the model predicts `0`, it indicates a **legitimate** transaction.

5. **Error Handling**:
   - If the user inputs invalid data or the process encounters an error (e.g., non-numeric values), the system will catch the exception and display an error message.

This CLI allows you to test individual transactions and receive immediate predictions based on the trained models.


In [None]:
# Command-line interface for manual transaction testing

# Prompt the user to input a new transaction with 30 feature values
print("\n📥 Enter a new transaction to predict fraud.")
print("Provide 30 feature values (Time, V1 to V28, Amount) separated by spaces:")

try:
    # Take user input for the transaction data
    input_str = input("Transaction input: ")  # Input from the user
    values = [float(v) for v in input_str.strip().split()]  # Convert the input into a list of floats
    
    # Check if the user has provided exactly 30 feature values
    if len(values) != X.shape[1]:
        print(f"❌ Expected 30 values, but got {len(values)}. Please retry.")  # Inform the user if wrong number of values
    else:
        # Apply scaling to 'Time' and 'Amount' using the previously fitted scalers
        values[0] = scaler_time.transform([[values[0]]])[0][0]  # Scale 'Time'
        values[-1] = scaler_amount.transform([[values[-1]]])[0][0]  # Scale 'Amount'
        
        # Use the best model (selected previously) to predict whether the transaction is fraud or not
        prediction = best_model.predict([values])[0]

        # Output the result of the prediction
        print("\n🔍 Prediction:")
        if prediction == 1:
            print("⚠️  FRAUDULENT transaction detected! (Class = 1)")  # Fraud detected
        else:
            print("✅ Legitimate transaction. (Class = 0)")  # Legitimate transaction
except Exception as e:
    print("❌ Invalid input:", e)  # Catch and print any error (e.g., invalid input)



📥 Enter a new transaction to predict fraud.
Provide 30 feature values (Time, V1 to V28, Amount) separated by spaces:
❌ Expected 30 values, but got 0. Please retry.
