#Fraudulent Transaction Detection: Model Evaluation and Prediction

In this code, we load the previously trained fraud detection model and use it to predict fraudulent transactions in a new dataset. We then evaluate the model's performance by comparing its predictions to the ground truth labels in the dataset.

In [None]:
import pandas as pd
from tensorflow.keras.models import load_model
import numpy as np

##Loading the Model and Data

In [None]:
# Load the saved model
model_filename = "/content/drive/MyDrive/Colab Notebooks/Analyzing Fraudulent Transaction/trained_model.h5"
loaded_model = load_model(model_filename)

We begin by loading the pre-trained model saved in the HDF5 format using Keras' load_model function. Next, we read the dataset ('Fraud.csv') containing features related to transactions. We preprocess the data by one-hot encoding the 'type' column, similar to what was done in the training notebook.

In [None]:
data = pd.read_csv('/content/drive/MyDrive/Colab Notebooks/Analyzing Fraudulent Transaction/Fraud.csv')

##Making Predictions

After preprocessing the data, we use the loaded model to make predictions on the new dataset. The model returns probabilities for each transaction being fraudulent, which we convert to binary labels (0 or 1) by multiplying by 100 and rounding. This gives us a set of flagged labels indicating whether a transaction is predicted as fraudulent (True) or not (False).

In [None]:
from sklearn.preprocessing import StandardScaler, OneHotEncoder

# Convert the 'type' column to one-hot encoded features
encoder = OneHotEncoder()
type_encoded = encoder.fit_transform(data[['type']]).toarray()

# Concatenate the one-hot encoded features with the original features
new_data = np.concatenate((data.drop(['type', 'isFraud', 'nameOrig', 'nameDest', 'isFlaggedFraud'], axis=1).values, type_encoded), axis=1)

In [None]:
# Make predictions on the preprocessed data
predictions = loaded_model.predict(new_data)



In [None]:
# Convert probabilities to integers by multiplying by 100
rounded_probabilities = (predictions * 100).astype(int)
flagged_labels = rounded_probabilities.astype(bool)

In [None]:
flagged_labels.shape

(6362620, 1)

In [None]:
# Convert flagged_labels to a 1-dimensional array
flagged_labels = flagged_labels.flatten()

##Combining Results with Original Data

We then create a new DataFrame, 'flagged_data', to store the flagged labels. We concatenate this DataFrame with the original 'new_data' DataFrame, which contains the preprocessed transaction features. The combined DataFrame, 'combined_data', now includes the original features along with the flagged labels.

In [None]:
# Create a new DataFrame with the 'Flagged' column
flagged_data = pd.DataFrame({'Flagged': flagged_labels})

# Combine the original DataFrame 'new_data' with the flagged DataFrame using pd.concat()
combined_data = pd.concat([data, flagged_data], axis=1)

In [None]:
combined_data

Unnamed: 0,step,type,amount,nameOrig,oldbalanceOrg,newbalanceOrig,nameDest,oldbalanceDest,newbalanceDest,isFraud,isFlaggedFraud,Flagged
0,1,PAYMENT,9839.64,C1231006815,170136.00,160296.36,M1979787155,0.00,0.00,0,0,False
1,1,PAYMENT,1864.28,C1666544295,21249.00,19384.72,M2044282225,0.00,0.00,0,0,False
2,1,TRANSFER,181.00,C1305486145,181.00,0.00,C553264065,0.00,0.00,1,0,True
3,1,CASH_OUT,181.00,C840083671,181.00,0.00,C38997010,21182.00,0.00,1,0,False
4,1,PAYMENT,11668.14,C2048537720,41554.00,29885.86,M1230701703,0.00,0.00,0,0,False
...,...,...,...,...,...,...,...,...,...,...,...,...
6362615,743,CASH_OUT,339682.13,C786484425,339682.13,0.00,C776919290,0.00,339682.13,1,0,False
6362616,743,TRANSFER,6311409.28,C1529008245,6311409.28,0.00,C1881841831,0.00,0.00,1,0,True
6362617,743,CASH_OUT,6311409.28,C1162922333,6311409.28,0.00,C1365125890,68488.84,6379898.11,1,0,False
6362618,743,TRANSFER,850002.52,C1685995037,850002.52,0.00,C2080388513,0.00,0.00,1,0,True


##Model Evaluation

In [None]:
# Compare the 'isFraud' and 'Flagged' columns and create a new column 'CorrectPrediction'
combined_data['Test'] = ((combined_data['isFraud'] == 1) & (combined_data['Flagged'] == True)) | ((combined_data['isFraud'] == 0) & (combined_data['Flagged'] == False))

In [None]:
combined_data['Test'].mean()*100

94.44873338341753

To evaluate the model's accuracy, we compare the 'isFraud' column (ground truth) with the 'Flagged' column (predictions). We create a new column called 'Test', which evaluates if the model's prediction matches the actual fraud status. If the prediction is correct, the 'Test' column is True; otherwise, it is False. We then calculate the percentage of correctly predicted transactions and display it as a measure of the model's performance.