#Fraudulent Transaction Detection using Deep Learning and TensorFlow

This code demonstrates the implementation of a deep learning model for detecting fraudulent transactions in a financial dataset. We utilize TensorFlow and Keras to build a neural network that learns to identify potentially fraudulent transactions based on a given set of features. The dataset contains various transaction-related attributes and a binary target variable indicating whether the transaction is fraudulent or not.

##Dependencies

Ensure you have the following libraries installed before running the code:



*   NumPy
*   Pandas
*   TensorFlow
*   Keras
*   Scikit-learn

In [6]:
!pip install numpy pandas tensorflow scikit-learn




[notice] A new release of pip is available: 23.2.1 -> 23.3.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [7]:
import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow import keras

##Loading and Preprocessing the Data

In [8]:
fraud_data = "https://drive.google.com/file/d/1FZ1wTv_oAO7I-f0B4KGEhugCq_iQeQak/view?usp=sharing"
data = pd.read_csv(fraud_data)

We start by loading the dataset using Pandas from a CSV file. The dataset contains features like transaction amount, type, and others. We preprocess the data by converting the 'type' column into one-hot encoded features, as neural networks require numerical inputs. Additionally, we split the data into training and test sets to evaluate the model's performance.

In [9]:
data.head()

Unnamed: 0,"<!DOCTYPE html><html><head><meta name=""google"" content=""notranslate""><meta http-equiv=""X-UA-Compatible"" content=""IE=edge;""><style nonce=""U2F-cawP02-dw0SvQ4JQww"">@font-face{font-family:'Roboto';font-style:italic;font-weight:400;src:url(//fonts.gstatic.com/s/roboto/v18/KFOkCnqEu92Fr1Mu51xGIzc.ttf)format('truetype');}@font-face{font-family:'Roboto';font-style:normal;font-weight:300;src:url(//fonts.gstatic.com/s/roboto/v18/KFOlCnqEu92Fr1MmSU5fChc9.ttf)format('truetype');}@font-face{font-family:'Roboto';font-style:normal;font-weight:400;src:url(//fonts.gstatic.com/s/roboto/v18/KFOmCnqEu92Fr1Mu7GxP.ttf)format('truetype');}@font-face{font-family:'Roboto';font-style:normal;font-weight:500;src:url(//fonts.gstatic.com/s/roboto/v18/KFOlCnqEu92Fr1MmEU9fChc9.ttf)format('truetype');}@font-face{font-family:'Roboto';font-style:normal;font-weight:700;src:url(//fonts.gstatic.com/s/roboto/v18/KFOlCnqEu92Fr1MmWUlfChc9.ttf)format('truetype');}</style><meta name=""referrer"" content=""origin""><title>Fraud.csv - „Google“ diskas</title><meta property=""og:title"" content=""Fraud.csv""><meta property=""og:type"" content=""article""><meta property=""og:site_name"" content=""Google Docs""><meta property=""og:url"" content=""https://drive.google.com/file/d/1FZ1wTv_oAO7I-f0B4KGEhugCq_iQeQak/view?usp=sharing&amp;usp=embed_facebook""><link rel=""shortcut icon"" href=""https://ssl.gstatic.com/images/branding/product/1x/drive_2020q4_32dp.png""><link rel=""stylesheet"" href=""https://fonts.googleapis.com/css?family=Google+Sans:300",400,500,"700"" nonce=""U2F-cawP02-dw0SvQ4JQww""><link rel=""stylesheet"" href=""https://www.gstatic.com/_/apps-fileview/_/ss/k=apps-fileview.v.WSLbwpUtFIA.L.X.O/am=IAw/d=0/rs=AO0039sHi1EkFlp2ZnqKlRucDpt-vPsC-g"" nonce=""U2F-cawP02-dw0SvQ4JQww""><script nonce=""6abQ_RdbAjIUeDrZ1Xzsjg"">_docs_flag_initialData={""docs-ails"":""docs_cold""","docs-fwds:""docs_sdf""","docs-crs:""docs_crs_nl""",docs-fe-re:2,docs-fl:1,docs-l1lc:4,"docs-l1lm:""WAW""",...,0.90,0.91,0.92,1.69,72175901].1,[[null.2,null.215,null.216,"https://www.gstatic.com/og/_/js/k=og.qtm.en_US.SmXvdVzKz-0.es5.O/rt=j/m=qabr,q_dnp,qapid,q_dg/exm=qaaw,qadd,qaid,qein,qhaw,qhba,qhbr,qhch,qhga,qhid,qhin/d=1/ed=1/rs=AA2YrTseT1teRg32sIb0pwjmFv1Nte9jQw]]]]",};this.gbar_=this.gbar_||{};(function(_){var window=this;
0,try{,,,,,,,,,,...,,,,,,,,,,
1,"_._F_toggles_initialize=function(a){(""undefine...",_._F_toggles_initialize)([]);,,,,,,,,,...,,,,,,,,,,
2,/*,,,,,,,,,,...,,,,,,,,,,
3,Copyright The Closure Library Authors.,,,,,,,,,,...,,,,,,,,,,
4,SPDX-License-Identifier: Apache-2.0,,,,,,,,,,...,,,,,,,,,,


In [5]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6362620 entries, 0 to 6362619
Data columns (total 11 columns):
 #   Column          Dtype  
---  ------          -----  
 0   step            int64  
 1   type            object 
 2   amount          float64
 3   nameOrig        object 
 4   oldbalanceOrg   float64
 5   newbalanceOrig  float64
 6   nameDest        object 
 7   oldbalanceDest  float64
 8   newbalanceDest  float64
 9   isFraud         int64  
 10  isFlaggedFraud  int64  
dtypes: float64(5), int64(3), object(3)
memory usage: 534.0+ MB


In [6]:
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, OneHotEncoder

# Convert the 'type' column to one-hot encoded features
encoder = OneHotEncoder()
type_encoded = encoder.fit_transform(data[['type']]).toarray()

# Concatenate the one-hot encoded features with the original features
features = np.concatenate((data.drop(['type', 'isFraud', 'nameOrig', 'nameDest', 'isFlaggedFraud'], axis=1).values, type_encoded), axis=1)

# Separate features (input) and labels (output)
labels = data["isFraud"].values

# Split the data into training and test sets
train_features, test_features, train_labels, test_labels = train_test_split(
    features, labels, test_size=0.2, random_state=42
)

# Normalize/Standardize the features (optional but recommended)
scaler = StandardScaler()
train_features = scaler.fit_transform(train_features)
test_features = scaler.transform(test_features)

##Building the Model

In [7]:
# Build the model
input_dim = features.shape[1]
model = keras.Sequential([
    keras.layers.Dense(128, activation='relu', input_shape=(input_dim,)),
    keras.layers.Dropout(0.2),
    keras.layers.Dense(64, activation='relu'),
    keras.layers.Dropout(0.2),
    keras.layers.Dense(1, activation='sigmoid')  # Output layer with 1 unit and a sigmoid activation for binary classification
])





The neural network model is constructed using Keras' Sequential API. The model architecture consists of three dense layers, each followed by a dropout layer to prevent overfitting. The first layer has 128 neurons, followed by a dropout rate of 0.2, and the second layer contains 64 neurons with another dropout rate of 0.2. The output layer is a single neuron with a sigmoid activation function, which is ideal for binary classification tasks like fraud detection.

##Compiling and Training the Model

We compile the model using the Adam optimizer and binary cross-entropy loss, which is well-suited for binary classification problems. We use accuracy as a metric to monitor the model's performance during training. The model is then trained on the training data for 10 epochs with a batch size of 32

In [8]:
# Compile the model
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# Train the model
model.fit(train_features, train_labels, epochs=10, batch_size=32)


Epoch 1/10


Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.src.callbacks.History at 0x21791da9370>

##Model Evaluation

In [9]:
# Evaluate the model on the test set
test_loss, test_acc = model.evaluate(test_features, test_labels)
print("Test accuracy:", test_acc)

Test accuracy: 0.9995222091674805


After training, we evaluate the model's performance on the test set to measure its accuracy in predicting fraudulent transactions. The test loss and accuracy are computed and displayed.

##Saving the Model

In [10]:
# Specify the filename or directory where you want to save the model
model_filename = "trained_model.h5"

# Save the model
model.save(model_filename)

  saving_api.save_model(


Finally, the trained model is saved in the Hierarchical Data Format (HDF5) format with the file name "trained_model.h5". This saved model can later be loaded and used for making predictions on new data without retraining.