# **Credit Card Fraud Detection**
> [Simple Credit card Fraud Detection 95% Accuracy by KRUTARTH DARJI](https://www.kaggle.com/code/krutarthhd/simple-credit-card-fraud-detection-95-accuracy)

## **Data field**
- `V1`, `V2`,...,`V28` : the result of a PCA transformation
- `Time` : the seconds elapsed between each transaction and the first transaction in the dataset. 
- `Amount` : the transaction Amount, this feature can be used for example-dependant cost-sensitive learning. 
- `Class` : the response variable and it takes value 1 in case of fraud and 0 otherwise.

  - This dataset presents transactions that occurred in two days, where we have 492 frauds out of 284,807 transactions. 
  - The dataset is highly unbalanced, the positive class (frauds) account for 0.172% of all transactions.



In [2]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

import tensorflow as tf
from tensorflow.keras import Sequential
from tensorflow.keras.layers import BatchNormalization,Dropout,Dense,Flatten,Conv1D
from tensorflow.keras.optimizers import Adam

%matplotlib inline

## 1. Gathering the data and assessing the data

In [5]:
df = pd.read_csv('../input/creditcardfraud/creditcard.csv')

print(df.shape)
df.head()

In [6]:
df.info()

In [7]:
df.Class.unique()

## 2. Uneven class distribution

In [8]:
df.Class.value_counts()

In [9]:
nf = df[df.Class==0]
f = df[df.Class==1]

In [10]:
print(nf.shape)
print(f.shape)

## 3. Extracting random entries of class-0
- Total entries are 1.5* NO. of class-1 entries

In [11]:
nf = nf.sample(738)

## 4. Creating new dataframe

In [12]:
data = f.append(nf, ignore_index=True)

In [13]:
print(data.shape)

In [14]:
X = data.drop(['Class'],axis=1)
y = data['Class']

## 5. Train-Test Split

In [15]:
x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.2, stratify = y)

In [18]:
print("Shape of train X:", x_train.shape)
print("Shape of train y:", y_train.shape)
print("Shape of test X:", x_test.shape)
print("Shape of test X:", y_test.shape)

## 6. Applying StandardScaler to obtain all the features in similar range

In [19]:
scaler = StandardScaler()
x_train = scaler.fit_transform(x_train)
x_test = scaler.transform(x_test)

In [20]:
y_train = y_train.to_numpy()
y_test = y_test.to_numpy()

## 7. Reshaping the input to 3D.

In [21]:
x_train = x_train.reshape(x_train.shape[0], x_train.shape[1],1)
x_test = x_test.reshape(x_test.shape[0], x_test.shape[1],1)

## 8. CNN model

In [23]:
model = Sequential()
model.add(Conv1D(32,2,activation='relu', input_shape = x_train[0].shape))
model.add(BatchNormalization())
model.add(Dropout(0.2))

model.add(Conv1D(64,2,activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.5))

model.add(Flatten())
model.add(Dense(64,activation='relu'))
model.add(Dropout(0.5))

model.add(Dense(1,activation='sigmoid'))

In [24]:
model.summary()

## 9. Compiling and Fiting

In [25]:
model.compile(optimizer=Adam(learning_rate=0.0001), loss='binary_crossentropy', metrics=['accuracy'])

In [26]:
history = model.fit(x_train, y_train, epochs=20, validation_data=(x_test,y_test))

In [27]:
def plotLearningCurve(history,epochs):
    epochRange = range(1,epochs+1)
    plt.plot(epochRange,history.history['accuracy'])
    plt.plot(epochRange,history.history['val_accuracy'])
    plt.title('Model Accuracy')
    plt.xlabel('Epoch')
    plt.ylabel('Accuracy')
    plt.legend(['Train','Validation'],loc='upper left')
    plt.show()

    plt.plot(epochRange,history.history['loss'])
    plt.plot(epochRange,history.history['val_loss'])
    plt.title('Model Loss')
    plt.xlabel('Epoch')
    plt.ylabel('Loss')
    plt.legend(['Train','Validation'],loc='upper left')
    plt.show()

In [28]:
plotLearningCurve(history,20)