## Kaynaklar

Kod
 - [Credit Card Fraudulent Detection with DNN (Deep Neural Network)](https://www.kaggle.com/dakshmiglani/credit-card-fraudulent-detection-with-dnn-keras)
 
Veri
 - 248,407 Kredi kartı işlemi Eylül 2013.
 
> The datasets contains transactions made by credit cards in September 2013 by european cardholders. This dataset presents transactions that occurred in two days, where we have 492 frauds out of 284,807 transactions. The dataset is highly unbalanced, the positive class (frauds) account for 0.172% of all transactions. 

In [27]:
import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
import keras

In [28]:
df = pd.read_csv('../../../../Documents/data/creditcard.csv')
#df = pd.read_excel('sahtecilik.xlsx')

In [30]:
df.head(3)

Unnamed: 0,Time,V1,V2,V3,V4,V5,V6,V7,V8,V9,...,V21,V22,V23,V24,V25,V26,V27,V28,Amount,Class
0,0.0,-1.359807,-0.072781,2.536347,1.378155,-0.338321,0.462388,0.239599,0.098698,0.363787,...,-0.018307,0.277838,-0.110474,0.066928,0.128539,-0.189115,0.133558,-0.021053,149.62,0
1,0.0,1.191857,0.266151,0.16648,0.448154,0.060018,-0.082361,-0.078803,0.085102,-0.255425,...,-0.225775,-0.638672,0.101288,-0.339846,0.16717,0.125895,-0.008983,0.014724,2.69,0
2,1.0,-1.358354,-1.340163,1.773209,0.37978,-0.503198,1.800499,0.791461,0.247676,-1.514654,...,0.247998,0.771679,0.909412,-0.689281,-0.327642,-0.139097,-0.055353,-0.059752,378.66,0


In [40]:
df.shape

(284807, 31)

In [33]:
df['Class'].unique() # 0 = no fraud, 1 = fraudulent

array([0, 1])

In [34]:
X = df.iloc[:, :-1].values
y = df.iloc[:, -1].values

In [36]:
X_train, X_test, Y_train, Y_test = train_test_split(X, y, test_size=0.1, random_state=1)

In [38]:
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

In [41]:
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout

In [42]:
clf = Sequential([
    Dense(units=16, kernel_initializer='uniform', input_dim=30, activation='relu'),
    Dense(units=18, kernel_initializer='uniform', activation='relu'),
    Dropout(0.25),
    Dense(20, kernel_initializer='uniform', activation='relu'),
    Dense(24, kernel_initializer='uniform', activation='relu'),
    Dense(1, kernel_initializer='uniform', activation='sigmoid')
])

In [44]:
clf.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_6 (Dense)              (None, 16)                496       
_________________________________________________________________
dense_7 (Dense)              (None, 18)                306       
_________________________________________________________________
dropout_2 (Dropout)          (None, 18)                0         
_________________________________________________________________
dense_8 (Dense)              (None, 20)                380       
_________________________________________________________________
dense_9 (Dense)              (None, 24)                504       
_________________________________________________________________
dense_10 (Dense)             (None, 1)                 25        
Total params: 1,711
Trainable params: 1,711
Non-trainable params: 0
_________________________________________________________________


In [45]:
clf.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

In [46]:
clf.fit(X_train, Y_train, batch_size=15, epochs=2)

Epoch 1/2
Epoch 2/2


<keras.callbacks.History at 0x1a2b0b4518>

In [48]:
score = clf.evaluate(X_test, Y_test, batch_size=128)
print('\nAnd the Score is ', score[1] * 100, '%')


And the Score is  99.91924440855307 %


## Veri Kucultme 

Veri cok buyuk oldugu icin, azaltip github'a koyacagim

In [50]:
df.head()

Unnamed: 0,Time,V1,V2,V3,V4,V5,V6,V7,V8,V9,...,V21,V22,V23,V24,V25,V26,V27,V28,Amount,Class
0,0.0,-1.359807,-0.072781,2.536347,1.378155,-0.338321,0.462388,0.239599,0.098698,0.363787,...,-0.018307,0.277838,-0.110474,0.066928,0.128539,-0.189115,0.133558,-0.021053,149.62,0
1,0.0,1.191857,0.266151,0.16648,0.448154,0.060018,-0.082361,-0.078803,0.085102,-0.255425,...,-0.225775,-0.638672,0.101288,-0.339846,0.16717,0.125895,-0.008983,0.014724,2.69,0
2,1.0,-1.358354,-1.340163,1.773209,0.37978,-0.503198,1.800499,0.791461,0.247676,-1.514654,...,0.247998,0.771679,0.909412,-0.689281,-0.327642,-0.139097,-0.055353,-0.059752,378.66,0
3,1.0,-0.966272,-0.185226,1.792993,-0.863291,-0.010309,1.247203,0.237609,0.377436,-1.387024,...,-0.1083,0.005274,-0.190321,-1.175575,0.647376,-0.221929,0.062723,0.061458,123.5,0
4,2.0,-1.158233,0.877737,1.548718,0.403034,-0.407193,0.095921,0.592941,-0.270533,0.817739,...,-0.009431,0.798278,-0.137458,0.141267,-0.20601,0.502292,0.219422,0.215153,69.99,0


In [15]:
df.shape

(284807, 31)

In [51]:
df.Class.value_counts()

0    284315
1       492
Name: Class, dtype: int64

In [52]:
normal_df_sample = df[df.Class == 0].sample(frac=0.05, replace=True)

In [53]:
normal_df_sample.shape

(14216, 31)

In [54]:
reduced_df = pd.concat([normal_df_sample, df[df.Class == 1]])

In [56]:
reduced_df.shape

(14708, 31)

In [57]:
reduced_df = reduced_df.reset_index(drop=True)

In [58]:
reduced_df.head()

Unnamed: 0,Time,V1,V2,V3,V4,V5,V6,V7,V8,V9,...,V21,V22,V23,V24,V25,V26,V27,V28,Amount,Class
0,147082.0,-4.309324,-2.677373,-2.110244,-2.876195,1.46008,-0.980984,-0.013074,-1.51833,-0.047999,...,0.469869,-0.312377,2.29453,-0.203526,0.376437,0.485161,1.400938,-0.289805,118.79,0
1,124570.0,1.970854,-1.903593,-1.008971,-1.792023,-0.739257,1.24678,-1.458121,0.354236,-1.289945,...,0.005548,0.419311,0.16935,-1.6425,-0.473432,-0.074601,0.055105,-0.055079,109.2,0
2,140410.0,-1.322064,-0.1144,0.616879,-2.550177,0.568724,-0.542466,0.493478,-0.86824,-1.828766,...,0.207311,-1.543509,-0.146085,0.688382,0.383625,1.032029,0.032607,-0.026724,93.0,0
3,157808.0,2.202722,-0.78384,-1.438641,-1.051017,-0.469859,-0.949678,-0.452135,-0.233771,-0.677965,...,0.457186,1.277398,-0.058854,-0.290948,0.195627,0.070873,-0.04741,-0.080113,15.0,0
4,135511.0,-0.396875,0.302161,-1.021965,-0.476986,2.081404,-0.811101,1.043505,-0.121031,-0.441177,...,0.315049,0.941829,-0.117308,0.236149,-0.2558,-0.172708,0.416393,0.294075,23.85,0


In [59]:
reduced_df.to_excel('sahtecilik.xlsx')