## Kaynaklar

Kod
 - [Credit Card Fraudulent Detection with DNN (Deep Neural Network)](https://www.kaggle.com/dakshmiglani/credit-card-fraudulent-detection-with-dnn-keras)
 
Veri
 - 248,407 Kredi kartı işlemi Eylül 2013.
 
> The datasets contains transactions made by credit cards in September 2013 by european cardholders. This dataset presents transactions that occurred in two days, where we have 492 frauds out of 284,807 transactions. The dataset is highly unbalanced, the positive class (frauds) account for 0.172% of all transactions. 

In [1]:
import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
import keras

Using TensorFlow backend.


In [2]:
df = pd.read_csv('../../../../Documents/data/creditcard.csv')
#df = pd.read_excel('sahtecilik.xlsx')

In [3]:
df.head(3)

Unnamed: 0,Time,V1,V2,V3,V4,V5,V6,V7,V8,V9,...,V21,V22,V23,V24,V25,V26,V27,V28,Amount,Class
0,0.0,-1.359807,-0.072781,2.536347,1.378155,-0.338321,0.462388,0.239599,0.098698,0.363787,...,-0.018307,0.277838,-0.110474,0.066928,0.128539,-0.189115,0.133558,-0.021053,149.62,0
1,0.0,1.191857,0.266151,0.16648,0.448154,0.060018,-0.082361,-0.078803,0.085102,-0.255425,...,-0.225775,-0.638672,0.101288,-0.339846,0.16717,0.125895,-0.008983,0.014724,2.69,0
2,1.0,-1.358354,-1.340163,1.773209,0.37978,-0.503198,1.800499,0.791461,0.247676,-1.514654,...,0.247998,0.771679,0.909412,-0.689281,-0.327642,-0.139097,-0.055353,-0.059752,378.66,0


In [4]:
df['Class'].unique() # 0 = no fraud, 1 = fraudulent

array([0, 1])

In [5]:
X = df.iloc[:, :-1].values
y = df.iloc[:, -1].values

In [6]:
X_train, X_test, Y_train, Y_test = train_test_split(X, y, test_size=0.1, random_state=1)

In [7]:
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

In [8]:
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout

In [9]:
clf = Sequential([
    Dense(units=16, kernel_initializer='uniform', input_dim=30, activation='relu'),
    Dense(units=18, kernel_initializer='uniform', activation='relu'),
    Dropout(0.25),
    Dense(20, kernel_initializer='uniform', activation='relu'),
    Dense(24, kernel_initializer='uniform', activation='relu'),
    Dense(1, kernel_initializer='uniform', activation='sigmoid')
])

In [10]:
clf.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_1 (Dense)              (None, 16)                496       
_________________________________________________________________
dense_2 (Dense)              (None, 18)                306       
_________________________________________________________________
dropout_1 (Dropout)          (None, 18)                0         
_________________________________________________________________
dense_3 (Dense)              (None, 20)                380       
_________________________________________________________________
dense_4 (Dense)              (None, 24)                504       
_________________________________________________________________
dense_5 (Dense)              (None, 1)                 25        
Total params: 1,711
Trainable params: 1,711
Non-trainable params: 0
_________________________________________________________________


In [11]:
clf.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

Instructions for updating:
keep_dims is deprecated, use keepdims instead


In [12]:
clf.fit(X_train, Y_train, batch_size=15, epochs=2)

Epoch 1/2
Epoch 2/2


<keras.callbacks.History at 0x1a295cd048>

In [13]:
score = clf.evaluate(X_test, Y_test, batch_size=128)
print('\nAnd the Score is ', score[1] * 100, '%')


And the Score is  99.92626663389628 %


## Veri Kucultme 

Veri cok buyuk oldugu icin, azaltip github'a koyacagim

In [14]:
df.head()

Unnamed: 0,Time,V1,V2,V3,V4,V5,V6,V7,V8,V9,...,V21,V22,V23,V24,V25,V26,V27,V28,Amount,Class
0,0.0,-1.359807,-0.072781,2.536347,1.378155,-0.338321,0.462388,0.239599,0.098698,0.363787,...,-0.018307,0.277838,-0.110474,0.066928,0.128539,-0.189115,0.133558,-0.021053,149.62,0
1,0.0,1.191857,0.266151,0.16648,0.448154,0.060018,-0.082361,-0.078803,0.085102,-0.255425,...,-0.225775,-0.638672,0.101288,-0.339846,0.16717,0.125895,-0.008983,0.014724,2.69,0
2,1.0,-1.358354,-1.340163,1.773209,0.37978,-0.503198,1.800499,0.791461,0.247676,-1.514654,...,0.247998,0.771679,0.909412,-0.689281,-0.327642,-0.139097,-0.055353,-0.059752,378.66,0
3,1.0,-0.966272,-0.185226,1.792993,-0.863291,-0.010309,1.247203,0.237609,0.377436,-1.387024,...,-0.1083,0.005274,-0.190321,-1.175575,0.647376,-0.221929,0.062723,0.061458,123.5,0
4,2.0,-1.158233,0.877737,1.548718,0.403034,-0.407193,0.095921,0.592941,-0.270533,0.817739,...,-0.009431,0.798278,-0.137458,0.141267,-0.20601,0.502292,0.219422,0.215153,69.99,0


In [15]:
df.shape

(284807, 31)

In [16]:
df.Class.value_counts()

0    284315
1       492
Name: Class, dtype: int64

In [17]:
normal_df_sample = df[df.Class == 0].sample(frac=0.05, replace=True)

In [18]:
normal_df_sample.shape

(14216, 31)

In [19]:
reduced_df = pd.concat([normal_df_sample, df[df.Class == 1]])

In [20]:
reduced_df.shape

(14708, 31)

In [21]:
reduced_df = reduced_df.reset_index(drop=True)

In [22]:
reduced_df.head()

Unnamed: 0,Time,V1,V2,V3,V4,V5,V6,V7,V8,V9,...,V21,V22,V23,V24,V25,V26,V27,V28,Amount,Class
0,65617.0,-1.22356,-0.616085,2.391712,-1.618031,-0.303234,-0.701199,0.103205,-0.08982,-1.035075,...,0.405956,1.031523,-0.019859,0.418614,0.46185,-0.194119,0.088336,-0.07965,103.0,0
1,40846.0,0.847852,-0.354261,0.33922,1.447672,-0.436064,0.038825,0.064058,0.152858,0.189459,...,0.030245,-0.077474,-0.152901,0.205123,0.514422,-0.352412,0.003026,0.028441,142.22,0
2,129826.0,-0.744838,0.873659,3.135126,4.489845,-0.574433,1.08687,-0.597055,0.569046,-0.991034,...,0.211562,0.601728,-0.156633,-0.011702,-0.055699,0.56476,0.37023,0.176133,38.82,0
3,148614.0,2.178983,-1.782342,-0.710779,-1.588412,-1.555033,-0.1879,-1.52377,0.105126,-0.606861,...,-0.419091,-0.997873,0.450296,0.27946,-0.664758,-0.493022,0.010028,-0.026954,69.0,0
4,155160.0,2.109017,0.028295,-2.41026,0.232724,0.92461,-0.995066,0.826068,-0.533862,0.367386,...,0.090419,0.450799,-0.242338,-0.938192,0.702172,-0.01133,-0.057583,-0.077789,38.95,0


In [23]:
reduced_df.to_excel('sahtecilik.xlsx')