# **Dacon 신용카드 사기 거래 탐지 AI 경진대회**

### 목표: 비식별화된 신용카드 거래 데이터로부터 사기 거래를 탐지하는 AI 솔루션 개발

- 데이콘에서 제공하는 신용카드 데이터셋으로 EDA 과정을 거치고 신용카드 사기 탐지 모델을 만들어 성능을 평가했습니다.






In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt   
import seaborn as sns

In [None]:
import random
import pandas as pd
import numpy as np
import os
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
from torch.utils.data import Dataset, DataLoader
from tqdm.auto import tqdm
from sklearn.metrics import f1_score

In [None]:
device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')

In [None]:
from google.colab import drive 
drive.mount('/content/drive/')

Drive already mounted at /content/drive/; to attempt to forcibly remount, call drive.mount("/content/drive/", force_remount=True).


## **1. Dataset**

데이터 다운

#### 1-1. 학습 데이터셋 (113842개)
* 정상, 사기 거래 여부 알 수 없는 신용 카드 데이터 (Unlabeled, 비지도학습)
* 'ID' : 신용 카드 거래 ID
* 'V1', 'V2', 'V3', ... ,'V30' : 비식별화된 신용 카드 거래 피처

In [None]:
train_df = pd.read_csv('/content/drive/MyDrive/train.csv') 
#train_df.head()
train_df.shape

(113842, 31)

#### 1-2. 검증 데이터셋 (28462개)
* 정상, 사기 거래 여부 **포함된** 신용 카드 데이터 (학습 불가능, 모델 구축에 이용)
* 'ID' : 신용 카드 거래 ID
* 'V1', 'V2', 'V3', ... ,'V30' : 비식별화된 신용 카드 거래 피처
* 'Class' : 신용 카드 거래의 정상, 사기 여부 (정상 : 0, 사기 : 1)

In [None]:
val_df = pd.read_csv('/content/drive/MyDrive/val.csv') 
#val_df.head()
val_df.shape

(28462, 32)

* 사기 거래가 정상 거래에 비해 매우 적은 비중으로 포함되어 있는 'Hightly unbalanced'
* Class 가 1에 해당하는 데이터의 비율이 전체의 약 0.11% 남짓임

In [None]:
val_df['Class'].value_counts()

0    28432
1       30
Name: Class, dtype: int64

#### 1-3. 테스트 데이터셋 (142503개)
* 정상, 사기 거래의 여부를 알 수 없는 신용 카드 데이터 (Unlabeled)
* 'ID' : 신용 카드 거래 ID
* 'V1', 'V2', 'V3', ... ,'V30' : 비식별화된 신용 카드 거래 피처

In [None]:
test_df = pd.read_csv('/content/drive/MyDrive/test.csv') 
#test_df.head()
test_df.shape

(142503, 31)

## **2. EDA**



#### 2-1. Null 확인

Train, Validation, Test 데이터셋 모두에서 null 값은 존재하지 않는 것으로 확인했습니다.

#### 2-2. Class 분포 확인

이미 위에서 value_counts() 로 클래스의 분포를 확인해보았지만, 이를 막대 그래프로 시각화하여 다시 한번 확인했습니다.


#### 2-3. V1 ~ V30 (비식별화된 거래) 분포 확인


###### 2-3-1. 히스토그램 
Train, Validation, Test 데이터셋 모두에서 각 거래별로 비슷한 분포 양상을 발견할 수 있었습니다.


###### 2-3-2. Scaling
describe에 따르면 각 필드가 가지는 값의 범위가 다양하므로, 정규화를 진행

In [None]:
# Train data
train_df.drop(columns=['ID']).describe()

# V_MIN과 V_MAX는 학습 데이터셋에서 최솟값 최댓값을 얻은 결과
V_MIN = train_df[train_df.columns.drop(['ID'])].min()
V_MAX = train_df[train_df.columns.drop(['ID'])].max()

# 정규화하는 normalize 함수
# 최솟값, 최댓값을 이용하여 0~1의 범위에 들어오도록 하는 MinMax Scaling
# 값이 전혀 변하지 않는 필드는 최솟값과 최댓값이 같을 때로, 이런 필드를 모두 0으로 만듦
def normalize(df):
    ndf = df.copy()
    for c in df.columns:
        if V_MIN[c] == V_MAX[c]:
            ndf[c] = df[c] - V_MIN[c]
        else:
            ndf[c] = (df[c] - V_MIN[c]) / (V_MAX[c] - V_MIN[c])
    return ndf

# 정규화한 TRAIN_DF 생성
TRAIN_DF = normalize(train_df[train_df.columns.drop(['ID'])])
TRAIN_DF

# 1 초과의 값이 있는지 / 0 미만의 값이 있는지 / NaN이 있는지 점검하는 boundary_check 함수
def boundary_check(df):
    x = np.array(df, dtype=np.float32)
    return np.any(x > 1.0), np.any(x < 0), np.any(np.isnan(x))

# 1보다 큰 값 / 0보다 작은 값 / not a number가 없으므로 정규화가 정상적으로 처리됨
boundary_check(TRAIN_DF)

(False, False, False)

같은 방식으로 검증 데이터셋과 테스트 데이터셋도 정규화 진행

In [None]:
# Validation data
val_df.drop(columns=['ID','Class']).describe()

V_MIN = val_df[val_df.columns.drop(['ID','Class'])].min()
V_MAX = val_df[val_df.columns.drop(['ID','Class'])].max()

VAL_DF = normalize(val_df[val_df.columns.drop(['ID','Class'])])
VAL_DF

boundary_check(VAL_DF)

(False, False, False)

In [None]:
# Test data
test_df.drop(columns=['ID']).describe()

V_MIN = test_df[test_df.columns.drop(['ID'])].min()
V_MAX = test_df[test_df.columns.drop(['ID'])].max()

TEST_DF = normalize(test_df[test_df.columns.drop(['ID'])])
TEST_DF

boundary_check(TEST_DF)

(False, False, False)

## **3. 모델링**


### 3-1. 3-layer Neural Network 

In [None]:
# 케라스로 신경망 학습 수행
# 1. 데이터 준비 - MinMax Scaling 정규화
TRAIN_DF = normalize(train_df[train_df.columns.drop(['ID'])])
VAL_DF = normalize(val_df[val_df.columns.drop(['ID','Class'])])
TEST_DF = normalize(test_df[test_df.columns.drop(['ID'])])

In [None]:
# 2. 인공 신경망 모형 구성
# 노드 30개 입력층 은닉층과 최종출력노드 2개인 출력층 생성
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Activation, Dense
from tensorflow.keras.optimizers import Adam

n_inputs = VAL_DF.shape[1]

sample_model = Sequential([
    Dense(n_inputs, input_shape=(n_inputs, ), activation='relu'),
    Dense(32, activation='relu'),
    Dense(2, activation='softmax')
])

In [None]:
# 3. 모형 학습 과정 설정
sample_model.compile(Adam(lr=0.001), # 최적화 기법 지정
                     loss='sparse_categorical_crossentropy', # 손실함수 지정
                     metrics=['accuracy'])

  super(Adam, self).__init__(name, **kwargs)


In [None]:
# 설정된 모형 상태 확인
sample_model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense (Dense)               (None, 30)                930       
                                                                 
 dense_1 (Dense)             (None, 32)                992       
                                                                 
 dense_2 (Dense)             (None, 2)                 66        
                                                                 
Total params: 1,988
Trainable params: 1,988
Non-trainable params: 0
_________________________________________________________________


In [None]:
# 4. 모형 학습
sample_model.fit(VAL_DF, val_df['Class'], validation_split=0.2, 
    batch_size=32, # 배치 크기 32
    epochs=15, # 에폭 15
    shuffle=True, verbose=2)

Epoch 1/15
712/712 - 5s - loss: 0.0312 - accuracy: 0.9962 - val_loss: 0.0091 - val_accuracy: 0.9988 - 5s/epoch - 7ms/step
Epoch 2/15
712/712 - 3s - loss: 0.0070 - accuracy: 0.9990 - val_loss: 0.0088 - val_accuracy: 0.9988 - 3s/epoch - 4ms/step
Epoch 3/15
712/712 - 3s - loss: 0.0061 - accuracy: 0.9990 - val_loss: 0.0085 - val_accuracy: 0.9988 - 3s/epoch - 4ms/step
Epoch 4/15
712/712 - 3s - loss: 0.0045 - accuracy: 0.9990 - val_loss: 0.0074 - val_accuracy: 0.9988 - 3s/epoch - 4ms/step
Epoch 5/15
712/712 - 2s - loss: 0.0038 - accuracy: 0.9994 - val_loss: 0.0075 - val_accuracy: 0.9988 - 2s/epoch - 3ms/step
Epoch 6/15
712/712 - 1s - loss: 0.0036 - accuracy: 0.9994 - val_loss: 0.0057 - val_accuracy: 0.9988 - 1s/epoch - 2ms/step
Epoch 7/15
712/712 - 1s - loss: 0.0034 - accuracy: 0.9994 - val_loss: 0.0063 - val_accuracy: 0.9988 - 1s/epoch - 2ms/step
Epoch 8/15
712/712 - 1s - loss: 0.0031 - accuracy: 0.9994 - val_loss: 0.0065 - val_accuracy: 0.9988 - 1s/epoch - 2ms/step
Epoch 9/15
712/712 - 1s 

<keras.callbacks.History at 0x7f1b8719fa90>

In [None]:
# history 객체에 학습 결과 저장
hist = sample_model.fit(
    VAL_DF, val_df['Class'], 
    validation_split=0.2, 
    batch_size=32, epochs=15,
    shuffle=True, verbose=2)

Epoch 1/15
712/712 - 2s - loss: 0.0024 - accuracy: 0.9995 - val_loss: 0.0052 - val_accuracy: 0.9991 - 2s/epoch - 2ms/step
Epoch 2/15
712/712 - 1s - loss: 0.0024 - accuracy: 0.9995 - val_loss: 0.0052 - val_accuracy: 0.9991 - 1s/epoch - 2ms/step
Epoch 3/15
712/712 - 1s - loss: 0.0023 - accuracy: 0.9996 - val_loss: 0.0030 - val_accuracy: 0.9993 - 1s/epoch - 2ms/step
Epoch 4/15
712/712 - 1s - loss: 0.0023 - accuracy: 0.9996 - val_loss: 0.0038 - val_accuracy: 0.9993 - 1s/epoch - 2ms/step
Epoch 5/15
712/712 - 1s - loss: 0.0023 - accuracy: 0.9996 - val_loss: 0.0036 - val_accuracy: 0.9993 - 1s/epoch - 2ms/step
Epoch 6/15
712/712 - 1s - loss: 0.0022 - accuracy: 0.9996 - val_loss: 0.0041 - val_accuracy: 0.9993 - 1s/epoch - 2ms/step
Epoch 7/15
712/712 - 1s - loss: 0.0021 - accuracy: 0.9996 - val_loss: 0.0055 - val_accuracy: 0.9991 - 1s/epoch - 2ms/step
Epoch 8/15
712/712 - 1s - loss: 0.0018 - accuracy: 0.9995 - val_loss: 0.0035 - val_accuracy: 0.9993 - 1s/epoch - 2ms/step
Epoch 9/15
712/712 - 1s 

In [None]:
hist.history['accuracy']

[0.9995169043540955,
 0.9995169043540955,
 0.9996047019958496,
 0.9995608329772949,
 0.9996486306190491,
 0.9995608329772949,
 0.9995608329772949,
 0.9995169043540955,
 0.9996047019958496,
 0.9996047019958496,
 0.9996047019958496,
 0.9995608329772949,
 0.9996047019958496,
 0.9996047019958496,
 0.9996047019958496]

In [None]:
hist.history['loss']

[0.0023871033918112516,
 0.002408544532954693,
 0.0023098718374967575,
 0.002304565627127886,
 0.0022855920251458883,
 0.002170458436012268,
 0.002096167765557766,
 0.0018114078557118773,
 0.0019778546411544085,
 0.001898507121950388,
 0.0020442407112568617,
 0.0020139876287430525,
 0.0018856441602110863,
 0.0019235610961914062,
 0.0019225989235565066]

In [None]:
# 5. 모형 성능 평가
loss, acc = sample_model.evaluate(VAL_DF, val_df['Class']) # 손실값, 정확도



In [None]:
# 6. 테스트 데이터에 모형 예측
sample_predictions = sample_model.predict(TEST_DF, batch_size=200, verbose=0)
sample_fraud_predictions = sample_predictions.argmax(axis=-1)

In [None]:
pd.Series(sample_fraud_predictions).value_counts()

0    142261
1       242
dtype: int64

In [None]:
# 7. 제출
submit_NeuralNet = pd.read_csv('/content/drive/MyDrive/sample_submission.csv')
submit_NeuralNet.head()

Unnamed: 0,ID,Class
0,AAAA0x1,0
1,AAAA0x2,0
2,AAAA0x5,0
3,AAAA0x7,0
4,AAAA0xc,0


In [None]:
submit_NeuralNet['Class'] = pd.Series(sample_fraud_predictions)
submit_NeuralNet.to_csv('/content/drive/MyDrive/submit_NeuralNet.csv', index=False)

결과: 0.8865000568

In [None]:
# 케라스로 신경망 학습 수행
# 1. 데이터 준비 - MinMax Scaling 정규화
TRAIN_DF = normalize(train_df[train_df.columns.drop(['ID'])])
VAL_DF = normalize(val_df[val_df.columns.drop(['ID','Class'])])
TEST_DF = normalize(test_df[test_df.columns.drop(['ID'])])

In [None]:
# 2. 인공 신경망 모형 구성
# 노드 30개, 은닉층 2개, 최종출력노드 2개인 출력층 생성
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Activation, Dense, Flatten
from tensorflow.keras.optimizers import Adam

n_inputs = VAL_DF.shape[1]

sample_model = Sequential([
    Dense(n_inputs, input_shape=(n_inputs, ), activation='relu'),
    Dense(32, activation='relu'),
    Dense(2, activation='softmax'),
    Flatten()
])

In [None]:
# 3. 모형 학습 과정 설정
sample_model.compile(Adam(lr=0.001), # 최적화 기법 지정
                     loss='sparse_categorical_crossentropy', # 손실함수 지정
                     metrics=['accuracy'])

  super(Adam, self).__init__(name, **kwargs)


In [None]:
# 설정된 모형 상태 확인
sample_model.summary()

Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_9 (Dense)             (None, 30)                930       
                                                                 
 dense_10 (Dense)            (None, 32)                992       
                                                                 
 dense_11 (Dense)            (None, 2)                 66        
                                                                 
 flatten (Flatten)           (None, 2)                 0         
                                                                 
Total params: 1,988
Trainable params: 1,988
Non-trainable params: 0
_________________________________________________________________


In [None]:
# 4. 모형 학습
sample_model.fit(
    VAL_DF, 
    val_df['Class'], 
    validation_split=0.2, 
    batch_size=32, # 배치 크기 32
    epochs=15, # 에폭 15
    shuffle=True, 
    verbose=2)

Epoch 1/15
712/712 - 2s - loss: 0.6934 - accuracy: 0.5139 - val_loss: 0.6932 - val_accuracy: 0.1542 - 2s/epoch - 3ms/step
Epoch 2/15
712/712 - 1s - loss: 0.6931 - accuracy: 0.5188 - val_loss: 0.6932 - val_accuracy: 0.1909 - 1s/epoch - 2ms/step
Epoch 3/15
712/712 - 2s - loss: 0.6931 - accuracy: 0.5104 - val_loss: 0.6931 - val_accuracy: 0.5552 - 2s/epoch - 2ms/step
Epoch 4/15
712/712 - 1s - loss: 0.6931 - accuracy: 0.5096 - val_loss: 0.6931 - val_accuracy: 0.4279 - 1s/epoch - 2ms/step
Epoch 5/15
712/712 - 1s - loss: 0.6931 - accuracy: 0.5132 - val_loss: 0.6931 - val_accuracy: 0.3485 - 1s/epoch - 2ms/step
Epoch 6/15
712/712 - 1s - loss: 0.6931 - accuracy: 0.5133 - val_loss: 0.6931 - val_accuracy: 0.3812 - 1s/epoch - 2ms/step
Epoch 7/15
712/712 - 2s - loss: 0.6931 - accuracy: 0.5139 - val_loss: 0.6931 - val_accuracy: 0.6237 - 2s/epoch - 2ms/step
Epoch 8/15
712/712 - 2s - loss: 0.6931 - accuracy: 0.5164 - val_loss: 0.6931 - val_accuracy: 0.0033 - 2s/epoch - 2ms/step
Epoch 9/15
712/712 - 1s 

<keras.callbacks.History at 0x7f1b84459590>

In [None]:
# history 객체에 학습 결과 저장
hist = sample_model.fit(
    VAL_DF, val_df['Class'], 
    validation_split=0.2, 
    batch_size=32, epochs=15,
    shuffle=True, verbose=2)

In [None]:
hist.history['accuracy']

In [None]:
hist.history['loss']

In [None]:
# 5. 모형 성능 평가
loss, acc = sample_model.evaluate(VAL_DF, val_df['Class']) # 손실값, 정확도



In [None]:
# 6. 훈련 데이터에 모형 예측
sample_predictions = sample_model.predict(TRAIN_DF, batch_size=200, verbose=0)
sample_fraud_predictions = sample_predictions.argmax(axis=-1)

In [None]:
pd.Series(sample_fraud_predictions).value_counts()

1    142489
0        14
dtype: int64

In [None]:
TRAIN_DF['Class'] = 0
TRAIN_DF['Class'] = pd.Series(sample_fraud_predictions)
TRAIN_DF

In [None]:
# 1. 훈련 데이터 준비
TRAIN_DF

# 2. 인공 신경망 모형 구성
n_inputs = TRAIN_DF.drop(columns='Class').shape[1]
sample_model = Sequential([
    Dense(n_inputs, input_shape=(n_inputs, ), activation='relu'),
    Dense(32, activation='relu'),
    Dense(2, activation='softmax')
])

# 3. 모형 학습 과정 설정 및 확인
sample_model.compile(Adam(lr=0.001), 
                     loss='sparse_categorical_crossentropy',
                     metrics=['accuracy'])
sample_model.summary()

# 4. 모형 학습
sample_model.fit(
    TRAIN_DF.drop(columns='Class'), TRAIN_DF['Class'], 
    validation_split=0.2, 
    batch_size=32, epochs=15,
    shuffle=True, verbose=2)

In [None]:
# 5. 모형 성능 평가
loss, acc = sample_model.evaluate(VAL_DF, val_df['Class'])

In [None]:
# 6. 테스트 데이터에 모형 예측
sample_predictions = sample_model.predict(TEST_DF, batch_size=200, verbose=0)
sample_fraud_predictions = sample_predictions.argmax(axis=-1)

In [None]:
pd.Series(sample_fraud_predictions).value_counts()

In [None]:
# 7. 제출
submit_NeuralNet = pd.read_csv('/content/drive/MyDrive/sample_submission.csv')
submit_NeuralNet.head()

Unnamed: 0,ID,Class
0,AAAA0x1,0
1,AAAA0x2,0
2,AAAA0x5,0
3,AAAA0x7,0
4,AAAA0xc,0


In [None]:
submit_NeuralNet['Class'] = pd.Series(sample_fraud_predictions)
submit_NeuralNet.to_csv('/content/drive/MyDrive/submit_NeuralNetwork.csv', index=False)

결과: 0.8505783497	

### 3-2. GAN

In [None]:
# 1. 검증 데이터 준비
VAL_DF = normalize(val_df[val_df.columns.drop(['ID','Class'])])

# 2. 생성적 적대 신경망 (GAN) 모형 구성
from keras.layers.advanced_activations import LeakyReLU
n_inputs = VAL_DF.shape[1]
g = Sequential()
g.add(Dense(n_inputs, input_shape=(n_inputs, ), activation=LeakyReLU(0.2)))
g.add(Dense(150, activation=LeakyReLU(0.2)))
g.add(Dense(30, activation='sigmoid'))

g.compile(Adam(lr=0.001), 
          loss='binary_crossentropy',
          metrics=['accuracy'])

d = Sequential()
d.add(Dense(n_inputs, input_shape=(30, ), activation=LeakyReLU(0.2)))
d.add(Dense(units=150, activation=LeakyReLU(0.2)))
d.add(Dense(units=1, activation='sigmoid'))

d.compile(loss='binary_crossentropy', 
          optimizer='adam', 
          metrics=['accuracy'])

d.trainable = False

  super(Adam, self).__init__(name, **kwargs)


In [None]:
# 3. GAN 모형 학습 과정 설정
from keras.models import Model
from keras.layers import Input

gan_input = Input(shape=(n_inputs,))
x = g(inputs=gan_input)
gan_output = d(x)

model = Model(gan_input, gan_output)

model.compile(loss='binary_crossentropy', 
              optimizer='adam', 
              metrics=['accuracy'])

In [None]:
# 4. GAN 모형 학습
EPOCHS = 5
BATCH_SIZE = 30
num_of_batches = int(VAL_DF.shape[0] / BATCH_SIZE)

for e in range(EPOCHS):
    for n in range(num_of_batches):
        noise = np.random.normal(0, 1, size=(BATCH_SIZE, n_inputs))
        fake_data = g.predict(noise)
        
        real_data = VAL_DF.iloc[np.random.randint(0, VAL_DF.shape[0], size=n_inputs)]
        
        X = np.concatenate([real_data, fake_data])
        y = np.zeros(2*BATCH_SIZE)
        y[:BATCH_SIZE] = 0.9
        
        d.trainable = True
        d_loss = d.train_on_batch(X, y) # 판별자 학습
        
        d.trainable = False
        
        noise2 = np.random.normal(0, 1, size=(BATCH_SIZE, n_inputs))
        y2 = np.ones(BATCH_SIZE)
        g_loss = model.train_on_batch(noise2, y2)
            
        print('EPOCH =', e+1, ', BATCH =', n+1, ', G_LOSS=', g_loss, ', D_LOSS =', d_loss)

EPOCH = 1 , BATCH = 1 , G_LOSS= [0.834487795829773, 0.0] , D_LOSS = [0.6839821338653564, 0.5]
EPOCH = 1 , BATCH = 2 , G_LOSS= [0.8039377927780151, 0.0] , D_LOSS = [0.6771727204322815, 0.5]
EPOCH = 1 , BATCH = 3 , G_LOSS= [0.7778986692428589, 0.0] , D_LOSS = [0.6749443411827087, 0.5]
EPOCH = 1 , BATCH = 4 , G_LOSS= [0.758228063583374, 0.0] , D_LOSS = [0.6712954640388489, 0.5]
EPOCH = 1 , BATCH = 5 , G_LOSS= [0.7523677349090576, 0.0] , D_LOSS = [0.6682472229003906, 0.5]
EPOCH = 1 , BATCH = 6 , G_LOSS= [0.7476255297660828, 0.0] , D_LOSS = [0.6674131751060486, 0.5]
EPOCH = 1 , BATCH = 7 , G_LOSS= [0.7489855289459229, 0.0] , D_LOSS = [0.6646971106529236, 0.44999998807907104]
EPOCH = 1 , BATCH = 8 , G_LOSS= [0.7308798432350159, 0.03333333507180214] , D_LOSS = [0.6624858379364014, 0.5]
EPOCH = 1 , BATCH = 9 , G_LOSS= [0.7434528470039368, 0.0] , D_LOSS = [0.6602922081947327, 0.46666666865348816]
EPOCH = 1 , BATCH = 10 , G_LOSS= [0.7537756562232971, 0.0] , D_LOSS = [0.6558986306190491, 0.5]
EPO

In [None]:
# 5. 모형 성능 평가
loss, acc = model.evaluate(VAL_DF, val_df['Class']) # 손실값, 정확도



In [None]:
# 6. 훈련 데이터에 모형 예측
TRAIN_DF = normalize(train_df[train_df.columns.drop(['ID'])])
predictions = model.predict(TRAIN_DF, batch_size=200, verbose=0)
fraud_predictions = predictions.argmax(axis=-1)

In [None]:
pd.Series(fraud_predictions).value_counts()

0    113842
dtype: int64

In [None]:
TRAIN_DF['Class'] = 0
TRAIN_DF['Class'] = pd.Series(sample_fraud_predictions)
TRAIN_DF

Unnamed: 0,V1,V2,V3,V4,V5,V6,V7,V8,V9,V10,...,V22,V23,V24,V25,V26,V27,V28,V29,V30,Class
0,0.922825,0.726028,0.868141,0.268766,0.772154,0.249182,0.270177,0.791604,0.408281,0.513018,...,0.546030,0.639586,0.289354,0.513503,0.402727,0.415489,0.491056,0.014739,0.000006,0
1,0.930777,0.739551,0.868484,0.213661,0.775515,0.243372,0.266803,0.793002,0.412695,0.507585,...,0.510277,0.620867,0.223826,0.587637,0.389197,0.417669,0.494929,0.004807,0.000006,0
2,0.941737,0.752967,0.857187,0.244472,0.778456,0.229963,0.268257,0.791740,0.440997,0.501038,...,0.483915,0.623658,0.332185,0.520715,0.442749,0.421196,0.495556,0.000143,0.000012,0
3,0.937309,0.758323,0.856031,0.230111,0.782056,0.234771,0.272183,0.747904,0.481946,0.534571,...,0.462660,0.625086,0.294686,0.506841,0.417014,0.394234,0.458290,0.001588,0.000041,0
4,0.932238,0.745071,0.835452,0.239894,0.793789,0.269357,0.267610,0.798106,0.447105,0.500230,...,0.497525,0.620631,0.518546,0.566791,0.362697,0.416728,0.497515,0.003628,0.000041,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
113837,0.696496,0.861012,0.690525,0.140643,0.744310,0.215632,0.243226,0.848184,0.629886,0.687797,...,0.437023,0.639268,0.213356,0.674264,0.477844,0.455101,0.532353,0.000384,0.999919,0
113838,0.988608,0.740039,0.820086,0.318724,0.775345,0.223831,0.266514,0.786251,0.483712,0.511224,...,0.539637,0.623390,0.375065,0.562472,0.350073,0.416848,0.491653,0.002335,0.999931,0
113839,0.945470,0.750060,0.844342,0.231388,0.777253,0.216166,0.271017,0.786713,0.464338,0.502831,...,0.486035,0.628866,0.432286,0.495894,0.443930,0.418938,0.495561,0.000214,0.999942,0
113840,0.952817,0.752622,0.827952,0.218901,0.783293,0.227797,0.270307,0.790175,0.453606,0.495120,...,0.472314,0.624964,0.396086,0.505274,0.445716,0.420534,0.495163,0.000105,0.999959,0
