# 케라스(Keras) 

<img src="https://s3.amazonaws.com/keras.io/img/keras-logo-2018-large-1200.png">

- 파이썬으로 작성된 고수준 신경망 API로 TensorFlow, CNTK, 혹은 Theano와 함께 사용 가능
- __사용자 친화성__, 모듈성, 확장성을 통해 빠르고 간편한 프로토타이핑 가능
- 컨볼루션 신경망, 순환 신경망, 그리고 둘의 조합까지 모두 지원
- CPU와 GPU에서 매끄럽게 실행


### Keras in TensorFlow 2.0

- Keras 창시자 프랑소와 숄레(François Chollet)가 TF 2.0 개발에 참였고, TF 2.0 에서 공식적이고 유일한 High-Level API로서 Keras를 채택

- Keras 2.4 버젼은 backend로 TF 2.0만 지원 

- 신경층(neural layer), 비용 함수(cost function), 옵티마이저(optimizer), 초기화 방식(initialization scheme), 활성화 함수(activation function), 정규화 방식(regularization scheme) 모두 독립적인 모듈이며 결합을 통해 새로운 모델 제작 가능


### Why are we using Keras? 

- Enable deep learning engineers to build and experiment with different models very quickly.
- Just as TensorFlow is a higher-level framework than Python, Keras is an even higher-level framework and provides additional abstractions.
- However, Keras is more restrictive than the lower-level frameworks. 
- So there are some very complex models that you can implement in TensorFlow but not in Keras. 
- Keras will work fine for many common models.



### A Basic Example Using Keras API

In [1]:
import numpy as np
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

# Step 1: 데이터 준비 및 전처리 가공

data = np.random.random((1000,100))
labels = np.random.randint(2,size=(1000,1))

# Step 2: 모델 구성 
model = Sequential()
model.add(Dense(32, activation = 'relu',input_dim=100))
model.add(Dense(1, activation='sigmoid'))

# Step 3: 모델 컴파일
model.compile(optimizer='rmsprop', 
              loss = 'binary_crossentropy',
              metrics = ['accuracy'])

# Step 4: 모델 학습
history = model.fit(data, labels, epochs=100, batch_size=32)

# Step 5: 모델 평가
prediction = model.predict(data)
for i in range(10):
    print('labels = {}\tprediction = {}'.format(labels[i],prediction[i]))

Epoch 1/100


InternalError:  Blas GEMM launch failed : a.shape=(32, 100), b.shape=(100, 32), m=32, n=32, k=100
	 [[node sequential/dense/MatMul (defined at <ipython-input-1-0b321ad2df46>:22) ]] [Op:__inference_train_function_684]

Function call stack:
train_function


In [None]:
# 학습 과정 시각화

import matplotlib.pyplot as plt

plt.plot(history.history['loss'], 'ro')
plt.plot(history.history['accuracy'],'b--')
plt.title('Training history')
plt.ylabel('loss/accuracy')
plt.xlabel('epochs')
plt.grid()
plt.legend(['loss','accuracy'])
plt.show()


### A Basic Example using Keras API - another way

In [None]:
import numpy as np
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

# Step 1: 데이터 준비 및 전처리 가공

data = np.random.random((1000,100))
labels = np.random.randint(2,size=(1000,1))

# Step 2: 모델 구성 
model = Sequential([
            Dense(32, activation = 'relu',input_dim=100),
            Dense(1, activation='sigmoid')
])
# Step 3: 모델 컴파일
model.compile(optimizer='rmsprop',  
              loss = 'binary_crossentropy',
              metrics = ['accuracy'])

# Step 4: 모델 학습
history = model.fit(data, labels, epochs=100, batch_size=32)

# Step 5: 모델 테스트 및 평가
prediction = model.predict(data)
for i in range(10):
    print('labels = {}\tprediction = {}'.format(labels[i],prediction[i]))

### Another Real Example using MNIST dataset




### MNIST Dataset

MNIST is a simple computer vision dataset. It consists of images of handwritten digits like these:

<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/2/27/MnistExamples.png/480px-MnistExamples.png">

- 60,000 images of handwritten digits for training data (mnist.train),
- 10,000 images of handwritten digits for test data (mnist.test),
- Each image is 28 pixels by 28 pixels. We can interpret this as a big array of numbers:

<img src="https://www.tensorflow.org/images/MNIST-Matrix.png">

An extended dataset similar to MNIST called [EMNIST](https://www.nist.gov/itl/products-and-services/emnist-dataset) has been published in 2017, which contains 240,000 training images, and 40,000 testing images of handwritten digits and characters.



### One-hot encoding

<img src="https://hackernoon.com/photos/4HK5qyMbWfetPhAavzyTZrEb90N2-3o23tie">

In [None]:
import numpy as np
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical
import matplotlib.pyplot as plt

# Browsing Original Dataset

(X_train, Y_train),(X_test, Y_test) = mnist.load_data()

# Original 데이터 확인
print('X_train shape: {}\tY_train shape: {}\n'.format(X_train.shape, Y_train.shape))
print('X_test shape: {}\tY_test shape: {}\n'.format(X_test.shape, Y_test.shape))

np.set_printoptions(linewidth=600, precision=2) # linewidth = 라인 최대 글자수, precision: 소수점이하 자리수

print('\n',Y_train[0: 9])
print('\n',X_train[0])
print('\n',X_train[5])     


In [None]:
fig = plt.figure(figsize=(6,6)) # Width, height in inches.
fig.subplots_adjust(hspace=0.7)

for index in range(25):
    plt.subplot(5,5, index+1) # 5행 5열
    plt.imshow(X_train[index], cmap='gray')
    plt.axis('off')
    plt.title(str(Y_train[index]))
    
plt.show()

plt.imshow(X_train[9], cmap='gray') 
plt.colorbar()   
plt.show()

In [None]:
# 학습데이터 정답 분포 확인

label_distribution = np.zeros(10)
for i in range(len(Y_train)):
    label = int(Y_train[i])
    label_distribution[label] = label_distribution[label] + 1

print(label_distribution)

# 히스토그램 시각화

plt.title('train label distribution')
plt.grid()
plt.xlabel('label')
plt.hist(Y_train, bins=10, rwidth=0.8)
plt.show()

In [None]:
# Basic plotting function
import matplotlib.pyplot as plt

def plot(data1, data2, data1_label='', data2_label='', title='', xlabel='', ylabel=''):

  plt.title(title)
  plt.xlabel(xlabel)
  plt.ylabel(ylabel)
  plt.plot(data1, label=data1_label)
  plt.plot(data2, label=data2_label)
  plt.legend(loc='best')
  plt.show()

In [None]:
import time
start = time.time() # 실행시간 측정을 위해 시작 시간 저장 

# Step 1: 데이터 준비. 전처리 가공, 하이퍼파라미터 지정

(X_train, Y_train),(X_test, Y_test) = mnist.load_data()
image_size = X_train.shape[1]
num_classes = 10
input_size= image_size*image_size 

# 데이터 전처리:

# 1) Dense Layers의 input은 1차원 텐서만 가능
X_train = np.reshape(X_train, [-1,input_size])
X_test = np.reshape(X_test,[-1,input_size])
print('X_train shape: ', X_train.shape)

# 2) Data normalization: # 학습을 더 빨리하고 Local optimum 에 빠지는 가능성을 감소
X_train = X_train/255.0
X_test = X_test/255.0

# 3) One-hot encoding
# The Y_train data is the associated labels for all the x_train examples. Rather
# than storing the label as an integer, it should be stored as a 1x10 binary 
# array with the one representing the digit. This is known as __one-hot encoding__. 
Y_train = to_categorical(Y_train, num_classes=num_classes)
Y_test = to_categorical(Y_test, num_classes=num_classes)

# 하이퍼 파라미터 설정
# In computer programming it is generally best to use variables and constants 
# rather than having to hard-code specific numbers every time that number is used. 
epochs = 20
batch_size = 128
hidden_units = 32

# Step 2: 모델 구성 
model = Sequential([
                    Dense(hidden_units, activation = 'relu',input_dim= input_size),
                    Dense(10, activation='softmax')
 ])
 
# Step 3: 모델 컴파일
model.compile(optimizer='rmsprop', 
              loss = 'categorical_crossentropy',
              metrics = ['accuracy'])

# Step 4: 모델 학습
hist = model.fit(X_train, Y_train, epochs=epochs, 
                 batch_size=batch_size,  validation_split=0.3, verbose = 1)
            # verbose = 0 (silent), 1 (progress bar), 2 (one line per epoch)
            # default = 1

# Loss 시각화 
plot(hist.history['loss'], hist.history['val_loss'], 'train loss','validation loss', 'Loss','epochs','accuracy')

# Accuracy 시각화
plot(hist.history['accuracy'], hist.history['val_accuracy'], 'train accuracy','validation accuracy', 'Accuracy','epochs','accuracy')

# Step 5: 모델 평가
model.evaluate(X_test, Y_test, batch_size= batch_size) 

print('\nExecution time in seconds:', time.time()-start) #  실행시간 출력

### 혼돈 행렬(Confusion Matrix)

- 모델의 성능을 평가할때 사용되는 지표
- 예측값이 실제 관측값을 얼마나 정확히 예측했는지 보여주는 행렬
<table>
<tr> <th></th>               <th>예상(예)</th>     <th>예상(아니오)</th> </tr>
<tr> <td>실제(예)</td>       <td>TP</td>           <td>FN</td></tr>
<tr> <td>실제(아니오)</td>   <td>FP</td>           <td>TN</td></tr>
</table>

- TP(True Positive) : 참긍정, 병에 관해 예 (병이 있을것이다.)라고 예측한 환자가 실제 병을 가진 경우
- TN(True Negative) : 참부정, 병에 관해 아니오(병이 없을 것이다)라고 예측한 환자가 실제로 병이 없는 경우
- FP(False Positive) : 거짓긍정, 병에 관해 예라고 예측한 환자가 실제로는 병이 없는 경우
- FN(False Negative) : 거짓부정, 병에 관해 아니오라고 예측한 환자가 실제로는 병이 있는 경우

In [None]:
from sklearn.metrics import confusion_matrix
import seaborn as sns

predicted_value =model.predict(X_test)

# 테스트 데이터 실제 라벨 및 예측라벨 분포

label_distribution = np.zeros(10)
pred_distribution = np.zeros(10)
for i in range(len(Y_test)):
    label = int(np.argmax(Y_test[i]))
    label_distribution[label] = label_distribution[label] + 1
    pred_label = int(np.argmax(predicted_value[i]))
    pred_distribution[pred_label] = pred_distribution[pred_label] + 1

print('실제 라벨분포:', label_distribution)
print('예측 라벨분포:', pred_distribution)

# Check Confusion Matrix

cm=confusion_matrix(np.argmax(Y_test, axis=-1),
                   np.argmax(predicted_value, axis=-1))

plt.figure(figsize=(6,6))
sns.heatmap(cm,annot=True, fmt='d')
plt.xlabel('Predicted')
plt.ylabel('True')
plt.show()

# Check accuracies among 10 groups

print(cm,'\n')

for i in range(10):
    print(('label = %d\t(%d/%d)\taccuracy = %.3f') % 
          (i, np.max(cm[i]), np.sum(cm[i]), 
           np.max(cm[i])/np.sum(cm[i])))
print('\n')
    
# Test Data에서의 숫자별 정확도 확인

label_distribution = np.zeros(10)
prediction_distribution = np.zeros(10)

for i in range(len(Y_test)):
    label = int(np.argmax(Y_test[i]))
    label_distribution[label] = label_distribution[label] + 1
    prediction = int(np.argmax(predicted_value[i]))
    prediction_distribution[prediction] = prediction_distribution[prediction] + 1

print(label_distribution,'\n')
print(prediction_distribution)

In [None]:
import time
start = time.time() # 실행시간 측정을 위해 시작 시간 저장 

# Step 1: 데이터 준비. 전처리 가공, 하이퍼파라미터 지정

(X_train, Y_train),(X_test, Y_test) = mnist.load_data()
image_size = X_train.shape[1]
num_classes = 10
input_size= image_size*image_size 

# 데이터 전처리:

# 1) Dense Layers의 input은 1차원 텐서만 가능
X_train = np.reshape(X_train, [-1,input_size])
X_test = np.reshape(X_test,[-1,input_size])
print('X_train shape: ', X_train.shape)

# 2) Data normalization: # 학습을 더 빨리하고 Local optimum 에 빠지는 가능성을 감소
X_train = X_train/255.0
X_test = X_test/255.0

# 3) One-hot encoding
# The Y_train data is the associated labels for all the x_train examples. Rather
# than storing the label as an integer, it should be stored as a 1x10 binary 
# array with the one representing the digit. This is known as __one-hot encoding__. 
Y_train = to_categorical(Y_train, num_classes=num_classes)
Y_test = to_categorical(Y_test, num_classes=num_classes)

# 하이퍼 파라미터 설정
# In computer programming it is generally best to use variables and constants 
# rather than having to hard-code specific numbers every time that number is used. 
epochs = 20
batch_size = 128
hidden_units = 32

# Step 2: 모델 구성 
model = Sequential([
                    Dense(10, activation='softmax',input_dim= input_size)
 ])
 
# Step 3: 모델 컴파일
model.compile(optimizer='rmsprop', 
              loss = 'categorical_crossentropy',
              metrics = ['accuracy'])
model.summary()

# Step 4: 모델 학습
hist = model.fit(X_train, Y_train, epochs=epochs, 
                 batch_size=batch_size,  validation_split=0.3, verbose = 1)
            # verbose = 0 (silent), 1 (progress bar), 2 (one line per epoch)
            # default = 1

# Loss 시각화 
plot(hist.history['loss'], hist.history['val_loss'], 'train loss','validation loss', 'Loss','epochs','accuracy')

# Accuracy 시각화
plot(hist.history['accuracy'], hist.history['val_accuracy'], 'train accuracy','validation accuracy', 'Accuracy','epochs','accuracy')

# Step 5: 모델 평가
model.evaluate(X_test, Y_test, batch_size= batch_size) 

print('\nExecution time in seconds:', time.time()-start) #  실행시간 출력

In [None]:
from sklearn.metrics import confusion_matrix
import seaborn as sns

predicted_value =model.predict(X_test)

# 테스트 데이터 실제 라벨 및 예측라벨 분포

label_distribution = np.zeros(10)
pred_distribution = np.zeros(10)
for i in range(len(Y_test)):
    label = int(np.argmax(Y_test[i]))
    label_distribution[label] = label_distribution[label] + 1
    pred_label = int(np.argmax(predicted_value[i]))
    pred_distribution[pred_label] = pred_distribution[pred_label] + 1

print('실제 라벨분포:', label_distribution)
print('예측 라벨분포:', pred_distribution)

# Check Confusion Matrix

cm=confusion_matrix(np.argmax(Y_test, axis=-1),
                   np.argmax(predicted_value, axis=-1))

plt.figure(figsize=(6,6))
sns.heatmap(cm,annot=True, fmt='d')
plt.xlabel('Predicted')
plt.ylabel('True')
plt.show()

# Check accuracies among 10 groups

print(cm,'\n')

for i in range(10):
    print(('label = %d\t(%d/%d)\taccuracy = %.3f') % 
          (i, np.max(cm[i]), np.sum(cm[i]), 
           np.max(cm[i])/np.sum(cm[i])))
print('\n')
    
# Test Data에서의 숫자별 정확도 확인

label_distribution = np.zeros(10)
prediction_distribution = np.zeros(10)

for i in range(len(Y_test)):

    label = int(np.argmax(Y_test[i]))

    label_distribution[label] = label_distribution[label] + 1

    prediction = int(np.argmax(predicted_value[i]))

    prediction_distribution[prediction] = prediction_distribution[prediction] + 1


print(label_distribution,'\n')
print(prediction_distribution)

In [None]:
from sklearn.metrics import confusion_matrix
import seaborn as sns

predicted_value =model.predict(X_test)

# 테스트 데이터 실제 라벨 및 예측라벨 분포

label_distribution = np.zeros(10)
pred_distribution = np.zeros(10)
for i in range(len(Y_test)):
    label = int(np.argmax(Y_test[i]))
    label_distribution[label] = label_distribution[label] + 1
    pred_label = int(np.argmax(predicted_value[i]))
    pred_distribution[pred_label] = pred_distribution[pred_label] + 1

print('실제 라벨분포:', label_distribution)
print('예측 라벨분포:', pred_distribution)

# Check Confusion Matrix

cm=confusion_matrix(np.argmax(Y_test, axis=-1),
                   np.argmax(predicted_value, axis=-1))

plt.figure(figsize=(6,6))
sns.heatmap(cm,annot=True, fmt='d')
plt.xlabel('Predicted')
plt.ylabel('True')
plt.show()

# Check accuracies among 10 groups

print(cm,'\n')

for i in range(10):
    print(('label = %d\t(%d/%d)\taccuracy = %.3f') % 
          (i, np.max(cm[i]), np.sum(cm[i]), 
           np.max(cm[i])/np.sum(cm[i])))
print('\n')
    
# Test Data에서의 숫자별 정확도 확인

label_distribution = np.zeros(10)
prediction_distribution = np.zeros(10)

for i in range(len(Y_test)):

    label = int(np.argmax(Y_test[i]))

    label_distribution[label] = label_distribution[label] + 1

    prediction = int(np.argmax(predicted_value[i]))

    prediction_distribution[prediction] = prediction_distribution[prediction] + 1


print(label_distribution,'\n')
print(prediction_distribution)

In [None]:
images_label5 = []
for i in range(10000):
  if (np.argmax(Y_test[i]) == 5) and (np.argmax(predicted_value[i]) != 5):
    images_label5.append(X_test[i])

number_images = len(images_label5)

for i in range(number_images):
    plt.imshow(np.reshape(images_label5[i], [28,28]), cmap='gray') 
    plt.colorbar()   
    plt.axis('off')
    plt.show()