# Letter recognition (small size)

> Indeed, I once even proposed that the toughest challenge facing AI workers is to answer the question: “What are the letters ‘A’ and ‘I’? - [Douglas R. Hofstadter](https://web.stanford.edu/group/SHR/4-2/text/hofstadter.html) (1995)


## notMNIST


Data source: [notMNIST](http://yaroslavvb.blogspot.com/2011/09/notmnist-dataset.html) (you need to download `notMNIST_small.mat` file):

![](http://yaroslavvb.com/upload/notMNIST/nmn.png)

> some publicly available fonts and extracted glyphs from them to make a dataset similar to MNIST. There are 10 classes, with letters A-J taken from different fonts.

> Approaching 0.5% error rate on notMNIST_small would be very impressive. If you run your algorithm on this dataset, please let me know your results.


## So, why not MNIST?

Many introductions to image classification with deep learning start with MNIST, a standard dataset of handwritten digits. This is unfortunate. Not only does it not produce a “Wow!” effect or show where deep learning shines, but it also can be solved with shallow machine learning techniques. In this case, plain k-Nearest Neighbors produces more than 97% accuracy (or even 99.5% with some data preprocessing!). Moreover, MNIST is not a typical image dataset – and mastering it is unlikely to teach you transferable skills that would be useful for other classification problems

> Many good ideas will not work well on MNIST (e.g. batch norm). Inversely many bad ideas may work on MNIST and no[t] transfer to real [computer vision]. - [François Chollet’s tweet](https://twitter.com/fchollet/status/852594987527045120)

In [None]:
!wget http://yaroslavvb.com/upload/notMNIST/notMNIST_small.mat

In [None]:
import matplotlib.pyplot as plt
from scipy import io
import numpy as np

## Data Loading

In [None]:
data = io.loadmat('notMNIST_small.mat')

data

In [None]:
x = data['images']
y = data['labels']

In [None]:
x.shape, y.shape

In [None]:
resolution = 28
classes = 10

x = np.transpose(x, (2, 0, 1))
print(x.shape)
x = x.reshape( (-1, resolution, resolution, 1) )

In [None]:
# sample, x, y, channel
x.shape, y.shape

* 데이터 살펴보기

In [None]:
rand_i = np.random.randint(0, x.shape[0])

plt.title( f'idx: {rand_i} , y: {"ABCDEFGHIJ"[ int(y[rand_i]) ]}' )
plt.imshow( x[rand_i, :, :, 0], cmap='Greys' )
plt.show()

In [None]:
rows = 5
fig, axes = plt.subplots(rows, classes, figsize=(classes,rows))

for letter_id in range(classes) :
    letters = x[y==letter_id]      # 0부터 9까지 각 숫자에 맞는 array가 letters에 들어간다.
    letters_len = len(letters)

    for row_i in range(rows) :
        axe = axes[row_i, letter_id]
        axe.imshow( letters[np.random.randint(letters_len)], cmap='Greys', interpolation='none')
        axe.axis('off')

## Data Preprocessing

* Data split

    - training set : test set = 8 : 2
    - 재연을 위한 난수 고정 : 2023

In [None]:
x.shape, y.shape

In [None]:
from sklearn.model_selection import train_test_split

In [None]:
train_x, test_x, train_y, test_y = train_test_split(x, y, test_size=0.2, random_state=2023)

In [None]:
train_x.shape, train_y.shape, test_x.shape, test_y.shape

* Scaling

    - min-max scaling

In [None]:
max_n, min_n = train_x.max(), train_x.min()

In [None]:
train_x = (train_x - min_n) / (max_n - min_n)
test_x = (test_x - min_n) / (max_n - min_n)

In [None]:
train_x.max(), train_x.min()

* One-hot encoding

In [None]:
from tensorflow.keras.utils import to_categorical

In [None]:
class_len = len(np.unique(train_y))
class_len

In [None]:
train_y = to_categorical(train_y, class_len)
test_y = to_categorical(test_y, class_len)

* Data shape 재확인

In [None]:
train_x.shape, train_y.shape

## Modeling
- 조건
    1. Flatten Layer 사용할 것
    2. Activation Function이 주어진 Dense Layer 뒤에 BatchNormalization 사용할 것
    3. Dropout을 0.2 정도로 사용할 것
    4. Early Stopping을 사용할 것

In [None]:
import tensorflow as tf
from tensorflow import keras

from tensorflow.keras.backend import clear_session
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import Input, Dense, Flatten, BatchNormalization, Dropout
from tensorflow.keras.layers import Activation

* Sequential API

In [None]:
## BatchNormalization의 최초 저자는 Activation 이전에 적용할 것을 주장!

# 1. 세션 클리어
clear_session()

# 2. 모델 선언
model1 = Sequential()

# 3. 레이어 블록 조립
model1.add( Input(shape=(28,28,1)) )
model1.add( Flatten() )
model1.add( Dense(256) )
model1.add( BatchNormalization() )
model1.add( Activation('relu') )
model1.add( Dropout(0.2) )
model1.add( Dense(128) )
model1.add( BatchNormalization() )
model1.add( Activation('relu') )
model1.add( Dropout(0.2) )
model1.add( Dense(64) )
model1.add( BatchNormalization() )
model1.add( Activation('relu') )
model1.add( Dropout(0.2) )
model1.add( Dense(10) )
model1.add( Activation('softmax') )

# 4. 컴파일
model1.compile(optimizer='adam', loss='categorical_crossentropy',
              metrics=['accuracy'])

model1.summary()

In [None]:
from tensorflow.keras.utils import plot_model

In [None]:
plot_model(model1, show_shapes=True, show_layer_activations=True)

* Functional API

In [None]:
# 그러나 성능은 Activation을 거친 이후에 BatchNormalization을 사용하는게 더 좋다고 알려짐!

# 1. 세션 클리어
clear_session()

# 2. 레이어 엮기
il = Input(shape=(28,28,1))
hl = Flatten()(il)
hl = Dense(256, activation='relu')(hl)
hl = BatchNormalization()(hl)
hl = Dropout(0.2)(hl)
hl = Dense(128, activation='relu')(hl)
hl = BatchNormalization()(hl)
hl = Dropout(0.2)(hl)
hl = Dense(64, activation='relu')(hl)
hl = BatchNormalization()(hl)
hl = Dropout(0.2)(hl)
ol = Dense(10, activation='softmax')(hl)

# 3. 모델의 시작과 끝 지정
model2 = Model(il, ol)

# 4. 컴파일
model2.compile(optimizer='adam', loss='categorical_crossentropy',
              metrics=['accuracy'])

model2.summary()

* Early stopping

In [None]:
from tensorflow.keras.callbacks import EarlyStopping

In [None]:
es = EarlyStopping(monitor='val_loss',          # 얼리 스토핑을 적용할 관측 대상
                   min_delta=0,                 # Threshold. 설정한 값 이상으로 변해야 성능 개선되었다고 간주.
                   patience=3,                  # 성능 개선이 발생하지 않았을 때, 몇 epoch를 더 지켜볼 것인가.
                   verbose=1,
                   restore_best_weights=True)   # 성능이 가장 좋은 epoch의 가중치를 적용함.

* .fit( )

In [None]:
model1.fit(train_x, train_y, epochs=20, verbose=1,
           validation_split=0.2,  # 매 epoch마다 training set의 20%를 validation set으로 만듬
           callbacks=[es]         # 얼리스토핑 적용
           )

* .evaluate( )

In [None]:
model1.evaluate(test_x, test_y, verbose=1)

* .predict( )

In [None]:
y_pred = model1.predict(test_x)

In [None]:
# 원핫 인코딩 한 것을 다시 묶어주는 코드
# 평가 지표 및 실제 데이터 확인을 위해 필요

y_pred_arg = np.argmax(y_pred, axis=1)
test_y_arg = np.argmax(test_y, axis=1)

* 평가 지표

In [None]:
from sklearn.metrics import accuracy_score, classification_report

In [None]:
accuracy_score(test_y_arg, y_pred_arg)

In [None]:
print( classification_report(test_y_arg, y_pred_arg) )

* 실제 데이터 확인

In [None]:
letters_str = "ABCDEFGHIJ"

rand_idx = np.random.randint(0, len(y_pred_arg))
test_idx = test_y_arg[rand_idx]
pred_idx = y_pred_arg[rand_idx]
class_prob = np.floor( y_pred[rand_idx]*100 )

print(f'idx = {rand_idx}')
print(f'해당 인덱스의 이미지는 {letters_str[test_idx]}')
print(f'모델의 예측 : {letters_str[pred_idx]}')
print(f'모델의 클래스별 확률 : ')
print('-------------------')
for idx, val in enumerate(letters_str) :
    print(val, class_prob[idx])
print('=================================================')

if test_y_arg[rand_idx] == y_pred_arg[rand_idx] :
    print('정답')
else :
    print('땡')

plt.imshow(test_x[rand_idx], cmap='Greys')
plt.show()

* 틀린 이미지만 확인해보기

In [None]:
temp = (test_y_arg == y_pred_arg)
false_idx = np.where(temp==False)[0]
false_len = len(false_idx)
false_len

In [None]:
letters_str = "ABCDEFGHIJ"

rand_idx = false_idx[np.random.randint(0, false_len)]
test_idx = test_y_arg[rand_idx]
pred_idx = y_pred_arg[rand_idx]
class_prob = np.floor( y_pred[rand_idx]*100 )

print(f'idx = {rand_idx}')
print(f'해당 인덱스의 이미지는 {letters_str[test_idx]}')
print(f'모델의 예측 : {letters_str[pred_idx]}')
print(f'모델의 클래스별 확률 : ')
print('-------------------')
for idx, val in enumerate(letters_str) :
    print(val, class_prob[idx])
print('=================================================')

if test_y_arg[rand_idx] == y_pred_arg[rand_idx] :
    print('정답')
else :
    print('땡')

plt.imshow(test_x[rand_idx], cmap='Greys')
plt.show()