## Session2 - playground2

### Whole training from the beginning / 初めからのトレーニング

Let's first run the whole network training process from the start.

まずは、最初からネットワークトレーニングを行います。

<div style="background-color: #FFDDBB; padding: 10px;">
<b>REMINDER</b>: If you are getting strange errors when executing code with neural networks, make sure that you stopped or restarted the kernels in all other notebooks!
    
<b>注意</b>：ニューラルネットワークのコードを実行しているときに見知らぬエラーが発生した場合は、他のすべてのノートブックでカーネルを停止または再起動したかを確認してください！
</div>


#### Imports

In [None]:
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt

# Import Tensorflow 
import tensorflow as tf

# Import Keras stuff
from tensorflow.keras.layers import Input, Conv2D, MaxPooling2D, Flatten, Dense, Dropout, LSTM
from tensorflow.keras.models import Model, load_model
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import ModelCheckpoint, EarlyStopping
from tensorflow.keras.utils import to_categorical

#### Load MNIST and create the training and testing sets

In [None]:
from mnist_loader import MNISTVectorLoader
mnist_vector_loader = MNISTVectorLoader(43)
X, y = mnist_vector_loader.samples(70000)

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test  = train_test_split(X, y, test_size=0.5)

from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
scaler.fit(X_train)
X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)

#### Define the network
<a id='define_the_network'></a>

In [None]:
input_shape = X_train[0].shape
vector_input = Input(shape = input_shape, name='input')

fc1 = Dense(128, activation='relu', name='fc1')(vector_input)

fc2 = Dense(128, activation='relu', name='fc2')(fc1)

output = Dense(10, activation='softmax', name='output')(fc2)

network = Model(vector_input, output, name='classification')

network.summary()

network.compile(loss='categorical_crossentropy', optimizer=Adam(lr=0.01), metrics=['acc'])

#### Train the network

In [None]:
y_train_one_hot = to_categorical(y_train)
y_test_one_hot = to_categorical(y_test)

In [None]:
H = network.fit(X_train, y_train_one_hot,
                batch_size=100, epochs=20,
                validation_data=(X_test, y_test_one_hot),
                verbose=1)

## TASK 1
Plot the evolution of the losses and accuracies.

損失と精度の進化をプロットしてください。

In [None]:
## YOUR CODE HERE

## TASK 2
Re-run the training but this time:
- Stop the training when `val_acc` the accuracy on the validation set does not grow anymore 
- Save the model with the best accuracy as `my_best_network.hdf5`

トレーニングを再実行してください。今回は
 - テストセットの精度（`val_acc`）が向上しなくなったら、トレーニングを中断する
 - モデルを `my_best_network.hdf5`として最高の精度で保存します

Note that you have to recreate the network to re-initialize the weights if you want to re-train from scratch.
<br>
Otherwise the training just updates the current weights.

<br>So, you have to re-run the cells from [Define the network](#define_the_network) to create a new model with fresh weights.

注：最初からトレーニングをやり直す場合は、重みを再初期化する必要があります（そうしなければ、トレーニングは現在の重みを更新するだけです）。そのためネットワークを再作成しなければなりません。
<br>
 [Define the network](#define_the_network)からのセルを再実行して、新しい重みを持つ新しいモデルを作成してください。

Plot again the evolution of the losses and accuracies.

もう一度損失と精度の進化をプロットしてください。

In [None]:
## YOUR CODE HERE

## TASK 3 - Recognize your own writting / 自分の手書き数字の認識

In the tests above the testing data was also from MNIST.
<br>
Let us now try to recognize some handwritten digits that are not from MNIST.

今までは、トレーニングにもテストにもMNISTのデータを使っていました。
<br>
今度は自分が書いた手書きの数字を認識してみましょう。

First load the network you want to test.

まず試してみたいネットワークをロードします。

In [None]:
network = load_model("my_best_network.hdf5")

Then, use the next cell  to draw a digit to test the classifier:
- Draw in the middle of the dark area
- The prediction is updated at each stroke
- Use the clear button to erase

Try the classification multiple times to empirically check the accuracy of the classifier.

次に、次のセルを使って数字を描いて分類器をテストしてください。
 - 暗い領域の真ん中に描いて
 - 予測は各ストロークで更新されます
 - 消去するにはクリアボタンを使用してください

分類器の精度を経験的に確認するために、分類を複数回試してください。

In [None]:
import ipywidgets as widgets
import jupyter_drawing_pad as jd
    
jdp = jd.CustomBox()
draw_pad = jdp.drawing_pad
clear_btn = jdp.children[1].children[1]

out = widgets.Output(layout=widgets.Layout(width='400px'))

@out.capture() 
def w_CB(change):
    from scipy.signal import convolve2d
    from cv2 import resize, INTER_CUBIC, cvtColor, COLOR_RGB2GRAY

    data = change['new']
    if len(data[0]) > 2:
        # Get strokes information
        x = np.array(data[0])
        y = np.array(data[1])
        t = np.array(data[2])

        # assuming there is at least 200ms between each stroke 
        line_breaks = np.where(np.diff(t) > 200)[0]
        # adding end of array
        line_breaks = np.append(line_breaks, t.shape[0])
        
        # Plot to canvas
        from matplotlib.backends.backend_agg import FigureCanvasAgg as FigureCanvas
        fig = plt.figure()
        canvas = FigureCanvas(fig)
        ax = fig.gca()

        # plot all strokes
        plt.plot(x[:line_breaks[0]], y[:line_breaks[0]], color='black', linewidth=4)
        for i in range(1, len(line_breaks)):
            plt.plot(x[line_breaks[i-1]+1 : line_breaks[i]], y[line_breaks[i-1]+1 : line_breaks[i]], color='black', linewidth=4)
        
        plt.xlim(0,460)
        plt.ylim(0,250)
        plt.axis("off")
        
        canvas.draw()       # draw the canvas, cache the renderer

        # convert to numpy array 
        imageflat = np.frombuffer(canvas.tostring_rgb(), dtype='uint8')
        # not sure why this size...
        image = np.reshape(imageflat,(288, 432, 3))
        
        # Cut the part containing the writting
        ind = np.where(image<255)      
        
        D0 = ind[0].max() - ind[0].min() 
        D1 = ind[1].max() - ind[1].min() 
        
        C0 = int(0.5 * (ind[0].max() + ind[0].min()))
        C1 = int(0.5 * (ind[1].max() + ind[1].min()))

        if D0 > D1:
            D = D0
        else:
            D = D1
        
        L = int(D / 2.0) + 20
        image = image[C0 - L : C0 + L ,  C1 - L : C1 + L, :] 
    
        # Convert to gray
        image = 255 - cvtColor(image, COLOR_RGB2GRAY)
        
        # Low pass filter and resize
        k = 12
        I = convolve2d(image, np.ones((k,k))/k**2.0, mode="same")      
                
        # Resize with opencv 
        I = resize(I, dsize=(28, 28), interpolation=INTER_CUBIC)
        
        # Clip in [0, 1]
        I = I / I.max()
        I = I * 3.0
        I = I.clip(0, 1)
        
        # Get a feature vector
        Xv =  I.reshape((1, 28*28)).astype(np.float64) 
        
        # Standardization
        Xv = scaler.transform(Xv)
        
        # Apply the classifier
        y_pred_one_hot = network.predict(Xv)
        y_prediction = np.argmax(y_pred_one_hot)
        v = np.max(y_pred_one_hot)    
        
        title = "Prediction: {} ({:.02f})".format(y_prediction, v)    
        
        # draw the converted image
        plt.clf()
        plt.imshow(I, aspect='equal', cmap = mpl.cm.binary, interpolation='none')
        plt.title(title)
        plt.axis("off")
              
        plt.show()

        # To erase after tracing
        #change['owner'].data = [[], [], []]
        
        # Schedule for clearing
        out.clear_output(wait=True)
    else:
        pass
        
draw_pad.observe(w_CB, names='data')

hb = widgets.HBox([draw_pad, clear_btn, out])
display(hb)

## OPTIONAL 1
Until now we split the dataset into testing and training sets.
<br>
However, in many cases, the dataset is split between a __development set__ and a testing set.

For example, in a "competition", the competing teams only have access to the development set to build their classifier and the testing set is kept secret by the organizers.
<br>
After the development period is over, the organizers use the testing set to compute the accuracy of the classifiers built by the participants to decide the winner.

これまでは、データセットをテストセットとトレーニングセットに分割しました。
<br>
ただし、多くの場合、データセットは__開発用のセット__とテストセットに分割されています。

> Even if you have access to the full data including the training dataset, it is standard practice to create a test set at the begining, put it aside, and never use it. You only use it at the very end, after you decided which classifier to use and tuned it, to get the final evaluation of the performance.

Since we do not have access to the testing set, the development set is usually split in two sets:
- a training set to train the classifier
- a __validation set__ to check if the training is well done (validate the training).

In particular, the validation set is used to stop the training before overfitting occurs (early stopping).

The `fit` function provides a way to automatically create a validation set:
- the parameter `validation_split` determine the ratio of the training data to be used for validation
- if the parameter `shuffle` is `True` a different random split is done at each epoch

The next command applies `fit` with 50% of validation data and a random shuffling at each epoch:

```H = network.fit(X_train, y_train_one_hot, 
                    batch_size=100, epochs=10,
                    validation_split=0.5 , shuffle=True,
                    callbacks=[early_stopping_cb, 
                    model_checkpoint_cb],
                    verbose=1)
                    ```
                    
This call to `fit` also sets an `EarlyStopping` callback.

Train 3 networks using:
- 5% of the training set for validation; save the best network as `best_network_05.hdf5`
- 50% of the training set for validation; save the best network as `best_network_50.hdf5` 
- 95% of the training set for validation; save the best network as `best_network_95.hdf5` 

Check the confusion matrix on the testing set for each of these three networks.