In [1]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

### 1.讀入深度學習的套件

In [2]:
from tensorflow.keras.preprocessing import sequence
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Embedding
from tensorflow.keras.layers import LSTM
from tensorflow.keras.datasets import imdb

### 2.讀入數據

In [3]:
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=10000)

In [4]:
len(x_train)

25000

In [5]:
len(x_test)

25000

In [6]:
len(x_train[0])

218

In [7]:
len(x_train[134])

55

In [8]:
y_train[0]

1

In [9]:
y_train[134]

0

### 3.資料處理

In [10]:
x_train = sequence.pad_sequences(x_train, maxlen=100)
x_test = sequence.pad_sequences(x_test, maxlen=100)

### 4.打造函數學習機(一)

In [32]:
model1 = Sequential()

In [33]:
model1.add(Embedding(10000,256))

In [34]:
model1.add(LSTM(256,dropout=0.2,recurrent_dropout=0.2))

In [35]:
model1.add(Dense(1,activation='sigmoid'))

In [36]:
model1.compile(loss='binary_crossentropy',
             optimizer='adam',
             metrics=['accuracy'])

In [37]:
model1.summary()

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_2 (Embedding)      (None, None, 256)         2560000   
_________________________________________________________________
lstm_2 (LSTM)                (None, 256)               525312    
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 257       
Total params: 3,085,569
Trainable params: 3,085,569
Non-trainable params: 0
_________________________________________________________________


### 5.訓練模型一

In [38]:
model1.fit(x_train, y_train, batch_size=50, epochs=10,
          
         validation_data=(x_test,y_test))

Train on 25000 samples, validate on 25000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x1bf25f25080>

In [39]:
model1_json = model1.to_json()
open('imdb_model_architecture.json', 'w').write(model1_json)
model1.save_weights('imdb_model_weights.h5')

model1的參數設定：

神經元：256

dropout=0.2

recurrent_dropout=0.2

activation='sigmoid'

loss='binary_crossentropy'

optimizer='adam'

訓練時的batch_size=50,epochs=10

訓練到第十次的時候，訓練資料的準確率為0.9875。
此外，可以發現訓練到第四次的時候，驗證資料的準確率為十次中最高的，其值為0.8412。

因此，我想調整dropout rate 和recurrent_dropout rate，看看能否讓驗證資料的準確率更高。

### 6. 打造函數學習機與訓練模型二

In [11]:
model2 = Sequential()

In [12]:
model2.add(Embedding(10000,256))

In [13]:
model2.add(LSTM(256,dropout=0.3,recurrent_dropout=0.3))

In [14]:
model2.add(Dense(1,activation='sigmoid'))

In [15]:
model2.compile(loss='binary_crossentropy',
             optimizer='adam',
             metrics=['accuracy'])

In [16]:
model2.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding (Embedding)        (None, None, 256)         2560000   
_________________________________________________________________
lstm (LSTM)                  (None, 256)               525312    
_________________________________________________________________
dense (Dense)                (None, 1)                 257       
Total params: 3,085,569
Trainable params: 3,085,569
Non-trainable params: 0
_________________________________________________________________


In [17]:
model2.fit(x_train, y_train, batch_size=50, epochs=10,
          
         validation_data=(x_test,y_test))

Train on 25000 samples, validate on 25000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x1f50047d3c8>

In [18]:
model2_json = model2.to_json()
open('imdb_model_architecture.json', 'w').write(model2_json)
model2.save_weights('imdb_model_weights.h5')

從上面這兩個模型來看，將dropout與recurrent dropout調整為0.3對模型準確率的提升沒有幫助；此外，設定dropout與recurrent dropout為0.2時，模型的表現較佳。我想固定dropout與recurrent dropout在0.2，透過調整其他參數以提升測試資料準確率。因此想試著調整batch_size，看能否提升測試資料的準確率。

### 7.打造函數學習機與訓練模型三

In [11]:
model3 = Sequential()

In [12]:
model3.add(Embedding(10000,256))

In [13]:
model3.add(LSTM(256,dropout=0.2,recurrent_dropout=0.2))

In [14]:
model3.add(Dense(1,activation='sigmoid'))

In [15]:
model3.compile(loss='binary_crossentropy',
             optimizer='adam',
             metrics=['accuracy'])

In [16]:
model3.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding (Embedding)        (None, None, 256)         2560000   
_________________________________________________________________
lstm (LSTM)                  (None, 256)               525312    
_________________________________________________________________
dense (Dense)                (None, 1)                 257       
Total params: 3,085,569
Trainable params: 3,085,569
Non-trainable params: 0
_________________________________________________________________


In [17]:
model3.fit(x_train, y_train, batch_size=25, epochs=10,
          
         validation_data=(x_test,y_test))

Train on 25000 samples, validate on 25000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x265c7c8a940>

### 8.打造函數學習機與訓練模型四

In [11]:
model4 = Sequential()

In [12]:
model4.add(Embedding(10000,256))

In [13]:
model4.add(LSTM(256,dropout=0.2,recurrent_dropout=0.2))

In [14]:
model4.add(Dense(1,activation='sigmoid'))

In [15]:
model4.compile(loss='binary_crossentropy',
               optimizer='adam',
               metrics=['accuracy'])

In [16]:
model4.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding (Embedding)        (None, None, 256)         2560000   
_________________________________________________________________
lstm (LSTM)                  (None, 256)               525312    
_________________________________________________________________
dense (Dense)                (None, 1)                 257       
Total params: 3,085,569
Trainable params: 3,085,569
Non-trainable params: 0
_________________________________________________________________


In [17]:
model4.fit(x_train, y_train, batch_size=10, epochs=10,
          
         validation_data=(x_test,y_test))

Train on 25000 samples, validate on 25000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x1410054d2e8>

### 9. 模型比較與結論

* 各模型的參數設定相同處：

    * 神經元：256

    * activation='sigmoid'

    * loss='binary_crossentropy'

    * optimizer='adam'

    * 訓練時的epochs=10

* model1的其他參數設定：

    * dropout=0.2

    * recurrent_dropout=0.2

    * 訓練時的batch_size=50

* model2的其他參數設定：

    * dropout=0.3

    * recurrent_dropout=0.3

    * 訓練時的batch_size=50

* model3的其他參數設定：

    * dropout=0.2

    * recurrent_dropout=0.2

    * 訓練時的batch_size=25

* model4的其他參數設定：

    * dropout=0.2

    * recurrent_dropout=0.2

    * 訓練時的batch_size=10

* 結論：

model1與model2僅有dropout與recurrent_dropout不同，分別為0.2、0.3。

從測試資料的準確度來看，model1最高為0.8412、model2最高為0.8421，兩者在第五、六次訓練後，測試資料的準確度都下降。

因此，我認為這份資料訓練太多次反而不好，而且dropout與recurrent_dropout在0.2和0.3的差異不大。

所以在下面我嘗試調整batch_size，看能不能提升測試資料的準確率。

model1、model3、model4的其他參數皆相同，batch_size分別為50、25、10。

從測試資料的準確度來看，model1最高為0.8412、model3最高為0.8423、model4最高為0.8524，分別出現在第4、2、3次訓練。

這些模型的驗證資料準確率在第五次以後幾乎都下降，只有model3下降後上升再下降。

從這三個模型可以看出隨著batch_size減少，測試資料準確度有稍微提升。

然而我認為這個模型的訓練次數可以減少到六七次即可，雖然後面訓練資料的準確度高達98%、99%，但是測試資料的準確率卻一直下降。

