**余盈蓓；語碩二；110555009**\
**111-2 Computational Linguistics**
<p align="center", style = "font-size:18pt">
<b>7th Assignment<br>
Deep Learning</b>
</p>

<p style = "font-size:14pt">
<b>Step 1. 資料讀入</b>
</p>

In [10]:
from tensorflow.keras.preprocessing import text_dataset_from_directory
from tensorflow.strings import regex_replace

# preprocess the dataset: remove all the <br /> tag

def prepareData(dir):
  data = text_dataset_from_directory(dir) # load in dataset
  return data.map(
    lambda text, label: (regex_replace(text, '<br />', ' '), label),
  )


# load in dataset
train_data = prepareData("movie-reviews-dataset/train")
test_data = prepareData("movie-reviews-dataset/test")

Found 25000 files belonging to 2 classes.
Found 25000 files belonging to 2 classes.


In [12]:
for text_batch, label_batch in train_data.take(2):
  print(text_batch.numpy()[0])
  print(label_batch.numpy()[0]) # 0 = negative, 1 = positive

b"This is the movie that I use to judge all other bad movies, and so far there hasn't been anything close.  The only good thing I can say is that after watching this I know that I have seen the worst movie I will ever see."
0
b'The real star of this ridiculous story is glorious technicolor. A visual treat to the eye, the film fails to stimulate the mind and heart. I was intrigued, at first, by the idea of Dietrich and Boyer leaving religion in order to "find" their capacity for love. What follows is a huge disappointment. Boyer is the only real actor in the production and one feels his torment. Dietrich\'s amazing wardrobe outshines her performance -- at times her face is frightening to look at -- a unfeeling mask. As a monk, Boyer held the formula for the monastery\'s liquer (which reminds me of the true story of Chartreuse) -- when he leaves his "marriage to god" the reaction by his fellow monks holds the shock and fear that perpetuate organized religion. The viewer feels Boyer was w

<p style = "font-size:14pt">
<b>Step 2. 向量化文字</b>
</p>

In [15]:

from tensorflow.keras.layers.experimental.preprocessing import TextVectorization

max_tokens = 1000
max_len = 50
vectorize_layer = TextVectorization(
  max_tokens=max_tokens,
  output_mode="int",
  output_sequence_length=max_len,
)

train_texts = train_data.map(lambda text, label: text) 
vectorize_layer.adapt(train_texts)

<p style = "font-size:14pt">
<b>Step 3. 模型訓練</b>
</p>
<p style = "font-size:12pt">
✨ <b>GRU</b><br>
&emsp;&ensp;→ 初始化模型
</p>

In [29]:
from tensorflow.keras.models import Sequential
from tensorflow.keras import Input

model = Sequential(name="GRU-model")
model.add(Input(shape=(1,), dtype="string"))
model.add(vectorize_layer)

<p style = "font-size:12pt">
&emsp;&ensp;→ 加入embedding
</p>

In [30]:
from tensorflow.keras.layers import Embedding

model.add(Embedding(max_tokens + 1, 128))

<p style = "font-size:12pt">
&emsp;&ensp;→ 加入GRU layer
</p>

In [31]:
from tensorflow.keras.layers import GRU

model.add(GRU(64))

<p style = "font-size:12pt">
&emsp;&ensp;→ 加入一個hidden layer跟最後的output layer
</p>

In [32]:
from tensorflow.keras.layers import Dense

model.add(Dense(64, activation="relu"))
model.add(Dense(1, activation="sigmoid"))

<p style = "font-size:12pt">
&emsp;&ensp;→ 完成模型
</p>

In [33]:
model.compile(
  optimizer='adam',
  loss='binary_crossentropy',
  metrics=['accuracy'],
)

In [34]:
print(model.summary())

Model: "GRU-model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 text_vectorization_1 (TextV  (None, 50)               0         
 ectorization)                                                   
                                                                 
 embedding_2 (Embedding)     (None, 50, 128)           128128    
                                                                 
 gru_2 (GRU)                 (None, 64)                37248     
                                                                 
 dense_4 (Dense)             (None, 64)                4160      
                                                                 
 dense_5 (Dense)             (None, 1)                 65        
                                                                 
Total params: 169,601
Trainable params: 169,601
Non-trainable params: 0
___________________________________________________

<p style = "font-size:12pt">
&emsp;&ensp;→ 放入訓練集訓練模型
</p>

In [35]:
model.fit(train_data, epochs=50)

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


<keras.callbacks.History at 0x1fad5a0c820>

<p style = "font-size:12pt">
✨ <b>bi-LSTM</b><br>
&emsp;&ensp;→ 設定模型參數
</p>

In [90]:
from tensorflow.keras.layers import LSTM, Bidirectional

bilstm_model = Sequential([
    Input(shape=(1,), dtype="string"),
    vectorize_layer,
    Embedding(max_tokens + 1, 128),
    Bidirectional(LSTM(64)),
    Dense(64, activation='relu'),
    Dense(1, activation='sigmoid')
])
bilstm_model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])
print(bilstm_model.summary())

Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 text_vectorization_1 (TextV  (None, 50)               0         
 ectorization)                                                   
                                                                 
 embedding_4 (Embedding)     (None, 50, 128)           128128    
                                                                 
 bidirectional_1 (Bidirectio  (None, 128)              98816     
 nal)                                                            
                                                                 
 dense_6 (Dense)             (None, 64)                8256      
                                                                 
 dense_7 (Dense)             (None, 1)                 65        
                                                                 
Total params: 235,265
Trainable params: 235,265
Non-tr

<p style = "font-size:12pt">
&emsp;&ensp;→ 放入訓練集訓練模型
</p>

In [91]:
bilstm_model.fit(train_data, epochs=50)

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


<keras.callbacks.History at 0x1faeac7a640>

<p style = "font-size:12pt">
✨ <b>bi-RNN</b><br>
&emsp;&ensp;→ 設定模型參數
</p>

In [92]:
from tensorflow.keras.layers import SimpleRNN

birnn_model = Sequential([
    Input(shape=(1,), dtype="string"),
    vectorize_layer,
    Embedding(max_tokens + 1, 128),
    Bidirectional(SimpleRNN(64)),
    Dense(64, activation='relu'),
    Dense(1, activation='sigmoid')
])
birnn_model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])
print(birnn_model.summary())

Model: "sequential_3"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 text_vectorization_1 (TextV  (None, 50)               0         
 ectorization)                                                   
                                                                 
 embedding_5 (Embedding)     (None, 50, 128)           128128    
                                                                 
 bidirectional_2 (Bidirectio  (None, 128)              24704     
 nal)                                                            
                                                                 
 dense_8 (Dense)             (None, 64)                8256      
                                                                 
 dense_9 (Dense)             (None, 1)                 65        
                                                                 
Total params: 161,153
Trainable params: 161,153
Non-tr

<p style = "font-size:12pt">
&emsp;&ensp;→ 放入訓練集訓練模型
</p>

In [93]:
birnn_model.fit(train_data, epochs=50)

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


<keras.callbacks.History at 0x1faeae73a30>

<p style = "font-size:14pt">
<b>Step 4. 保存模型參數</b>
</p>

In [95]:
model.save_weights('GRU')
bilstm_model.save_weights('bi-LSTM')
birnn_model.save_weights('bi-RNN')

<p style = "font-size:14pt">
<b>Step 5. 模型評估</b>
</p>
<p style = "font-size:12pt">
✨ <b>GRU</b><br>
</p>

In [107]:
GRU_results = model.evaluate(test_data)
print("test loss, test acc:", GRU_results)

test loss, test acc: [2.827223539352417, 0.6917999982833862]


<p style = "font-size:12pt">
✨ <b>bi-LSTM</b><br>
</p>

In [108]:
LSTM_results = bilstm_model.evaluate(test_data)
print("test loss, test acc:", LSTM_results)

test loss, test acc: [2.9129574298858643, 0.6949599981307983]


<p style = "font-size:12pt">
✨ <b>bi-RNN</b><br>
</p>

In [109]:
RNN_results = birnn_model.evaluate(test_data)
print("test loss, test acc:", RNN_results)

test loss, test acc: [2.0231945514678955, 0.6616799831390381]


<p align = "center", style = "font-size:16pt">
    🧐<b>討論</b>🧐<br>
</p>
<p style = "font-size:12pt">
✨分數統整<br>
</p>
<table>
  <tr>
    <th rowspan = "2">&emsp;&emsp;</th>
    <th colspan = "2", style = "background-color: #F1E1FF" >training</th>
    <th colspan = "2", style = "background-color: #F1E1FF" >testing</th>
  </tr>
  <tr style = "background-color: #FFECF5">
    <th>loss</th>
    <th>accuracy</th>
    <th>loss</th>
    <th>accuracy</th>
  </tr>
  <tr>
    <th style = "background-color: 	#D2E9FF">GRU</th>
    <td>0.0297</td>
    <td>0.9907</td>
    <td>2.8272</td>
    <td>0.6918</td>
  </tr>
  <tr>
    <th style = "background-color: 	#D2E9FF">bi-RNN</th>
    <td>0.0539</td>
    <td>0.9812</td>
    <td>2.0232</td>
    <td>0.6617</td>
  </tr>
  <tr>
    <th style = "background-color: 	#D2E9FF">bi-LSTM</th>
    <td >0.0158</td>
    <td>0.9945</td>
    <td>2.9130</td>
    <td>0.6950</td>
  </tr>
</table>

<p style = "font-size:12pt">
   &emsp;&emsp;首先從模型訓練的過程就可以看到，在參數設定都一樣的情況下，bi-LSTM的表現是最好的。而測試的結果也是bi-LSTM的sccuracy最高，唯一不同是bi-LSTM的loss是最高的。bi-LSTM的表現最好其實是可以預見的結果，因為它改善了RNN可能把長句子中比較前面的context稀釋掉的問題，又比GRU多了一個方向去考慮更多context，所以結果才是最好的。比較讓人意外的是他們之間的差距其實沒有很大，這個部分不確定是這幾個模型其實都只是在細節上針對缺點做了一點修改的關係，還是資料的問題。
</p>