# RNN 做情意分析

In [1]:
%env KERAS_BACKEND=tensorflow
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
from keras.datasets import imdb     # IMDB 電影數據庫影評
from keras.preprocessing import sequence
from keras.models import Sequential
from keras.layers import Dense, Embedding
from keras.layers import LSTM

env: KERAS_BACKEND=tensorflow


Using TensorFlow backend.


In [2]:
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=20000)
x_train = sequence.pad_sequences(x_train, maxlen=150)
x_test = sequence.pad_sequences(x_test, maxlen=150)
x_train.shape

(25000, 150)

## 建構神經網路

In [33]:
N = 4 # 文字要壓到 N 維
K = 64 # LSTM 有 K 個神經元

In [34]:
model = Sequential()
model.add(Embedding(20000, N))
model.add(LSTM(K))
model.add(Dense(1, activation='sigmoid'))

## 組裝神經網路

In [35]:
model.compile(loss='binary_crossentropy',
             optimizer='adam',
             metrics=['accuracy'])
model.summary()                                   #(4*(N+K)+4)*K

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_6 (Embedding)      (None, None, 4)           80000     
_________________________________________________________________
lstm_6 (LSTM)                (None, 64)                17664     
_________________________________________________________________
dense_6 (Dense)              (None, 1)                 65        
Total params: 97,729
Trainable params: 97,729
Non-trainable params: 0
_________________________________________________________________


## 訓練神經網路

In [36]:
model.fit(x_train, y_train,
         batch_size=32,
         epochs=5)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0xb2cec5cf8>

## 檢視結果

In [39]:
score = model.evaluate(x_test, y_test)
print(f'測試資料的 loss = {score[0]}')
print(f'測試資料正確率 = {score[1]}')

測試資料的 loss = 0.4517346919059753
測試資料正確率 = 0.8488


經過多次測試結果，發現如果把維度調低，Epoch的最終準確率較低，但是測試資料的正確率較高；
但若把維度調高，Epoch的最終準確率會顯著增加，但是測試資料的正確率卻比較低。
即便調整過多次參數，正確率僅能達到85％附近。

## 儲存結果

In [38]:
model_json = model.to_json()
open('imdb_model_arch.json',
     'w').write(model_json)
model.save_weights('imdb_model_weights.h5')