In [1]:
%env KERAS_BACKEND=tensorflow
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
from keras.datasets import imdb

env: KERAS_BACKEND=tensorflow


Using TensorFlow backend.


In [2]:
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words = 10000)

In [3]:
from keras.preprocessing import sequence

In [4]:
x_train1 = sequence.pad_sequences(x_train,maxlen=150)
x_test1 = sequence.pad_sequences(x_test,maxlen=150) 

### 重現上課的模型

In [5]:
from keras.models import Sequential
from keras.layers import Dense , Embedding
from keras.layers import LSTM

In [7]:
N = 3 #文字壓縮到幾維
K = 4 #LSTM層幾個神經元
model = Sequential()
model.add(Embedding(10000,N))
model.add(LSTM(K))
model.add(Dense(1,activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(x_train1,y_train,batch_size =32,epochs = 5,validation_data = (x_test1, y_test))

Train on 25000 samples, validate on 25000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x1bf1126d4a8>

### 開始建立自己的model

In [9]:
N = 5 #文字壓縮到幾維
K = 8 #LSTM層幾個神經元
model1 = Sequential()
model1.add(Embedding(10000,N))
model1.add(LSTM(K))
model1.add(Dense(1,activation='sigmoid'))
model1.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

In [10]:
model1.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_4 (Embedding)      (None, None, 5)           50000     
_________________________________________________________________
lstm_4 (LSTM)                (None, 8)                 448       
_________________________________________________________________
dense_4 (Dense)              (None, 1)                 9         
Total params: 50,457
Trainable params: 50,457
Non-trainable params: 0
_________________________________________________________________


**確定一下真的懂LSTM:** <br> LSTM 層總共有448個 parameter，在文字5維，LSTM有8個cell的情況。這是因為 <br>
((**5**(維文字)+**8**(個cell會feedback))(input/neuron)\***4**(neuron/cell)+4(bias))\***8**(個LSTM cell)， $(13*4+4)*8 = 448$

![](https://i.imgur.com/qc3548R.png)


In [11]:
model1.fit(x_train1,y_train,batch_size =32,epochs = 5,validation_data = (x_test1, y_test))

Train on 25000 samples, validate on 25000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x1bf172d64e0>

###  以上發現雖然高於上課的model，但是說不定只是數據誤差，因為只多了不到1%，因此著手進行下一步改進

In [24]:
# 1. 文章看多一點說不定更能夠知道文章在講甚麼
x_train2 = sequence.pad_sequences(x_train,maxlen=200)
x_test2 = sequence.pad_sequences(x_test,maxlen=200)
N = 50 #文字壓縮到幾維
K = 50 #LSTM層幾個神經元

#2.調整RNN內參數
model2 = Sequential()
model2.add(Embedding(10000,N))
model2.add(LSTM(K))
model2.add(Dense(1,activation='sigmoid'))
model2.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model2.fit(x_train2,y_train,batch_size =30,epochs = 5,validation_data = (x_test2, y_test))

Train on 25000 samples, validate on 25000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x1bf5353a550>

### 雖然這次的確有提升個2%左右，但我覺得仍有進步空間，但是LSTM能調整的參數也差不多調整過了，因此想加入CNN來看看，想法是先壓縮到一個蠻大的維度(1000)，然後用CNN萃取特徵，就好像把話的特徵取出來，然後再真的加入LSTM。
考慮到剛才RNN改變參數，也好像沒有很大的幫助，這裡就先回到比較少的LSTM cell以免train太久

In [31]:
from keras.layers import Conv1D, MaxPool1D #上次圖片是二維才需要2D
# 1. 文章看多一點說不定更能夠知道文章在講甚麼
x_train3 = sequence.pad_sequences(x_train,maxlen=150)
x_test3 = sequence.pad_sequences(x_test,maxlen=150)

N = 1000 #文字壓縮到幾維
K = 10 #LSTM層幾個神經元

model3 = Sequential()
model3.add(Embedding(10000,N))
model3.add(Conv1D(16,7,padding='same',activation='relu'))#16個NN neuron，長度7的filter(因為是一維)
model3.add(MaxPool1D(pool_size=4))
model3.add(LSTM(K))
model3.add(Dense(1,activation='sigmoid'))
model3.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model3.fit(x_train3,y_train,batch_size =30,epochs = 5,validation_data = (x_test3, y_test))

Train on 25000 samples, validate on 25000 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x1bf44257518>

### 感覺有點 overfitting 問題，因為training set都提升到0.99了，validation set還是未見提升 因此加入drop out

In [42]:
from keras.layers import Conv1D, MaxPool1D
from keras.layers import Dropout
import keras.callbacks
# 1. 文章看多一點說不定更能夠知道文章在講甚麼
x_train4 = sequence.pad_sequences(x_train,maxlen=200)
x_test4 = sequence.pad_sequences(x_test,maxlen=200)

N = 1000 #文字壓縮到幾維
K = 10 #LSTM層幾個神經元

model4 = Sequential()
model4.add(Embedding(10000,N))
model4.add(Dropout(0.5))
model4.add(Conv1D(16,7,padding='same',activation='relu'))
model4.add(MaxPool1D(pool_size=4))
model4.add(LSTM(K))
model4.add(Dense(1,activation='sigmoid'))
model4.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
EarlyStop = keras.callbacks.EarlyStopping(monitor='val_loss',patience=0,verbose=0,mode='auto')
model4.fit(x_train4,y_train,batch_size =30,epochs = 5,validation_data = (x_test4, y_test),callbacks=[EarlyStop])
#經由多次測試發現，有時候在跑epochs時，validation set的正確率會下降，所以決定用early stop，讓他決定甚麼時候要結束epoch

Train on 25000 samples, validate on 25000 samples
Epoch 1/5
Epoch 2/5


<keras.callbacks.History at 0x222043c3dd8>

**可以看到整體上來說，在 validation data 上的表現確實有提升，大約是88.2% 左右，trainin data 的accuracy其實可以再上升到99%左右(其實在上一個model就已經達成)，只是因為實驗多次之後發現epoch一多，validation data accuracy會下降，所以決定用early stop**