# LSTM层的参数

LSTM的第1个参数memory units是指hidden state的维度，也是输出向量的维度。   

## stateful

[?]  
默认情况下，LSTM的memory每个mini-batch会reset一次。  
设置参数`stateful=True`以后就不会自动reset。由开发者手动reset。    
以下情况会需要用到这一功能：  
1. batch_size = 1  
2. 一个长序列分成了几个子序列  
3. 一个非常长的序列所以要考虑效率问题  

## go_backwords

定义LSTM的方向，默认为False（正向），设置为True则是反向。  

# LSTM作为输入层

LSTM层要求的输入数据是三维的，分别代表样本数、时间步数、特征数。   
对应的，LSTM层也要通过参数`input_shape`定义可以接受的输入数据的形状。input_shape的参数代表输入数据的时间步数和特征数。  
默认的activation是linear。LSTM的第1个参数是输出向量的维度。   

In [7]:
import numpy as np
import tensorflow as tf

data = np.array([1,2,3,4,5,6,7,8,9,10])
data = data.reshape(1, 10, 1)
tf.keras.layers.LSTM(5, input_shape=(10, 1))

<tensorflow.python.keras.layers.recurrent_v2.LSTM at 0x63f97b210>

In [8]:
import numpy as np

data = np.array([[1,2],[3,4],[5,6],[7,8],[9,10]])
data = data.reshape(1, 5, 2)
tf.keras.layers.LSTM(5, input_shape=(5, 1))

<tensorflow.python.keras.layers.recurrent_v2.LSTM at 0x64074c710>

# LSTM作为隐藏层

In [9]:
import tensorflow as tf

vocab_size = 10000
model = tf.keras.Sequential([
    tf.keras.layers.Embedding(vocab_size, 64),
    tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(64)),
    tf.keras.layers.Dense(64, activation='relu'),
    tf.keras.layers.Dense(1, activation='sigmoid')
])

# 两个连续的LSTM层

In [10]:
import tensorflow as tf

vocab_size = 10000
model = tf.keras.Sequential([
    tf.keras.layers.Embedding(vocab_size, 64),
    tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(64, return_sequences=True)),
    tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(32)),
    tf.keras.layers.Dense(64, activation='relu'),
    tf.keras.layers.Dense(1, activation='sigmoid')
])

# 双向LSTM

In [11]:
import tensorflow as tf

vocab_size = 10000
model = tf.keras.Sequential([
    tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(64, return_sequences=True)),
])

# Embedding层

LSTM之前通常要加入Embedding  
Embedding是指把编码转成指定维度的向量，使具有相近语义的单词其向量也是相近的  
Embedding通常作为输入层，且不需要定义input_shape，输入数据是二维的，第一维是样本数，第二维是样本序列。

In [3]:
import tensorflow as tf

model = tf.keras.Sequential()
vocab_size = 100 # 单词表中有99个单词，There is one additional row to handle "unknown" words.
embedding_dim = 30 # 输出向量为30维
max_length = 10 # 每个样本的长度
model.add(tf.keras.layers.Embedding(vocab_size, embedding_dim, input_length=max_length))
model.add(tf.keras.layers.LSTM(units=128, return_sequences=True,dropout=0.5))
model.add(tf.keras.layers.LSTM(units=128, return_sequences=False,dropout=0.5))
model.add(tf.keras.layers.Dense(units=5, activation='softmax'))
model.summary()

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding (Embedding)        (None, 10, 30)            3000      
_________________________________________________________________
lstm (LSTM)                  (None, 10, 128)           81408     
_________________________________________________________________
lstm_1 (LSTM)                (None, 128)               131584    
_________________________________________________________________
dense (Dense)                (None, 5)                 645       
Total params: 216,637
Trainable params: 216,637
Non-trainable params: 0
_________________________________________________________________


## 用预训练参数初始化Embedding层

In [5]:
import numpy as np

def read_glove_vecs(glove_file):
    with open(glove_file, 'r',encoding='utf-8') as f:
        words = set()
        word_to_vec_map = {}
        for line in f:
            line = line.strip().split()
            curr_word = line[0]
            words.add(curr_word)
            word_to_vec_map[curr_word] = np.array(line[1:], dtype=np.float64)

        i = 1
        words_to_index = {}
        index_to_words = {}
        for w in sorted(words):
            words_to_index[w] = i
            index_to_words[i] = w
            i = i + 1
    return words_to_index, index_to_words, word_to_vec_map

def pretrained_embedding_layer(word_to_vec_map, word_to_index):
    """
    Creates a Keras Embedding() layer and loads in pre-trained GloVe 50-dimensional vectors.

    Arguments:
    word_to_vec_map -- dictionary mapping words to their GloVe vector representation.
    word_to_index -- dictionary mapping from words to their indices in the vocabulary (400,001 words)

    Returns:
    embedding_layer -- pretrained layer Keras instance
    """

    vocab_len = len(word_to_index) + 1  # adding 1 to fit Keras embedding (requirement)
    emb_dim = word_to_vec_map["cucumber"].shape[0]  # define dimensionality of your GloVe word vectors (= 50)

    ### START CODE HERE ###
    # Step 1
    # Initialize the embedding matrix as a numpy array of zeros.
    # See instructions above to choose the correct shape.
    emb_matrix = np.zeros((vocab_len, emb_dim))

    # Step 2
    # Set each row "idx" of the embedding matrix to be
    # the word vector representation of the idx'th word of the vocabulary
    for word, idx in word_to_index.items():
        emb_matrix[idx, :] = word_to_vec_map[word]

    # Step 3
    # Define Keras embedding layer with the correct input and output sizes
    # Make it non-trainable.
    embedding_layer = tf.keras.layers.Embedding(vocab_len, emb_dim, trainable=False)
    ### END CODE HERE ###

    # Step 4 (already done for you; please do not modify)
    # Build the embedding layer, it is required before setting the weights of the embedding layer.
    embedding_layer.build((None,))  # Do not modify the "None".  This line of code is complete as-is.

    # Set the weights of the embedding layer to the embedding matrix. Your layer is now pretrained.
    embedding_layer.set_weights([emb_matrix])

    return embedding_layer

model = tf.keras.Sequential()
vocab_size = 100 # 单词表中有99个单词，There is one additional row to handle "unknown" words.
embedding_dim = 30 # 输出向量为30维
max_length = 10 # 每个样本的长度
word_to_index, index_to_word, word_to_vec_map = read_glove_vecs('pre_trained/glove.6B.50d.txt')
embedding_layer = pretrained_embedding_layer(word_to_vec_map, word_to_index)
model.add(embedding_layer)
model.add(tf.keras.layers.LSTM(units=128, return_sequences=True,dropout=0.5))
model.add(tf.keras.layers.LSTM(units=128, return_sequences=False,dropout=0.5))
model.add(tf.keras.layers.Dense(units=5, activation='softmax'))
model.summary()

FileNotFoundError: [Errno 2] No such file or directory: 'pre_trained/glove.6B.50d.txt'