# Sentiment Analysis with an RNN 使用RNN来进行情绪分析

In this notebook, you'll implement a recurrent neural network that performs sentiment analysis. Using an RNN rather than a feedfoward network is more accurate since we can include information about the *sequence* of words. Here we'll use a dataset of movie reviews, accompanied by labels.

在这本笔记本中，您将实施执行情绪分析的循环神经网络。 使用RNN而不是feedfoward网络更准确，因为我们可以包括有关*sequence*的信息。 在这里，我们将使用电影评论的数据集，并附有标签。

The architecture for this network is shown below.

该网络的架构如下所示。

<img src="assets/network_diagram.png" width=400px>

Here, we'll pass in words to an embedding layer. We need an embedding layer because we have tens of thousands of words, so we'll need a more efficient representation for our input data than one-hot encoded vectors. You should have seen this before from the word2vec lesson. You can actually train up an embedding with word2vec and use it here. But it's good enough to just have an embedding layer and let the network learn the embedding table on it's own.

在这里，我们将传入一个嵌入层。 我们需要一个嵌入层，因为我们有数万个单词，所以我们需要比单热编码向量更有效地表示输入数据。 你应该从word2vec课程中看到过。 实际上你可以用word2vec来训练一个嵌入，并在这里使用它。 但是，只要拥有一个嵌入层，让网络学习嵌入表就可以了。

From the embedding layer, the new representations will be passed to LSTM cells. These will add recurrent connections to the network so we can include information about the sequence of words in the data. Finally, the LSTM cells will go to a sigmoid output layer here. We're using the sigmoid because we're trying to predict if this text has positive or negative sentiment. The output layer will just be a single unit then, with a sigmoid activation function.

从嵌入层，新的表示将被传递给LSTM单元。 这些将添加到网络的重复连接，因此我们可以包括关于数据中单词序列的信息。 最后，LSTM单元将在这里进入S形输出层。 我们正在使用sigmoid，因为我们试图预测这个文本是否有积极或消极的情绪。 输出层只是一个单一的单元，然后具有S形激活功能。

We don't care about the sigmoid outputs except for the very last one, we can ignore the rest. We'll calculate the cost from the output of the last step and the training label.

我们不关心Sigmoid输出，除了最后一个，我们可以忽略其余的。 我们将从最后一步的输出和培训标签计算成本。

In [1]:
import numpy as np
import tensorflow as tf

In [2]:
with open('../sentiment-network/reviews.txt', 'r') as f:
    reviews = f.read()
with open('../sentiment-network/labels.txt', 'r') as f:
    labels = f.read()

In [3]:
reviews[:2000]

'bromwell high is a cartoon comedy . it ran at the same time as some other programs about school life  such as  teachers  . my   years in the teaching profession lead me to believe that bromwell high  s satire is much closer to reality than is  teachers  . the scramble to survive financially  the insightful students who can see right through their pathetic teachers  pomp  the pettiness of the whole situation  all remind me of the schools i knew and their students . when i saw the episode in which a student repeatedly tried to burn down the school  i immediately recalled . . . . . . . . . at . . . . . . . . . . high . a classic line inspector i  m here to sack one of your teachers . student welcome to bromwell high . i expect that many adults of my age think that bromwell high is far fetched . what a pity that it isn  t   \nstory of a man who has unnatural feelings for a pig . starts out with a opening scene that is a terrific example of absurd comedy . a formal orchestra audience is tu

## Data preprocessing 数据预处理

The first step when building a neural network model is getting your data into the proper form to feed into the network. Since we're using embedding layers, we'll need to encode each word with an integer. We'll also want to clean it up a bit.

构建神经网络模型的第一步是将您的数据转化为正确的形式进入网络。 由于我们使用嵌入层，我们需要用一个整数对每个单词进行编码。 我们也想清理一下。

You can see an example of the reviews data above. We'll want to get rid of those periods. Also, you might notice that the reviews are delimited with newlines `\n`. To deal with those, I'm going to split the text into each review using `\n` as the delimiter. Then I can combined all the reviews back together into one big string.

您可以看到上面的评论数据的例子。 我们想要摆脱那些时期。 另外，您可能会注意到，这些评论用换行符`\ n`分隔。 为了处理这些，我将使用`\ n`作为分隔符将文本分割成每个评论。 然后我可以将所有的评论结合在一起，形成一个大字符串。

First, let's remove all punctuation. Then get all the text without the newlines and split it into individual words.

首先，我们删除所有的标点符号。 然后获取所有没有换行符的文本，并将其分成单个单词。

In [4]:
from string import punctuation
all_text = ''.join([c for c in reviews if c not in punctuation])
reviews = all_text.split('\n')

all_text = ' '.join(reviews)
words = all_text.split()

In [5]:
all_text[:2000]

'bromwell high is a cartoon comedy  it ran at the same time as some other programs about school life  such as  teachers   my   years in the teaching profession lead me to believe that bromwell high  s satire is much closer to reality than is  teachers   the scramble to survive financially  the insightful students who can see right through their pathetic teachers  pomp  the pettiness of the whole situation  all remind me of the schools i knew and their students  when i saw the episode in which a student repeatedly tried to burn down the school  i immediately recalled          at           high  a classic line inspector i  m here to sack one of your teachers  student welcome to bromwell high  i expect that many adults of my age think that bromwell high is far fetched  what a pity that it isn  t    story of a man who has unnatural feelings for a pig  starts out with a opening scene that is a terrific example of absurd comedy  a formal orchestra audience is turned into an insane  violent m

In [6]:
words[:100]

['bromwell',
 'high',
 'is',
 'a',
 'cartoon',
 'comedy',
 'it',
 'ran',
 'at',
 'the',
 'same',
 'time',
 'as',
 'some',
 'other',
 'programs',
 'about',
 'school',
 'life',
 'such',
 'as',
 'teachers',
 'my',
 'years',
 'in',
 'the',
 'teaching',
 'profession',
 'lead',
 'me',
 'to',
 'believe',
 'that',
 'bromwell',
 'high',
 's',
 'satire',
 'is',
 'much',
 'closer',
 'to',
 'reality',
 'than',
 'is',
 'teachers',
 'the',
 'scramble',
 'to',
 'survive',
 'financially',
 'the',
 'insightful',
 'students',
 'who',
 'can',
 'see',
 'right',
 'through',
 'their',
 'pathetic',
 'teachers',
 'pomp',
 'the',
 'pettiness',
 'of',
 'the',
 'whole',
 'situation',
 'all',
 'remind',
 'me',
 'of',
 'the',
 'schools',
 'i',
 'knew',
 'and',
 'their',
 'students',
 'when',
 'i',
 'saw',
 'the',
 'episode',
 'in',
 'which',
 'a',
 'student',
 'repeatedly',
 'tried',
 'to',
 'burn',
 'down',
 'the',
 'school',
 'i',
 'immediately',
 'recalled',
 'at',
 'high']

### Encoding the words 对多个单词进行编码

The embedding lookup requires that we pass in integers to our network. The easiest way to do this is to create dictionaries that map the words in the vocabulary to integers. Then we can convert each of our reviews into integers so they can be passed into the network.

嵌入式查找要求我们将整数传递给我们的网络。 最简单的方法是创建将词表中的单词映射为整数的字典。 然后我们可以将每个评论转换成整数，以便将它们传递到网络中。

> **Exercise:** Now you're going to encode the words with integers. Build a dictionary that maps words to integers. Later we're going to pad our input vectors with zeros, so make sure the integers **start at 1, not 0**.
> Also, convert the reviews to integers and store the reviews in a new list called `reviews_ints`. 

> **练习：** 现在你要用整数来编码这些单词。 构建一个将单词映射到整数的字典。 稍后我们将使用零填充输入向量，所以确保整数 **start at 1, not 0**。
>此外，将评论转换为整数，并将评论存储在名为`reviews_ints`的新列表中。

In [1]:
# Create your dictionary that maps vocab words to integers here
vocab_to_int = 

# Convert the reviews to integers, same shape as reviews list, but with integers
reviews_ints = 

SyntaxError: invalid syntax (<ipython-input-1-d8f415f8057c>, line 2)

### Encoding the labels 对多个标签进行编码

Our labels are "positive" or "negative". To use these labels in our network, we need to convert them to 0 and 1.

我们的标签是“positive”或“negative”。 要在我们的网络中使用这些标签，我们需要将它们转换为0和1。

> **Exercise:** Convert labels from `positive` and `negative` to 1 and 0, respectively.

> **练习：** 将标签从`positive`和`negative`分别转换为1和0。

In [8]:
# Convert labels to 1s and 0s for 'positive' and 'negative'
labels = 

If you built `labels` correctly, you should see the next output.

如果你正确地构建了`labels`，你应该看到下一个输出。

In [9]:
from collections import Counter
review_lens = Counter([len(x) for x in reviews_ints])
print("Zero-length reviews: {}".format(review_lens[0]))
print("Maximum review length: {}".format(max(review_lens)))

Zero-length reviews: 1
Maximum review length: 2514


Okay, a couple issues here. We seem to have one review with zero length. And, the maximum review length is way too many steps for our RNN. Let's truncate to 200 steps. For reviews shorter than 200, we'll pad with 0s. For reviews longer than 200, we can truncate them to the first 200 characters.

好的，这里有几个问题。 我们似乎有一个零长度的审查。 而且，我们的RNN的最大审查时间是太多的步骤。 让我们截断到200步。 对于小于200的评论，我们将用0填写。 对于超过200次的评论，我们可以将其截断为前200个字符。

> **Exercise:** First, remove the review with zero length from the `reviews_ints` list.

> **练习：**首先，从`reviews_ints`列表中删除零长度的评论。

In [10]:
# Filter out that review with 0 length
reviews_ints = 

> **Exercise:** Now, create an array `features` that contains the data we'll pass to the network. The data should come from `review_ints`, since we want to feed integers to the network. Each row should be 200 elements long. For reviews shorter than 200 words, left pad with 0s. That is, if the review is `['best', 'movie', 'ever']`, `[117, 18, 128]` as integers, the row will look like `[0, 0, 0, ..., 0, 117, 18, 128]`. For reviews longer than 200, use on the first 200 words as the feature vector.

> **练习：**现在，创建一个包含我们传递给网络的数据的数组`features`。 数据应该来自`review_ints`，因为我们想把整数提供给网络。 每行应该是200元素长。 对于短于200个字的评论，左键为0。 也就是说，如果审查是`['best', 'movie', 'ever']`，`[117, 18, 128]` 作为整数，行将看起来像 `[0, 0, 0, ..., 0, 117, 18, 128]` 对于超过200次的评论，使用前200个单词作为特征向量。

This isn't trivial and there are a bunch of ways to do this. But, if you're going to be building your own deep learning networks, you're going to have to get used to preparing your data.

这不是微不足道的，有一些方法来做到这一点。 但是，如果您要建立自己的深入学习网络，那么您将不得不习惯于准备数据。

In [11]:
seq_len = 200
features = 

If you build features correctly, it should look like that cell output below.

如果正确构建功能，它应该看起来像下面的单元格输出。

In [13]:
features[:10,:100]

array([[    0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0, 21282,   308,     6,
            3,  1050,   207,     8,  2143,    32,     1,   171,    57,
           15,    49,    81,  5832,    44,   382,   110,   140,    15,
         5236,    60,   154,     9,     1,  5014,  5899,   475,    71,
            5,   260,    12, 21282,   308,    13,  1981,     6,    74,
         2396],
       [    0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     

## Training, Validation, Test 训练，验证，测试

With our data in nice shape, we'll split it into training, validation, and test sets.

我们的数据很好，我们将其分为培训，验证和测试集。

> **Exercise:** Create the training, validation, and test sets here. You'll need to create sets for the features and the labels, `train_x` and `train_y` for example. Define a split fraction, `split_frac` as the fraction of data to keep in the training set. Usually this is set to 0.8 or 0.9. The rest of the data will be split in half to create the validation and testing data.

> **练习：**在这里创建训练，验证和测试集。 您将需要为功能和标签创建集合，例如`train_x`和`train_y`。 定义一个分数分数，`split_frac`作为保留在训练集中的数据的一部分。 通常设置为0.8或0.9。 剩下的数据将被分成两半，以创建验证和测试数据。

In [None]:
split_frac = 0.8

train_x, val_x = 
train_y, val_y = 

val_x, test_x = 
val_y, test_y = 

print("\t\t\tFeature Shapes:")
print("Train set: \t\t{}".format(train_x.shape), 
      "\nValidation set: \t{}".format(val_x.shape),
      "\nTest set: \t\t{}".format(test_x.shape))

With train, validation, and text fractions of 0.8, 0.1, 0.1, the final shapes should look like:

具有0.8,0.1,0.1的训练，验证和文本分数，最终形状应如下所示：
```
                    Feature Shapes:
Train set: 		 (20000, 200) 
Validation set: 	(2500, 200) 
Test set: 		  (2500, 200)
```

## Build the graph 建立图表

Here, we'll build the graph. First up, defining the hyperparameters.

* `lstm_size`: Number of units in the hidden layers in the LSTM cells. Usually larger is better performance wise. Common values are 128, 256, 512, etc.
* `lstm_layers`: Number of LSTM layers in the network. I'd start with 1, then add more if I'm underfitting.
* `batch_size`: The number of reviews to feed the network in one training pass. Typically this should be set as high as you can go without running out of memory.
* `learning_rate`: Learning rate

在这里，我们将构建图。 首先，定义超参数。

* `lstm_size`：LSTM单元格中隐藏图层中的单位数。 通常更大的是更好的性能明智。 常用值为128,256,512等
* `lstm_layers`：网络中的LSTM层数。 我从1开始，然后如果我不适合，添加更多。
* `batch_size`：在一个培训通行证中提供网络的评论数量。 通常这应该设置为尽可能高，没有内存不足。
* `learning_rate`：学习率

In [31]:
lstm_size = 256
lstm_layers = 1
batch_size = 500
learning_rate = 0.001

For the network itself, we'll be passing in our 200 element long review vectors. Each batch will be `batch_size` vectors. We'll also be using dropout on the LSTM layer, so we'll make a placeholder for the keep probability.

对于网络本身，我们将传递我们的200个元素长的评估向量。 每批将是 `batch_size` 向量。 我们还将在LSTM层使用辍学，所以我们将为保留概率创建一个占位符。

> **Exercise:** Create the `inputs_`, `labels_`, and drop out `keep_prob` placeholders using `tf.placeholder`. `labels_` needs to be two-dimensional to work with some functions later.  Since `keep_prob` is a scalar (a 0-dimensional tensor), you shouldn't provide a size to `tf.placeholder`.

> **练习：** 创建`inputs_`，`labels_`，并使用`tf.placeholder`退出`keep_prob`占位符。 `labels_`需要二维以后才能使用某些功能。 因为`keep_prob`是一个标量（一个0维张量），所以你不应该为`tf.placeholder`提供一个大小。

In [32]:
n_words = len(vocab_to_int)

# Create the graph object
graph = tf.Graph()
# Add nodes to the graph
with graph.as_default():
    inputs_ = 
    labels_ = 
    keep_prob = 

### Embedding 嵌入

Now we'll add an embedding layer. We need to do this because there are 74000 words in our vocabulary. It is massively inefficient to one-hot encode our classes here. You should remember dealing with this problem from the word2vec lesson. Instead of one-hot encoding, we can have an embedding layer and use that layer as a lookup table. You could train an embedding layer using word2vec, then load it here. But, it's fine to just make a new layer and let the network learn the weights.

现在我们将添加一个嵌入层。 我们需要这样做，因为我们的词汇中有74000个单词。 在这里对我们的课程进行一次热编码是非常有效的。 你应该记住从word2vec课程处理这个问题。 而不是单热编码，我们可以拥有嵌入层，并将该层用作查找表。 您可以使用word2vec训练一个嵌入层，然后将其加载到此处。 但是，只需创建一个新层，让网络学习权重就可以了。

> **Exercise:** Create the embedding lookup matrix as a `tf.Variable`. Use that embedding matrix to get the embedded vectors to pass to the LSTM cell with [`tf.nn.embedding_lookup`](https://www.tensorflow.org/api_docs/python/tf/nn/embedding_lookup). This function takes the embedding matrix and an input tensor, such as the review vectors. Then, it'll return another tensor with the embedded vectors. So, if the embedding layer has 200 units, the function will return a tensor with size [batch_size, 200].

> **练习：** 将嵌入式查找矩阵创建为`tf.Variable`。 使用该嵌入矩阵获取嵌入的向量以 [`tf.nn.embedding_lookup`](https://www.tensorflow.org/api_docs/python/tf/nn/embedding_lookup). 传递给LSTM单元格。 该函数采用嵌入矩阵和输入张量，如评估向量。 然后，它将返回另一个与嵌入向量的张量。 因此，如果嵌入层有200个单位，则函数将返回尺寸为[batch_size，200]的张量。



In [33]:
# Size of the embedding vectors (number of units in the embedding layer)
embed_size = 300 

with graph.as_default():
    embedding = 
    embed = 

### LSTM cell   LSTM细胞

<img src="assets/network_diagram.png" width=400px>

Next, we'll create our LSTM cells to use in the recurrent network ([TensorFlow documentation](https://www.tensorflow.org/api_docs/python/tf/contrib/rnn)). Here we are just defining what the cells look like. This isn't actually building the graph, just defining the type of cells we want in our graph.

接下来，我们将创建我们的LSTM单元格，用于经常性网络（[TensorFlow文档](https://www.tensorflow.org/api_docs/python/tf/contrib/rnn)）。 这里我们只是定义单元格的外观。 这不是实际构建图，只是在图中定义我们想要的单元格类型。

To create a basic LSTM cell for the graph, you'll want to use `tf.contrib.rnn.BasicLSTMCell`. Looking at the function documentation:

要为图形创建一个基本的LSTM单元格，您需要使用`tf.contrib.rnn.BasicLSTMCell`。 查看功能文档：

```
tf.contrib.rnn.BasicLSTMCell(num_units, forget_bias=1.0, input_size=None, state_is_tuple=True, activation=<function tanh at 0x109f1ef28>)
```

you can see it takes a parameter called `num_units`, the number of units in the cell, called `lstm_size` in this code. So then, you can write something like 

您可以看到它在该代码中使用了一个名为`num_units`的参数，单元格中的单位数，称为`lstm_size`。 那么，你可以写一些类似的东西

```
lstm = tf.contrib.rnn.BasicLSTMCell(num_units)
```

to create an LSTM cell with `num_units`. Next, you can add dropout to the cell with `tf.contrib.rnn.DropoutWrapper`. This just wraps the cell in another cell, but with dropout added to the inputs and/or outputs. It's a really convenient way to make your network better with almost no effort! So you'd do something like

用`num_units` 创建一个LSTM单元格。 接下来，您可以使用`tf.contrib.rnn.DropoutWrapper`向单元格添加退出。 这只是将单元格包装在另一个单元格中，但是将输出和/或输出添加到输出中。 这是一个非常方便的方式，使您的网络更好，几乎没有任何努力！ 所以你会做一些事情

```
drop = tf.contrib.rnn.DropoutWrapper(cell, output_keep_prob=keep_prob)
```

Most of the time, your network will have better performance with more layers. That's sort of the magic of deep learning, adding more layers allows the network to learn really complex relationships. Again, there is a simple way to create multiple layers of LSTM cells with `tf.contrib.rnn.MultiRNNCell`:

大多数时候，您的网络将具有更好的性能，更多的层。 这就是深入学习的神奇之处，增加更多层次让网络学习真正复杂的关系。 再次，有一个简单的方式来创建具有`tf.contrib.rnn.MultiRNNCell`的多层LSTM单元格：

```
cell = tf.contrib.rnn.MultiRNNCell([drop] * lstm_layers)
```

Here, `[drop] * lstm_layers` creates a list of cells (`drop`) that is `lstm_layers` long. The `MultiRNNCell` wrapper builds this into multiple layers of RNN cells, one for each cell in the list.

这里，`[drop] * lstm_layers`创建一个长度为lstm_layers的单元格列表（`drop`）。 `MultiRNNCell`包装器将其构建到多个RNN单元格中，一个用于列表中的每个单元格。

So the final cell you're using in the network is actually multiple (or just one) LSTM cells with dropout. But it all works the same from an achitectural viewpoint, just a more complicated graph in the cell.

因此，您在网络中使用的最后一个单元格实际上是多个（或只有一个）具有删除的LSTM单元格。 但是，从建筑的角度来看，它们都是一样的，只是一个更复杂的单元格图形。

> **Exercise:** Below, use `tf.contrib.rnn.BasicLSTMCell` to create an LSTM cell. Then, add drop out to it with `tf.contrib.rnn.DropoutWrapper`. Finally, create multiple LSTM layers with `tf.contrib.rnn.MultiRNNCell`.

> **练习：** 下面，使用`tf.contrib.rnn.BasicLSTMCell`创建一个LSTM单元格。 然后，使用`tf.contrib.rnn.DropoutWrapper`添加到它。 最后，使用`tf.contrib.rnn.MultiRNNCell`创建多个LSTM图层。

Here is [a tutorial on building RNNs](https://www.tensorflow.org/tutorials/recurrent) that will help you out.

这是 [一个关于建立RNN的教程](https://www.tensorflow.org/tutorials/recurrent)，这将有助于您。


In [34]:
with graph.as_default():
    # Your basic LSTM cell
    lstm = 
    
    # Add dropout to the cell
    drop = 
    
    # Stack up multiple LSTM layers, for deep learning
    cell = 
    
    # Getting an initial state of all zeros
    initial_state = cell.zero_state(batch_size, tf.float32)

### RNN forward pass RNN前向传递

<img src="assets/network_diagram.png" width=400px>

Now we need to actually run the data through the RNN nodes. You can use [`tf.nn.dynamic_rnn`](https://www.tensorflow.org/api_docs/python/tf/nn/dynamic_rnn) to do this. You'd pass in the RNN cell you created (our multiple layered LSTM `cell` for instance), and the inputs to the network.

现在我们需要通过RNN节点实际运行数据。 您可以使用[`tf.nn.dynamic_rnn`](https://www.tensorflow.org/api_docs/python/tf/nn/dynamic_rnn)来执行此操作。 你会传入你创建的RNN单元格（例如我们的多层LSTM `cell`），以及对网络的输入。

```
outputs, final_state = tf.nn.dynamic_rnn(cell, inputs, initial_state=initial_state)
```

Above I created an initial state, `initial_state`, to pass to the RNN. This is the cell state that is passed between the hidden layers in successive time steps. `tf.nn.dynamic_rnn` takes care of most of the work for us. We pass in our cell and the input to the cell, then it does the unrolling and everything else for us. It returns outputs for each time step and the final_state of the hidden layer.

上面我创建了一个初始状态`initial_state`，传递给RNN。 这是在连续时间步长中在隐藏层之间传递的单元格状态。 `tf.nn.dynamic_rnn`为我们照顾大部分的工作。 我们通过我们的单元格和输入到单元格，然后它展开和其他一切为我们。 它返回每个时间步的输出和隐藏层的final_state。

> **Exercise:** Use `tf.nn.dynamic_rnn` to add the forward pass through the RNN. Remember that we're actually passing in vectors from the embedding layer, `embed`.

> **练习：**使用`tf.nn.dynamic_rnn`添加通过RNN的前进路径。 记住，我们实际上是从嵌入层 `embed` 传递向量。



In [35]:
with graph.as_default():
    outputs, final_state = 

### Output 输出

We only care about the final output, we'll be using that as our sentiment prediction. So we need to grab the last output with `outputs[:, -1]`, the calculate the cost from that and `labels_`.

我们只关心最后的输出，我们将使用它作为我们的情绪预测。 所以我们需要用`outputs[:, -1]`来获取最后一个输出，计算出它的代价和 `labels_`。

In [36]:
with graph.as_default():
    predictions = tf.contrib.layers.fully_connected(outputs[:, -1], 1, activation_fn=tf.sigmoid)
    cost = tf.losses.mean_squared_error(labels_, predictions)
    
    optimizer = tf.train.AdamOptimizer(learning_rate).minimize(cost)

### Validation accuracy 验证准确率

Here we can add a few nodes to calculate the accuracy which we'll use in the validation pass.

这里我们可以添加几个节点来计算我们将在验证过程中使用的准确性。

In [37]:
with graph.as_default():
    correct_pred = tf.equal(tf.cast(tf.round(predictions), tf.int32), labels_)
    accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

### Batching 分批

This is a simple function for returning batches from our data. First it removes data such that we only have full batches. Then it iterates through the `x` and `y` arrays and returns slices out of those arrays with size `[batch_size]`.

这是从我们的数据返回批次的一个简单的功能。 首先它删除数据，使我们只有完整的批次。 然后它遍历 `x` 和 `y` 数组，并从大小为`[batch_size]`的数组返回分片。

In [38]:
def get_batches(x, y, batch_size=100):
    
    n_batches = len(x)//batch_size
    x, y = x[:n_batches*batch_size], y[:n_batches*batch_size]
    for ii in range(0, len(x), batch_size):
        yield x[ii:ii+batch_size], y[ii:ii+batch_size]

## Training 训练

Below is the typical training code. If you want to do this yourself, feel free to delete all this code and implement it yourself. Before you run this, make sure the `checkpoints` directory exists.

以下是典型的培训代码。 如果你想自己做这个，可以自由删除所有这些代码，并自己实现。 在运行此操作之前，请确保 `checkpoints` 目录存在。

In [None]:
epochs = 10

with graph.as_default():
    saver = tf.train.Saver()

with tf.Session(graph=graph) as sess:
    sess.run(tf.global_variables_initializer())
    iteration = 1
    for e in range(epochs):
        state = sess.run(initial_state)
        
        for ii, (x, y) in enumerate(get_batches(train_x, train_y, batch_size), 1):
            feed = {inputs_: x,
                    labels_: y[:, None],
                    keep_prob: 0.5,
                    initial_state: state}
            loss, state, _ = sess.run([cost, final_state, optimizer], feed_dict=feed)
            
            if iteration%5==0:
                print("Epoch: {}/{}".format(e, epochs),
                      "Iteration: {}".format(iteration),
                      "Train loss: {:.3f}".format(loss))

            if iteration%25==0:
                val_acc = []
                val_state = sess.run(cell.zero_state(batch_size, tf.float32))
                for x, y in get_batches(val_x, val_y, batch_size):
                    feed = {inputs_: x,
                            labels_: y[:, None],
                            keep_prob: 1,
                            initial_state: val_state}
                    batch_acc, val_state = sess.run([accuracy, final_state], feed_dict=feed)
                    val_acc.append(batch_acc)
                print("Val acc: {:.3f}".format(np.mean(val_acc)))
            iteration +=1
    saver.save(sess, "checkpoints/sentiment.ckpt")

## Testing 测试

In [None]:
test_acc = []
with tf.Session(graph=graph) as sess:
    saver.restore(sess, tf.train.latest_checkpoint('checkpoints'))
    test_state = sess.run(cell.zero_state(batch_size, tf.float32))
    for ii, (x, y) in enumerate(get_batches(test_x, test_y, batch_size), 1):
        feed = {inputs_: x,
                labels_: y[:, None],
                keep_prob: 1,
                initial_state: test_state}
        batch_acc, test_state = sess.run([accuracy, final_state], feed_dict=feed)
        test_acc.append(batch_acc)
    print("Test accuracy: {:.3f}".format(np.mean(test_acc)))