### References
- This notebook is based off the Sentiment RNN lesson from the Udacity Deep Learning Nanodegree class. https://github.com/udacity/deep-learning/tree/master/sentiment-rnn

# Predict if Content contains information about a person/character

This notebook uses an RNN to make a classifier 0 = not person, 1 = person, but will give a scalar prediction between 0 and 1.

The architecture for this network is shown below.

<img src="assets/network_diagram.png" width=400px>

Here, we'll pass in words to an embedding layer. We need an embedding layer because we have tens of thousands of words, so we'll need a more efficient representation for our input data than one-hot encoded vectors. You can see an example of this with word2vec in the DLND embedding less. https://github.com/udacity/deep-learning/tree/master/embeddings

From the embedding layer, the new representations will be passed to LSTM cells. These will add recurrent connections to the network so we can include information about the sequence of words in the data. Finally, the LSTM cells will go to a sigmoid output layer here. We're using the sigmoid because we want the classifier to output a probability between 0.0 and 1.0. The output layer will just be a single unit then, with a sigmoid activation function.

We don't care about the sigmoid outputs except for the very last one, we can ignore the rest. We'll calculate the cost from the output of the last step and the training label.

In [1]:
import numpy as np
import tensorflow as tf
import pandas as pd
import pickle
import timeit
import re

The combined data contains two sets of data that have been shuffled together. Both sets are Title/Content from wikipedia pages. One set contains people/characters and the other set contains all other wikipedia content that is not a person.

In [2]:
# Combined and shuffled dataframe of people and not regular wikipedia articles
df = pd.read_pickle('DataSets/combined.pkl')

In [3]:
df

Unnamed: 0,Content,Is_Name,Title
18020,emerson is a jamaican politician from the peop...,1,floyd morris
32943,is an estonian professional football manager a...,1,sergei ratnikov
0,the gcsb mori te <UNK> <UNK> formerly te <UNK>...,0,government communications security bureau
0,commonly <UNK> or <UNK> is a founded in the ci...,0,alcoa tenn federal credit union
12684,is a <UNK> footballer who played for <UNK> uni...,1,petio semaia
0,is a former roman city of the roman province o...,0,thimida regia
0,is an administrative nickname for a background...,0,yankee white
0,hungarian <UNK> is a village and municipality ...,0,vyn ipov
47262,is a british author and television presenter h...,1,richard rudgley
0,llp was a corporate law firm headquartered in ...,0,dewey ballantine


## Data preprocessing

The first step when building a neural network model is getting your data into the proper form to feed into the network. Since we're using embedding layers, we'll need to encode each word with an integer. We'll also want to clean it up a bit.

You can see an example of the content data above.

### Encoding the words

The embedding lookup requires that we pass in integers to our network. The easiest way to do this is to create dictionaries that map the words in the vocabulary to integers. Then we can convert each of our reviews into integers so they can be passed into the network.

In [3]:
# Get a list of all words
content = list(df['Content'])
all_text = ' '.join(content)
words = all_text.split()

In [4]:
# Create word mappings dictionaries, both to and from integers
vocab = set([w for w in words if w != '<UNK>'])
vocab_to_int = {word: i for i, word in enumerate(vocab, 1)} # Start at 1
vocab_to_int["<UNK>"] = 0 # Identifier for unknown words (words that don't occur in the vocab)

int_to_vocab = {i: w for w, i in vocab_to_int.items()}

In [5]:
# Convert wiki page Content to integers
content_ints = [] # each row contains page content as integer list
for row in content:
    content_ints.append([vocab_to_int.get(word, 0) for word in row.split()])

In [6]:
# Create list of labels to match content and classify if that page contains a 1=person or 0=not
labels = np.array(list(df['Is_Name']))

In [10]:
# Save Dictionaries to Pickle files
pickle.dump(vocab_to_int, open("DataSets/vocab_to_int.p", "wb"))
pickle.dump(int_to_vocab, open("DataSets/int_to_vocab.p", "wb"))

In [11]:
from collections import Counter

content_lens = Counter([len(x) for x in content_ints])
print("Minimum review length: {}".format(min(content_lens)))
print("Maximum review length: {}".format(max(content_lens)))

Minimum review length: 40
Maximum review length: 1000


The maximum review length is way too many steps for our RNN. Let's truncate to 200 steps. For reviews shorter than 200, we'll pad with 0s. For reviews longer than 200, we can truncate them to the first 200 characters.

In [12]:
# Set the seq_len, but reduce its size if the maximum length is less
max_seq_len = max([len(r) for r in content_ints])

seq_len = 200
if max_seq_len < seq_len:
    seq_len = max_seq_len

Create an array `features` that contains the data we'll pass to the network. The data should come from `content_ints`, since we want to feed integers to the network. Each row will be 200 elements long. For reviews shorter than 200 words, left pad with 0s. That is, if the review is `['she', 'was', 'born']`, `[117, 18, 128]` as integers, the row will look like `[0, 0, 0, ..., 0, 117, 18, 128]`. For reviews longer than 200, use on the first 200 words as the feature vector.

In [13]:
features = np.zeros((len(content_ints), seq_len), dtype=int)
for i, row in enumerate(content_ints):
    features[i, -len(row):] = np.array(row)[:seq_len]

In [14]:
features[:10,:100]

array([[51282, 51171, 13884, 27435, 52051, 13875, 52372,  9192, 45957,
        39444, 10035, 51171, 52372, 23690, 52372, 17823, 37039,  4207,
        43122, 34276, 51985, 29811, 34207, 24046, 35589, 28369,  8188,
         5681, 15742, 16323, 14036, 32988, 41847, 29999, 30691, 52372,
         7912, 27321, 34066, 37039, 52372, 17823, 46556, 10035, 25862,
         9918, 36404, 30436, 29591, 25862,  6314, 20097, 34541, 36404,
        37163, 45857, 23159,  8936, 26670, 23013, 18607,  3236,  4207,
        34207, 39175, 33719, 43122, 25862, 13884, 19904,  5681, 34207,
        46467,     0, 17580, 25862, 13884,     0,  6652, 14060, 10409,
         5681,   390, 41289, 51985,  1917, 13806, 34207, 24046, 36404,
        28369,  8188,  6362,  1917,     0, 51972,  1917, 41773, 52372,
            0],
       [    0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     0,     0,
            0,     0,     0,     0,     0,     0,     0,     

## Training, Validation, Test



With our data in nice shape, we'll split it into training, validation, and test sets.

In [15]:
split_frac = 0.8
split_idx = int(len(features)*0.8)
train_x, val_x = features[:split_idx], features[split_idx:]
train_y, val_y = labels[:split_idx], labels[split_idx:]

test_idx = int(len(val_x)*0.5)
val_x, test_x = val_x[:test_idx], val_x[test_idx:]
val_y, test_y = val_y[:test_idx], val_y[test_idx:]

print("\t\t\tFeature Shapes:")
print("Train set: \t\t{}".format(train_x.shape), 
      "\nValidation set: \t{}".format(val_x.shape),
      "\nTest set: \t\t{}".format(test_x.shape))

			Feature Shapes:
Train set: 		(150725, 200) 
Validation set: 	(18841, 200) 
Test set: 		(18841, 200)


## Build the graph

Here, we'll build the graph. First up, defining the hyperparameters.

* `lstm_size`: Number of units in the hidden layers in the LSTM cells. Usually larger is better performance wise. Common values are 128, 256, 512, etc.
* `lstm_layers`: Number of LSTM layers in the network.
* `batch_size`: The number of pages to feed the network in one training pass. Typically this should be set as high as you can go without running out of memory.
* `learning_rate`: Learning rate

In [16]:
lstm_size = 256
lstm_layers = 6
batch_size = 128
learning_rate = 0.0001

For the network itself, we'll be passing in our 200 element long review vectors. Each batch will be `batch_size` vectors. We'll also be using dropout on the LSTM layer, so we'll make a placeholder for the keep probability.

Create the `inputs_`, `labels_`, and drop out `keep_prob` placeholders using `tf.placeholder`. `labels_` needs to be two-dimensional to work with some functions later.  Since `keep_prob` is a scalar (a 0-dimensional tensor), you shouldn't provide a size to `tf.placeholder`.

In [17]:
n_words = len(vocab_to_int)

# Create the graph object
graph = tf.Graph()
# Add nodes to the graph
with graph.as_default():
    inputs_ = tf.placeholder(tf.int32, [None, None], name='inputs')
    labels_ = tf.placeholder(tf.int32, [None, None], name='labels')
    keep_prob = tf.placeholder(tf.float32, name='keep_prob')

### Embedding

Now we'll add an embedding layer. We need to do this because there are ~50000 words in our vocabulary. It is massively inefficient to one-hot encode our classes here. Instead of one-hot encoding, we can have an embedding layer and use that layer as a lookup table. You could train an embedding layer using word2vec, then load it here. But, it's fine to just make a new layer and let the network learn the weights.

Create the embedding lookup matrix as a `tf.Variable`. Use that embedding matrix to get the embedded vectors to pass to the LSTM cell with [`tf.nn.embedding_lookup`](https://www.tensorflow.org/api_docs/python/tf/nn/embedding_lookup). This function takes the embedding matrix and an input tensor, such as the review vectors. Then, it'll return another tensor with the embedded vectors. So, if the embedding layer as 200 units, the function will return a tensor with size [batch_size, 200].

In [18]:
# Size of the embedding vectors (number of units in the embedding layer)
embed_size = 300 

with graph.as_default():
    embedding = tf.Variable(tf.random_uniform((n_words, embed_size), -1, 1))
    embed = tf.nn.embedding_lookup(embedding, inputs_)

### LSTM cell

<img src="assets/network_diagram.png" width=400px>

Next, we'll create our LSTM cells to use in the recurrent network ([TensorFlow documentation](https://www.tensorflow.org/api_docs/python/tf/contrib/rnn)). Here we are just defining what the cells look like. This isn't actually building the graph, just defining the type of cells we want in our graph.

To create a basic LSTM cell for the graph, you'll want to use `tf.contrib.rnn.BasicLSTMCell`. Looking at the function documentation:

```
tf.contrib.rnn.BasicLSTMCell(num_units, forget_bias=1.0, input_size=None, state_is_tuple=True, activation=<function tanh at 0x109f1ef28>)
```

you can see it takes a parameter called `num_units`, the number of units in the cell, called `lstm_size` in this code. So then, you can write something like 

```
lstm = tf.contrib.rnn.BasicLSTMCell(num_units)
```

to create an LSTM cell with `num_units`. Next, you can add dropout to the cell with `tf.contrib.rnn.DropoutWrapper`. This just wraps the cell in another cell, but with dropout added to the inputs and/or outputs. It's a really convenient way to make your network better with almost no effort! So you'd do something like

```
drop = tf.contrib.rnn.DropoutWrapper(cell, output_keep_prob=keep_prob)
```

Most of the time, you're network will have better performance with more layers. That's sort of the magic of deep learning, adding more layers allows the network to learn really complex relationships. Again, there is a simple way to create multiple layers of LSTM cells with `tf.contrib.rnn.MultiRNNCell`:

```
cell = tf.contrib.rnn.MultiRNNCell([drop] * lstm_layers)
```

Here, `[drop] * lstm_layers` creates a list of cells (`drop`) that is `lstm_layers` long. The `MultiRNNCell` wrapper builds this into multiple layers of RNN cells, one for each cell in the list.

So the final cell you're using in the network is actually multiple (or just one) LSTM cells with dropout. But it all works the same from an achitectural viewpoint, just a more complicated graph in the cell.

Here is [a tutorial on building RNNs](https://www.tensorflow.org/tutorials/recurrent).


In [19]:
with graph.as_default():
    # Your basic LSTM cell
    lstm = tf.contrib.rnn.BasicLSTMCell(lstm_size)
    
    # Add dropout to the cell
    drop = tf.contrib.rnn.DropoutWrapper(lstm, output_keep_prob=keep_prob)
    
    # Stack up multiple LSTM layers, for deep learning
    cell = tf.contrib.rnn.MultiRNNCell([drop] * lstm_layers)
    
    # Getting an initial state of all zeros
    initial_state = cell.zero_state(batch_size, tf.float32)

### RNN forward pass

<img src="assets/network_diagram.png" width=400px>

Now we need to actually run the data through the RNN nodes. You can use [`tf.nn.dynamic_rnn`](https://www.tensorflow.org/api_docs/python/tf/nn/dynamic_rnn) to do this. You'd pass in the RNN cell you created (our multiple layered LSTM `cell` for instance), and the inputs to the network.

```
outputs, final_state = tf.nn.dynamic_rnn(cell, inputs, initial_state=initial_state)
```

Above I created an initial state, `initial_state`, to pass to the RNN. This is the cell state that is passed between the hidden layers in successive time steps. `tf.nn.dynamic_rnn` takes care of most of the work for us. We pass in our cell and the input to the cell, then it does the unrolling and everything else for us. It returns outputs for each time step and the final_state of the hidden layer.

Add the forward pass through the RNN. Remember that we're actually passing in vectors from the embedding layer, `embed`.

In [20]:
with graph.as_default():
    outputs, final_state = tf.nn.dynamic_rnn(cell, embed, initial_state=initial_state)

### Output

We only care about the final output, we'll be using that as our prediction. So we need to grab the last output with `outputs[:, -1]`, the calculated cost from that and `labels_`.

In [21]:
with graph.as_default():
    predictions = tf.contrib.layers.fully_connected(outputs[:, -1], 1, activation_fn=tf.sigmoid)
    cost = tf.losses.mean_squared_error(labels_, predictions)
    
    optimizer = tf.train.AdamOptimizer(learning_rate).minimize(cost)

### Validation accuracy

Here we can add a few nodes to calculate the accuracy which we'll use in the validation pass.

In [22]:
with graph.as_default():
    correct_pred = tf.equal(tf.cast(tf.round(predictions), tf.int32), labels_)
    accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

### Batching

This is a simple function for returning batches from our data. First it removes data such that we only have full batches. Then it iterates through the `x` and `y` arrays and returns slices out of those arrays with size `[batch_size]`.

In [23]:
def get_batches(x, y, batch_size=100):
    
    n_batches = len(x)//batch_size
    x, y = x[:n_batches*batch_size], y[:n_batches*batch_size]
    for ii in range(0, len(x), batch_size):
        yield x[ii:ii+batch_size], y[ii:ii+batch_size]

## Training

In [24]:
epochs = 20

with graph.as_default():
    saver = tf.train.Saver()

with tf.Session(graph=graph) as sess:
    sess.run(tf.global_variables_initializer())
    iteration = 1
    for e in range(epochs):
        state = sess.run(initial_state)
        
        for ii, (x, y) in enumerate(get_batches(train_x, train_y, batch_size), 1):
            feed = {inputs_: x,
                    labels_: y[:, None],
                    keep_prob: 0.5,
                    initial_state: state}
            loss, state, _ = sess.run([cost, final_state, optimizer], feed_dict=feed)
            
            if iteration%5==0:
                print("Epoch: {}/{}".format(e, epochs),
                      "Iteration: {}".format(iteration),
                      "Train loss: {:.3f}".format(loss))

            if iteration%25==0:
                val_acc = []
                val_state = sess.run(cell.zero_state(batch_size, tf.float32))
                for x, y in get_batches(val_x, val_y, batch_size):
                    feed = {inputs_: x,
                            labels_: y[:, None],
                            keep_prob: 1,
                            initial_state: val_state}
                    batch_acc, val_state = sess.run([accuracy, final_state], feed_dict=feed)
                    val_acc.append(batch_acc)
                print("Val acc: {:.3f}".format(np.mean(val_acc)))
            iteration +=1
    saver.save(sess, "checkpoints/is_person.ckpt")

Epoch: 0/20 Iteration: 5 Train loss: 0.248
Epoch: 0/20 Iteration: 10 Train loss: 0.239
Epoch: 0/20 Iteration: 15 Train loss: 0.205
Epoch: 0/20 Iteration: 20 Train loss: 0.154
Epoch: 0/20 Iteration: 25 Train loss: 0.139
Val acc: 0.827
Epoch: 0/20 Iteration: 30 Train loss: 0.120
Epoch: 0/20 Iteration: 35 Train loss: 0.141
Epoch: 0/20 Iteration: 40 Train loss: 0.107
Epoch: 0/20 Iteration: 45 Train loss: 0.094
Epoch: 0/20 Iteration: 50 Train loss: 0.062
Val acc: 0.891
Epoch: 0/20 Iteration: 55 Train loss: 0.094
Epoch: 0/20 Iteration: 60 Train loss: 0.048
Epoch: 0/20 Iteration: 65 Train loss: 0.097
Epoch: 0/20 Iteration: 70 Train loss: 0.076
Epoch: 0/20 Iteration: 75 Train loss: 0.039
Val acc: 0.910
Epoch: 0/20 Iteration: 80 Train loss: 0.059
Epoch: 0/20 Iteration: 85 Train loss: 0.063
Epoch: 0/20 Iteration: 90 Train loss: 0.054
Epoch: 0/20 Iteration: 95 Train loss: 0.066
Epoch: 0/20 Iteration: 100 Train loss: 0.063
Val acc: 0.917
Epoch: 0/20 Iteration: 105 Train loss: 0.034
Epoch: 0/20 Ite

Epoch: 0/20 Iteration: 865 Train loss: 0.058
Epoch: 0/20 Iteration: 870 Train loss: 0.071
Epoch: 0/20 Iteration: 875 Train loss: 0.031
Val acc: 0.951
Epoch: 0/20 Iteration: 880 Train loss: 0.027
Epoch: 0/20 Iteration: 885 Train loss: 0.034
Epoch: 0/20 Iteration: 890 Train loss: 0.031
Epoch: 0/20 Iteration: 895 Train loss: 0.042
Epoch: 0/20 Iteration: 900 Train loss: 0.046
Val acc: 0.950
Epoch: 0/20 Iteration: 905 Train loss: 0.062
Epoch: 0/20 Iteration: 910 Train loss: 0.054
Epoch: 0/20 Iteration: 915 Train loss: 0.035
Epoch: 0/20 Iteration: 920 Train loss: 0.051
Epoch: 0/20 Iteration: 925 Train loss: 0.048
Val acc: 0.945
Epoch: 0/20 Iteration: 930 Train loss: 0.061
Epoch: 0/20 Iteration: 935 Train loss: 0.039
Epoch: 0/20 Iteration: 940 Train loss: 0.062
Epoch: 0/20 Iteration: 945 Train loss: 0.042
Epoch: 0/20 Iteration: 950 Train loss: 0.040
Val acc: 0.948
Epoch: 0/20 Iteration: 955 Train loss: 0.042
Epoch: 0/20 Iteration: 960 Train loss: 0.046
Epoch: 0/20 Iteration: 965 Train loss: 0

Val acc: 0.950
Epoch: 1/20 Iteration: 1705 Train loss: 0.076
Epoch: 1/20 Iteration: 1710 Train loss: 0.045
Epoch: 1/20 Iteration: 1715 Train loss: 0.037
Epoch: 1/20 Iteration: 1720 Train loss: 0.061
Epoch: 1/20 Iteration: 1725 Train loss: 0.038
Val acc: 0.950
Epoch: 1/20 Iteration: 1730 Train loss: 0.025
Epoch: 1/20 Iteration: 1735 Train loss: 0.043
Epoch: 1/20 Iteration: 1740 Train loss: 0.024
Epoch: 1/20 Iteration: 1745 Train loss: 0.068
Epoch: 1/20 Iteration: 1750 Train loss: 0.040
Val acc: 0.953
Epoch: 1/20 Iteration: 1755 Train loss: 0.050
Epoch: 1/20 Iteration: 1760 Train loss: 0.016
Epoch: 1/20 Iteration: 1765 Train loss: 0.047
Epoch: 1/20 Iteration: 1770 Train loss: 0.031
Epoch: 1/20 Iteration: 1775 Train loss: 0.099
Val acc: 0.946
Epoch: 1/20 Iteration: 1780 Train loss: 0.032
Epoch: 1/20 Iteration: 1785 Train loss: 0.021
Epoch: 1/20 Iteration: 1790 Train loss: 0.016
Epoch: 1/20 Iteration: 1795 Train loss: 0.037
Epoch: 1/20 Iteration: 1800 Train loss: 0.036
Val acc: 0.952
Epoch

Epoch: 2/20 Iteration: 2545 Train loss: 0.031
Epoch: 2/20 Iteration: 2550 Train loss: 0.027
Val acc: 0.954
Epoch: 2/20 Iteration: 2555 Train loss: 0.028
Epoch: 2/20 Iteration: 2560 Train loss: 0.041
Epoch: 2/20 Iteration: 2565 Train loss: 0.015
Epoch: 2/20 Iteration: 2570 Train loss: 0.041
Epoch: 2/20 Iteration: 2575 Train loss: 0.037
Val acc: 0.953
Epoch: 2/20 Iteration: 2580 Train loss: 0.035
Epoch: 2/20 Iteration: 2585 Train loss: 0.048
Epoch: 2/20 Iteration: 2590 Train loss: 0.031
Epoch: 2/20 Iteration: 2595 Train loss: 0.037
Epoch: 2/20 Iteration: 2600 Train loss: 0.030
Val acc: 0.954
Epoch: 2/20 Iteration: 2605 Train loss: 0.024
Epoch: 2/20 Iteration: 2610 Train loss: 0.025
Epoch: 2/20 Iteration: 2615 Train loss: 0.028
Epoch: 2/20 Iteration: 2620 Train loss: 0.027
Epoch: 2/20 Iteration: 2625 Train loss: 0.050
Val acc: 0.949
Epoch: 2/20 Iteration: 2630 Train loss: 0.028
Epoch: 2/20 Iteration: 2635 Train loss: 0.027
Epoch: 2/20 Iteration: 2640 Train loss: 0.024
Epoch: 2/20 Iteratio

Epoch: 2/20 Iteration: 3385 Train loss: 0.021
Epoch: 2/20 Iteration: 3390 Train loss: 0.013
Epoch: 2/20 Iteration: 3395 Train loss: 0.019
Epoch: 2/20 Iteration: 3400 Train loss: 0.055
Val acc: 0.953
Epoch: 2/20 Iteration: 3405 Train loss: 0.046
Epoch: 2/20 Iteration: 3410 Train loss: 0.043
Epoch: 2/20 Iteration: 3415 Train loss: 0.050
Epoch: 2/20 Iteration: 3420 Train loss: 0.024
Epoch: 2/20 Iteration: 3425 Train loss: 0.029
Val acc: 0.951
Epoch: 2/20 Iteration: 3430 Train loss: 0.079
Epoch: 2/20 Iteration: 3435 Train loss: 0.023
Epoch: 2/20 Iteration: 3440 Train loss: 0.054
Epoch: 2/20 Iteration: 3445 Train loss: 0.032
Epoch: 2/20 Iteration: 3450 Train loss: 0.052
Val acc: 0.954
Epoch: 2/20 Iteration: 3455 Train loss: 0.027
Epoch: 2/20 Iteration: 3460 Train loss: 0.029
Epoch: 2/20 Iteration: 3465 Train loss: 0.044
Epoch: 2/20 Iteration: 3470 Train loss: 0.018
Epoch: 2/20 Iteration: 3475 Train loss: 0.063
Val acc: 0.952
Epoch: 2/20 Iteration: 3480 Train loss: 0.036
Epoch: 2/20 Iteratio

Epoch: 3/20 Iteration: 4225 Train loss: 0.049
Val acc: 0.954
Epoch: 3/20 Iteration: 4230 Train loss: 0.031
Epoch: 3/20 Iteration: 4235 Train loss: 0.036
Epoch: 3/20 Iteration: 4240 Train loss: 0.031
Epoch: 3/20 Iteration: 4245 Train loss: 0.020
Epoch: 3/20 Iteration: 4250 Train loss: 0.026
Val acc: 0.956
Epoch: 3/20 Iteration: 4255 Train loss: 0.038
Epoch: 3/20 Iteration: 4260 Train loss: 0.031
Epoch: 3/20 Iteration: 4265 Train loss: 0.016
Epoch: 3/20 Iteration: 4270 Train loss: 0.010
Epoch: 3/20 Iteration: 4275 Train loss: 0.039
Val acc: 0.954
Epoch: 3/20 Iteration: 4280 Train loss: 0.045
Epoch: 3/20 Iteration: 4285 Train loss: 0.050
Epoch: 3/20 Iteration: 4290 Train loss: 0.049
Epoch: 3/20 Iteration: 4295 Train loss: 0.019
Epoch: 3/20 Iteration: 4300 Train loss: 0.019
Val acc: 0.956
Epoch: 3/20 Iteration: 4305 Train loss: 0.015
Epoch: 3/20 Iteration: 4310 Train loss: 0.020
Epoch: 3/20 Iteration: 4315 Train loss: 0.061
Epoch: 3/20 Iteration: 4320 Train loss: 0.021
Epoch: 3/20 Iteratio

Epoch: 4/20 Iteration: 5065 Train loss: 0.040
Epoch: 4/20 Iteration: 5070 Train loss: 0.022
Epoch: 4/20 Iteration: 5075 Train loss: 0.042
Val acc: 0.956
Epoch: 4/20 Iteration: 5080 Train loss: 0.033
Epoch: 4/20 Iteration: 5085 Train loss: 0.027
Epoch: 4/20 Iteration: 5090 Train loss: 0.064
Epoch: 4/20 Iteration: 5095 Train loss: 0.028
Epoch: 4/20 Iteration: 5100 Train loss: 0.021
Val acc: 0.957
Epoch: 4/20 Iteration: 5105 Train loss: 0.021
Epoch: 4/20 Iteration: 5110 Train loss: 0.012
Epoch: 4/20 Iteration: 5115 Train loss: 0.014
Epoch: 4/20 Iteration: 5120 Train loss: 0.033
Epoch: 4/20 Iteration: 5125 Train loss: 0.030
Val acc: 0.958
Epoch: 4/20 Iteration: 5130 Train loss: 0.027
Epoch: 4/20 Iteration: 5135 Train loss: 0.044
Epoch: 4/20 Iteration: 5140 Train loss: 0.020
Epoch: 4/20 Iteration: 5145 Train loss: 0.045
Epoch: 4/20 Iteration: 5150 Train loss: 0.005
Val acc: 0.957
Epoch: 4/20 Iteration: 5155 Train loss: 0.020
Epoch: 4/20 Iteration: 5160 Train loss: 0.025
Epoch: 4/20 Iteratio

Val acc: 0.958
Epoch: 5/20 Iteration: 5905 Train loss: 0.019
Epoch: 5/20 Iteration: 5910 Train loss: 0.032
Epoch: 5/20 Iteration: 5915 Train loss: 0.045
Epoch: 5/20 Iteration: 5920 Train loss: 0.032
Epoch: 5/20 Iteration: 5925 Train loss: 0.003
Val acc: 0.959
Epoch: 5/20 Iteration: 5930 Train loss: 0.037
Epoch: 5/20 Iteration: 5935 Train loss: 0.018
Epoch: 5/20 Iteration: 5940 Train loss: 0.023
Epoch: 5/20 Iteration: 5945 Train loss: 0.023
Epoch: 5/20 Iteration: 5950 Train loss: 0.026
Val acc: 0.958
Epoch: 5/20 Iteration: 5955 Train loss: 0.021
Epoch: 5/20 Iteration: 5960 Train loss: 0.011
Epoch: 5/20 Iteration: 5965 Train loss: 0.019
Epoch: 5/20 Iteration: 5970 Train loss: 0.022
Epoch: 5/20 Iteration: 5975 Train loss: 0.016
Val acc: 0.959
Epoch: 5/20 Iteration: 5980 Train loss: 0.024
Epoch: 5/20 Iteration: 5985 Train loss: 0.014
Epoch: 5/20 Iteration: 5990 Train loss: 0.011
Epoch: 5/20 Iteration: 5995 Train loss: 0.030
Epoch: 5/20 Iteration: 6000 Train loss: 0.022
Val acc: 0.958
Epoch

Epoch: 5/20 Iteration: 6745 Train loss: 0.019
Epoch: 5/20 Iteration: 6750 Train loss: 0.042
Val acc: 0.958
Epoch: 5/20 Iteration: 6755 Train loss: 0.047
Epoch: 5/20 Iteration: 6760 Train loss: 0.044
Epoch: 5/20 Iteration: 6765 Train loss: 0.015
Epoch: 5/20 Iteration: 6770 Train loss: 0.033
Epoch: 5/20 Iteration: 6775 Train loss: 0.028
Val acc: 0.951
Epoch: 5/20 Iteration: 6780 Train loss: 0.035
Epoch: 5/20 Iteration: 6785 Train loss: 0.037
Epoch: 5/20 Iteration: 6790 Train loss: 0.058
Epoch: 5/20 Iteration: 6795 Train loss: 0.051
Epoch: 5/20 Iteration: 6800 Train loss: 0.032
Val acc: 0.956
Epoch: 5/20 Iteration: 6805 Train loss: 0.038
Epoch: 5/20 Iteration: 6810 Train loss: 0.031
Epoch: 5/20 Iteration: 6815 Train loss: 0.049
Epoch: 5/20 Iteration: 6820 Train loss: 0.034
Epoch: 5/20 Iteration: 6825 Train loss: 0.035
Val acc: 0.958
Epoch: 5/20 Iteration: 6830 Train loss: 0.011
Epoch: 5/20 Iteration: 6835 Train loss: 0.032
Epoch: 5/20 Iteration: 6840 Train loss: 0.036
Epoch: 5/20 Iteratio

Epoch: 6/20 Iteration: 7585 Train loss: 0.017
Epoch: 6/20 Iteration: 7590 Train loss: 0.053
Epoch: 6/20 Iteration: 7595 Train loss: 0.059
Epoch: 6/20 Iteration: 7600 Train loss: 0.040
Val acc: 0.952
Epoch: 6/20 Iteration: 7605 Train loss: 0.053
Epoch: 6/20 Iteration: 7610 Train loss: 0.023
Epoch: 6/20 Iteration: 7615 Train loss: 0.026
Epoch: 6/20 Iteration: 7620 Train loss: 0.036
Epoch: 6/20 Iteration: 7625 Train loss: 0.012
Val acc: 0.959
Epoch: 6/20 Iteration: 7630 Train loss: 0.049
Epoch: 6/20 Iteration: 7635 Train loss: 0.033
Epoch: 6/20 Iteration: 7640 Train loss: 0.057
Epoch: 6/20 Iteration: 7645 Train loss: 0.040
Epoch: 6/20 Iteration: 7650 Train loss: 0.057
Val acc: 0.953
Epoch: 6/20 Iteration: 7655 Train loss: 0.034
Epoch: 6/20 Iteration: 7660 Train loss: 0.052
Epoch: 6/20 Iteration: 7665 Train loss: 0.034
Epoch: 6/20 Iteration: 7670 Train loss: 0.027
Epoch: 6/20 Iteration: 7675 Train loss: 0.012
Val acc: 0.957
Epoch: 6/20 Iteration: 7680 Train loss: 0.021
Epoch: 6/20 Iteratio

Epoch: 7/20 Iteration: 8425 Train loss: 0.036
Val acc: 0.961
Epoch: 7/20 Iteration: 8430 Train loss: 0.029
Epoch: 7/20 Iteration: 8435 Train loss: 0.020
Epoch: 7/20 Iteration: 8440 Train loss: 0.024
Epoch: 7/20 Iteration: 8445 Train loss: 0.034
Epoch: 7/20 Iteration: 8450 Train loss: 0.010
Val acc: 0.961
Epoch: 7/20 Iteration: 8455 Train loss: 0.048
Epoch: 7/20 Iteration: 8460 Train loss: 0.043
Epoch: 7/20 Iteration: 8465 Train loss: 0.028
Epoch: 7/20 Iteration: 8470 Train loss: 0.042
Epoch: 7/20 Iteration: 8475 Train loss: 0.039
Val acc: 0.956
Epoch: 7/20 Iteration: 8480 Train loss: 0.040
Epoch: 7/20 Iteration: 8485 Train loss: 0.017
Epoch: 7/20 Iteration: 8490 Train loss: 0.023
Epoch: 7/20 Iteration: 8495 Train loss: 0.021
Epoch: 7/20 Iteration: 8500 Train loss: 0.017
Val acc: 0.959
Epoch: 7/20 Iteration: 8505 Train loss: 0.023
Epoch: 7/20 Iteration: 8510 Train loss: 0.046
Epoch: 7/20 Iteration: 8515 Train loss: 0.015
Epoch: 7/20 Iteration: 8520 Train loss: 0.033
Epoch: 7/20 Iteratio

Epoch: 7/20 Iteration: 9265 Train loss: 0.029
Epoch: 7/20 Iteration: 9270 Train loss: 0.013
Epoch: 7/20 Iteration: 9275 Train loss: 0.007
Val acc: 0.961
Epoch: 7/20 Iteration: 9280 Train loss: 0.018
Epoch: 7/20 Iteration: 9285 Train loss: 0.053
Epoch: 7/20 Iteration: 9290 Train loss: 0.030
Epoch: 7/20 Iteration: 9295 Train loss: 0.031
Epoch: 7/20 Iteration: 9300 Train loss: 0.044
Val acc: 0.961
Epoch: 7/20 Iteration: 9305 Train loss: 0.012
Epoch: 7/20 Iteration: 9310 Train loss: 0.016
Epoch: 7/20 Iteration: 9315 Train loss: 0.040
Epoch: 7/20 Iteration: 9320 Train loss: 0.032
Epoch: 7/20 Iteration: 9325 Train loss: 0.033
Val acc: 0.961
Epoch: 7/20 Iteration: 9330 Train loss: 0.020
Epoch: 7/20 Iteration: 9335 Train loss: 0.041
Epoch: 7/20 Iteration: 9340 Train loss: 0.025
Epoch: 7/20 Iteration: 9345 Train loss: 0.021
Epoch: 7/20 Iteration: 9350 Train loss: 0.048
Val acc: 0.955
Epoch: 7/20 Iteration: 9355 Train loss: 0.015
Epoch: 7/20 Iteration: 9360 Train loss: 0.045
Epoch: 7/20 Iteratio

Epoch: 8/20 Iteration: 10100 Train loss: 0.033
Val acc: 0.961
Epoch: 8/20 Iteration: 10105 Train loss: 0.017
Epoch: 8/20 Iteration: 10110 Train loss: 0.044
Epoch: 8/20 Iteration: 10115 Train loss: 0.024
Epoch: 8/20 Iteration: 10120 Train loss: 0.030
Epoch: 8/20 Iteration: 10125 Train loss: 0.033
Val acc: 0.957
Epoch: 8/20 Iteration: 10130 Train loss: 0.020
Epoch: 8/20 Iteration: 10135 Train loss: 0.019
Epoch: 8/20 Iteration: 10140 Train loss: 0.032
Epoch: 8/20 Iteration: 10145 Train loss: 0.021
Epoch: 8/20 Iteration: 10150 Train loss: 0.016
Val acc: 0.960
Epoch: 8/20 Iteration: 10155 Train loss: 0.006
Epoch: 8/20 Iteration: 10160 Train loss: 0.009
Epoch: 8/20 Iteration: 10165 Train loss: 0.037
Epoch: 8/20 Iteration: 10170 Train loss: 0.041
Epoch: 8/20 Iteration: 10175 Train loss: 0.040
Val acc: 0.959
Epoch: 8/20 Iteration: 10180 Train loss: 0.012
Epoch: 8/20 Iteration: 10185 Train loss: 0.022
Epoch: 8/20 Iteration: 10190 Train loss: 0.005
Epoch: 8/20 Iteration: 10195 Train loss: 0.015


Epoch: 9/20 Iteration: 10920 Train loss: 0.043
Epoch: 9/20 Iteration: 10925 Train loss: 0.039
Val acc: 0.950
Epoch: 9/20 Iteration: 10930 Train loss: 0.019
Epoch: 9/20 Iteration: 10935 Train loss: 0.032
Epoch: 9/20 Iteration: 10940 Train loss: 0.010
Epoch: 9/20 Iteration: 10945 Train loss: 0.058
Epoch: 9/20 Iteration: 10950 Train loss: 0.043
Val acc: 0.954
Epoch: 9/20 Iteration: 10955 Train loss: 0.031
Epoch: 9/20 Iteration: 10960 Train loss: 0.035
Epoch: 9/20 Iteration: 10965 Train loss: 0.024
Epoch: 9/20 Iteration: 10970 Train loss: 0.035
Epoch: 9/20 Iteration: 10975 Train loss: 0.063
Val acc: 0.958
Epoch: 9/20 Iteration: 10980 Train loss: 0.020
Epoch: 9/20 Iteration: 10985 Train loss: 0.042
Epoch: 9/20 Iteration: 10990 Train loss: 0.044
Epoch: 9/20 Iteration: 10995 Train loss: 0.025
Epoch: 9/20 Iteration: 11000 Train loss: 0.032
Val acc: 0.952
Epoch: 9/20 Iteration: 11005 Train loss: 0.045
Epoch: 9/20 Iteration: 11010 Train loss: 0.020
Epoch: 9/20 Iteration: 11015 Train loss: 0.036


Epoch: 9/20 Iteration: 11740 Train loss: 0.031
Epoch: 9/20 Iteration: 11745 Train loss: 0.031
Epoch: 9/20 Iteration: 11750 Train loss: 0.044
Val acc: 0.958
Epoch: 9/20 Iteration: 11755 Train loss: 0.016
Epoch: 9/20 Iteration: 11760 Train loss: 0.002
Epoch: 9/20 Iteration: 11765 Train loss: 0.011
Epoch: 9/20 Iteration: 11770 Train loss: 0.024
Epoch: 10/20 Iteration: 11775 Train loss: 0.026
Val acc: 0.957
Epoch: 10/20 Iteration: 11780 Train loss: 0.040
Epoch: 10/20 Iteration: 11785 Train loss: 0.037
Epoch: 10/20 Iteration: 11790 Train loss: 0.013
Epoch: 10/20 Iteration: 11795 Train loss: 0.018
Epoch: 10/20 Iteration: 11800 Train loss: 0.037
Val acc: 0.958
Epoch: 10/20 Iteration: 11805 Train loss: 0.037
Epoch: 10/20 Iteration: 11810 Train loss: 0.005
Epoch: 10/20 Iteration: 11815 Train loss: 0.028
Epoch: 10/20 Iteration: 11820 Train loss: 0.020
Epoch: 10/20 Iteration: 11825 Train loss: 0.022
Val acc: 0.960
Epoch: 10/20 Iteration: 11830 Train loss: 0.024
Epoch: 10/20 Iteration: 11835 Train

Epoch: 10/20 Iteration: 12545 Train loss: 0.018
Epoch: 10/20 Iteration: 12550 Train loss: 0.022
Val acc: 0.955
Epoch: 10/20 Iteration: 12555 Train loss: 0.053
Epoch: 10/20 Iteration: 12560 Train loss: 0.024
Epoch: 10/20 Iteration: 12565 Train loss: 0.043
Epoch: 10/20 Iteration: 12570 Train loss: 0.031
Epoch: 10/20 Iteration: 12575 Train loss: 0.020
Val acc: 0.957
Epoch: 10/20 Iteration: 12580 Train loss: 0.054
Epoch: 10/20 Iteration: 12585 Train loss: 0.037
Epoch: 10/20 Iteration: 12590 Train loss: 0.034
Epoch: 10/20 Iteration: 12595 Train loss: 0.006
Epoch: 10/20 Iteration: 12600 Train loss: 0.023
Val acc: 0.959
Epoch: 10/20 Iteration: 12605 Train loss: 0.013
Epoch: 10/20 Iteration: 12610 Train loss: 0.017
Epoch: 10/20 Iteration: 12615 Train loss: 0.028
Epoch: 10/20 Iteration: 12620 Train loss: 0.022
Epoch: 10/20 Iteration: 12625 Train loss: 0.016
Val acc: 0.958
Epoch: 10/20 Iteration: 12630 Train loss: 0.017
Epoch: 10/20 Iteration: 12635 Train loss: 0.041
Epoch: 10/20 Iteration: 1264

Epoch: 11/20 Iteration: 13350 Train loss: 0.033
Val acc: 0.961
Epoch: 11/20 Iteration: 13355 Train loss: 0.015
Epoch: 11/20 Iteration: 13360 Train loss: 0.040
Epoch: 11/20 Iteration: 13365 Train loss: 0.025
Epoch: 11/20 Iteration: 13370 Train loss: 0.006
Epoch: 11/20 Iteration: 13375 Train loss: 0.049
Val acc: 0.961
Epoch: 11/20 Iteration: 13380 Train loss: 0.034
Epoch: 11/20 Iteration: 13385 Train loss: 0.021
Epoch: 11/20 Iteration: 13390 Train loss: 0.017
Epoch: 11/20 Iteration: 13395 Train loss: 0.026
Epoch: 11/20 Iteration: 13400 Train loss: 0.029
Val acc: 0.961
Epoch: 11/20 Iteration: 13405 Train loss: 0.013
Epoch: 11/20 Iteration: 13410 Train loss: 0.061
Epoch: 11/20 Iteration: 13415 Train loss: 0.032
Epoch: 11/20 Iteration: 13420 Train loss: 0.036
Epoch: 11/20 Iteration: 13425 Train loss: 0.026
Val acc: 0.959
Epoch: 11/20 Iteration: 13430 Train loss: 0.025
Epoch: 11/20 Iteration: 13435 Train loss: 0.018
Epoch: 11/20 Iteration: 13440 Train loss: 0.055
Epoch: 11/20 Iteration: 1344

Val acc: 0.961
Epoch: 12/20 Iteration: 14155 Train loss: 0.037
Epoch: 12/20 Iteration: 14160 Train loss: 0.016
Epoch: 12/20 Iteration: 14165 Train loss: 0.016
Epoch: 12/20 Iteration: 14170 Train loss: 0.023
Epoch: 12/20 Iteration: 14175 Train loss: 0.026
Val acc: 0.962
Epoch: 12/20 Iteration: 14180 Train loss: 0.032
Epoch: 12/20 Iteration: 14185 Train loss: 0.021
Epoch: 12/20 Iteration: 14190 Train loss: 0.015
Epoch: 12/20 Iteration: 14195 Train loss: 0.029
Epoch: 12/20 Iteration: 14200 Train loss: 0.008
Val acc: 0.962
Epoch: 12/20 Iteration: 14205 Train loss: 0.022
Epoch: 12/20 Iteration: 14210 Train loss: 0.012
Epoch: 12/20 Iteration: 14215 Train loss: 0.021
Epoch: 12/20 Iteration: 14220 Train loss: 0.026
Epoch: 12/20 Iteration: 14225 Train loss: 0.019
Val acc: 0.961
Epoch: 12/20 Iteration: 14230 Train loss: 0.024
Epoch: 12/20 Iteration: 14235 Train loss: 0.032
Epoch: 12/20 Iteration: 14240 Train loss: 0.022
Epoch: 12/20 Iteration: 14245 Train loss: 0.016
Epoch: 12/20 Iteration: 1425

Epoch: 12/20 Iteration: 14960 Train loss: 0.018
Epoch: 12/20 Iteration: 14965 Train loss: 0.016
Epoch: 12/20 Iteration: 14970 Train loss: 0.039
Epoch: 12/20 Iteration: 14975 Train loss: 0.022
Val acc: 0.960
Epoch: 12/20 Iteration: 14980 Train loss: 0.031
Epoch: 12/20 Iteration: 14985 Train loss: 0.018
Epoch: 12/20 Iteration: 14990 Train loss: 0.010
Epoch: 12/20 Iteration: 14995 Train loss: 0.020
Epoch: 12/20 Iteration: 15000 Train loss: 0.019
Val acc: 0.959
Epoch: 12/20 Iteration: 15005 Train loss: 0.014
Epoch: 12/20 Iteration: 15010 Train loss: 0.036
Epoch: 12/20 Iteration: 15015 Train loss: 0.040
Epoch: 12/20 Iteration: 15020 Train loss: 0.008
Epoch: 12/20 Iteration: 15025 Train loss: 0.014
Val acc: 0.961
Epoch: 12/20 Iteration: 15030 Train loss: 0.027
Epoch: 12/20 Iteration: 15035 Train loss: 0.022
Epoch: 12/20 Iteration: 15040 Train loss: 0.028
Epoch: 12/20 Iteration: 15045 Train loss: 0.017
Epoch: 12/20 Iteration: 15050 Train loss: 0.011
Val acc: 0.958
Epoch: 12/20 Iteration: 1505

Epoch: 13/20 Iteration: 15765 Train loss: 0.034
Epoch: 13/20 Iteration: 15770 Train loss: 0.009
Epoch: 13/20 Iteration: 15775 Train loss: 0.009
Val acc: 0.959
Epoch: 13/20 Iteration: 15780 Train loss: 0.030
Epoch: 13/20 Iteration: 15785 Train loss: 0.042
Epoch: 13/20 Iteration: 15790 Train loss: 0.018
Epoch: 13/20 Iteration: 15795 Train loss: 0.033
Epoch: 13/20 Iteration: 15800 Train loss: 0.017
Val acc: 0.958
Epoch: 13/20 Iteration: 15805 Train loss: 0.025
Epoch: 13/20 Iteration: 15810 Train loss: 0.007
Epoch: 13/20 Iteration: 15815 Train loss: 0.012
Epoch: 13/20 Iteration: 15820 Train loss: 0.020
Epoch: 13/20 Iteration: 15825 Train loss: 0.014
Val acc: 0.961
Epoch: 13/20 Iteration: 15830 Train loss: 0.011
Epoch: 13/20 Iteration: 15835 Train loss: 0.019
Epoch: 13/20 Iteration: 15840 Train loss: 0.004
Epoch: 13/20 Iteration: 15845 Train loss: 0.013
Epoch: 13/20 Iteration: 15850 Train loss: 0.021
Val acc: 0.962
Epoch: 13/20 Iteration: 15855 Train loss: 0.027
Epoch: 13/20 Iteration: 1586

Epoch: 14/20 Iteration: 16570 Train loss: 0.019
Epoch: 14/20 Iteration: 16575 Train loss: 0.020
Val acc: 0.960
Epoch: 14/20 Iteration: 16580 Train loss: 0.022
Epoch: 14/20 Iteration: 16585 Train loss: 0.016
Epoch: 14/20 Iteration: 16590 Train loss: 0.020
Epoch: 14/20 Iteration: 16595 Train loss: 0.022
Epoch: 14/20 Iteration: 16600 Train loss: 0.028
Val acc: 0.961
Epoch: 14/20 Iteration: 16605 Train loss: 0.021
Epoch: 14/20 Iteration: 16610 Train loss: 0.031
Epoch: 14/20 Iteration: 16615 Train loss: 0.031
Epoch: 14/20 Iteration: 16620 Train loss: 0.021
Epoch: 14/20 Iteration: 16625 Train loss: 0.014
Val acc: 0.960
Epoch: 14/20 Iteration: 16630 Train loss: 0.018
Epoch: 14/20 Iteration: 16635 Train loss: 0.011
Epoch: 14/20 Iteration: 16640 Train loss: 0.022
Epoch: 14/20 Iteration: 16645 Train loss: 0.007
Epoch: 14/20 Iteration: 16650 Train loss: 0.012
Val acc: 0.957
Epoch: 14/20 Iteration: 16655 Train loss: 0.002
Epoch: 14/20 Iteration: 16660 Train loss: 0.020
Epoch: 14/20 Iteration: 1666

Epoch: 14/20 Iteration: 17375 Train loss: 0.006
Val acc: 0.962
Epoch: 14/20 Iteration: 17380 Train loss: 0.016
Epoch: 14/20 Iteration: 17385 Train loss: 0.039
Epoch: 14/20 Iteration: 17390 Train loss: 0.025
Epoch: 14/20 Iteration: 17395 Train loss: 0.009
Epoch: 14/20 Iteration: 17400 Train loss: 0.010
Val acc: 0.962
Epoch: 14/20 Iteration: 17405 Train loss: 0.015
Epoch: 14/20 Iteration: 17410 Train loss: 0.002
Epoch: 14/20 Iteration: 17415 Train loss: 0.016
Epoch: 14/20 Iteration: 17420 Train loss: 0.010
Epoch: 14/20 Iteration: 17425 Train loss: 0.050
Val acc: 0.961
Epoch: 14/20 Iteration: 17430 Train loss: 0.013
Epoch: 14/20 Iteration: 17435 Train loss: 0.034
Epoch: 14/20 Iteration: 17440 Train loss: 0.036
Epoch: 14/20 Iteration: 17445 Train loss: 0.019
Epoch: 14/20 Iteration: 17450 Train loss: 0.008
Val acc: 0.962
Epoch: 14/20 Iteration: 17455 Train loss: 0.010
Epoch: 14/20 Iteration: 17460 Train loss: 0.036
Epoch: 14/20 Iteration: 17465 Train loss: 0.012
Epoch: 14/20 Iteration: 1747

Val acc: 0.961
Epoch: 15/20 Iteration: 18180 Train loss: 0.020
Epoch: 15/20 Iteration: 18185 Train loss: 0.014
Epoch: 15/20 Iteration: 18190 Train loss: 0.013
Epoch: 15/20 Iteration: 18195 Train loss: 0.014
Epoch: 15/20 Iteration: 18200 Train loss: 0.016
Val acc: 0.961
Epoch: 15/20 Iteration: 18205 Train loss: 0.031
Epoch: 15/20 Iteration: 18210 Train loss: 0.016
Epoch: 15/20 Iteration: 18215 Train loss: 0.003
Epoch: 15/20 Iteration: 18220 Train loss: 0.011
Epoch: 15/20 Iteration: 18225 Train loss: 0.044
Val acc: 0.961
Epoch: 15/20 Iteration: 18230 Train loss: 0.016
Epoch: 15/20 Iteration: 18235 Train loss: 0.008
Epoch: 15/20 Iteration: 18240 Train loss: 0.032
Epoch: 15/20 Iteration: 18245 Train loss: 0.016
Epoch: 15/20 Iteration: 18250 Train loss: 0.020
Val acc: 0.960
Epoch: 15/20 Iteration: 18255 Train loss: 0.014
Epoch: 15/20 Iteration: 18260 Train loss: 0.009
Epoch: 15/20 Iteration: 18265 Train loss: 0.021
Epoch: 15/20 Iteration: 18270 Train loss: 0.019
Epoch: 15/20 Iteration: 1827

Epoch: 16/20 Iteration: 18985 Train loss: 0.016
Epoch: 16/20 Iteration: 18990 Train loss: 0.017
Epoch: 16/20 Iteration: 18995 Train loss: 0.020
Epoch: 16/20 Iteration: 19000 Train loss: 0.025
Val acc: 0.957
Epoch: 16/20 Iteration: 19005 Train loss: 0.019
Epoch: 16/20 Iteration: 19010 Train loss: 0.003
Epoch: 16/20 Iteration: 19015 Train loss: 0.027
Epoch: 16/20 Iteration: 19020 Train loss: 0.023
Epoch: 16/20 Iteration: 19025 Train loss: 0.008
Val acc: 0.961
Epoch: 16/20 Iteration: 19030 Train loss: 0.021
Epoch: 16/20 Iteration: 19035 Train loss: 0.024
Epoch: 16/20 Iteration: 19040 Train loss: 0.015
Epoch: 16/20 Iteration: 19045 Train loss: 0.019
Epoch: 16/20 Iteration: 19050 Train loss: 0.012
Val acc: 0.962
Epoch: 16/20 Iteration: 19055 Train loss: 0.025
Epoch: 16/20 Iteration: 19060 Train loss: 0.024
Epoch: 16/20 Iteration: 19065 Train loss: 0.031
Epoch: 16/20 Iteration: 19070 Train loss: 0.023
Epoch: 16/20 Iteration: 19075 Train loss: 0.004
Val acc: 0.961
Epoch: 16/20 Iteration: 1908

Epoch: 16/20 Iteration: 19790 Train loss: 0.011
Epoch: 16/20 Iteration: 19795 Train loss: 0.005
Epoch: 16/20 Iteration: 19800 Train loss: 0.009
Val acc: 0.962
Epoch: 16/20 Iteration: 19805 Train loss: 0.025
Epoch: 16/20 Iteration: 19810 Train loss: 0.019
Epoch: 16/20 Iteration: 19815 Train loss: 0.022
Epoch: 16/20 Iteration: 19820 Train loss: 0.017
Epoch: 16/20 Iteration: 19825 Train loss: 0.015
Val acc: 0.961
Epoch: 16/20 Iteration: 19830 Train loss: 0.027
Epoch: 16/20 Iteration: 19835 Train loss: 0.014
Epoch: 16/20 Iteration: 19840 Train loss: 0.009
Epoch: 16/20 Iteration: 19845 Train loss: 0.021
Epoch: 16/20 Iteration: 19850 Train loss: 0.016
Val acc: 0.962
Epoch: 16/20 Iteration: 19855 Train loss: 0.020
Epoch: 16/20 Iteration: 19860 Train loss: 0.005
Epoch: 16/20 Iteration: 19865 Train loss: 0.011
Epoch: 16/20 Iteration: 19870 Train loss: 0.026
Epoch: 16/20 Iteration: 19875 Train loss: 0.028
Val acc: 0.961
Epoch: 16/20 Iteration: 19880 Train loss: 0.012
Epoch: 16/20 Iteration: 1988

Epoch: 17/20 Iteration: 20595 Train loss: 0.021
Epoch: 17/20 Iteration: 20600 Train loss: 0.049
Val acc: 0.962
Epoch: 17/20 Iteration: 20605 Train loss: 0.016
Epoch: 17/20 Iteration: 20610 Train loss: 0.025
Epoch: 17/20 Iteration: 20615 Train loss: 0.004
Epoch: 17/20 Iteration: 20620 Train loss: 0.016
Epoch: 17/20 Iteration: 20625 Train loss: 0.015
Val acc: 0.958
Epoch: 17/20 Iteration: 20630 Train loss: 0.013
Epoch: 17/20 Iteration: 20635 Train loss: 0.022
Epoch: 17/20 Iteration: 20640 Train loss: 0.007
Epoch: 17/20 Iteration: 20645 Train loss: 0.016
Epoch: 17/20 Iteration: 20650 Train loss: 0.009
Val acc: 0.960
Epoch: 17/20 Iteration: 20655 Train loss: 0.027
Epoch: 17/20 Iteration: 20660 Train loss: 0.016
Epoch: 17/20 Iteration: 20665 Train loss: 0.009
Epoch: 17/20 Iteration: 20670 Train loss: 0.024
Epoch: 17/20 Iteration: 20675 Train loss: 0.019
Val acc: 0.961
Epoch: 17/20 Iteration: 20680 Train loss: 0.037
Epoch: 17/20 Iteration: 20685 Train loss: 0.009
Epoch: 17/20 Iteration: 2069

Epoch: 18/20 Iteration: 21400 Train loss: 0.007
Val acc: 0.962
Epoch: 18/20 Iteration: 21405 Train loss: 0.020
Epoch: 18/20 Iteration: 21410 Train loss: 0.030
Epoch: 18/20 Iteration: 21415 Train loss: 0.010
Epoch: 18/20 Iteration: 21420 Train loss: 0.018
Epoch: 18/20 Iteration: 21425 Train loss: 0.022
Val acc: 0.963
Epoch: 18/20 Iteration: 21430 Train loss: 0.026
Epoch: 18/20 Iteration: 21435 Train loss: 0.022
Epoch: 18/20 Iteration: 21440 Train loss: 0.019
Epoch: 18/20 Iteration: 21445 Train loss: 0.008
Epoch: 18/20 Iteration: 21450 Train loss: 0.018
Val acc: 0.961
Epoch: 18/20 Iteration: 21455 Train loss: 0.016
Epoch: 18/20 Iteration: 21460 Train loss: 0.026
Epoch: 18/20 Iteration: 21465 Train loss: 0.008
Epoch: 18/20 Iteration: 21470 Train loss: 0.019
Epoch: 18/20 Iteration: 21475 Train loss: 0.025
Val acc: 0.961
Epoch: 18/20 Iteration: 21480 Train loss: 0.026
Epoch: 18/20 Iteration: 21485 Train loss: 0.015
Epoch: 18/20 Iteration: 21490 Train loss: 0.011
Epoch: 18/20 Iteration: 2149

Val acc: 0.963
Epoch: 18/20 Iteration: 22205 Train loss: 0.016
Epoch: 18/20 Iteration: 22210 Train loss: 0.052
Epoch: 18/20 Iteration: 22215 Train loss: 0.015
Epoch: 18/20 Iteration: 22220 Train loss: 0.011
Epoch: 18/20 Iteration: 22225 Train loss: 0.004
Val acc: 0.962
Epoch: 18/20 Iteration: 22230 Train loss: 0.016
Epoch: 18/20 Iteration: 22235 Train loss: 0.011
Epoch: 18/20 Iteration: 22240 Train loss: 0.010
Epoch: 18/20 Iteration: 22245 Train loss: 0.009
Epoch: 18/20 Iteration: 22250 Train loss: 0.015
Val acc: 0.959
Epoch: 18/20 Iteration: 22255 Train loss: 0.008
Epoch: 18/20 Iteration: 22260 Train loss: 0.010
Epoch: 18/20 Iteration: 22265 Train loss: 0.010
Epoch: 18/20 Iteration: 22270 Train loss: 0.016
Epoch: 18/20 Iteration: 22275 Train loss: 0.022
Val acc: 0.960
Epoch: 18/20 Iteration: 22280 Train loss: 0.012
Epoch: 18/20 Iteration: 22285 Train loss: 0.006
Epoch: 18/20 Iteration: 22290 Train loss: 0.017
Epoch: 18/20 Iteration: 22295 Train loss: 0.033
Epoch: 18/20 Iteration: 2230

Epoch: 19/20 Iteration: 23010 Train loss: 0.024
Epoch: 19/20 Iteration: 23015 Train loss: 0.024
Epoch: 19/20 Iteration: 23020 Train loss: 0.002
Epoch: 19/20 Iteration: 23025 Train loss: 0.010
Val acc: 0.960
Epoch: 19/20 Iteration: 23030 Train loss: 0.008
Epoch: 19/20 Iteration: 23035 Train loss: 0.022
Epoch: 19/20 Iteration: 23040 Train loss: 0.041
Epoch: 19/20 Iteration: 23045 Train loss: 0.008
Epoch: 19/20 Iteration: 23050 Train loss: 0.003
Val acc: 0.959
Epoch: 19/20 Iteration: 23055 Train loss: 0.011
Epoch: 19/20 Iteration: 23060 Train loss: 0.013
Epoch: 19/20 Iteration: 23065 Train loss: 0.009
Epoch: 19/20 Iteration: 23070 Train loss: 0.000
Epoch: 19/20 Iteration: 23075 Train loss: 0.014
Val acc: 0.961
Epoch: 19/20 Iteration: 23080 Train loss: 0.047
Epoch: 19/20 Iteration: 23085 Train loss: 0.015
Epoch: 19/20 Iteration: 23090 Train loss: 0.032
Epoch: 19/20 Iteration: 23095 Train loss: 0.007
Epoch: 19/20 Iteration: 23100 Train loss: 0.017
Val acc: 0.962
Epoch: 19/20 Iteration: 2310

## Testing

In [32]:
test_acc = []
with tf.Session(graph=graph) as sess:
    saver.restore(sess, 'checkpoints/is_person.ckpt')

    test_state = sess.run(cell.zero_state(batch_size, tf.float32))
    for ii, (x, y) in enumerate(get_batches(test_x, test_y, batch_size), 1):
        feed = {inputs_: x,
                labels_: y[:, None],
                keep_prob: 1,
                initial_state: test_state}
        batch_acc, test_state = sess.run([accuracy, final_state], feed_dict=feed)
        test_acc.append(batch_acc)
    print("Test accuracy: {:.3f}".format(np.mean(test_acc)))

Test accuracy: 0.964


# Predictions

In [2]:
import timeit
import re
from collections import Counter

df = pd.read_pickle('DataSets/combined.pkl')

### Predict Single Data Row

In [103]:
data = test_x[:batch_size]

with tf.Session(graph=graph) as sess:
    saver.restore(sess, 'checkpoints/is_person.ckpt')
    
    test_state = sess.run(cell.zero_state(batch_size, tf.float32))
    
    feed = {inputs_: data,
        keep_prob: 1,
        initial_state: test_state}
    
    output_predictions = sess.run([predictions], feed_dict=feed)
    print(output_predictions)

[array([[  9.99967337e-01],
       [  4.68772720e-04],
       [  9.98217523e-01],
       [  9.99976516e-01],
       [  4.54508199e-06],
       [  1.76544519e-04],
       [  9.99984145e-01],
       [  1.28880114e-04],
       [  9.99939442e-01],
       [  5.29563753e-04],
       [  4.27714578e-04],
       [  3.44581326e-06],
       [  2.41771522e-05],
       [  8.00505877e-05],
       [  9.99655724e-01],
       [  1.07329246e-03],
       [  6.78206270e-04],
       [  9.95866060e-01],
       [  8.35756958e-01],
       [  9.99882936e-01],
       [  2.66421470e-04],
       [  9.99988198e-01],
       [  2.75759317e-04],
       [  2.13995241e-04],
       [  5.51104662e-04],
       [  9.99991655e-01],
       [  2.00413022e-04],
       [  5.30082070e-05],
       [  9.99978423e-01],
       [  6.49401336e-04],
       [  9.99985933e-01],
       [  5.24374889e-04],
       [  6.42896863e-04],
       [  9.99968290e-01],
       [  4.41575452e-04],
       [  1.58764233e-04],
       [  1.40085653e-03],


# Predict Multiple Rows

### Prepare dataset

In [19]:
# Data to predict on and filter
df = pd.read_pickle('DataSets/data.pkl')

### Split data sets between original content and modified below

In [21]:
original = df
original.to_pickle('DataSets/final_dataset.pkl')

In [22]:
# To Lower Case
df = df.apply(lambda x: x.astype(str).str.lower())

In [23]:
# Remove special characters from content
def remove_special_char(content):
    return re.sub('[^A-Za-z\s]+', '', content)

In [24]:
start_time = timeit.default_timer()

df['Content'] = df['Content'].apply(remove_special_char)

print(timeit.default_timer() - start_time)

12.507773416871588


### Remove words that aren't in vocabulary

In [25]:
def remove_unkowns(content):
    return ' '.join([w for w in content.split() if vocab_to_int.get(w, vocab_to_int['<UNK>']) != vocab_to_int['<UNK>']])

In [26]:
df['Content'] = df['Content'].apply(remove_unkowns)

In [27]:
def cont_to_int(content):
    return [vocab_to_int.get(word, vocab_to_int['<UNK>']) for word in content.split()]

In [28]:
df['Content_Ints'] = df['Content'].apply(cont_to_int)

In [29]:
df.to_pickle('DataSets/feature_dataset.pkl')

# Start Here With Existing Data

In [21]:
df = pd.read_pickle('DataSets/feature_dataset.pkl')
original = pd.read_pickle('DataSets/final_dataset.pkl')

In [30]:
seq_len = 200
content_ints = list(df['Content_Ints'])

features = np.zeros((len(content_ints), seq_len), dtype=int)
for i, row in enumerate(content_ints):
    features[i, -len(row):] = np.array(row)[:seq_len]

### Predict

In [32]:
def get_predict_batches(x, batch_size=100):    
    n_batches = len(x)//batch_size
    x = x[:n_batches*batch_size]
    for ii in range(0, len(x), batch_size):
        yield x[ii:ii+batch_size]

In [33]:
results = []

with graph.as_default():
    saver = tf.train.Saver()

with tf.Session(graph=graph) as sess:
    saver.restore(sess, 'checkpoints/is_person.ckpt')
    
    test_state = sess.run(cell.zero_state(batch_size, tf.float32))
    
    for ii, x in enumerate(get_predict_batches(features, batch_size), 1):
        feed = {inputs_: x,
            keep_prob: 1,
            initial_state: test_state}
    
        output_predictions = sess.run([predictions], feed_dict=feed)
        results += [r[0] for r in output_predictions[0]]

In [None]:
threshold = 0.9

final_predictions = []
for r in results:
    final_predictions += [str(r > threshold)]
    
final_predictions += ["Not Evaluated"]*(len(df) - len(final_predictions))
results += [404.0]*(len(df) - len(results))

original['Probabilities'] = results
original['Is_Name'] = final_predictions