# Project 6: Analyzing Stock Sentiment from Twits
## Instructions
Each problem consists of a function to implement and instructions on how to implement the function.  The parts of the function that need to be implemented are marked with a `# TODO` comment.

## Packages
When you implement the functions, you'll only need to you use the packages you've used in the classroom, like [Pandas](https://pandas.pydata.org/) and [Numpy](http://www.numpy.org/). These packages will be imported for you. We recommend you don't add any import statements, otherwise the grader might not be able to run your code.

### Load Packages

In [1]:
import json
import nltk
import os
import random
import re
import torch

from torch import nn, optim
import torch.nn.functional as F

## Introduction
When deciding the value of a company, it's important to follow the news. For example, a product recall or natural disaster in a company's product chain. You want to be able to turn this information into a signal. Currently, the best tool for the job is a Neural Network. 

For this project, you'll use posts from the social media site [StockTwits](https://en.wikipedia.org/wiki/StockTwits). The community on StockTwits is full of investors, traders, and entrepreneurs. Each message posted is called a Twit. This is similar to Twitter's version of a post, called a Tweet. You'll build a model around these twits that generate a sentiment score.

We've collected a bunch of twits, then hand labeled the sentiment of each. To capture the degree of sentiment, we'll use a five-point scale: very negative, negative, neutral, positive, very positive. Each twit is labeled -2 to 2 in steps of 1, from very negative to very positive respectively. You'll build a sentiment analysis model that will learn to assign sentiment to twits on its own, using this labeled data.

The first thing we should to do, is load the data.

## Import Twits 
### Load Twits Data 
This JSON file contains a list of objects for each twit in the `'data'` field:

```
{'data':
  {'message_body': 'Neutral twit body text here',
   'sentiment': 0},
  {'message_body': 'Happy twit body text here',
   'sentiment': 1},
   ...
}
```

The fields represent the following:

* `'message_body'`: The text of the twit.
* `'sentiment'`: Sentiment score for the twit, ranges from -2 to 2 in steps of 1, with 0 being neutral.


To see what the data look like by printing the first 10 twits from the list. 

In [2]:
# Delete this
from shutil import copyfile
copyfile(src = "./../../data/project_6_stocktwits/twits.json", dst = "./twits.json")

'./twits.json'

In [3]:
with open('twits.json', 'r') as f:
    twits = json.load(f)

print(twits['data'][:10])

[{'message_body': '$FITB great buy at 26.00...ill wait', 'sentiment': 2, 'timestamp': '2018-07-01T00:00:09Z'}, {'message_body': '@StockTwits $MSFT', 'sentiment': 1, 'timestamp': '2018-07-01T00:00:42Z'}, {'message_body': '#STAAnalystAlert for $TDG : Jefferies Maintains with a rating of Hold setting target price at USD 350.00. Our own verdict is Buy  http://www.stocktargetadvisor.com/toprating', 'sentiment': 2, 'timestamp': '2018-07-01T00:01:24Z'}, {'message_body': '$AMD I heard there’s a guy who knows someone who thinks somebody knows something - on StockTwits.', 'sentiment': 1, 'timestamp': '2018-07-01T00:01:47Z'}, {'message_body': '$AMD reveal yourself!', 'sentiment': 0, 'timestamp': '2018-07-01T00:02:13Z'}, {'message_body': '$AAPL Why the drop? I warren Buffet taking out his position?', 'sentiment': 1, 'timestamp': '2018-07-01T00:03:10Z'}, {'message_body': '$BA bears have 1 reason on 06-29 to pay more attention https://dividendbot.com?s=BA', 'sentiment': -2, 'timestamp': '2018-07-01T

### Length of Data
Now let's look at the number of twits in dataset. Print the number of twits below.

In [4]:
print('Length of the Data: ', len(twits['data']))

Length of the Data:  1548010


### Split Message Body and Sentiment Score

In [5]:
messages = [twit['message_body'] for twit in twits['data']]
# Since the sentiment scores are discrete, we'll scale the sentiments to 0 to 4 for use in our network
sentiments = [twit['sentiment'] + 2 for twit in twits['data']]

## Preprocessing the Data
With our data in hand we need to preprocess our text. These twits are collected by filtering on ticker symbols where these are denoted with a leader $ symbol in the twit itself. For example,

`{'message_body': 'RT @google Our annual look at the year in Google blogging (and beyond) http://t.co/sptHOAh8 $GOOG',
 'sentiment': 0}`

The ticker symbols don't provide information on the sentiment, and they are in every twit, so we should remove them. This twit also has the `@google` username, again not providing sentiment information, so we should also remove it. We also see a URL `http://t.co/sptHOAh8`. Let's remove these too.

The easiest way to remove specific words or phrases is with regex using the `re` module. You can sub out specific patterns with a space:

```python
re.sub(pattern, ' ', text)
```
This will substitute a space with anywhere the pattern matches in the text. Later when we tokenize the text, we'll split appropriately on those spaces.

### Pre-Processing

In [6]:
nltk.download('wordnet')


def preprocess(message):
    """
    This function takes a string as input, then performs these operations: 
        - lowercase
        - remove URLs
        - remove ticker symbols 
        - removes punctuation
        - tokenize by splitting the string on whitespace 
        - removes any single character tokens
    
    Parameters
    ----------
        message : The text message to be preprocessed.
        
    Returns
    -------
        tokens: The preprocessed text into tokens.
    """ 
    
    # Lowercase the twit message
    text = message.lower()
    
    # Replace URLs with a space in the message
    text = re.sub(r'https?://[^\s]+', ' ', text)
    
    # Replace ticker symbols with a space. The ticker symbols are any stock symbol that starts with $.
    text = re.sub(r'\$[a-zA-Z0-9]*', ' ', text)
    
    # Replace StockTwits usernames with a space. The usernames are any word that starts with @.
    text = re.sub(r'@[a-zA-Z0-9]*', ' ', text)

    # Replace everything not a letter with a space
    text = re.sub(r'[^a-z]', ' ', text)
    
    # Tokenize by splitting the string on whitespace into a list of words
    tokens = text.split()

    # Lemmatize words using the WordNetLemmatizer. You can ignore any word that is not longer than one character.
    wnl = nltk.stem.WordNetLemmatizer()
    tokens = [wnl.lemmatize(w) for w in tokens if len(w)  > 1]
    
    return tokens

[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data]   Unzipping corpora/wordnet.zip.


### Preprocess All the Twits 
Now we can preprocess each of the twits in our dataset. Apply the function `preprocess` to all the twit messages.

In [7]:
tokenized = [preprocess(i_message) for i_message in messages]

### Bag of Words
Now with all of our messages tokenized, we want to create a vocabulary and count up how often each word appears in our entire corpus. Use the [`Counter`](https://docs.python.org/3.1/library/collections.html#collections.Counter) function to count up all the tokens.

In [8]:
from collections import Counter

"""
Create a vocabulary by using Bag of words
"""
tokens = [word for twit in tokenized for word in twit]
bow = Counter(tokens)

### Frequency of Words Appearing in Message
With our vocabulary, now we'll remove some of the most common words such as 'the', 'and', 'it', etc. These words don't contribute to identifying sentiment and are really common, resulting in a lot of noise in our input. If we can filter these out, then our network should have an easier time learning.

We also want to remove really rare words that show up in a only a few twits. Here you'll want to divide the count of each word by the number of messages. Then remove words that only appear in some small fraction of the messages.

In [9]:
"""
Set the following variables:
    freqs
    low_cutoff
    high_cutoff
    K_most_common
"""

# TODO Implement 

# Dictionart that contains the Frequency of words appearing in messages.
# The key is the token and the value is the frequency of that word in the corpus.
freqs = {i_word: i_count / len(tokens) for i_word, i_count in bow.items()}

# Float that is the frequency cutoff. Drop words with a frequency that is lower or equal to this number.
low_cutoff = 5e-6

# Integer that is the cut off for most common words. Drop words that are the `high_cutoff` most common words.
high_cutoff = 30

# The k most common words in the corpus. Use `high_cutoff` as the k.
K_most_common = [word[0] for word in bow.most_common(high_cutoff)]


filtered_words = [word for word in freqs if (freqs[word] > low_cutoff and word not in K_most_common)]
print(K_most_common)
len(filtered_words) 

['the', 'to', 'is', 'for', 'on', 'of', 'and', 'in', 'this', 'it', 'at', 'will', 'up', 'are', 'you', 'that', 'be', 'short', 'what', 'buy', 'today', 'stock', 'here', 'just', 'down', 'with', 'not', 'call', 'day', 'we']


6607

### Updating Vocabulary by Removing Filtered Words
Let's creat three variables that will help with our vocabulary.

In [21]:
from tqdm import tqdm

"""
Set the following variables:
    vocab
    id2vocab
    filtered
"""
vocab = {}    # A dictionary for the `filtered_words`. The key is the word and value is an id that represents the word. 
id2vocab = {} # Reverse of the `vocab` dictionary. The key is word id and value is the word. 
for word_id, word in enumerate(filtered_words, 1):
    vocab[word] = word_id
    id2vocab[word_id] = word

# tokenized with the words not in `filtered_words` removed.
filtered = []
for twit in tqdm(tokenized):
    filtered.append([word for word in twit if word in filtered_words])

100%|██████████| 1548010/1548010 [19:34<00:00, 1317.93it/s]


### Balancing the classes
Let's do a few last pre-processing steps. If we look at how our twits are labeled, we'll find that 50% of them are neutral. This means that our network will be 50% accurate just by guessing 0 every single time. To help our network learn appropriately, we'll want to balance our classes.
That is, make sure each of our different sentiment scores show up roughly as frequently in the data.

What we can do here is go through each of our examples and randomly drop twits with neutral sentiment. What should be the probability we drop these twits if we want to get around 20% neutral twits starting at 50% neutral? We should also take this opportunity to remove messages with length 0.

In [22]:
balanced = {'messages': [], 'sentiments':[]}

n_neutral = sum(1 for each in sentiments if each == 2)
N_examples = len(sentiments)
keep_prob = (N_examples - n_neutral)/4/n_neutral

for idx, sentiment in enumerate(sentiments):
    message = filtered[idx]
    if len(message) == 0:
        # skip this message because it has length zero
        continue
    elif sentiment != 2 or random.random() < keep_prob:
        balanced['messages'].append(message)
        balanced['sentiments'].append(sentiment) 

If you did it correctly, you should see the following result 

In [23]:
n_neutral = sum(1 for each in balanced['sentiments'] if each == 2)
N_examples = len(balanced['sentiments'])
n_neutral/N_examples

0.194235907378922

Finally let's convert our tokens into integer ids which we can pass to the network.

In [24]:
token_ids = [[vocab[word] for word in message] for message in balanced['messages']]
sentiments = balanced['sentiments']

## Neural Network
Now we have our vocabulary which means we can transform our tokens into ids, which are then passed to our network. So, let's define the network now!

Here is a nice diagram showing the network we'd like to build: 

#### Embed -> RNN -> Dense -> Softmax
### Implement the text classifier
Before we build text classifier, if you remember from the other network that you built in  "Sentiment Analysis with an RNN"  exercise  - which there, the network called " SentimentRNN", here we named it "TextClassifer" - consists of three main parts: 1) init function `__init__` 2) forward pass `forward`  3) hidden state `init_hidden`. 

This network is pretty similar to the network you built expect in the  `forward` pass, we use softmax instead of sigmoid. The reason we are not using sigmoid is that the output of NN is not a binary. In our network, sentiment scores have 5 possible outcomes. We are looking for an outcome with the highest probability thus softmax is a better choice.

In [25]:
class TextClassifier(nn.Module):
    def __init__(self, vocab_size, embed_size, lstm_size, output_size, lstm_layers=1, dropout=0.1):
        """
        Initialize the model by setting up the layers.
        
        Parameters
        ----------
            vocab_size : The vocabulary size.
            embed_size : The embedding layer size.
            lstm_size : The LSTM layer size.
            output_size : The output size.
            lstm_layers : The number of LSTM layers.
            dropout : The dropout probability.
        """
        
        super().__init__()
        self.vocab_size = vocab_size
        self.embed_size = embed_size
        self.lstm_size = lstm_size
        self.output_size = output_size
        self.lstm_layers = lstm_layers
        self.dropout = dropout

        # Setup embedding layer
        self.embedding = nn.Embedding(num_embeddings=self.vocab_size, embedding_dim=self.embed_size)
        
        ### Setup additional layers
        
        # LSTM layer
        self.lstm = nn.LSTM(input_size = self.embed_size, hidden_size = self.lstm_size,
                            num_layers = self.lstm_layers, batch_first = False, dropout = self.dropout)
        
        # FCL
        self.fcl = nn.Linear(in_features = self.lstm_size, out_features = self.output_size)
        
        # LOG softmax
        self.log_softmax = nn.LogSoftmax(dim=1)


    def init_hidden(self, batch_size):
        """ 
        Initializes hidden state
        
        Parameters
        ----------
            batch_size : The size of batches.
        
        Returns
        -------
            hidden_state
            
        """
        # Create two new tensors with sizes n_layers x batch_size x hidden_dim,
        # initialized to zero, for hidden state and cell state of LSTM
        weight = next(self.parameters()).data
        hidden_state = (weight.new(self.lstm_layers, batch_size, self.lstm_size).zero_(),
                        weight.new(self.lstm_layers, batch_size, self.lstm_size).zero_())
        
        return hidden_state


    def forward(self, nn_input, hidden_state):
        """
        Perform a forward pass of our model on nn_input.
        
        Parameters
        ----------
            nn_input : The batch of input to the NN.
            hidden_state : The LSTM hidden state.

        Returns
        -------
            logps: log softmax output
            hidden_state: The new hidden state.

        """
        
        # Embedding
        embed = self.embedding(nn_input)
        
        # LSTM
        lstm_out, hidden_state = self.lstm(embed, hidden_state)
        lstm_out = lstm_out[-1]
        
        # FCL + Log Softmax
        logps = self.log_softmax(self.fcl(lstm_out))
        
        return logps, hidden_state

### View Model

In [26]:
model = TextClassifier(len(vocab), 10, 6, 5, dropout=0.1, lstm_layers=2)
model.embedding.weight.data.uniform_(-1, 1)
input = torch.randint(0, 1000, (5, 4), dtype=torch.int64)
hidden = model.init_hidden(4)

logps, _ = model.forward(input, hidden)
print(logps)

tensor([[-1.7592, -1.4968, -1.6923, -1.8165, -1.3575],
        [-1.7273, -1.5131, -1.6940, -1.8050, -1.3714],
        [-1.7046, -1.5288, -1.6859, -1.7997, -1.3834],
        [-1.7591, -1.4960, -1.6950, -1.8142, -1.3579]])


## Training
### DataLoaders and Batching
Now we should build a generator that we can use to loop through our data. It'll be more efficient if we can pass our sequences in as batches. Our input tensors should look like `(sequence_length, batch_size)`. So if our sequences are 40 tokens long and we pass in 25 sequences, then we'd have an input size of `(40, 25)`.

If we set our sequence length to 40, what do we do with messages that are more or less than 40 tokens? For messages with fewer than 40 tokens, we will pad the empty spots with zeros. We should be sure to **left** pad so that the RNN starts from nothing before going through the data. If the message has 20 tokens, then the first 20 spots of our 40 long sequence will be 0. If a message has more than 40 tokens, we'll just keep the first 40 tokens.

In [27]:
def dataloader(messages, labels, sequence_length=30, batch_size=32, shuffle=False):
    """ 
    Build a dataloader.
    """
    if shuffle:
        indices = list(range(len(messages)))
        random.shuffle(indices)
        messages = [messages[idx] for idx in indices]
        labels = [labels[idx] for idx in indices]

    total_sequences = len(messages)

    for ii in range(0, total_sequences, batch_size):
        batch_messages = messages[ii: ii+batch_size]
        
        # First initialize a tensor of all zeros
        batch = torch.zeros((sequence_length, len(batch_messages)), dtype=torch.int64)
        for batch_num, tokens in enumerate(batch_messages):
            token_tensor = torch.tensor(tokens)
            # Left pad!
            start_idx = max(sequence_length - len(token_tensor), 0)
            batch[start_idx:, batch_num] = token_tensor[:sequence_length]
        
        label_tensor = torch.tensor(labels[ii: ii+len(batch_messages)])
        
        yield batch, label_tensor

### Training and  Validation
With our data in nice shape, we'll split it into training and validation sets.

In [28]:
"""
Split data into training and validation datasets. Use an appropriate split size.
The features are the `token_ids` and the labels are the `sentiments`.
"""   
# Split size
split_size = 0.8

# Split index
split_index = int(len(token_ids) * split_size)

# Training features and labels
train_features = token_ids[:split_index]
train_labels = sentiments[:split_index]

# Validation features and labels
valid_features = token_ids[split_index:]
valid_labels = sentiments[split_index:]

In [29]:
text_batch, labels = next(iter(dataloader(train_features, train_labels, sequence_length=20, batch_size=64)))
model = TextClassifier(len(vocab)+1, 200, 128, 5, dropout=0.)
hidden = model.init_hidden(64)
logps, hidden = model.forward(text_batch, hidden)

### Training
It's time to train the neural network!

In [30]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

model = TextClassifier(len(vocab)+1, 1024, 512, 5, lstm_layers=2, dropout=0.2)
model.embedding.weight.data.uniform_(-1, 1)
model.to(device)

TextClassifier(
  (embedding): Embedding(6608, 1024)
  (lstm): LSTM(1024, 512, num_layers=2, dropout=0.2)
  (fcl): Linear(in_features=512, out_features=5, bias=True)
  (log_softmax): LogSoftmax()
)

In [35]:
import numpy as np

"""
Train your model with dropout. Make sure to clip your gradients.
Print the training loss, validation loss, and validation accuracy for every 100 steps.
"""

# Hyperparameters
epochs = 10
batch_size = 128
learning_rate = 1e-3
print_every = 100

# Loss fnuction
criterion = nn.NLLLoss()

# Adam optimizer
optimizer = optim.Adam(model.parameters(), lr=learning_rate)

# Train mode on
model.train()

# Loop over epchs
for epoch in range(epochs):
    
    # Report
    print('Starting epoch {}'.format(epoch + 1))
    
    # Initialize step
    steps = 0
    
    # Dataloader
    tqdm_dataloader = tqdm(dataloader(train_features, train_labels, batch_size = batch_size, sequence_length = 20, 
                                      shuffle = True))
    
    # Loop over the baches
    for text_batch, labels in tqdm_dataloader:
        
        # Increment the steps
        steps += 1
        
        # Hidden
        hidden = model.init_hidden(labels.shape[0])
        
        # Set Device
        text_batch, labels = text_batch.to(device), labels.to(device)
        for each in hidden:
            each.to(device)
        
        ### TRAINING THE MODEL
        
        # Reset the gradients to zero
        model.zero_grad()
        
        # Feedforward
        output, hidden = model(text_batch, hidden)
        
        # Calculate loss
        loss = criterion(output.squeeze(), labels)
        
        # Backpropagation
        loss.backward()
        
        # Clip gradient (in order to prevent the exploding gradient problem in our LSTM)
        nn.utils.clip_grad_norm_(model.parameters(), 5)
        
        # Optimize
        optimizer.step()
        
        # Report
        tqdm_dataloader.set_description(f'Epoch:{epoch + 1}...   Batch:{steps}...   Training Loss:{loss:.3f}')
        
        ### EVALUATING THE MODEL
        if steps % print_every == 0:
            
            # Evaluation mode on
            model.eval()
            
            # Initialize an empty list for validation losses
            validation_losses = []            
            
            # Get the inputs and labels + Convert them to given device
            for inputs, labels in dataloader(valid_features, valid_labels, batch_size=batch_size):
                inputs, labels = inputs.to(device), labels.to(device)
                
            # Get the validation hidden + Convert them to given device
            valid_hidden  = model.init_hidden(labels.shape[0])
            for i in valid_hidden: i.to(device)
                
            # Feedforward
            output, valid_hidden = model(inputs, valid_hidden)
            
            # Calculate the loss
            validation_loss = criterion(output.squeeze(), labels)
            
            # Append the loss to lis
            validation_losses.append(validation_loss.item())
            
            # Training mode on
            model.train()
            
            # Report
            tqdm.write(f'Epoch:{epoch + 1}/{epochs}...   Batch:{steps}...   Validation Loss:{np.mean(validation_losses):.3f}')


0it [00:00, ?it/s]

Starting epoch 1


[A
Epoch:1...   Batch:1...   Training Loss:1.219: 1it [00:02,  2.11s/it][A
Epoch:1...   Batch:3...   Training Loss:1.171: 3it [00:02,  1.50s/it][A
Epoch:1...   Batch:5...   Training Loss:1.242: 5it [00:02,  1.08s/it][A
Epoch:1...   Batch:7...   Training Loss:1.180: 7it [00:02,  1.29it/s][A
Epoch:1...   Batch:9...   Training Loss:1.284: 9it [00:02,  1.78it/s][A
Epoch:1...   Batch:11...   Training Loss:1.258: 11it [00:02,  2.42it/s][A
Epoch:1...   Batch:13...   Training Loss:1.275: 13it [00:02,  3.24it/s][A
Epoch:1...   Batch:15...   Training Loss:1.152: 15it [00:03,  4.25it/s][A
Epoch:1...   Batch:17...   Training Loss:1.165: 17it [00:03,  5.43it/s][A
Epoch:1...   Batch:19...   Training Loss:1.260: 19it [00:03,  6.74it/s][A
Epoch:1...   Batch:21...   Training Loss:1.169: 21it [00:03,  8.07it/s][A
Epoch:1...   Batch:23...   Training Loss:1.357: 23it [00:03,  9.40it/s][A
Epoch:1...   Batch:25...   Training Loss:1.159: 25it [00:03, 10.68it/s][A
Epoch:1...   Batch:27...   Trai

Epoch:1/10...   Batch:100...   Validation Loss:1.090


Epoch:1...   Batch:203...   Training Loss:1.080: 203it [00:20,  2.88it/s]

Epoch:1/10...   Batch:200...   Validation Loss:1.023


Epoch:1...   Batch:303...   Training Loss:1.108: 303it [00:30,  2.84it/s]

Epoch:1/10...   Batch:300...   Validation Loss:0.932


Epoch:1...   Batch:403...   Training Loss:1.044: 403it [00:39,  2.92it/s]

Epoch:1/10...   Batch:400...   Validation Loss:0.924


Epoch:1...   Batch:503...   Training Loss:0.980: 503it [00:48,  2.86it/s]

Epoch:1/10...   Batch:500...   Validation Loss:0.908


Epoch:1...   Batch:601...   Training Loss:0.900: 601it [00:57,  2.12it/s]

Epoch:1/10...   Batch:600...   Validation Loss:0.809


Epoch:1...   Batch:703...   Training Loss:0.898: 703it [01:06,  2.89it/s]

Epoch:1/10...   Batch:700...   Validation Loss:0.760


Epoch:1...   Batch:803...   Training Loss:1.016: 803it [01:16,  2.93it/s]

Epoch:1/10...   Batch:800...   Validation Loss:0.734


Epoch:1...   Batch:903...   Training Loss:0.847: 903it [01:25,  2.85it/s]

Epoch:1/10...   Batch:900...   Validation Loss:0.756


Epoch:1...   Batch:1003...   Training Loss:0.923: 1003it [01:34,  2.87it/s]

Epoch:1/10...   Batch:1000...   Validation Loss:0.770


Epoch:1...   Batch:1101...   Training Loss:0.967: 1101it [01:43,  2.12it/s]

Epoch:1/10...   Batch:1100...   Validation Loss:0.753


Epoch:1...   Batch:1203...   Training Loss:0.951: 1203it [01:52,  2.85it/s]

Epoch:1/10...   Batch:1200...   Validation Loss:0.727


Epoch:1...   Batch:1303...   Training Loss:0.871: 1303it [02:02,  2.92it/s]

Epoch:1/10...   Batch:1300...   Validation Loss:0.731


Epoch:1...   Batch:1403...   Training Loss:0.945: 1403it [02:11,  2.93it/s]

Epoch:1/10...   Batch:1400...   Validation Loss:0.735


Epoch:1...   Batch:1503...   Training Loss:0.952: 1503it [02:20,  2.95it/s]

Epoch:1/10...   Batch:1500...   Validation Loss:0.744


Epoch:1...   Batch:1601...   Training Loss:0.817: 1601it [02:29,  2.19it/s]

Epoch:1/10...   Batch:1600...   Validation Loss:0.718


Epoch:1...   Batch:1703...   Training Loss:0.963: 1703it [02:38,  2.88it/s]

Epoch:1/10...   Batch:1700...   Validation Loss:0.705


Epoch:1...   Batch:1803...   Training Loss:0.793: 1803it [02:47,  2.90it/s]

Epoch:1/10...   Batch:1800...   Validation Loss:0.729


Epoch:1...   Batch:1903...   Training Loss:0.882: 1903it [02:56,  2.93it/s]

Epoch:1/10...   Batch:1900...   Validation Loss:0.716


Epoch:1...   Batch:2003...   Training Loss:0.975: 2003it [03:06,  2.85it/s]

Epoch:1/10...   Batch:2000...   Validation Loss:0.675


Epoch:1...   Batch:2101...   Training Loss:0.908: 2101it [03:15,  2.11it/s]

Epoch:1/10...   Batch:2100...   Validation Loss:0.721


Epoch:1...   Batch:2203...   Training Loss:0.971: 2203it [03:24,  2.84it/s]

Epoch:1/10...   Batch:2200...   Validation Loss:0.719


Epoch:1...   Batch:2303...   Training Loss:0.902: 2303it [03:33,  2.91it/s]

Epoch:1/10...   Batch:2300...   Validation Loss:0.729


Epoch:1...   Batch:2403...   Training Loss:0.818: 2403it [03:42,  2.88it/s]

Epoch:1/10...   Batch:2400...   Validation Loss:0.705


Epoch:1...   Batch:2503...   Training Loss:0.945: 2503it [03:52,  2.86it/s]

Epoch:1/10...   Batch:2500...   Validation Loss:0.713


Epoch:1...   Batch:2601...   Training Loss:0.836: 2601it [04:01,  2.11it/s]

Epoch:1/10...   Batch:2600...   Validation Loss:0.733


Epoch:1...   Batch:2703...   Training Loss:0.862: 2703it [04:10,  2.89it/s]

Epoch:1/10...   Batch:2700...   Validation Loss:0.724


Epoch:1...   Batch:2803...   Training Loss:1.014: 2803it [04:19,  2.86it/s]

Epoch:1/10...   Batch:2800...   Validation Loss:0.718


Epoch:1...   Batch:2903...   Training Loss:0.889: 2903it [04:28,  2.90it/s]

Epoch:1/10...   Batch:2900...   Validation Loss:0.718


Epoch:1...   Batch:3003...   Training Loss:0.879: 3003it [04:37,  2.91it/s]

Epoch:1/10...   Batch:3000...   Validation Loss:0.705


Epoch:1...   Batch:3101...   Training Loss:0.977: 3101it [04:46,  2.16it/s]

Epoch:1/10...   Batch:3100...   Validation Loss:0.694


Epoch:1...   Batch:3203...   Training Loss:0.901: 3203it [04:56,  2.93it/s]

Epoch:1/10...   Batch:3200...   Validation Loss:0.707


Epoch:1...   Batch:3303...   Training Loss:0.871: 3303it [05:05,  2.89it/s]

Epoch:1/10...   Batch:3300...   Validation Loss:0.704


Epoch:1...   Batch:3403...   Training Loss:0.829: 3403it [05:14,  2.89it/s]

Epoch:1/10...   Batch:3400...   Validation Loss:0.672


Epoch:1...   Batch:3503...   Training Loss:0.935: 3503it [05:23,  2.88it/s]

Epoch:1/10...   Batch:3500...   Validation Loss:0.696


Epoch:1...   Batch:3601...   Training Loss:0.932: 3601it [05:32,  2.14it/s]

Epoch:1/10...   Batch:3600...   Validation Loss:0.696


Epoch:1...   Batch:3703...   Training Loss:0.820: 3703it [05:42,  2.89it/s]

Epoch:1/10...   Batch:3700...   Validation Loss:0.687


Epoch:1...   Batch:3803...   Training Loss:0.939: 3803it [05:51,  2.88it/s]

Epoch:1/10...   Batch:3800...   Validation Loss:0.671


Epoch:1...   Batch:3903...   Training Loss:0.979: 3903it [06:00,  2.87it/s]

Epoch:1/10...   Batch:3900...   Validation Loss:0.658


Epoch:1...   Batch:4003...   Training Loss:0.700: 4003it [06:09,  2.92it/s]

Epoch:1/10...   Batch:4000...   Validation Loss:0.673


Epoch:1...   Batch:4101...   Training Loss:0.879: 4101it [06:18,  2.14it/s]

Epoch:1/10...   Batch:4100...   Validation Loss:0.659


Epoch:1...   Batch:4203...   Training Loss:0.834: 4203it [06:27,  2.90it/s]

Epoch:1/10...   Batch:4200...   Validation Loss:0.661


Epoch:1...   Batch:4303...   Training Loss:1.125: 4303it [06:36,  2.92it/s]

Epoch:1/10...   Batch:4300...   Validation Loss:0.648


Epoch:1...   Batch:4403...   Training Loss:0.823: 4403it [06:45,  2.99it/s]

Epoch:1/10...   Batch:4400...   Validation Loss:0.638


Epoch:1...   Batch:4503...   Training Loss:0.866: 4503it [06:55,  2.90it/s]

Epoch:1/10...   Batch:4500...   Validation Loss:0.659


Epoch:1...   Batch:4601...   Training Loss:0.849: 4601it [07:04,  2.13it/s]

Epoch:1/10...   Batch:4600...   Validation Loss:0.643


Epoch:1...   Batch:4703...   Training Loss:0.896: 4703it [07:13,  2.87it/s]

Epoch:1/10...   Batch:4700...   Validation Loss:0.658


Epoch:1...   Batch:4803...   Training Loss:0.768: 4803it [07:22,  2.91it/s]

Epoch:1/10...   Batch:4800...   Validation Loss:0.663


Epoch:1...   Batch:4901...   Training Loss:0.885: 4901it [07:31,  2.16it/s]

Epoch:1/10...   Batch:4900...   Validation Loss:0.665


Epoch:1...   Batch:5003...   Training Loss:0.959: 5003it [07:40,  2.88it/s]

Epoch:1/10...   Batch:5000...   Validation Loss:0.672


Epoch:1...   Batch:5101...   Training Loss:0.869: 5101it [07:49,  2.14it/s]

Epoch:1/10...   Batch:5100...   Validation Loss:0.678


Epoch:1...   Batch:5203...   Training Loss:0.774: 5203it [07:59,  2.88it/s]

Epoch:1/10...   Batch:5200...   Validation Loss:0.652


Epoch:1...   Batch:5303...   Training Loss:0.802: 5303it [08:08,  2.88it/s]

Epoch:1/10...   Batch:5300...   Validation Loss:0.657


Epoch:1...   Batch:5403...   Training Loss:0.847: 5403it [08:17,  2.89it/s]

Epoch:1/10...   Batch:5400...   Validation Loss:0.667


Epoch:1...   Batch:5503...   Training Loss:0.845: 5503it [08:26,  2.89it/s]

Epoch:1/10...   Batch:5500...   Validation Loss:0.673


Epoch:1...   Batch:5601...   Training Loss:0.821: 5601it [08:35,  2.11it/s]

Epoch:1/10...   Batch:5600...   Validation Loss:0.675


Epoch:1...   Batch:5703...   Training Loss:0.857: 5703it [08:45,  2.95it/s]

Epoch:1/10...   Batch:5700...   Validation Loss:0.686


Epoch:1...   Batch:5803...   Training Loss:0.783: 5803it [08:54,  2.86it/s]

Epoch:1/10...   Batch:5800...   Validation Loss:0.680


Epoch:1...   Batch:5903...   Training Loss:0.859: 5903it [09:03,  2.93it/s]

Epoch:1/10...   Batch:5900...   Validation Loss:0.666


Epoch:1...   Batch:6003...   Training Loss:1.055: 6003it [09:12,  2.83it/s]

Epoch:1/10...   Batch:6000...   Validation Loss:0.672


Epoch:1...   Batch:6101...   Training Loss:0.760: 6101it [09:21,  2.14it/s]

Epoch:1/10...   Batch:6100...   Validation Loss:0.679


Epoch:1...   Batch:6203...   Training Loss:0.767: 6203it [09:31,  2.87it/s]

Epoch:1/10...   Batch:6200...   Validation Loss:0.663


Epoch:1...   Batch:6303...   Training Loss:0.896: 6303it [09:40,  2.98it/s]

Epoch:1/10...   Batch:6300...   Validation Loss:0.668


Epoch:1...   Batch:6396...   Training Loss:0.732: 6396it [09:46, 10.91it/s]
0it [00:00, ?it/s]

Starting epoch 2


Epoch:2...   Batch:103...   Training Loss:0.682: 103it [00:11,  2.85it/s]

Epoch:2/10...   Batch:100...   Validation Loss:0.691


Epoch:2...   Batch:201...   Training Loss:0.704: 201it [00:20,  2.15it/s]

Epoch:2/10...   Batch:200...   Validation Loss:0.688


Epoch:2...   Batch:303...   Training Loss:0.911: 303it [00:29,  2.88it/s]

Epoch:2/10...   Batch:300...   Validation Loss:0.683


Epoch:2...   Batch:403...   Training Loss:0.818: 403it [00:38,  2.91it/s]

Epoch:2/10...   Batch:400...   Validation Loss:0.674


Epoch:2...   Batch:503...   Training Loss:0.961: 503it [00:48,  2.91it/s]

Epoch:2/10...   Batch:500...   Validation Loss:0.652


Epoch:2...   Batch:603...   Training Loss:0.667: 603it [00:57,  2.95it/s]

Epoch:2/10...   Batch:600...   Validation Loss:0.629


Epoch:2...   Batch:701...   Training Loss:0.646: 701it [01:06,  2.18it/s]

Epoch:2/10...   Batch:700...   Validation Loss:0.634


Epoch:2...   Batch:803...   Training Loss:0.942: 803it [01:15,  2.93it/s]

Epoch:2/10...   Batch:800...   Validation Loss:0.676


Epoch:2...   Batch:903...   Training Loss:0.825: 903it [01:24,  2.88it/s]

Epoch:2/10...   Batch:900...   Validation Loss:0.670


Epoch:2...   Batch:1003...   Training Loss:0.857: 1003it [01:33,  2.85it/s]

Epoch:2/10...   Batch:1000...   Validation Loss:0.669


Epoch:2...   Batch:1103...   Training Loss:0.863: 1103it [01:43,  2.90it/s]

Epoch:2/10...   Batch:1100...   Validation Loss:0.640


Epoch:2...   Batch:1201...   Training Loss:0.884: 1201it [01:52,  2.10it/s]

Epoch:2/10...   Batch:1200...   Validation Loss:0.637


Epoch:2...   Batch:1303...   Training Loss:0.677: 1303it [02:01,  2.89it/s]

Epoch:2/10...   Batch:1300...   Validation Loss:0.637


Epoch:2...   Batch:1403...   Training Loss:0.760: 1403it [02:10,  2.87it/s]

Epoch:2/10...   Batch:1400...   Validation Loss:0.647


Epoch:2...   Batch:1503...   Training Loss:0.892: 1503it [02:19,  2.92it/s]

Epoch:2/10...   Batch:1500...   Validation Loss:0.677


Epoch:2...   Batch:1603...   Training Loss:0.840: 1603it [02:28,  2.86it/s]

Epoch:2/10...   Batch:1600...   Validation Loss:0.671


Epoch:2...   Batch:1701...   Training Loss:0.902: 1701it [02:37,  2.17it/s]

Epoch:2/10...   Batch:1700...   Validation Loss:0.696


Epoch:2...   Batch:1803...   Training Loss:0.824: 1803it [02:47,  2.88it/s]

Epoch:2/10...   Batch:1800...   Validation Loss:0.676


Epoch:2...   Batch:1903...   Training Loss:0.816: 1903it [02:56,  2.92it/s]

Epoch:2/10...   Batch:1900...   Validation Loss:0.695


Epoch:2...   Batch:2003...   Training Loss:0.815: 2003it [03:05,  2.84it/s]

Epoch:2/10...   Batch:2000...   Validation Loss:0.680


Epoch:2...   Batch:2103...   Training Loss:0.881: 2103it [03:14,  2.97it/s]

Epoch:2/10...   Batch:2100...   Validation Loss:0.666


Epoch:2...   Batch:2201...   Training Loss:0.768: 2201it [03:23,  2.10it/s]

Epoch:2/10...   Batch:2200...   Validation Loss:0.687


Epoch:2...   Batch:2303...   Training Loss:0.928: 2303it [03:33,  2.92it/s]

Epoch:2/10...   Batch:2300...   Validation Loss:0.693


Epoch:2...   Batch:2403...   Training Loss:0.846: 2403it [03:42,  2.89it/s]

Epoch:2/10...   Batch:2400...   Validation Loss:0.681


Epoch:2...   Batch:2503...   Training Loss:0.740: 2503it [03:51,  2.93it/s]

Epoch:2/10...   Batch:2500...   Validation Loss:0.703


Epoch:2...   Batch:2603...   Training Loss:0.785: 2603it [04:00,  2.87it/s]

Epoch:2/10...   Batch:2600...   Validation Loss:0.707


Epoch:2...   Batch:2701...   Training Loss:0.736: 2701it [04:09,  2.16it/s]

Epoch:2/10...   Batch:2700...   Validation Loss:0.670


Epoch:2...   Batch:2803...   Training Loss:0.736: 2803it [04:18,  2.88it/s]

Epoch:2/10...   Batch:2800...   Validation Loss:0.674


Epoch:2...   Batch:2903...   Training Loss:0.900: 2903it [04:28,  2.89it/s]

Epoch:2/10...   Batch:2900...   Validation Loss:0.672


Epoch:2...   Batch:3003...   Training Loss:0.782: 3003it [04:37,  2.88it/s]

Epoch:2/10...   Batch:3000...   Validation Loss:0.673


Epoch:2...   Batch:3103...   Training Loss:0.921: 3103it [04:46,  2.92it/s]

Epoch:2/10...   Batch:3100...   Validation Loss:0.686


Epoch:2...   Batch:3201...   Training Loss:0.723: 3201it [04:55,  2.18it/s]

Epoch:2/10...   Batch:3200...   Validation Loss:0.721


Epoch:2...   Batch:3303...   Training Loss:0.740: 3303it [05:04,  2.91it/s]

Epoch:2/10...   Batch:3300...   Validation Loss:0.676


Epoch:2...   Batch:3403...   Training Loss:0.801: 3403it [05:13,  2.91it/s]

Epoch:2/10...   Batch:3400...   Validation Loss:0.650


Epoch:2...   Batch:3503...   Training Loss:0.696: 3503it [05:22,  2.91it/s]

Epoch:2/10...   Batch:3500...   Validation Loss:0.662


Epoch:2...   Batch:3603...   Training Loss:0.787: 3603it [05:32,  2.90it/s]

Epoch:2/10...   Batch:3600...   Validation Loss:0.660


Epoch:2...   Batch:3701...   Training Loss:0.878: 3701it [05:40,  2.19it/s]

Epoch:2/10...   Batch:3700...   Validation Loss:0.674


Epoch:2...   Batch:3803...   Training Loss:0.814: 3803it [05:50,  2.92it/s]

Epoch:2/10...   Batch:3800...   Validation Loss:0.666


Epoch:2...   Batch:3903...   Training Loss:0.721: 3903it [05:59,  2.95it/s]

Epoch:2/10...   Batch:3900...   Validation Loss:0.662


Epoch:2...   Batch:4003...   Training Loss:0.851: 4003it [06:08,  2.88it/s]

Epoch:2/10...   Batch:4000...   Validation Loss:0.676


Epoch:2...   Batch:4103...   Training Loss:0.919: 4103it [06:17,  2.92it/s]

Epoch:2/10...   Batch:4100...   Validation Loss:0.672


Epoch:2...   Batch:4201...   Training Loss:0.866: 4201it [06:26,  2.10it/s]

Epoch:2/10...   Batch:4200...   Validation Loss:0.667


Epoch:2...   Batch:4303...   Training Loss:0.808: 4303it [06:36,  2.93it/s]

Epoch:2/10...   Batch:4300...   Validation Loss:0.669


Epoch:2...   Batch:4403...   Training Loss:0.922: 4403it [06:45,  2.86it/s]

Epoch:2/10...   Batch:4400...   Validation Loss:0.657


Epoch:2...   Batch:4503...   Training Loss:0.727: 4503it [06:54,  2.92it/s]

Epoch:2/10...   Batch:4500...   Validation Loss:0.669


Epoch:2...   Batch:4603...   Training Loss:0.924: 4603it [07:03,  2.86it/s]

Epoch:2/10...   Batch:4600...   Validation Loss:0.664


Epoch:2...   Batch:4701...   Training Loss:0.729: 4701it [07:12,  2.15it/s]

Epoch:2/10...   Batch:4700...   Validation Loss:0.687


Epoch:2...   Batch:4803...   Training Loss:0.770: 4803it [07:21,  2.88it/s]

Epoch:2/10...   Batch:4800...   Validation Loss:0.691


Epoch:2...   Batch:4903...   Training Loss:0.843: 4903it [07:30,  2.96it/s]

Epoch:2/10...   Batch:4900...   Validation Loss:0.685


Epoch:2...   Batch:5003...   Training Loss:0.794: 5003it [07:40,  2.88it/s]

Epoch:2/10...   Batch:5000...   Validation Loss:0.684


Epoch:2...   Batch:5103...   Training Loss:0.811: 5103it [07:49,  2.99it/s]

Epoch:2/10...   Batch:5100...   Validation Loss:0.674


Epoch:2...   Batch:5201...   Training Loss:0.796: 5201it [07:58,  2.12it/s]

Epoch:2/10...   Batch:5200...   Validation Loss:0.663


Epoch:2...   Batch:5303...   Training Loss:0.775: 5303it [08:07,  2.91it/s]

Epoch:2/10...   Batch:5300...   Validation Loss:0.660


Epoch:2...   Batch:5403...   Training Loss:0.743: 5403it [08:16,  2.88it/s]

Epoch:2/10...   Batch:5400...   Validation Loss:0.664


Epoch:2...   Batch:5503...   Training Loss:0.863: 5503it [08:25,  2.92it/s]

Epoch:2/10...   Batch:5500...   Validation Loss:0.660


Epoch:2...   Batch:5603...   Training Loss:0.925: 5603it [08:34,  2.94it/s]

Epoch:2/10...   Batch:5600...   Validation Loss:0.673


Epoch:2...   Batch:5701...   Training Loss:0.827: 5701it [08:43,  2.18it/s]

Epoch:2/10...   Batch:5700...   Validation Loss:0.667


Epoch:2...   Batch:5803...   Training Loss:1.106: 5803it [08:53,  2.93it/s]

Epoch:2/10...   Batch:5800...   Validation Loss:0.662


Epoch:2...   Batch:5903...   Training Loss:0.822: 5903it [09:02,  2.92it/s]

Epoch:2/10...   Batch:5900...   Validation Loss:0.648


Epoch:2...   Batch:6003...   Training Loss:0.804: 6003it [09:11,  2.92it/s]

Epoch:2/10...   Batch:6000...   Validation Loss:0.640


Epoch:2...   Batch:6103...   Training Loss:0.731: 6103it [09:20,  2.89it/s]

Epoch:2/10...   Batch:6100...   Validation Loss:0.658


Epoch:2...   Batch:6201...   Training Loss:0.782: 6201it [09:29,  2.08it/s]

Epoch:2/10...   Batch:6200...   Validation Loss:0.691


Epoch:2...   Batch:6303...   Training Loss:0.737: 6303it [09:38,  2.91it/s]

Epoch:2/10...   Batch:6300...   Validation Loss:0.695


Epoch:2...   Batch:6396...   Training Loss:0.579: 6396it [09:44, 10.94it/s]
0it [00:00, ?it/s]

Starting epoch 3


Epoch:3...   Batch:103...   Training Loss:0.784: 103it [00:11,  2.86it/s]

Epoch:3/10...   Batch:100...   Validation Loss:0.651


Epoch:3...   Batch:203...   Training Loss:0.700: 203it [00:20,  2.90it/s]

Epoch:3/10...   Batch:200...   Validation Loss:0.674


Epoch:3...   Batch:301...   Training Loss:0.750: 301it [00:29,  2.13it/s]

Epoch:3/10...   Batch:300...   Validation Loss:0.665


Epoch:3...   Batch:403...   Training Loss:0.663: 403it [00:38,  2.85it/s]

Epoch:3/10...   Batch:400...   Validation Loss:0.657


Epoch:3...   Batch:503...   Training Loss:0.814: 503it [00:48,  2.88it/s]

Epoch:3/10...   Batch:500...   Validation Loss:0.677


Epoch:3...   Batch:603...   Training Loss:0.744: 603it [00:57,  2.94it/s]

Epoch:3/10...   Batch:600...   Validation Loss:0.662


Epoch:3...   Batch:703...   Training Loss:0.825: 703it [01:06,  2.92it/s]

Epoch:3/10...   Batch:700...   Validation Loss:0.682


Epoch:3...   Batch:803...   Training Loss:0.775: 803it [01:15,  2.91it/s]

Epoch:3/10...   Batch:800...   Validation Loss:0.665


Epoch:3...   Batch:901...   Training Loss:0.857: 901it [01:24,  2.15it/s]

Epoch:3/10...   Batch:900...   Validation Loss:0.670


Epoch:3...   Batch:1003...   Training Loss:0.828: 1003it [01:33,  2.91it/s]

Epoch:3/10...   Batch:1000...   Validation Loss:0.681


Epoch:3...   Batch:1103...   Training Loss:0.705: 1103it [01:42,  2.92it/s]

Epoch:3/10...   Batch:1100...   Validation Loss:0.672


Epoch:3...   Batch:1203...   Training Loss:0.823: 1203it [01:52,  2.87it/s]

Epoch:3/10...   Batch:1200...   Validation Loss:0.686


Epoch:3...   Batch:1303...   Training Loss:0.853: 1303it [02:01,  2.89it/s]

Epoch:3/10...   Batch:1300...   Validation Loss:0.679


Epoch:3...   Batch:1401...   Training Loss:0.656: 1401it [02:10,  2.12it/s]

Epoch:3/10...   Batch:1400...   Validation Loss:0.674


Epoch:3...   Batch:1503...   Training Loss:0.663: 1503it [02:19,  2.86it/s]

Epoch:3/10...   Batch:1500...   Validation Loss:0.685


Epoch:3...   Batch:1603...   Training Loss:0.756: 1603it [02:28,  2.91it/s]

Epoch:3/10...   Batch:1600...   Validation Loss:0.670


Epoch:3...   Batch:1703...   Training Loss:0.766: 1703it [02:37,  2.83it/s]

Epoch:3/10...   Batch:1700...   Validation Loss:0.673


Epoch:3...   Batch:1803...   Training Loss:0.682: 1803it [02:47,  2.86it/s]

Epoch:3/10...   Batch:1800...   Validation Loss:0.689


Epoch:3...   Batch:1901...   Training Loss:0.817: 1901it [02:56,  2.18it/s]

Epoch:3/10...   Batch:1900...   Validation Loss:0.666


Epoch:3...   Batch:2003...   Training Loss:0.843: 2003it [03:05,  2.87it/s]

Epoch:3/10...   Batch:2000...   Validation Loss:0.674


Epoch:3...   Batch:2103...   Training Loss:0.785: 2103it [03:14,  2.96it/s]

Epoch:3/10...   Batch:2100...   Validation Loss:0.681


Epoch:3...   Batch:2203...   Training Loss:0.715: 2203it [03:23,  2.97it/s]

Epoch:3/10...   Batch:2200...   Validation Loss:0.697


Epoch:3...   Batch:2303...   Training Loss:0.784: 2303it [03:32,  2.89it/s]

Epoch:3/10...   Batch:2300...   Validation Loss:0.688


Epoch:3...   Batch:2401...   Training Loss:0.679: 2401it [03:41,  2.13it/s]

Epoch:3/10...   Batch:2400...   Validation Loss:0.670


Epoch:3...   Batch:2503...   Training Loss:0.588: 2503it [03:51,  2.91it/s]

Epoch:3/10...   Batch:2500...   Validation Loss:0.668


Epoch:3...   Batch:2603...   Training Loss:0.745: 2603it [04:00,  2.96it/s]

Epoch:3/10...   Batch:2600...   Validation Loss:0.663


Epoch:3...   Batch:2703...   Training Loss:0.722: 2703it [04:09,  2.90it/s]

Epoch:3/10...   Batch:2700...   Validation Loss:0.667


Epoch:3...   Batch:2803...   Training Loss:0.820: 2803it [04:18,  2.94it/s]

Epoch:3/10...   Batch:2800...   Validation Loss:0.688


Epoch:3...   Batch:2901...   Training Loss:0.720: 2901it [04:27,  2.11it/s]

Epoch:3/10...   Batch:2900...   Validation Loss:0.692


Epoch:3...   Batch:3003...   Training Loss:0.821: 3003it [04:36,  2.90it/s]

Epoch:3/10...   Batch:3000...   Validation Loss:0.674


Epoch:3...   Batch:3103...   Training Loss:0.707: 3103it [04:45,  2.90it/s]

Epoch:3/10...   Batch:3100...   Validation Loss:0.703


Epoch:3...   Batch:3203...   Training Loss:0.739: 3203it [04:55,  2.86it/s]

Epoch:3/10...   Batch:3200...   Validation Loss:0.694


Epoch:3...   Batch:3303...   Training Loss:0.791: 3303it [05:04,  2.89it/s]

Epoch:3/10...   Batch:3300...   Validation Loss:0.689


Epoch:3...   Batch:3401...   Training Loss:0.706: 3401it [05:13,  2.16it/s]

Epoch:3/10...   Batch:3400...   Validation Loss:0.669


Epoch:3...   Batch:3503...   Training Loss:0.701: 3503it [05:22,  3.01it/s]

Epoch:3/10...   Batch:3500...   Validation Loss:0.669


Epoch:3...   Batch:3603...   Training Loss:0.904: 3603it [05:31,  2.92it/s]

Epoch:3/10...   Batch:3600...   Validation Loss:0.664


Epoch:3...   Batch:3703...   Training Loss:0.729: 3703it [05:40,  2.87it/s]

Epoch:3/10...   Batch:3700...   Validation Loss:0.678


Epoch:3...   Batch:3803...   Training Loss:0.837: 3803it [05:49,  2.88it/s]

Epoch:3/10...   Batch:3800...   Validation Loss:0.687


Epoch:3...   Batch:3901...   Training Loss:0.818: 3901it [05:58,  2.13it/s]

Epoch:3/10...   Batch:3900...   Validation Loss:0.680


Epoch:3...   Batch:4003...   Training Loss:0.786: 4003it [06:08,  2.97it/s]

Epoch:3/10...   Batch:4000...   Validation Loss:0.676


Epoch:3...   Batch:4103...   Training Loss:0.915: 4103it [06:17,  2.91it/s]

Epoch:3/10...   Batch:4100...   Validation Loss:0.685


Epoch:3...   Batch:4203...   Training Loss:0.700: 4203it [06:26,  2.98it/s]

Epoch:3/10...   Batch:4200...   Validation Loss:0.684


Epoch:3...   Batch:4303...   Training Loss:0.803: 4303it [06:35,  2.89it/s]

Epoch:3/10...   Batch:4300...   Validation Loss:0.678


Epoch:3...   Batch:4401...   Training Loss:0.760: 4401it [06:44,  2.09it/s]

Epoch:3/10...   Batch:4400...   Validation Loss:0.682


Epoch:3...   Batch:4503...   Training Loss:0.907: 4503it [06:53,  2.90it/s]

Epoch:3/10...   Batch:4500...   Validation Loss:0.701


Epoch:3...   Batch:4603...   Training Loss:0.727: 4603it [07:03,  2.93it/s]

Epoch:3/10...   Batch:4600...   Validation Loss:0.698


Epoch:3...   Batch:4703...   Training Loss:0.733: 4703it [07:12,  2.90it/s]

Epoch:3/10...   Batch:4700...   Validation Loss:0.704


Epoch:3...   Batch:4803...   Training Loss:0.759: 4803it [07:21,  2.86it/s]

Epoch:3/10...   Batch:4800...   Validation Loss:0.701


Epoch:3...   Batch:4901...   Training Loss:0.739: 4901it [07:30,  2.13it/s]

Epoch:3/10...   Batch:4900...   Validation Loss:0.693


Epoch:3...   Batch:5003...   Training Loss:0.722: 5003it [07:39,  2.90it/s]

Epoch:3/10...   Batch:5000...   Validation Loss:0.679


Epoch:3...   Batch:5103...   Training Loss:0.767: 5103it [07:48,  2.84it/s]

Epoch:3/10...   Batch:5100...   Validation Loss:0.668


Epoch:3...   Batch:5203...   Training Loss:0.766: 5203it [07:58,  2.91it/s]

Epoch:3/10...   Batch:5200...   Validation Loss:0.683


Epoch:3...   Batch:5303...   Training Loss:0.787: 5303it [08:07,  2.97it/s]

Epoch:3/10...   Batch:5300...   Validation Loss:0.691


Epoch:3...   Batch:5403...   Training Loss:0.743: 5403it [08:16,  2.94it/s]

Epoch:3/10...   Batch:5400...   Validation Loss:0.697


Epoch:3...   Batch:5503...   Training Loss:0.745: 5503it [08:25,  3.00it/s]

Epoch:3/10...   Batch:5500...   Validation Loss:0.689


Epoch:3...   Batch:5601...   Training Loss:0.790: 5601it [08:34,  2.12it/s]

Epoch:3/10...   Batch:5600...   Validation Loss:0.687


Epoch:3...   Batch:5703...   Training Loss:0.969: 5703it [08:43,  2.89it/s]

Epoch:3/10...   Batch:5700...   Validation Loss:0.701


Epoch:3...   Batch:5803...   Training Loss:0.862: 5803it [08:52,  2.92it/s]

Epoch:3/10...   Batch:5800...   Validation Loss:0.697


Epoch:3...   Batch:5903...   Training Loss:0.913: 5903it [09:01,  2.95it/s]

Epoch:3/10...   Batch:5900...   Validation Loss:0.682


Epoch:3...   Batch:6003...   Training Loss:0.734: 6003it [09:10,  2.94it/s]

Epoch:3/10...   Batch:6000...   Validation Loss:0.669


Epoch:3...   Batch:6101...   Training Loss:0.831: 6101it [09:19,  2.22it/s]

Epoch:3/10...   Batch:6100...   Validation Loss:0.686


Epoch:3...   Batch:6203...   Training Loss:0.835: 6203it [09:28,  2.99it/s]

Epoch:3/10...   Batch:6200...   Validation Loss:0.702


Epoch:3...   Batch:6303...   Training Loss:0.741: 6303it [09:37,  2.91it/s]

Epoch:3/10...   Batch:6300...   Validation Loss:0.700


Epoch:3...   Batch:6396...   Training Loss:0.934: 6396it [09:44, 10.95it/s]
0it [00:00, ?it/s]

Starting epoch 4


Epoch:4...   Batch:103...   Training Loss:0.645: 103it [00:11,  2.99it/s]

Epoch:4/10...   Batch:100...   Validation Loss:0.727


Epoch:4...   Batch:203...   Training Loss:0.554: 203it [00:20,  2.97it/s]

Epoch:4/10...   Batch:200...   Validation Loss:0.749


Epoch:4...   Batch:303...   Training Loss:0.545: 303it [00:29,  3.04it/s]

Epoch:4/10...   Batch:300...   Validation Loss:0.741


Epoch:4...   Batch:403...   Training Loss:0.533: 403it [00:37,  3.08it/s]

Epoch:4/10...   Batch:400...   Validation Loss:0.734


Epoch:4...   Batch:503...   Training Loss:0.742: 503it [00:46,  3.04it/s]

Epoch:4/10...   Batch:500...   Validation Loss:0.722


Epoch:4...   Batch:603...   Training Loss:0.698: 603it [00:55,  3.05it/s]

Epoch:4/10...   Batch:600...   Validation Loss:0.732


Epoch:4...   Batch:701...   Training Loss:0.637: 701it [01:04,  2.22it/s]

Epoch:4/10...   Batch:700...   Validation Loss:0.755


Epoch:4...   Batch:803...   Training Loss:0.666: 803it [01:13,  2.99it/s]

Epoch:4/10...   Batch:800...   Validation Loss:0.753


Epoch:4...   Batch:903...   Training Loss:0.704: 903it [01:22,  3.01it/s]

Epoch:4/10...   Batch:900...   Validation Loss:0.741


Epoch:4...   Batch:1003...   Training Loss:0.764: 1003it [01:31,  3.02it/s]

Epoch:4/10...   Batch:1000...   Validation Loss:0.718


Epoch:4...   Batch:1103...   Training Loss:0.672: 1103it [01:40,  2.96it/s]

Epoch:4/10...   Batch:1100...   Validation Loss:0.715


Epoch:4...   Batch:1203...   Training Loss:0.671: 1203it [01:49,  3.03it/s]

Epoch:4/10...   Batch:1200...   Validation Loss:0.718


Epoch:4...   Batch:1303...   Training Loss:0.651: 1303it [01:58,  2.98it/s]

Epoch:4/10...   Batch:1300...   Validation Loss:0.703


Epoch:4...   Batch:1403...   Training Loss:0.726: 1403it [02:07,  3.03it/s]

Epoch:4/10...   Batch:1400...   Validation Loss:0.726


Epoch:4...   Batch:1503...   Training Loss:0.634: 1503it [02:16,  2.98it/s]

Epoch:4/10...   Batch:1500...   Validation Loss:0.712


Epoch:4...   Batch:1603...   Training Loss:0.696: 1603it [02:25,  2.97it/s]

Epoch:4/10...   Batch:1600...   Validation Loss:0.734


Epoch:4...   Batch:1701...   Training Loss:0.747: 1701it [02:34,  2.13it/s]

Epoch:4/10...   Batch:1700...   Validation Loss:0.730


Epoch:4...   Batch:1803...   Training Loss:0.659: 1803it [02:43,  2.91it/s]

Epoch:4/10...   Batch:1800...   Validation Loss:0.742


Epoch:4...   Batch:1903...   Training Loss:0.693: 1903it [02:52,  2.96it/s]

Epoch:4/10...   Batch:1900...   Validation Loss:0.729


Epoch:4...   Batch:2003...   Training Loss:0.769: 2003it [03:01,  2.99it/s]

Epoch:4/10...   Batch:2000...   Validation Loss:0.733


Epoch:4...   Batch:2103...   Training Loss:0.710: 2103it [03:10,  2.97it/s]

Epoch:4/10...   Batch:2100...   Validation Loss:0.712


Epoch:4...   Batch:2201...   Training Loss:0.644: 2201it [03:19,  2.18it/s]

Epoch:4/10...   Batch:2200...   Validation Loss:0.712


Epoch:4...   Batch:2303...   Training Loss:0.630: 2303it [03:29,  2.92it/s]

Epoch:4/10...   Batch:2300...   Validation Loss:0.717


Epoch:4...   Batch:2403...   Training Loss:0.684: 2403it [03:38,  2.95it/s]

Epoch:4/10...   Batch:2400...   Validation Loss:0.731


Epoch:4...   Batch:2503...   Training Loss:0.787: 2503it [03:47,  2.98it/s]

Epoch:4/10...   Batch:2500...   Validation Loss:0.721


Epoch:4...   Batch:2603...   Training Loss:0.628: 2603it [03:56,  2.93it/s]

Epoch:4/10...   Batch:2600...   Validation Loss:0.743


Epoch:4...   Batch:2701...   Training Loss:0.681: 2701it [04:05,  2.12it/s]

Epoch:4/10...   Batch:2700...   Validation Loss:0.729


Epoch:4...   Batch:2803...   Training Loss:0.789: 2803it [04:14,  2.99it/s]

Epoch:4/10...   Batch:2800...   Validation Loss:0.709


Epoch:4...   Batch:2903...   Training Loss:0.770: 2903it [04:23,  2.97it/s]

Epoch:4/10...   Batch:2900...   Validation Loss:0.697


Epoch:4...   Batch:3003...   Training Loss:0.626: 3003it [04:32,  2.92it/s]

Epoch:4/10...   Batch:3000...   Validation Loss:0.696


Epoch:4...   Batch:3101...   Training Loss:0.662: 3101it [04:41,  2.19it/s]

Epoch:4/10...   Batch:3100...   Validation Loss:0.717


Epoch:4...   Batch:3201...   Training Loss:0.631: 3201it [04:51,  2.11it/s]

Epoch:4/10...   Batch:3200...   Validation Loss:0.712


Epoch:4...   Batch:3303...   Training Loss:0.610: 3303it [05:00,  2.88it/s]

Epoch:4/10...   Batch:3300...   Validation Loss:0.710


Epoch:4...   Batch:3403...   Training Loss:0.669: 3403it [05:09,  3.02it/s]

Epoch:4/10...   Batch:3400...   Validation Loss:0.712


Epoch:4...   Batch:3503...   Training Loss:0.615: 3503it [05:18,  2.88it/s]

Epoch:4/10...   Batch:3500...   Validation Loss:0.698


Epoch:4...   Batch:3603...   Training Loss:0.819: 3603it [05:27,  2.93it/s]

Epoch:4/10...   Batch:3600...   Validation Loss:0.711


Epoch:4...   Batch:3701...   Training Loss:0.749: 3701it [05:36,  2.16it/s]

Epoch:4/10...   Batch:3700...   Validation Loss:0.724


Epoch:4...   Batch:3803...   Training Loss:0.635: 3803it [05:46,  2.91it/s]

Epoch:4/10...   Batch:3800...   Validation Loss:0.714


Epoch:4...   Batch:3903...   Training Loss:0.634: 3903it [05:55,  2.92it/s]

Epoch:4/10...   Batch:3900...   Validation Loss:0.695


Epoch:4...   Batch:4003...   Training Loss:0.772: 4003it [06:04,  2.90it/s]

Epoch:4/10...   Batch:4000...   Validation Loss:0.709


Epoch:4...   Batch:4103...   Training Loss:0.626: 4103it [06:13,  2.92it/s]

Epoch:4/10...   Batch:4100...   Validation Loss:0.711


Epoch:4...   Batch:4203...   Training Loss:0.623: 4203it [06:22,  2.91it/s]

Epoch:4/10...   Batch:4200...   Validation Loss:0.708


Epoch:4...   Batch:4301...   Training Loss:0.774: 4301it [06:31,  2.09it/s]

Epoch:4/10...   Batch:4300...   Validation Loss:0.717


Epoch:4...   Batch:4403...   Training Loss:0.665: 4403it [06:41,  2.88it/s]

Epoch:4/10...   Batch:4400...   Validation Loss:0.726


Epoch:4...   Batch:4503...   Training Loss:0.867: 4503it [06:50,  2.93it/s]

Epoch:4/10...   Batch:4500...   Validation Loss:0.720


Epoch:4...   Batch:4603...   Training Loss:0.715: 4603it [06:59,  2.92it/s]

Epoch:4/10...   Batch:4600...   Validation Loss:0.724


Epoch:4...   Batch:4703...   Training Loss:0.641: 4703it [07:08,  2.85it/s]

Epoch:4/10...   Batch:4700...   Validation Loss:0.701


Epoch:4...   Batch:4801...   Training Loss:0.791: 4801it [07:17,  2.16it/s]

Epoch:4/10...   Batch:4800...   Validation Loss:0.687


Epoch:4...   Batch:4903...   Training Loss:0.850: 4903it [07:27,  2.86it/s]

Epoch:4/10...   Batch:4900...   Validation Loss:0.693


Epoch:4...   Batch:5003...   Training Loss:0.675: 5003it [07:36,  2.92it/s]

Epoch:4/10...   Batch:5000...   Validation Loss:0.690


Epoch:4...   Batch:5103...   Training Loss:0.732: 5103it [07:45,  2.90it/s]

Epoch:4/10...   Batch:5100...   Validation Loss:0.721


Epoch:4...   Batch:5203...   Training Loss:0.783: 5203it [07:54,  2.97it/s]

Epoch:4/10...   Batch:5200...   Validation Loss:0.709


Epoch:4...   Batch:5301...   Training Loss:0.780: 5301it [08:03,  2.16it/s]

Epoch:4/10...   Batch:5300...   Validation Loss:0.722


Epoch:4...   Batch:5403...   Training Loss:0.700: 5403it [08:12,  2.93it/s]

Epoch:4/10...   Batch:5400...   Validation Loss:0.722


Epoch:4...   Batch:5503...   Training Loss:0.787: 5503it [08:21,  2.94it/s]

Epoch:4/10...   Batch:5500...   Validation Loss:0.692


Epoch:4...   Batch:5603...   Training Loss:0.894: 5603it [08:31,  2.96it/s]

Epoch:4/10...   Batch:5600...   Validation Loss:0.694


Epoch:4...   Batch:5703...   Training Loss:0.694: 5703it [08:40,  2.93it/s]

Epoch:4/10...   Batch:5700...   Validation Loss:0.704


Epoch:4...   Batch:5803...   Training Loss:0.646: 5803it [08:49,  3.02it/s]

Epoch:4/10...   Batch:5800...   Validation Loss:0.697


Epoch:4...   Batch:5903...   Training Loss:0.755: 5903it [08:58,  3.06it/s]

Epoch:4/10...   Batch:5900...   Validation Loss:0.711


Epoch:4...   Batch:6003...   Training Loss:0.707: 6003it [09:06,  3.01it/s]

Epoch:4/10...   Batch:6000...   Validation Loss:0.693


Epoch:4...   Batch:6101...   Training Loss:0.736: 6101it [09:15,  2.18it/s]

Epoch:4/10...   Batch:6100...   Validation Loss:0.687


Epoch:4...   Batch:6203...   Training Loss:0.652: 6203it [09:25,  2.93it/s]

Epoch:4/10...   Batch:6200...   Validation Loss:0.699


Epoch:4...   Batch:6303...   Training Loss:0.660: 6303it [09:34,  2.91it/s]

Epoch:4/10...   Batch:6300...   Validation Loss:0.694


Epoch:4...   Batch:6396...   Training Loss:0.560: 6396it [09:40, 11.02it/s]
0it [00:00, ?it/s]

Starting epoch 5


Epoch:5...   Batch:103...   Training Loss:0.685: 103it [00:11,  2.91it/s]

Epoch:5/10...   Batch:100...   Validation Loss:0.707


Epoch:5...   Batch:201...   Training Loss:0.524: 201it [00:20,  2.12it/s]

Epoch:5/10...   Batch:200...   Validation Loss:0.715


Epoch:5...   Batch:303...   Training Loss:0.513: 303it [00:29,  2.96it/s]

Epoch:5/10...   Batch:300...   Validation Loss:0.710


Epoch:5...   Batch:403...   Training Loss:0.655: 403it [00:38,  2.95it/s]

Epoch:5/10...   Batch:400...   Validation Loss:0.724


Epoch:5...   Batch:503...   Training Loss:0.509: 503it [00:47,  2.90it/s]

Epoch:5/10...   Batch:500...   Validation Loss:0.750


Epoch:5...   Batch:603...   Training Loss:0.583: 603it [00:56,  2.91it/s]

Epoch:5/10...   Batch:600...   Validation Loss:0.759


Epoch:5...   Batch:701...   Training Loss:0.583: 701it [01:05,  2.14it/s]

Epoch:5/10...   Batch:700...   Validation Loss:0.724


Epoch:5...   Batch:803...   Training Loss:0.565: 803it [01:15,  2.87it/s]

Epoch:5/10...   Batch:800...   Validation Loss:0.752


Epoch:5...   Batch:903...   Training Loss:0.596: 903it [01:24,  2.89it/s]

Epoch:5/10...   Batch:900...   Validation Loss:0.754


Epoch:5...   Batch:1003...   Training Loss:0.617: 1003it [01:33,  2.94it/s]

Epoch:5/10...   Batch:1000...   Validation Loss:0.758


Epoch:5...   Batch:1103...   Training Loss:0.773: 1103it [01:42,  2.93it/s]

Epoch:5/10...   Batch:1100...   Validation Loss:0.783


Epoch:5...   Batch:1201...   Training Loss:0.478: 1201it [01:51,  2.18it/s]

Epoch:5/10...   Batch:1200...   Validation Loss:0.779


Epoch:5...   Batch:1303...   Training Loss:0.615: 1303it [02:01,  2.88it/s]

Epoch:5/10...   Batch:1300...   Validation Loss:0.802


Epoch:5...   Batch:1403...   Training Loss:0.581: 1403it [02:10,  2.89it/s]

Epoch:5/10...   Batch:1400...   Validation Loss:0.802


Epoch:5...   Batch:1503...   Training Loss:0.663: 1503it [02:19,  2.92it/s]

Epoch:5/10...   Batch:1500...   Validation Loss:0.783


Epoch:5...   Batch:1603...   Training Loss:0.625: 1603it [02:28,  2.91it/s]

Epoch:5/10...   Batch:1600...   Validation Loss:0.763


Epoch:5...   Batch:1703...   Training Loss:0.633: 1703it [02:37,  2.98it/s]

Epoch:5/10...   Batch:1700...   Validation Loss:0.777


Epoch:5...   Batch:1801...   Training Loss:0.696: 1801it [02:46,  2.16it/s]

Epoch:5/10...   Batch:1800...   Validation Loss:0.760


Epoch:5...   Batch:1903...   Training Loss:0.588: 1903it [02:56,  2.91it/s]

Epoch:5/10...   Batch:1900...   Validation Loss:0.758


Epoch:5...   Batch:2003...   Training Loss:0.625: 2003it [03:05,  2.88it/s]

Epoch:5/10...   Batch:2000...   Validation Loss:0.783


Epoch:5...   Batch:2103...   Training Loss:0.581: 2103it [03:14,  2.94it/s]

Epoch:5/10...   Batch:2100...   Validation Loss:0.808


Epoch:5...   Batch:2203...   Training Loss:0.718: 2203it [03:23,  2.91it/s]

Epoch:5/10...   Batch:2200...   Validation Loss:0.813


Epoch:5...   Batch:2301...   Training Loss:0.588: 2301it [03:32,  2.11it/s]

Epoch:5/10...   Batch:2300...   Validation Loss:0.790


Epoch:5...   Batch:2403...   Training Loss:0.713: 2403it [03:42,  2.94it/s]

Epoch:5/10...   Batch:2400...   Validation Loss:0.762


Epoch:5...   Batch:2503...   Training Loss:0.652: 2503it [03:51,  2.91it/s]

Epoch:5/10...   Batch:2500...   Validation Loss:0.757


Epoch:5...   Batch:2603...   Training Loss:0.667: 2603it [04:00,  2.91it/s]

Epoch:5/10...   Batch:2600...   Validation Loss:0.748


Epoch:5...   Batch:2703...   Training Loss:0.680: 2703it [04:09,  2.94it/s]

Epoch:5/10...   Batch:2700...   Validation Loss:0.770


Epoch:5...   Batch:2801...   Training Loss:0.777: 2801it [04:18,  2.13it/s]

Epoch:5/10...   Batch:2800...   Validation Loss:0.806


Epoch:5...   Batch:2903...   Training Loss:0.676: 2903it [04:27,  2.91it/s]

Epoch:5/10...   Batch:2900...   Validation Loss:0.797


Epoch:5...   Batch:3003...   Training Loss:0.648: 3003it [04:37,  2.90it/s]

Epoch:5/10...   Batch:3000...   Validation Loss:0.804


Epoch:5...   Batch:3103...   Training Loss:0.495: 3103it [04:46,  2.92it/s]

Epoch:5/10...   Batch:3100...   Validation Loss:0.794


Epoch:5...   Batch:3203...   Training Loss:0.726: 3203it [04:55,  2.86it/s]

Epoch:5/10...   Batch:3200...   Validation Loss:0.755


Epoch:5...   Batch:3301...   Training Loss:0.591: 3301it [05:04,  2.14it/s]

Epoch:5/10...   Batch:3300...   Validation Loss:0.744


Epoch:5...   Batch:3403...   Training Loss:0.746: 3403it [05:13,  2.91it/s]

Epoch:5/10...   Batch:3400...   Validation Loss:0.750


Epoch:5...   Batch:3503...   Training Loss:0.482: 3503it [05:23,  2.93it/s]

Epoch:5/10...   Batch:3500...   Validation Loss:0.755


Epoch:5...   Batch:3603...   Training Loss:0.709: 3603it [05:32,  2.91it/s]

Epoch:5/10...   Batch:3600...   Validation Loss:0.749


Epoch:5...   Batch:3703...   Training Loss:0.701: 3703it [05:41,  2.91it/s]

Epoch:5/10...   Batch:3700...   Validation Loss:0.730


Epoch:5...   Batch:3801...   Training Loss:0.688: 3801it [05:50,  2.14it/s]

Epoch:5/10...   Batch:3800...   Validation Loss:0.742


Epoch:5...   Batch:3903...   Training Loss:0.627: 3903it [05:59,  2.98it/s]

Epoch:5/10...   Batch:3900...   Validation Loss:0.777


Epoch:5...   Batch:4003...   Training Loss:0.623: 4003it [06:08,  2.93it/s]

Epoch:5/10...   Batch:4000...   Validation Loss:0.770


Epoch:5...   Batch:4103...   Training Loss:0.756: 4103it [06:18,  2.90it/s]

Epoch:5/10...   Batch:4100...   Validation Loss:0.744


Epoch:5...   Batch:4203...   Training Loss:0.747: 4203it [06:27,  2.93it/s]

Epoch:5/10...   Batch:4200...   Validation Loss:0.741


Epoch:5...   Batch:4301...   Training Loss:0.635: 4301it [06:36,  2.20it/s]

Epoch:5/10...   Batch:4300...   Validation Loss:0.738


Epoch:5...   Batch:4403...   Training Loss:0.669: 4403it [06:45,  2.93it/s]

Epoch:5/10...   Batch:4400...   Validation Loss:0.738


Epoch:5...   Batch:4503...   Training Loss:0.609: 4503it [06:54,  2.90it/s]

Epoch:5/10...   Batch:4500...   Validation Loss:0.718


Epoch:5...   Batch:4603...   Training Loss:0.554: 4603it [07:03,  2.92it/s]

Epoch:5/10...   Batch:4600...   Validation Loss:0.721


Epoch:5...   Batch:4703...   Training Loss:0.724: 4703it [07:12,  2.93it/s]

Epoch:5/10...   Batch:4700...   Validation Loss:0.752


Epoch:5...   Batch:4803...   Training Loss:0.737: 4803it [07:21,  2.97it/s]

Epoch:5/10...   Batch:4800...   Validation Loss:0.730


Epoch:5...   Batch:4901...   Training Loss:0.700: 4901it [07:30,  2.18it/s]

Epoch:5/10...   Batch:4900...   Validation Loss:0.724


Epoch:5...   Batch:5003...   Training Loss:0.691: 5003it [07:39,  2.88it/s]

Epoch:5/10...   Batch:5000...   Validation Loss:0.723


Epoch:5...   Batch:5103...   Training Loss:0.793: 5103it [07:49,  2.94it/s]

Epoch:5/10...   Batch:5100...   Validation Loss:0.738


Epoch:5...   Batch:5203...   Training Loss:0.658: 5203it [07:58,  2.92it/s]

Epoch:5/10...   Batch:5200...   Validation Loss:0.749


Epoch:5...   Batch:5303...   Training Loss:0.588: 5303it [08:07,  2.97it/s]

Epoch:5/10...   Batch:5300...   Validation Loss:0.737


Epoch:5...   Batch:5403...   Training Loss:0.748: 5403it [08:16,  3.02it/s]

Epoch:5/10...   Batch:5400...   Validation Loss:0.736


Epoch:5...   Batch:5503...   Training Loss:0.788: 5503it [08:24,  3.04it/s]

Epoch:5/10...   Batch:5500...   Validation Loss:0.722


Epoch:5...   Batch:5603...   Training Loss:0.800: 5603it [08:34,  2.95it/s]

Epoch:5/10...   Batch:5600...   Validation Loss:0.745


Epoch:5...   Batch:5703...   Training Loss:0.826: 5703it [08:43,  3.01it/s]

Epoch:5/10...   Batch:5700...   Validation Loss:0.739


Epoch:5...   Batch:5803...   Training Loss:0.532: 5803it [08:51,  3.00it/s]

Epoch:5/10...   Batch:5800...   Validation Loss:0.727


Epoch:5...   Batch:5903...   Training Loss:0.651: 5903it [09:00,  2.99it/s]

Epoch:5/10...   Batch:5900...   Validation Loss:0.740


Epoch:5...   Batch:6003...   Training Loss:0.651: 6003it [09:10,  2.93it/s]

Epoch:5/10...   Batch:6000...   Validation Loss:0.747


Epoch:5...   Batch:6101...   Training Loss:0.720: 6101it [09:19,  2.09it/s]

Epoch:5/10...   Batch:6100...   Validation Loss:0.755


Epoch:5...   Batch:6203...   Training Loss:0.624: 6203it [09:28,  2.89it/s]

Epoch:5/10...   Batch:6200...   Validation Loss:0.756


Epoch:5...   Batch:6303...   Training Loss:0.692: 6303it [09:37,  2.85it/s]

Epoch:5/10...   Batch:6300...   Validation Loss:0.730


Epoch:5...   Batch:6396...   Training Loss:0.904: 6396it [09:43, 10.96it/s]
0it [00:00, ?it/s]

Starting epoch 6


Epoch:6...   Batch:103...   Training Loss:0.417: 103it [00:11,  2.90it/s]

Epoch:6/10...   Batch:100...   Validation Loss:0.743


Epoch:6...   Batch:201...   Training Loss:0.557: 201it [00:20,  2.12it/s]

Epoch:6/10...   Batch:200...   Validation Loss:0.748


Epoch:6...   Batch:303...   Training Loss:0.527: 303it [00:29,  2.89it/s]

Epoch:6/10...   Batch:300...   Validation Loss:0.775


Epoch:6...   Batch:403...   Training Loss:0.531: 403it [00:38,  2.91it/s]

Epoch:6/10...   Batch:400...   Validation Loss:0.765


Epoch:6...   Batch:503...   Training Loss:0.514: 503it [00:47,  2.90it/s]

Epoch:6/10...   Batch:500...   Validation Loss:0.773


Epoch:6...   Batch:603...   Training Loss:0.427: 603it [00:57,  2.90it/s]

Epoch:6/10...   Batch:600...   Validation Loss:0.802


Epoch:6...   Batch:703...   Training Loss:0.558: 703it [01:06,  2.94it/s]

Epoch:6/10...   Batch:700...   Validation Loss:0.820


Epoch:6...   Batch:801...   Training Loss:0.476: 801it [01:15,  2.12it/s]

Epoch:6/10...   Batch:800...   Validation Loss:0.808


Epoch:6...   Batch:903...   Training Loss:0.561: 903it [01:24,  2.95it/s]

Epoch:6/10...   Batch:900...   Validation Loss:0.782


Epoch:6...   Batch:1003...   Training Loss:0.573: 1003it [01:33,  2.94it/s]

Epoch:6/10...   Batch:1000...   Validation Loss:0.783


Epoch:6...   Batch:1103...   Training Loss:0.573: 1103it [01:42,  2.94it/s]

Epoch:6/10...   Batch:1100...   Validation Loss:0.811


Epoch:6...   Batch:1203...   Training Loss:0.611: 1203it [01:51,  2.92it/s]

Epoch:6/10...   Batch:1200...   Validation Loss:0.771


Epoch:6...   Batch:1303...   Training Loss:0.585: 1303it [02:00,  3.00it/s]

Epoch:6/10...   Batch:1300...   Validation Loss:0.767


Epoch:6...   Batch:1401...   Training Loss:0.763: 1401it [02:09,  2.15it/s]

Epoch:6/10...   Batch:1400...   Validation Loss:0.777


Epoch:6...   Batch:1503...   Training Loss:0.606: 1503it [02:18,  2.99it/s]

Epoch:6/10...   Batch:1500...   Validation Loss:0.717


Epoch:6...   Batch:1603...   Training Loss:0.643: 1603it [02:28,  2.90it/s]

Epoch:6/10...   Batch:1600...   Validation Loss:0.734


Epoch:6...   Batch:1703...   Training Loss:0.585: 1703it [02:37,  2.91it/s]

Epoch:6/10...   Batch:1700...   Validation Loss:0.722


Epoch:6...   Batch:1803...   Training Loss:0.610: 1803it [02:46,  2.89it/s]

Epoch:6/10...   Batch:1800...   Validation Loss:0.706


Epoch:6...   Batch:1901...   Training Loss:0.495: 1901it [02:55,  2.20it/s]

Epoch:6/10...   Batch:1900...   Validation Loss:0.717


Epoch:6...   Batch:2003...   Training Loss:0.605: 2003it [03:04,  2.88it/s]

Epoch:6/10...   Batch:2000...   Validation Loss:0.709


Epoch:6...   Batch:2103...   Training Loss:0.675: 2103it [03:13,  3.04it/s]

Epoch:6/10...   Batch:2100...   Validation Loss:0.692


Epoch:6...   Batch:2203...   Training Loss:0.580: 2203it [03:22,  2.87it/s]

Epoch:6/10...   Batch:2200...   Validation Loss:0.675


Epoch:6...   Batch:2303...   Training Loss:0.525: 2303it [03:31,  2.89it/s]

Epoch:6/10...   Batch:2300...   Validation Loss:0.685


Epoch:6...   Batch:2401...   Training Loss:0.616: 2401it [03:40,  2.15it/s]

Epoch:6/10...   Batch:2400...   Validation Loss:0.735


Epoch:6...   Batch:2503...   Training Loss:0.650: 2503it [03:50,  2.91it/s]

Epoch:6/10...   Batch:2500...   Validation Loss:0.749


Epoch:6...   Batch:2603...   Training Loss:0.597: 2603it [03:59,  2.95it/s]

Epoch:6/10...   Batch:2600...   Validation Loss:0.692


Epoch:6...   Batch:2703...   Training Loss:0.547: 2703it [04:08,  2.86it/s]

Epoch:6/10...   Batch:2700...   Validation Loss:0.701


Epoch:6...   Batch:2803...   Training Loss:0.519: 2803it [04:17,  2.92it/s]

Epoch:6/10...   Batch:2800...   Validation Loss:0.722


Epoch:6...   Batch:2901...   Training Loss:0.591: 2901it [04:26,  2.14it/s]

Epoch:6/10...   Batch:2900...   Validation Loss:0.716


Epoch:6...   Batch:3003...   Training Loss:0.531: 3003it [04:35,  2.93it/s]

Epoch:6/10...   Batch:3000...   Validation Loss:0.680


Epoch:6...   Batch:3103...   Training Loss:0.588: 3103it [04:45,  2.93it/s]

Epoch:6/10...   Batch:3100...   Validation Loss:0.699


Epoch:6...   Batch:3203...   Training Loss:0.682: 3203it [04:54,  2.90it/s]

Epoch:6/10...   Batch:3200...   Validation Loss:0.698


Epoch:6...   Batch:3303...   Training Loss:0.550: 3303it [05:03,  2.91it/s]

Epoch:6/10...   Batch:3300...   Validation Loss:0.731


Epoch:6...   Batch:3403...   Training Loss:0.601: 3403it [05:12,  2.96it/s]

Epoch:6/10...   Batch:3400...   Validation Loss:0.719


Epoch:6...   Batch:3501...   Training Loss:0.659: 3501it [05:21,  2.13it/s]

Epoch:6/10...   Batch:3500...   Validation Loss:0.712


Epoch:6...   Batch:3603...   Training Loss:0.643: 3603it [05:30,  2.94it/s]

Epoch:6/10...   Batch:3600...   Validation Loss:0.734


Epoch:6...   Batch:3703...   Training Loss:0.616: 3703it [05:39,  2.93it/s]

Epoch:6/10...   Batch:3700...   Validation Loss:0.736


Epoch:6...   Batch:3803...   Training Loss:0.587: 3803it [05:48,  2.91it/s]

Epoch:6/10...   Batch:3800...   Validation Loss:0.740


Epoch:6...   Batch:3903...   Training Loss:0.633: 3903it [05:58,  2.92it/s]

Epoch:6/10...   Batch:3900...   Validation Loss:0.771


Epoch:6...   Batch:4001...   Training Loss:0.708: 4001it [06:07,  2.16it/s]

Epoch:6/10...   Batch:4000...   Validation Loss:0.757


Epoch:6...   Batch:4103...   Training Loss:0.698: 4103it [06:16,  2.92it/s]

Epoch:6/10...   Batch:4100...   Validation Loss:0.740


Epoch:6...   Batch:4201...   Training Loss:0.473: 4201it [06:25,  2.10it/s]

Epoch:6/10...   Batch:4200...   Validation Loss:0.748


Epoch:6...   Batch:4303...   Training Loss:0.668: 4303it [06:34,  2.94it/s]

Epoch:6/10...   Batch:4300...   Validation Loss:0.759


Epoch:6...   Batch:4403...   Training Loss:0.655: 4403it [06:43,  2.89it/s]

Epoch:6/10...   Batch:4400...   Validation Loss:0.737


Epoch:6...   Batch:4501...   Training Loss:0.661: 4501it [06:52,  2.15it/s]

Epoch:6/10...   Batch:4500...   Validation Loss:0.738


Epoch:6...   Batch:4603...   Training Loss:0.655: 4603it [07:01,  2.95it/s]

Epoch:6/10...   Batch:4600...   Validation Loss:0.710


Epoch:6...   Batch:4703...   Training Loss:0.665: 4703it [07:11,  2.91it/s]

Epoch:6/10...   Batch:4700...   Validation Loss:0.698


Epoch:6...   Batch:4803...   Training Loss:0.554: 4803it [07:20,  2.94it/s]

Epoch:6/10...   Batch:4800...   Validation Loss:0.726


Epoch:6...   Batch:4903...   Training Loss:0.619: 4903it [07:29,  2.92it/s]

Epoch:6/10...   Batch:4900...   Validation Loss:0.709


Epoch:6...   Batch:5001...   Training Loss:0.815: 5001it [07:38,  2.13it/s]

Epoch:6/10...   Batch:5000...   Validation Loss:0.708


Epoch:6...   Batch:5103...   Training Loss:0.656: 5103it [07:47,  2.93it/s]

Epoch:6/10...   Batch:5100...   Validation Loss:0.719


Epoch:6...   Batch:5203...   Training Loss:0.744: 5203it [07:56,  2.91it/s]

Epoch:6/10...   Batch:5200...   Validation Loss:0.769


Epoch:6...   Batch:5303...   Training Loss:0.624: 5303it [08:05,  2.89it/s]

Epoch:6/10...   Batch:5300...   Validation Loss:0.771


Epoch:6...   Batch:5403...   Training Loss:0.704: 5403it [08:15,  2.98it/s]

Epoch:6/10...   Batch:5400...   Validation Loss:0.759


Epoch:6...   Batch:5501...   Training Loss:0.648: 5501it [08:24,  2.14it/s]

Epoch:6/10...   Batch:5500...   Validation Loss:0.753


Epoch:6...   Batch:5603...   Training Loss:0.607: 5603it [08:33,  2.91it/s]

Epoch:6/10...   Batch:5600...   Validation Loss:0.771


Epoch:6...   Batch:5703...   Training Loss:0.643: 5703it [08:42,  2.94it/s]

Epoch:6/10...   Batch:5700...   Validation Loss:0.765


Epoch:6...   Batch:5803...   Training Loss:0.667: 5803it [08:51,  2.91it/s]

Epoch:6/10...   Batch:5800...   Validation Loss:0.767


Epoch:6...   Batch:5903...   Training Loss:0.651: 5903it [09:00,  2.91it/s]

Epoch:6/10...   Batch:5900...   Validation Loss:0.778


Epoch:6...   Batch:6001...   Training Loss:0.616: 6001it [09:09,  2.17it/s]

Epoch:6/10...   Batch:6000...   Validation Loss:0.767


Epoch:6...   Batch:6103...   Training Loss:0.703: 6103it [09:18,  2.92it/s]

Epoch:6/10...   Batch:6100...   Validation Loss:0.755


Epoch:6...   Batch:6203...   Training Loss:0.731: 6203it [09:28,  2.94it/s]

Epoch:6/10...   Batch:6200...   Validation Loss:0.766


Epoch:6...   Batch:6303...   Training Loss:0.708: 6303it [09:37,  2.92it/s]

Epoch:6/10...   Batch:6300...   Validation Loss:0.794


Epoch:6...   Batch:6396...   Training Loss:0.832: 6396it [09:43, 10.96it/s]
0it [00:00, ?it/s]

Starting epoch 7


Epoch:7...   Batch:101...   Training Loss:0.372: 101it [00:11,  2.13it/s]

Epoch:7/10...   Batch:100...   Validation Loss:0.872


Epoch:7...   Batch:203...   Training Loss:0.568: 203it [00:20,  2.92it/s]

Epoch:7/10...   Batch:200...   Validation Loss:0.857


Epoch:7...   Batch:303...   Training Loss:0.609: 303it [00:29,  2.94it/s]

Epoch:7/10...   Batch:300...   Validation Loss:0.878


Epoch:7...   Batch:403...   Training Loss:0.488: 403it [00:38,  2.93it/s]

Epoch:7/10...   Batch:400...   Validation Loss:0.894


Epoch:7...   Batch:503...   Training Loss:0.619: 503it [00:47,  2.92it/s]

Epoch:7/10...   Batch:500...   Validation Loss:0.857


Epoch:7...   Batch:603...   Training Loss:0.590: 603it [00:57,  2.95it/s]

Epoch:7/10...   Batch:600...   Validation Loss:0.860


Epoch:7...   Batch:701...   Training Loss:0.475: 701it [01:06,  2.13it/s]

Epoch:7/10...   Batch:700...   Validation Loss:0.841


Epoch:7...   Batch:803...   Training Loss:0.597: 803it [01:15,  2.91it/s]

Epoch:7/10...   Batch:800...   Validation Loss:0.844


Epoch:7...   Batch:903...   Training Loss:0.487: 903it [01:24,  2.93it/s]

Epoch:7/10...   Batch:900...   Validation Loss:0.852


Epoch:7...   Batch:1003...   Training Loss:0.675: 1003it [01:33,  2.90it/s]

Epoch:7/10...   Batch:1000...   Validation Loss:0.831


Epoch:7...   Batch:1103...   Training Loss:0.605: 1103it [01:42,  2.92it/s]

Epoch:7/10...   Batch:1100...   Validation Loss:0.825


Epoch:7...   Batch:1201...   Training Loss:0.565: 1201it [01:51,  2.19it/s]

Epoch:7/10...   Batch:1200...   Validation Loss:0.848


Epoch:7...   Batch:1303...   Training Loss:0.458: 1303it [02:01,  2.92it/s]

Epoch:7/10...   Batch:1300...   Validation Loss:0.828


Epoch:7...   Batch:1403...   Training Loss:0.571: 1403it [02:10,  2.91it/s]

Epoch:7/10...   Batch:1400...   Validation Loss:0.819


Epoch:7...   Batch:1503...   Training Loss:0.626: 1503it [02:19,  2.90it/s]

Epoch:7/10...   Batch:1500...   Validation Loss:0.853


Epoch:7...   Batch:1603...   Training Loss:0.527: 1603it [02:28,  2.90it/s]

Epoch:7/10...   Batch:1600...   Validation Loss:0.852


Epoch:7...   Batch:1701...   Training Loss:0.576: 1701it [02:37,  2.17it/s]

Epoch:7/10...   Batch:1700...   Validation Loss:0.848


Epoch:7...   Batch:1803...   Training Loss:0.543: 1803it [02:46,  2.88it/s]

Epoch:7/10...   Batch:1800...   Validation Loss:0.831


Epoch:7...   Batch:1903...   Training Loss:0.557: 1903it [02:55,  2.91it/s]

Epoch:7/10...   Batch:1900...   Validation Loss:0.832


Epoch:7...   Batch:2003...   Training Loss:0.615: 2003it [03:05,  2.93it/s]

Epoch:7/10...   Batch:2000...   Validation Loss:0.828


Epoch:7...   Batch:2103...   Training Loss:0.575: 2103it [03:14,  2.95it/s]

Epoch:7/10...   Batch:2100...   Validation Loss:0.838


Epoch:7...   Batch:2201...   Training Loss:0.437: 2201it [03:23,  2.11it/s]

Epoch:7/10...   Batch:2200...   Validation Loss:0.849


Epoch:7...   Batch:2303...   Training Loss:0.409: 2303it [03:32,  2.90it/s]

Epoch:7/10...   Batch:2300...   Validation Loss:0.854


Epoch:7...   Batch:2403...   Training Loss:0.584: 2403it [03:41,  2.90it/s]

Epoch:7/10...   Batch:2400...   Validation Loss:0.887


Epoch:7...   Batch:2503...   Training Loss:0.558: 2503it [03:50,  2.91it/s]

Epoch:7/10...   Batch:2500...   Validation Loss:0.882


Epoch:7...   Batch:2603...   Training Loss:0.542: 2603it [04:00,  2.96it/s]

Epoch:7/10...   Batch:2600...   Validation Loss:0.867


Epoch:7...   Batch:2701...   Training Loss:0.474: 2701it [04:09,  2.11it/s]

Epoch:7/10...   Batch:2700...   Validation Loss:0.866


Epoch:7...   Batch:2803...   Training Loss:0.595: 2803it [04:18,  2.91it/s]

Epoch:7/10...   Batch:2800...   Validation Loss:0.816


Epoch:7...   Batch:2903...   Training Loss:0.751: 2903it [04:27,  2.94it/s]

Epoch:7/10...   Batch:2900...   Validation Loss:0.835


Epoch:7...   Batch:3003...   Training Loss:0.681: 3003it [04:36,  2.96it/s]

Epoch:7/10...   Batch:3000...   Validation Loss:0.843


Epoch:7...   Batch:3103...   Training Loss:0.478: 3103it [04:45,  2.92it/s]

Epoch:7/10...   Batch:3100...   Validation Loss:0.835


Epoch:7...   Batch:3203...   Training Loss:0.598: 3203it [04:54,  2.93it/s]

Epoch:7/10...   Batch:3200...   Validation Loss:0.854


Epoch:7...   Batch:3301...   Training Loss:0.617: 3301it [05:03,  2.12it/s]

Epoch:7/10...   Batch:3300...   Validation Loss:0.837


Epoch:7...   Batch:3403...   Training Loss:0.634: 3403it [05:13,  2.91it/s]

Epoch:7/10...   Batch:3400...   Validation Loss:0.813


Epoch:7...   Batch:3503...   Training Loss:0.543: 3503it [05:22,  2.94it/s]

Epoch:7/10...   Batch:3500...   Validation Loss:0.815


Epoch:7...   Batch:3603...   Training Loss:0.614: 3603it [05:31,  2.90it/s]

Epoch:7/10...   Batch:3600...   Validation Loss:0.779


Epoch:7...   Batch:3703...   Training Loss:0.522: 3703it [05:40,  2.89it/s]

Epoch:7/10...   Batch:3700...   Validation Loss:0.780


Epoch:7...   Batch:3801...   Training Loss:0.479: 3801it [05:49,  2.16it/s]

Epoch:7/10...   Batch:3800...   Validation Loss:0.805


Epoch:7...   Batch:3903...   Training Loss:0.532: 3903it [05:58,  2.88it/s]

Epoch:7/10...   Batch:3900...   Validation Loss:0.816


Epoch:7...   Batch:4003...   Training Loss:0.428: 4003it [06:07,  2.89it/s]

Epoch:7/10...   Batch:4000...   Validation Loss:0.778


Epoch:7...   Batch:4103...   Training Loss:0.562: 4103it [06:16,  2.90it/s]

Epoch:7/10...   Batch:4100...   Validation Loss:0.786


Epoch:7...   Batch:4203...   Training Loss:0.450: 4203it [06:26,  2.94it/s]

Epoch:7/10...   Batch:4200...   Validation Loss:0.814


Epoch:7...   Batch:4303...   Training Loss:0.591: 4303it [06:35,  2.91it/s]

Epoch:7/10...   Batch:4300...   Validation Loss:0.759


Epoch:7...   Batch:4403...   Training Loss:0.646: 4403it [06:44,  2.90it/s]

Epoch:7/10...   Batch:4400...   Validation Loss:0.739


Epoch:7...   Batch:4503...   Training Loss:0.547: 4503it [06:53,  2.89it/s]

Epoch:7/10...   Batch:4500...   Validation Loss:0.723


Epoch:7...   Batch:4603...   Training Loss:0.606: 4603it [07:02,  2.92it/s]

Epoch:7/10...   Batch:4600...   Validation Loss:0.741


Epoch:7...   Batch:4703...   Training Loss:0.580: 4703it [07:11,  2.94it/s]

Epoch:7/10...   Batch:4700...   Validation Loss:0.744


Epoch:7...   Batch:4803...   Training Loss:0.631: 4803it [07:20,  2.93it/s]

Epoch:7/10...   Batch:4800...   Validation Loss:0.735


Epoch:7...   Batch:4901...   Training Loss:0.565: 4901it [07:29,  2.17it/s]

Epoch:7/10...   Batch:4900...   Validation Loss:0.722


Epoch:7...   Batch:5003...   Training Loss:0.619: 5003it [07:38,  2.92it/s]

Epoch:7/10...   Batch:5000...   Validation Loss:0.746


Epoch:7...   Batch:5103...   Training Loss:0.652: 5103it [07:48,  2.90it/s]

Epoch:7/10...   Batch:5100...   Validation Loss:0.777


Epoch:7...   Batch:5203...   Training Loss:0.642: 5203it [07:57,  2.92it/s]

Epoch:7/10...   Batch:5200...   Validation Loss:0.795


Epoch:7...   Batch:5303...   Training Loss:0.680: 5303it [08:06,  2.94it/s]

Epoch:7/10...   Batch:5300...   Validation Loss:0.806


Epoch:7...   Batch:5401...   Training Loss:0.706: 5401it [08:15,  2.16it/s]

Epoch:7/10...   Batch:5400...   Validation Loss:0.790


Epoch:7...   Batch:5503...   Training Loss:0.614: 5503it [08:24,  2.90it/s]

Epoch:7/10...   Batch:5500...   Validation Loss:0.797


Epoch:7...   Batch:5603...   Training Loss:0.629: 5603it [08:33,  2.92it/s]

Epoch:7/10...   Batch:5600...   Validation Loss:0.781


Epoch:7...   Batch:5703...   Training Loss:0.553: 5703it [08:42,  2.91it/s]

Epoch:7/10...   Batch:5700...   Validation Loss:0.751


Epoch:7...   Batch:5803...   Training Loss:0.675: 5803it [08:51,  2.94it/s]

Epoch:7/10...   Batch:5800...   Validation Loss:0.788


Epoch:7...   Batch:5901...   Training Loss:0.679: 5901it [09:00,  2.13it/s]

Epoch:7/10...   Batch:5900...   Validation Loss:0.767


Epoch:7...   Batch:6003...   Training Loss:0.593: 6003it [09:10,  2.90it/s]

Epoch:7/10...   Batch:6000...   Validation Loss:0.792


Epoch:7...   Batch:6103...   Training Loss:0.643: 6103it [09:19,  2.88it/s]

Epoch:7/10...   Batch:6100...   Validation Loss:0.795


Epoch:7...   Batch:6203...   Training Loss:0.642: 6203it [09:28,  2.86it/s]

Epoch:7/10...   Batch:6200...   Validation Loss:0.789


Epoch:7...   Batch:6303...   Training Loss:0.619: 6303it [09:37,  2.92it/s]

Epoch:7/10...   Batch:6300...   Validation Loss:0.808


Epoch:7...   Batch:6396...   Training Loss:0.748: 6396it [09:43, 10.96it/s]
0it [00:00, ?it/s]

Starting epoch 8


Epoch:8...   Batch:101...   Training Loss:0.538: 101it [00:11,  2.14it/s]

Epoch:8/10...   Batch:100...   Validation Loss:0.859


Epoch:8...   Batch:203...   Training Loss:0.460: 203it [00:20,  2.90it/s]

Epoch:8/10...   Batch:200...   Validation Loss:0.858


Epoch:8...   Batch:303...   Training Loss:0.597: 303it [00:29,  2.89it/s]

Epoch:8/10...   Batch:300...   Validation Loss:0.894


Epoch:8...   Batch:403...   Training Loss:0.502: 403it [00:38,  2.92it/s]

Epoch:8/10...   Batch:400...   Validation Loss:0.882


Epoch:8...   Batch:503...   Training Loss:0.466: 503it [00:47,  2.94it/s]

Epoch:8/10...   Batch:500...   Validation Loss:0.881


Epoch:8...   Batch:603...   Training Loss:0.544: 603it [00:57,  2.90it/s]

Epoch:8/10...   Batch:600...   Validation Loss:0.897


Epoch:8...   Batch:703...   Training Loss:0.454: 703it [01:06,  2.92it/s]

Epoch:8/10...   Batch:700...   Validation Loss:0.900


Epoch:8...   Batch:803...   Training Loss:0.479: 803it [01:15,  2.86it/s]

Epoch:8/10...   Batch:800...   Validation Loss:0.842


Epoch:8...   Batch:903...   Training Loss:0.452: 903it [01:24,  2.89it/s]

Epoch:8/10...   Batch:900...   Validation Loss:0.841


Epoch:8...   Batch:1003...   Training Loss:0.603: 1003it [01:33,  2.91it/s]

Epoch:8/10...   Batch:1000...   Validation Loss:0.847


Epoch:8...   Batch:1101...   Training Loss:0.624: 1101it [01:42,  2.12it/s]

Epoch:8/10...   Batch:1100...   Validation Loss:0.822


Epoch:8...   Batch:1203...   Training Loss:0.448: 1203it [01:51,  2.95it/s]

Epoch:8/10...   Batch:1200...   Validation Loss:0.845


Epoch:8...   Batch:1303...   Training Loss:0.521: 1303it [02:01,  2.94it/s]

Epoch:8/10...   Batch:1300...   Validation Loss:0.864


Epoch:8...   Batch:1403...   Training Loss:0.610: 1403it [02:10,  2.92it/s]

Epoch:8/10...   Batch:1400...   Validation Loss:0.882


Epoch:8...   Batch:1503...   Training Loss:0.477: 1503it [02:19,  2.93it/s]

Epoch:8/10...   Batch:1500...   Validation Loss:0.821


Epoch:8...   Batch:1601...   Training Loss:0.567: 1601it [02:28,  2.11it/s]

Epoch:8/10...   Batch:1600...   Validation Loss:0.859


Epoch:8...   Batch:1703...   Training Loss:0.485: 1703it [02:37,  2.92it/s]

Epoch:8/10...   Batch:1700...   Validation Loss:0.834


Epoch:8...   Batch:1803...   Training Loss:0.492: 1803it [02:46,  2.94it/s]

Epoch:8/10...   Batch:1800...   Validation Loss:0.836


Epoch:8...   Batch:1903...   Training Loss:0.558: 1903it [02:55,  2.95it/s]

Epoch:8/10...   Batch:1900...   Validation Loss:0.870


Epoch:8...   Batch:2003...   Training Loss:0.482: 2003it [03:04,  2.93it/s]

Epoch:8/10...   Batch:2000...   Validation Loss:0.836


Epoch:8...   Batch:2103...   Training Loss:0.507: 2103it [03:13,  2.95it/s]

Epoch:8/10...   Batch:2100...   Validation Loss:0.833


Epoch:8...   Batch:2201...   Training Loss:0.629: 2201it [03:22,  2.13it/s]

Epoch:8/10...   Batch:2200...   Validation Loss:0.836


Epoch:8...   Batch:2303...   Training Loss:0.666: 2303it [03:32,  2.89it/s]

Epoch:8/10...   Batch:2300...   Validation Loss:0.830


Epoch:8...   Batch:2403...   Training Loss:0.686: 2403it [03:41,  2.93it/s]

Epoch:8/10...   Batch:2400...   Validation Loss:0.825


Epoch:8...   Batch:2503...   Training Loss:0.596: 2503it [03:50,  2.92it/s]

Epoch:8/10...   Batch:2500...   Validation Loss:0.823


Epoch:8...   Batch:2603...   Training Loss:0.733: 2603it [03:59,  3.00it/s]

Epoch:8/10...   Batch:2600...   Validation Loss:0.785


Epoch:8...   Batch:2701...   Training Loss:0.570: 2701it [04:08,  2.14it/s]

Epoch:8/10...   Batch:2700...   Validation Loss:0.812


Epoch:8...   Batch:2803...   Training Loss:0.517: 2803it [04:17,  2.95it/s]

Epoch:8/10...   Batch:2800...   Validation Loss:0.832


Epoch:8...   Batch:2903...   Training Loss:0.492: 2903it [04:26,  2.89it/s]

Epoch:8/10...   Batch:2900...   Validation Loss:0.846


Epoch:8...   Batch:3003...   Training Loss:0.575: 3003it [04:36,  2.92it/s]

Epoch:8/10...   Batch:3000...   Validation Loss:0.880


Epoch:8...   Batch:3103...   Training Loss:0.525: 3103it [04:45,  2.90it/s]

Epoch:8/10...   Batch:3100...   Validation Loss:0.861


Epoch:8...   Batch:3201...   Training Loss:0.580: 3201it [04:54,  2.18it/s]

Epoch:8/10...   Batch:3200...   Validation Loss:0.853


Epoch:8...   Batch:3303...   Training Loss:0.620: 3303it [05:03,  2.91it/s]

Epoch:8/10...   Batch:3300...   Validation Loss:0.810


Epoch:8...   Batch:3403...   Training Loss:0.527: 3403it [05:12,  2.96it/s]

Epoch:8/10...   Batch:3400...   Validation Loss:0.803


Epoch:8...   Batch:3503...   Training Loss:0.522: 3503it [05:21,  2.95it/s]

Epoch:8/10...   Batch:3500...   Validation Loss:0.785


Epoch:8...   Batch:3603...   Training Loss:0.628: 3603it [05:30,  2.95it/s]

Epoch:8/10...   Batch:3600...   Validation Loss:0.771


Epoch:8...   Batch:3701...   Training Loss:0.586: 3701it [05:39,  2.16it/s]

Epoch:8/10...   Batch:3700...   Validation Loss:0.765


Epoch:8...   Batch:3803...   Training Loss:0.561: 3803it [05:49,  2.92it/s]

Epoch:8/10...   Batch:3800...   Validation Loss:0.751


Epoch:8...   Batch:3903...   Training Loss:0.617: 3903it [05:58,  2.92it/s]

Epoch:8/10...   Batch:3900...   Validation Loss:0.773


Epoch:8...   Batch:4003...   Training Loss:0.585: 4003it [06:07,  2.91it/s]

Epoch:8/10...   Batch:4000...   Validation Loss:0.779


Epoch:8...   Batch:4103...   Training Loss:0.688: 4103it [06:16,  2.86it/s]

Epoch:8/10...   Batch:4100...   Validation Loss:0.773


Epoch:8...   Batch:4201...   Training Loss:0.538: 4201it [06:25,  2.14it/s]

Epoch:8/10...   Batch:4200...   Validation Loss:0.796


Epoch:8...   Batch:4303...   Training Loss:0.518: 4303it [06:34,  2.90it/s]

Epoch:8/10...   Batch:4300...   Validation Loss:0.763


Epoch:8...   Batch:4403...   Training Loss:0.552: 4403it [06:44,  2.89it/s]

Epoch:8/10...   Batch:4400...   Validation Loss:0.786


Epoch:8...   Batch:4503...   Training Loss:0.632: 4503it [06:53,  2.84it/s]

Epoch:8/10...   Batch:4500...   Validation Loss:0.810


Epoch:8...   Batch:4603...   Training Loss:0.615: 4603it [07:02,  2.88it/s]

Epoch:8/10...   Batch:4600...   Validation Loss:0.791


Epoch:8...   Batch:4701...   Training Loss:0.547: 4701it [07:11,  2.16it/s]

Epoch:8/10...   Batch:4700...   Validation Loss:0.784


Epoch:8...   Batch:4803...   Training Loss:0.632: 4803it [07:20,  2.95it/s]

Epoch:8/10...   Batch:4800...   Validation Loss:0.784


Epoch:8...   Batch:4903...   Training Loss:0.647: 4903it [07:29,  2.93it/s]

Epoch:8/10...   Batch:4900...   Validation Loss:0.760


Epoch:8...   Batch:5003...   Training Loss:0.590: 5003it [07:38,  2.94it/s]

Epoch:8/10...   Batch:5000...   Validation Loss:0.775


Epoch:8...   Batch:5103...   Training Loss:0.622: 5103it [07:47,  2.93it/s]

Epoch:8/10...   Batch:5100...   Validation Loss:0.766


Epoch:8...   Batch:5201...   Training Loss:0.683: 5201it [07:56,  2.16it/s]

Epoch:8/10...   Batch:5200...   Validation Loss:0.755


Epoch:8...   Batch:5303...   Training Loss:0.679: 5303it [08:06,  2.92it/s]

Epoch:8/10...   Batch:5300...   Validation Loss:0.751


Epoch:8...   Batch:5403...   Training Loss:0.583: 5403it [08:15,  2.91it/s]

Epoch:8/10...   Batch:5400...   Validation Loss:0.744


Epoch:8...   Batch:5503...   Training Loss:0.790: 5503it [08:24,  2.92it/s]

Epoch:8/10...   Batch:5500...   Validation Loss:0.740


Epoch:8...   Batch:5603...   Training Loss:0.497: 5603it [08:33,  2.93it/s]

Epoch:8/10...   Batch:5600...   Validation Loss:0.749


Epoch:8...   Batch:5701...   Training Loss:0.654: 5701it [08:42,  2.17it/s]

Epoch:8/10...   Batch:5700...   Validation Loss:0.797


Epoch:8...   Batch:5803...   Training Loss:0.657: 5803it [08:51,  2.95it/s]

Epoch:8/10...   Batch:5800...   Validation Loss:0.775


Epoch:8...   Batch:5903...   Training Loss:0.632: 5903it [09:01,  2.91it/s]

Epoch:8/10...   Batch:5900...   Validation Loss:0.737


Epoch:8...   Batch:6003...   Training Loss:0.651: 6003it [09:10,  2.92it/s]

Epoch:8/10...   Batch:6000...   Validation Loss:0.777


Epoch:8...   Batch:6103...   Training Loss:0.709: 6103it [09:19,  2.94it/s]

Epoch:8/10...   Batch:6100...   Validation Loss:0.758


Epoch:8...   Batch:6201...   Training Loss:0.511: 6201it [09:28,  2.10it/s]

Epoch:8/10...   Batch:6200...   Validation Loss:0.773


Epoch:8...   Batch:6303...   Training Loss:0.633: 6303it [09:37,  2.91it/s]

Epoch:8/10...   Batch:6300...   Validation Loss:0.786


Epoch:8...   Batch:6396...   Training Loss:0.768: 6396it [09:43, 10.96it/s]
0it [00:00, ?it/s]

Starting epoch 9


Epoch:9...   Batch:103...   Training Loss:0.488: 103it [00:11,  2.93it/s]

Epoch:9/10...   Batch:100...   Validation Loss:0.851


Epoch:9...   Batch:203...   Training Loss:0.577: 203it [00:20,  2.89it/s]

Epoch:9/10...   Batch:200...   Validation Loss:0.833


Epoch:9...   Batch:301...   Training Loss:0.436: 301it [00:29,  2.13it/s]

Epoch:9/10...   Batch:300...   Validation Loss:0.876


Epoch:9...   Batch:403...   Training Loss:0.512: 403it [00:38,  2.89it/s]

Epoch:9/10...   Batch:400...   Validation Loss:0.889


Epoch:9...   Batch:503...   Training Loss:0.470: 503it [00:47,  2.96it/s]

Epoch:9/10...   Batch:500...   Validation Loss:0.882


Epoch:9...   Batch:603...   Training Loss:0.378: 603it [00:57,  2.95it/s]

Epoch:9/10...   Batch:600...   Validation Loss:0.912


Epoch:9...   Batch:703...   Training Loss:0.580: 703it [01:06,  2.94it/s]

Epoch:9/10...   Batch:700...   Validation Loss:0.884


Epoch:9...   Batch:801...   Training Loss:0.510: 801it [01:15,  2.12it/s]

Epoch:9/10...   Batch:800...   Validation Loss:0.859


Epoch:9...   Batch:903...   Training Loss:0.576: 903it [01:24,  2.92it/s]

Epoch:9/10...   Batch:900...   Validation Loss:0.871


Epoch:9...   Batch:1003...   Training Loss:0.592: 1003it [01:33,  2.90it/s]

Epoch:9/10...   Batch:1000...   Validation Loss:0.849


Epoch:9...   Batch:1103...   Training Loss:0.548: 1103it [01:42,  2.85it/s]

Epoch:9/10...   Batch:1100...   Validation Loss:0.846


Epoch:9...   Batch:1203...   Training Loss:0.531: 1203it [01:52,  2.91it/s]

Epoch:9/10...   Batch:1200...   Validation Loss:0.857


Epoch:9...   Batch:1301...   Training Loss:0.644: 1301it [02:00,  2.16it/s]

Epoch:9/10...   Batch:1300...   Validation Loss:0.876


Epoch:9...   Batch:1403...   Training Loss:0.562: 1403it [02:10,  2.88it/s]

Epoch:9/10...   Batch:1400...   Validation Loss:0.862


Epoch:9...   Batch:1503...   Training Loss:0.466: 1503it [02:19,  2.97it/s]

Epoch:9/10...   Batch:1500...   Validation Loss:0.883


Epoch:9...   Batch:1603...   Training Loss:0.578: 1603it [02:28,  2.91it/s]

Epoch:9/10...   Batch:1600...   Validation Loss:0.857


Epoch:9...   Batch:1703...   Training Loss:0.491: 1703it [02:37,  2.90it/s]

Epoch:9/10...   Batch:1700...   Validation Loss:0.892


Epoch:9...   Batch:1801...   Training Loss:0.479: 1801it [02:46,  2.15it/s]

Epoch:9/10...   Batch:1800...   Validation Loss:0.897


Epoch:9...   Batch:1903...   Training Loss:0.561: 1903it [02:55,  2.94it/s]

Epoch:9/10...   Batch:1900...   Validation Loss:0.918


Epoch:9...   Batch:2003...   Training Loss:0.523: 2003it [03:05,  2.88it/s]

Epoch:9/10...   Batch:2000...   Validation Loss:0.877


Epoch:9...   Batch:2103...   Training Loss:0.490: 2103it [03:14,  2.92it/s]

Epoch:9/10...   Batch:2100...   Validation Loss:0.879


Epoch:9...   Batch:2203...   Training Loss:0.608: 2203it [03:23,  2.85it/s]

Epoch:9/10...   Batch:2200...   Validation Loss:0.910


Epoch:9...   Batch:2301...   Training Loss:0.669: 2301it [03:32,  2.13it/s]

Epoch:9/10...   Batch:2300...   Validation Loss:0.920


Epoch:9...   Batch:2403...   Training Loss:0.558: 2403it [03:41,  2.92it/s]

Epoch:9/10...   Batch:2400...   Validation Loss:0.888


Epoch:9...   Batch:2503...   Training Loss:0.650: 2503it [03:50,  2.92it/s]

Epoch:9/10...   Batch:2500...   Validation Loss:0.883


Epoch:9...   Batch:2603...   Training Loss:0.421: 2603it [03:59,  2.91it/s]

Epoch:9/10...   Batch:2600...   Validation Loss:0.837


Epoch:9...   Batch:2703...   Training Loss:0.637: 2703it [04:09,  2.96it/s]

Epoch:9/10...   Batch:2700...   Validation Loss:0.792


Epoch:9...   Batch:2801...   Training Loss:0.599: 2801it [04:18,  2.14it/s]

Epoch:9/10...   Batch:2800...   Validation Loss:0.798


Epoch:9...   Batch:2903...   Training Loss:0.689: 2903it [04:27,  2.90it/s]

Epoch:9/10...   Batch:2900...   Validation Loss:0.791


Epoch:9...   Batch:3003...   Training Loss:0.466: 3003it [04:36,  2.89it/s]

Epoch:9/10...   Batch:3000...   Validation Loss:0.770


Epoch:9...   Batch:3103...   Training Loss:0.543: 3103it [04:45,  2.87it/s]

Epoch:9/10...   Batch:3100...   Validation Loss:0.782


Epoch:9...   Batch:3203...   Training Loss:0.477: 3203it [04:54,  2.91it/s]

Epoch:9/10...   Batch:3200...   Validation Loss:0.771


Epoch:9...   Batch:3301...   Training Loss:0.598: 3301it [05:03,  2.15it/s]

Epoch:9/10...   Batch:3300...   Validation Loss:0.784


Epoch:9...   Batch:3403...   Training Loss:0.529: 3403it [05:13,  2.94it/s]

Epoch:9/10...   Batch:3400...   Validation Loss:0.801


Epoch:9...   Batch:3503...   Training Loss:0.646: 3503it [05:22,  2.86it/s]

Epoch:9/10...   Batch:3500...   Validation Loss:0.839


Epoch:9...   Batch:3603...   Training Loss:0.616: 3603it [05:31,  2.90it/s]

Epoch:9/10...   Batch:3600...   Validation Loss:0.866


Epoch:9...   Batch:3703...   Training Loss:0.465: 3703it [05:40,  2.91it/s]

Epoch:9/10...   Batch:3700...   Validation Loss:0.837


Epoch:9...   Batch:3801...   Training Loss:0.520: 3801it [05:49,  2.16it/s]

Epoch:9/10...   Batch:3800...   Validation Loss:0.847


Epoch:9...   Batch:3903...   Training Loss:0.461: 3903it [05:58,  2.96it/s]

Epoch:9/10...   Batch:3900...   Validation Loss:0.845


Epoch:9...   Batch:4003...   Training Loss:0.631: 4003it [06:07,  2.91it/s]

Epoch:9/10...   Batch:4000...   Validation Loss:0.834


Epoch:9...   Batch:4103...   Training Loss:0.402: 4103it [06:17,  2.96it/s]

Epoch:9/10...   Batch:4100...   Validation Loss:0.857


Epoch:9...   Batch:4203...   Training Loss:0.518: 4203it [06:26,  2.95it/s]

Epoch:9/10...   Batch:4200...   Validation Loss:0.858


Epoch:9...   Batch:4303...   Training Loss:0.442: 4303it [06:35,  2.95it/s]

Epoch:9/10...   Batch:4300...   Validation Loss:0.873


Epoch:9...   Batch:4403...   Training Loss:0.584: 4403it [06:44,  2.94it/s]

Epoch:9/10...   Batch:4400...   Validation Loss:0.877


Epoch:9...   Batch:4501...   Training Loss:0.539: 4501it [06:53,  2.13it/s]

Epoch:9/10...   Batch:4500...   Validation Loss:0.904


Epoch:9...   Batch:4603...   Training Loss:0.563: 4603it [07:02,  2.95it/s]

Epoch:9/10...   Batch:4600...   Validation Loss:0.947


Epoch:9...   Batch:4703...   Training Loss:0.556: 4703it [07:11,  2.96it/s]

Epoch:9/10...   Batch:4700...   Validation Loss:0.955


Epoch:9...   Batch:4803...   Training Loss:0.506: 4803it [07:20,  2.92it/s]

Epoch:9/10...   Batch:4800...   Validation Loss:0.943


Epoch:9...   Batch:4903...   Training Loss:0.537: 4903it [07:29,  2.92it/s]

Epoch:9/10...   Batch:4900...   Validation Loss:0.951


Epoch:9...   Batch:5001...   Training Loss:0.584: 5001it [07:38,  2.16it/s]

Epoch:9/10...   Batch:5000...   Validation Loss:0.942


Epoch:9...   Batch:5103...   Training Loss:0.653: 5103it [07:48,  2.94it/s]

Epoch:9/10...   Batch:5100...   Validation Loss:0.912


Epoch:9...   Batch:5203...   Training Loss:0.505: 5203it [07:57,  3.01it/s]

Epoch:9/10...   Batch:5200...   Validation Loss:0.899


Epoch:9...   Batch:5303...   Training Loss:0.585: 5303it [08:06,  2.93it/s]

Epoch:9/10...   Batch:5300...   Validation Loss:0.880


Epoch:9...   Batch:5403...   Training Loss:0.582: 5403it [08:15,  3.01it/s]

Epoch:9/10...   Batch:5400...   Validation Loss:0.864


Epoch:9...   Batch:5503...   Training Loss:0.562: 5503it [08:23,  2.96it/s]

Epoch:9/10...   Batch:5500...   Validation Loss:0.885


Epoch:9...   Batch:5601...   Training Loss:0.516: 5601it [08:32,  2.21it/s]

Epoch:9/10...   Batch:5600...   Validation Loss:0.883


Epoch:9...   Batch:5703...   Training Loss:0.641: 5703it [08:42,  2.91it/s]

Epoch:9/10...   Batch:5700...   Validation Loss:0.896


Epoch:9...   Batch:5803...   Training Loss:0.636: 5803it [08:51,  2.90it/s]

Epoch:9/10...   Batch:5800...   Validation Loss:0.875


Epoch:9...   Batch:5903...   Training Loss:0.549: 5903it [09:00,  2.94it/s]

Epoch:9/10...   Batch:5900...   Validation Loss:0.873


Epoch:9...   Batch:6003...   Training Loss:0.599: 6003it [09:09,  2.92it/s]

Epoch:9/10...   Batch:6000...   Validation Loss:0.878


Epoch:9...   Batch:6101...   Training Loss:0.721: 6101it [09:18,  2.17it/s]

Epoch:9/10...   Batch:6100...   Validation Loss:0.867


Epoch:9...   Batch:6203...   Training Loss:0.559: 6203it [09:27,  2.92it/s]

Epoch:9/10...   Batch:6200...   Validation Loss:0.871


Epoch:9...   Batch:6303...   Training Loss:0.615: 6303it [09:36,  2.93it/s]

Epoch:9/10...   Batch:6300...   Validation Loss:0.826


Epoch:9...   Batch:6396...   Training Loss:0.554: 6396it [09:42, 10.98it/s]
0it [00:00, ?it/s]

Starting epoch 10


Epoch:10...   Batch:103...   Training Loss:0.505: 103it [00:11,  2.95it/s]

Epoch:10/10...   Batch:100...   Validation Loss:0.844


Epoch:10...   Batch:203...   Training Loss:0.377: 203it [00:20,  2.99it/s]

Epoch:10/10...   Batch:200...   Validation Loss:0.861


Epoch:10...   Batch:303...   Training Loss:0.600: 303it [00:29,  2.99it/s]

Epoch:10/10...   Batch:300...   Validation Loss:0.909


Epoch:10...   Batch:401...   Training Loss:0.340: 401it [00:38,  2.14it/s]

Epoch:10/10...   Batch:400...   Validation Loss:0.909


Epoch:10...   Batch:503...   Training Loss:0.369: 503it [00:47,  2.92it/s]

Epoch:10/10...   Batch:500...   Validation Loss:0.952


Epoch:10...   Batch:603...   Training Loss:0.488: 603it [00:56,  2.94it/s]

Epoch:10/10...   Batch:600...   Validation Loss:0.938


Epoch:10...   Batch:703...   Training Loss:0.493: 703it [01:05,  2.98it/s]

Epoch:10/10...   Batch:700...   Validation Loss:0.946


Epoch:10...   Batch:803...   Training Loss:0.442: 803it [01:14,  2.94it/s]

Epoch:10/10...   Batch:800...   Validation Loss:0.948


Epoch:10...   Batch:901...   Training Loss:0.577: 901it [01:23,  2.15it/s]

Epoch:10/10...   Batch:900...   Validation Loss:0.927


Epoch:10...   Batch:1003...   Training Loss:0.556: 1003it [01:33,  2.89it/s]

Epoch:10/10...   Batch:1000...   Validation Loss:0.895


Epoch:10...   Batch:1103...   Training Loss:0.482: 1103it [01:42,  2.90it/s]

Epoch:10/10...   Batch:1100...   Validation Loss:0.868


Epoch:10...   Batch:1203...   Training Loss:0.532: 1203it [01:51,  2.95it/s]

Epoch:10/10...   Batch:1200...   Validation Loss:0.875


Epoch:10...   Batch:1303...   Training Loss:0.455: 1303it [02:00,  2.89it/s]

Epoch:10/10...   Batch:1300...   Validation Loss:0.909


Epoch:10...   Batch:1401...   Training Loss:0.456: 1401it [02:09,  2.12it/s]

Epoch:10/10...   Batch:1400...   Validation Loss:0.943


Epoch:10...   Batch:1503...   Training Loss:0.512: 1503it [02:18,  2.91it/s]

Epoch:10/10...   Batch:1500...   Validation Loss:0.917


Epoch:10...   Batch:1603...   Training Loss:0.540: 1603it [02:27,  3.06it/s]

Epoch:10/10...   Batch:1600...   Validation Loss:0.946


Epoch:10...   Batch:1703...   Training Loss:0.409: 1703it [02:36,  3.04it/s]

Epoch:10/10...   Batch:1700...   Validation Loss:0.954


Epoch:10...   Batch:1803...   Training Loss:0.449: 1803it [02:45,  3.00it/s]

Epoch:10/10...   Batch:1800...   Validation Loss:0.954


Epoch:10...   Batch:1901...   Training Loss:0.493: 1901it [02:54,  2.24it/s]

Epoch:10/10...   Batch:1900...   Validation Loss:0.907


Epoch:10...   Batch:2003...   Training Loss:0.479: 2003it [03:03,  3.01it/s]

Epoch:10/10...   Batch:2000...   Validation Loss:0.891


Epoch:10...   Batch:2103...   Training Loss:0.557: 2103it [03:12,  2.99it/s]

Epoch:10/10...   Batch:2100...   Validation Loss:0.906


Epoch:10...   Batch:2203...   Training Loss:0.417: 2203it [03:21,  2.91it/s]

Epoch:10/10...   Batch:2200...   Validation Loss:0.901


Epoch:10...   Batch:2303...   Training Loss:0.535: 2303it [03:30,  2.91it/s]

Epoch:10/10...   Batch:2300...   Validation Loss:0.864


Epoch:10...   Batch:2401...   Training Loss:0.535: 2401it [03:39,  2.19it/s]

Epoch:10/10...   Batch:2400...   Validation Loss:0.880


Epoch:10...   Batch:2503...   Training Loss:0.410: 2503it [03:48,  2.96it/s]

Epoch:10/10...   Batch:2500...   Validation Loss:0.890


Epoch:10...   Batch:2603...   Training Loss:0.528: 2603it [03:57,  2.88it/s]

Epoch:10/10...   Batch:2600...   Validation Loss:0.903


Epoch:10...   Batch:2703...   Training Loss:0.519: 2703it [04:06,  2.90it/s]

Epoch:10/10...   Batch:2700...   Validation Loss:0.879


Epoch:10...   Batch:2803...   Training Loss:0.511: 2803it [04:15,  2.99it/s]

Epoch:10/10...   Batch:2800...   Validation Loss:0.889


Epoch:10...   Batch:2901...   Training Loss:0.659: 2901it [04:24,  2.13it/s]

Epoch:10/10...   Batch:2900...   Validation Loss:0.843


Epoch:10...   Batch:3003...   Training Loss:0.553: 3003it [04:34,  2.91it/s]

Epoch:10/10...   Batch:3000...   Validation Loss:0.842


Epoch:10...   Batch:3103...   Training Loss:0.562: 3103it [04:43,  2.89it/s]

Epoch:10/10...   Batch:3100...   Validation Loss:0.842


Epoch:10...   Batch:3203...   Training Loss:0.509: 3203it [04:52,  2.96it/s]

Epoch:10/10...   Batch:3200...   Validation Loss:0.878


Epoch:10...   Batch:3303...   Training Loss:0.496: 3303it [05:01,  3.05it/s]

Epoch:10/10...   Batch:3300...   Validation Loss:0.854


Epoch:10...   Batch:3403...   Training Loss:0.647: 3403it [05:10,  3.03it/s]

Epoch:10/10...   Batch:3400...   Validation Loss:0.870


Epoch:10...   Batch:3503...   Training Loss:0.524: 3503it [05:19,  3.00it/s]

Epoch:10/10...   Batch:3500...   Validation Loss:0.872


Epoch:10...   Batch:3601...   Training Loss:0.488: 3601it [05:28,  2.17it/s]

Epoch:10/10...   Batch:3600...   Validation Loss:0.858


Epoch:10...   Batch:3703...   Training Loss:0.594: 3703it [05:37,  3.05it/s]

Epoch:10/10...   Batch:3700...   Validation Loss:0.849


Epoch:10...   Batch:3803...   Training Loss:0.568: 3803it [05:46,  2.96it/s]

Epoch:10/10...   Batch:3800...   Validation Loss:0.852


Epoch:10...   Batch:3903...   Training Loss:0.541: 3903it [05:55,  2.93it/s]

Epoch:10/10...   Batch:3900...   Validation Loss:0.852


Epoch:10...   Batch:4003...   Training Loss:0.536: 4003it [06:04,  3.00it/s]

Epoch:10/10...   Batch:4000...   Validation Loss:0.853


Epoch:10...   Batch:4101...   Training Loss:0.532: 4101it [06:13,  2.21it/s]

Epoch:10/10...   Batch:4100...   Validation Loss:0.900


Epoch:10...   Batch:4203...   Training Loss:0.452: 4203it [06:22,  2.97it/s]

Epoch:10/10...   Batch:4200...   Validation Loss:0.928


Epoch:10...   Batch:4303...   Training Loss:0.605: 4303it [06:31,  2.94it/s]

Epoch:10/10...   Batch:4300...   Validation Loss:0.933


Epoch:10...   Batch:4403...   Training Loss:0.645: 4403it [06:40,  2.90it/s]

Epoch:10/10...   Batch:4400...   Validation Loss:0.937


Epoch:10...   Batch:4503...   Training Loss:0.664: 4503it [06:49,  2.99it/s]

Epoch:10/10...   Batch:4500...   Validation Loss:0.881


Epoch:10...   Batch:4601...   Training Loss:0.585: 4601it [06:58,  2.19it/s]

Epoch:10/10...   Batch:4600...   Validation Loss:0.884


Epoch:10...   Batch:4703...   Training Loss:0.493: 4703it [07:07,  3.06it/s]

Epoch:10/10...   Batch:4700...   Validation Loss:0.842


Epoch:10...   Batch:4803...   Training Loss:0.614: 4803it [07:16,  3.04it/s]

Epoch:10/10...   Batch:4800...   Validation Loss:0.881


Epoch:10...   Batch:4903...   Training Loss:0.591: 4903it [07:25,  3.05it/s]

Epoch:10/10...   Batch:4900...   Validation Loss:0.888


Epoch:10...   Batch:5003...   Training Loss:0.580: 5003it [07:34,  3.03it/s]

Epoch:10/10...   Batch:5000...   Validation Loss:0.852


Epoch:10...   Batch:5101...   Training Loss:0.517: 5101it [07:43,  2.12it/s]

Epoch:10/10...   Batch:5100...   Validation Loss:0.862


Epoch:10...   Batch:5203...   Training Loss:0.545: 5203it [07:52,  3.05it/s]

Epoch:10/10...   Batch:5200...   Validation Loss:0.857


Epoch:10...   Batch:5303...   Training Loss:0.530: 5303it [08:01,  3.05it/s]

Epoch:10/10...   Batch:5300...   Validation Loss:0.865


Epoch:10...   Batch:5403...   Training Loss:0.624: 5403it [08:10,  3.04it/s]

Epoch:10/10...   Batch:5400...   Validation Loss:0.863


Epoch:10...   Batch:5503...   Training Loss:0.595: 5503it [08:18,  3.05it/s]

Epoch:10/10...   Batch:5500...   Validation Loss:0.854


Epoch:10...   Batch:5603...   Training Loss:0.593: 5603it [08:27,  3.01it/s]

Epoch:10/10...   Batch:5600...   Validation Loss:0.870


Epoch:10...   Batch:5703...   Training Loss:0.673: 5703it [08:36,  3.08it/s]

Epoch:10/10...   Batch:5700...   Validation Loss:0.857


Epoch:10...   Batch:5803...   Training Loss:0.610: 5803it [08:45,  3.08it/s]

Epoch:10/10...   Batch:5800...   Validation Loss:0.852


Epoch:10...   Batch:5901...   Training Loss:0.475: 5901it [08:54,  2.24it/s]

Epoch:10/10...   Batch:5900...   Validation Loss:0.849


Epoch:10...   Batch:6003...   Training Loss:0.585: 6003it [09:03,  2.99it/s]

Epoch:10/10...   Batch:6000...   Validation Loss:0.871


Epoch:10...   Batch:6103...   Training Loss:0.529: 6103it [09:12,  2.91it/s]

Epoch:10/10...   Batch:6100...   Validation Loss:0.875


Epoch:10...   Batch:6203...   Training Loss:0.416: 6203it [09:21,  2.88it/s]

Epoch:10/10...   Batch:6200...   Validation Loss:0.882


Epoch:10...   Batch:6303...   Training Loss:0.491: 6303it [09:30,  2.97it/s]

Epoch:10/10...   Batch:6300...   Validation Loss:0.873


Epoch:10...   Batch:6396...   Training Loss:0.863: 6396it [09:36, 11.09it/s]


## Making Predictions
### Prediction 
Okay, now that you have a trained model, try it on some new twits and see if it works appropriately. Remember that for any new text, you'll need to preprocess it first before passing it to the network. Implement the `predict` function to generate the prediction vector from a message.

In [36]:
def predict(text, model, vocab):
    """ 
    Make a prediction on a single sentence.

    Parameters
    ----------
        text : The string to make a prediction on.
        model : The model to use for making the prediction.
        vocab : Dictionary for word to word ids. The key is the word and the value is the word id.

    Returns
    -------
        pred : Prediction vector
    """    
    # Get the preprocessed tokens
    tokens = preprocess(text)
    
    # Filter non-vocab words
    tokens = [i_word for i_word in tokens if i_word in vocab]

    # Convert words to ids
    tokens = [vocab[i_word] if (i_word is not None) else "sell" for i_word in tokens]
        
    # Adding a batch dimension
    text_input = torch.tensor(tokens).unsqueeze(1)

    # Get the NN output
    hidden = model.init_hidden(text_input.size(1))
    logps, _ = model.forward(text_input, hidden)

    # Take the exponent of the NN output to get a range of 0 to 1 for each label.
    pred = torch.exp(logps)
    
    return pred

In [37]:
text = "Google is working on self driving cars, I'm bullish on $goog"
model.eval()
model.to("cpu")
predict(text, model, vocab)

tensor([[ 0.0000,  0.0086,  0.0052,  0.6462,  0.3400]])

### Questions: What is the prediction of the model? What is the uncertainty of the prediction?

In the end, we should get a array of 5 digits. From left to right these are the meanings: super negative, negative, neutral, positive, super positive. Note that these are the probability distribution and they should add up to 1. Since in here the highest probability is the 4th digit then the model predicts that the model is positive. 

Now we have a trained model and we can make predictions. We can use this model to track the sentiments of various stocks by predicting the sentiments of twits as they are coming in. Now we have a stream of twits. For each of those twits, pull out the stocks mentioned in them and keep track of the sentiments. Remember that in the twits, ticker symbols are encoded with a dollar sign as the first character, all caps, and 2-4 letters, like $AAPL. Ideally, you'd want to track the sentiments of the stocks in your universe and use this as a signal in your larger model(s).

## Testing
### Load the Data 

In [38]:
# Delete this
from shutil import copyfile
copyfile(src = "./../../data/project_6_stocktwits/test_twits.json", dst = "./test_twits.json")

'./test_twits.json'

In [39]:
with open('test_twits.json', 'r') as f:
    test_data = json.load(f)

### Twit Stream

In [40]:
def twit_stream():
    for twit in test_data['data']:
        yield twit

next(twit_stream())

{'message_body': '$JWN has moved -1.69% on 10-31. Check out the movement and peers at  https://dividendbot.com?s=JWN',
 'timestamp': '2018-11-01T00:00:05Z'}

Using the `prediction` function, let's apply it to a stream of twits.

In [41]:
def score_twits(stream, model, vocab, universe):
    """ 
    Given a stream of twits and a universe of tickers, return sentiment scores for tickers in the universe.
    """
    for twit in stream:

        # Get the message text
        text = twit['message_body']
        symbols = re.findall('\$[A-Z]{2,4}', text)
        score = predict(text, model, vocab)

        for symbol in symbols:
            if symbol in universe:
                yield {'symbol': symbol, 'score': score, 'timestamp': twit['timestamp']}

In [42]:
universe = {'$BBRY', '$AAPL', '$AMZN', '$BABA', '$YHOO', '$LQMT', '$FB', '$GOOG', '$BBBY', '$JNUG', '$SBUX', '$MU'}
score_stream = score_twits(twit_stream(), model, vocab, universe)

next(score_stream)

{'symbol': '$AAPL',
 'score': tensor([[ 0.1503,  0.0844,  0.1629,  0.2773,  0.3252]]),
 'timestamp': '2018-11-01T00:00:18Z'}

That's it. You have successfully built a model for sentiment analysis! 

## Submission
Now that you're done with the project, it's time to submit it. Click the submit button in the bottom right. One of our reviewers will give you feedback on your project with a pass or not passed grade. You can continue to the next section while you wait for feedback.