# Movie Review Sentiment Classification

## Load the Data 

In [1]:
def print_review_label(i):
    print(labels[i] + "\t:\t" + reviews[i][:50]  +"...")

g = open('reviews.txt' , 'r')    # Open the reviews file 
reviews = list(map(lambda x:x[:-1] , g.readlines())) # text file map into a list.
g.close()

g = open('labels.txt' , 'r')    # Open a labels file and map into a list.
labels = list(map(lambda x:x[:-1].upper() , g.readlines()))
g.close()
    

In [2]:
reviews[0]      # Check the first review in the list 

'bromwell high is a cartoon comedy . it ran at the same time as some other programs about school life  such as  teachers  . my   years in the teaching profession lead me to believe that bromwell high  s satire is much closer to reality than is  teachers  . the scramble to survive financially  the insightful students who can see right through their pathetic teachers  pomp  the pettiness of the whole situation  all remind me of the schools i knew and their students . when i saw the episode in which a student repeatedly tried to burn down the school  i immediately recalled . . . . . . . . . at . . . . . . . . . . high . a classic line inspector i  m here to sack one of your teachers . student welcome to bromwell high . i expect that many adults of my age think that bromwell high is far fetched . what a pity that it isn  t   '

In [3]:
labels[0]     # Check the first label in the list

'POSITIVE'

In [4]:
len(labels)   # lenght of labels list.

25000

In [5]:
len(reviews)  # length of reviews in a list , 25K reviews in a dataset.

25000

## Data Exploration

In [6]:
print_review_label(25)    # Check the reviews and labels are corresponding to each other
print_review_label(52)

NEGATIVE	:	plot is not worth discussion even if it hints at c...
POSITIVE	:	after seeing several movies of villaronga  i had a...


### Import Library

In [7]:
from collections import Counter   
import numpy as np

In [8]:
# Create Three Counters for Positive, Negative and Total words count in the dataset
positive_counts = Counter()
negative_counts = Counter()
total_counts = Counter()

In [9]:
# Store the words in the counters acoording to Positive or Negative reviews
for i in range (len(reviews)):
    if (labels[i]=="POSITIVE") :
        for word in reviews[i].split(" "):   # Split the positive reviews in indivisual word
            positive_counts[word] += 1
            total_counts[word] +=1 
    else:
        for word in reviews[i].split(" "): # Split the negative reviews in indivisual word
            negative_counts[word] += 1
            total_counts[word] += 1
        

In [10]:
positive_counts.most_common()      # Most positive common words are show and their counts in data

[('', 550468),
 ('the', 173324),
 ('.', 159654),
 ('and', 89722),
 ('a', 83688),
 ('of', 76855),
 ('to', 66746),
 ('is', 57245),
 ('in', 50215),
 ('br', 49235),
 ('it', 48025),
 ('i', 40743),
 ('that', 35630),
 ('this', 35080),
 ('s', 33815),
 ('as', 26308),
 ('with', 23247),
 ('for', 22416),
 ('was', 21917),
 ('film', 20937),
 ('but', 20822),
 ('movie', 19074),
 ('his', 17227),
 ('on', 17008),
 ('you', 16681),
 ('he', 16282),
 ('are', 14807),
 ('not', 14272),
 ('t', 13720),
 ('one', 13655),
 ('have', 12587),
 ('be', 12416),
 ('by', 11997),
 ('all', 11942),
 ('who', 11464),
 ('an', 11294),
 ('at', 11234),
 ('from', 10767),
 ('her', 10474),
 ('they', 9895),
 ('has', 9186),
 ('so', 9154),
 ('like', 9038),
 ('about', 8313),
 ('very', 8305),
 ('out', 8134),
 ('there', 8057),
 ('she', 7779),
 ('what', 7737),
 ('or', 7732),
 ('good', 7720),
 ('more', 7521),
 ('when', 7456),
 ('some', 7441),
 ('if', 7285),
 ('just', 7152),
 ('can', 7001),
 ('story', 6780),
 ('time', 6515),
 ('my', 6488),
 ('g

In [11]:
negative_counts.most_common()   # Most negative common words are show and their counts in data

[('', 561462),
 ('.', 167538),
 ('the', 163389),
 ('a', 79321),
 ('and', 74385),
 ('of', 69009),
 ('to', 68974),
 ('br', 52637),
 ('is', 50083),
 ('it', 48327),
 ('i', 46880),
 ('in', 43753),
 ('this', 40920),
 ('that', 37615),
 ('s', 31546),
 ('was', 26291),
 ('movie', 24965),
 ('for', 21927),
 ('but', 21781),
 ('with', 20878),
 ('as', 20625),
 ('t', 20361),
 ('film', 19218),
 ('you', 17549),
 ('on', 17192),
 ('not', 16354),
 ('have', 15144),
 ('are', 14623),
 ('be', 14541),
 ('he', 13856),
 ('one', 13134),
 ('they', 13011),
 ('at', 12279),
 ('his', 12147),
 ('all', 12036),
 ('so', 11463),
 ('like', 11238),
 ('there', 10775),
 ('just', 10619),
 ('by', 10549),
 ('or', 10272),
 ('an', 10266),
 ('who', 9969),
 ('from', 9731),
 ('if', 9518),
 ('about', 9061),
 ('out', 8979),
 ('what', 8422),
 ('some', 8306),
 ('no', 8143),
 ('her', 7947),
 ('even', 7687),
 ('can', 7653),
 ('has', 7604),
 ('good', 7423),
 ('bad', 7401),
 ('would', 7036),
 ('up', 6970),
 ('only', 6781),
 ('more', 6730),
 ('

In [12]:
total_counts.most_common()   # Most common words are show and their counts in data

[('', 1111930),
 ('the', 336713),
 ('.', 327192),
 ('and', 164107),
 ('a', 163009),
 ('of', 145864),
 ('to', 135720),
 ('is', 107328),
 ('br', 101872),
 ('it', 96352),
 ('in', 93968),
 ('i', 87623),
 ('this', 76000),
 ('that', 73245),
 ('s', 65361),
 ('was', 48208),
 ('as', 46933),
 ('for', 44343),
 ('with', 44125),
 ('movie', 44039),
 ('but', 42603),
 ('film', 40155),
 ('you', 34230),
 ('on', 34200),
 ('t', 34081),
 ('not', 30626),
 ('he', 30138),
 ('are', 29430),
 ('his', 29374),
 ('have', 27731),
 ('be', 26957),
 ('one', 26789),
 ('all', 23978),
 ('at', 23513),
 ('they', 22906),
 ('by', 22546),
 ('an', 21560),
 ('who', 21433),
 ('so', 20617),
 ('from', 20498),
 ('like', 20276),
 ('there', 18832),
 ('her', 18421),
 ('or', 18004),
 ('just', 17771),
 ('about', 17374),
 ('out', 17113),
 ('if', 16803),
 ('has', 16790),
 ('what', 16159),
 ('some', 15747),
 ('good', 15143),
 ('can', 14654),
 ('more', 14251),
 ('she', 14223),
 ('when', 14182),
 ('very', 14069),
 ('up', 13291),
 ('time', 127

In [13]:
# Calculate the Positive to Negative ratios and store in counter
pos_neg_ratios = Counter()

for word in total_counts :
    if total_counts[word]>200:
        pos_neg_ratios[word] = float(positive_counts[word]) / float (negative_counts[word]+1)

In [14]:
pos_neg_ratios["the"]  # Check the ratio by word

1.0607993145235326

In [15]:
print("pos_neg_ratios of 'the' = {}".format(pos_neg_ratios['the']))
print("pos_neg_ratios of 'are' = {}".format(pos_neg_ratios['are']))
print("pos_neg_ratios of 'good' = {}".format(pos_neg_ratios['good']))
print("pos_neg_ratios of 'amazing' = {}".format(pos_neg_ratios['amazing']))
print("pos_neg_ratios of 'terrible' = {}".format(pos_neg_ratios['terrible']))

pos_neg_ratios of 'the' = 1.0607993145235326
pos_neg_ratios of 'are' = 1.0125136761487965
pos_neg_ratios of 'good' = 1.0398706896551724
pos_neg_ratios of 'amazing' = 4.022813688212928
pos_neg_ratios of 'terrible' = 0.17744252873563218


In [16]:
# Log transformation of pos_neg_ratio
for word in pos_neg_ratios:
    pos_neg_ratios[word] = np.log(pos_neg_ratios[word])

In [17]:
print("pos_neg_ratios of 'the' = {}".format(pos_neg_ratios['the']))
print("pos_neg_ratios of 'are' = {}".format(pos_neg_ratios['are']))
print("pos_neg_ratios of 'good' = {}".format(pos_neg_ratios['good']))
print("pos_neg_ratios of 'amazing' = {}".format(pos_neg_ratios['amazing']))
print("pos_neg_ratios of 'terrible' = {}".format(pos_neg_ratios['terrible']))

pos_neg_ratios of 'the' = 0.05902269426102881
pos_neg_ratios of 'are' = 0.012436027214787634
pos_neg_ratios of 'good' = 0.03909636855278532
pos_neg_ratios of 'amazing' = 1.3919815802404802
pos_neg_ratios of 'terrible' = -1.7291085042663878


In [18]:
pos_neg_ratios.most_common()

[('victoria', 2.681021528714291),
 ('captures', 2.038619547159581),
 ('wonderfully', 2.0218960560332353),
 ('powell', 1.978345424808467),
 ('refreshing', 1.8551812956655511),
 ('delightful', 1.8002701588959635),
 ('beautifully', 1.7626953362841438),
 ('underrated', 1.7197859696029656),
 ('superb', 1.7091514458966952),
 ('welles', 1.667706820558076),
 ('sinatra', 1.6389967146756448),
 ('touching', 1.637217476541176),
 ('stewart', 1.611998733295774),
 ('brilliantly', 1.5950491749820008),
 ('friendship', 1.5677652160335325),
 ('wonderful', 1.5645425925262093),
 ('magnificent', 1.54663701119507),
 ('finest', 1.546259010812569),
 ('jackie', 1.5439233053234738),
 ('freedom', 1.5091151908062312),
 ('fantastic', 1.5048433868558566),
 ('terrific', 1.5026699370083942),
 ('noir', 1.493925025312256),
 ('outstanding', 1.4910053152089213),
 ('nancy', 1.488077055429833),
 ('marie', 1.4825711915553104),
 ('excellent', 1.4647538505723599),
 ('chan', 1.423108334242607),
 ('gem', 1.3932148039644643),
 ('

In [25]:
list(reversed(pos_neg_ratios.most_common()))

[('unfunny', -2.6922395950755678),
 ('waste', -2.6193845640165536),
 ('pointless', -2.4553061800117097),
 ('redeeming', -2.3682390632154826),
 ('lousy', -2.307572634505085),
 ('worst', -2.286987896180378),
 ('laughable', -2.264363880173848),
 ('awful', -2.227194247027435),
 ('poorly', -2.2207550747464135),
 ('sucks', -1.987068221548821),
 ('lame', -1.981767458946166),
 ('insult', -1.978345424808467),
 ('horrible', -1.9102590939512902),
 ('amateurish', -1.9095425048844386),
 ('pathetic', -1.9003933102308506),
 ('wasted', -1.8382794848629478),
 ('crap', -1.8281271133989299),
 ('tedious', -1.802454758344803),
 ('dreadful', -1.7725281073001673),
 ('badly', -1.753626599532611),
 ('worse', -1.7372712839439852),
 ('terrible', -1.7291085042663878),
 ('embarrassing', -1.702147310538368),
 ('mess', -1.6900958154515549),
 ('garbage', -1.686913224602391),
 ('pile', -1.6682784124570338),
 ('stupid', -1.6552583827449077),
 ('vampires', -1.6191467265610613),
 ('dull', -1.5846733548097038),
 ('avoid',

# Creating the Input/Output Data

In [20]:
vocab = total_counts.keys()

In [21]:
vocab_size =len(vocab)

In [22]:
vocab_size

74074

In [23]:
layer_0 = np.zeros((1,vocab_size))

In [24]:
layer_0.shape  # Create input layer

(1, 74074)

In [26]:
# Create a dictionary of words in the vocabulary mapped to index positions
word2index = {}
for i , word in enumerate (vocab):
    word2index[word] = i
    
word2index

{'bromwell': 0,
 'high': 1,
 'is': 2,
 'a': 3,
 'cartoon': 4,
 'comedy': 5,
 '.': 6,
 'it': 7,
 'ran': 8,
 'at': 9,
 'the': 10,
 'same': 11,
 'time': 12,
 'as': 13,
 'some': 14,
 'other': 15,
 'programs': 16,
 'about': 17,
 'school': 18,
 'life': 19,
 '': 20,
 'such': 21,
 'teachers': 22,
 'my': 23,
 'years': 24,
 'in': 25,
 'teaching': 26,
 'profession': 27,
 'lead': 28,
 'me': 29,
 'to': 30,
 'believe': 31,
 'that': 32,
 's': 33,
 'satire': 34,
 'much': 35,
 'closer': 36,
 'reality': 37,
 'than': 38,
 'scramble': 39,
 'survive': 40,
 'financially': 41,
 'insightful': 42,
 'students': 43,
 'who': 44,
 'can': 45,
 'see': 46,
 'right': 47,
 'through': 48,
 'their': 49,
 'pathetic': 50,
 'pomp': 51,
 'pettiness': 52,
 'of': 53,
 'whole': 54,
 'situation': 55,
 'all': 56,
 'remind': 57,
 'schools': 58,
 'i': 59,
 'knew': 60,
 'and': 61,
 'when': 62,
 'saw': 63,
 'episode': 64,
 'which': 65,
 'student': 66,
 'repeatedly': 67,
 'tried': 68,
 'burn': 69,
 'down': 70,
 'immediately': 71,
 're

In [27]:
#It should count how many times each word is used in the given review, 
#and then store those counts at the appropriate indices inside `layer_0`.
def update_input_layer (review):
    global layer_0
    layer_0 *= 0
    for word in review.split(" "):
        layer_0[0][word2index[word]] += 1  

In [28]:
update_input_layer(reviews[1])
layer_0[0]

array([0., 0., 2., ..., 0., 0., 0.])

In [29]:
# Convert a label to `0` or `1` repectively Negative or Positive.
def get_target_for_label (label):
    if (label == "POSITIVE"):
        return 1
    else :
        return 0

In [30]:
labels[1]

'NEGATIVE'

In [31]:
get_target_for_label(labels[1])

0

# Building a Neural Network

In [41]:
import time , sys
import numpy as np

# Encapsulate our neural network in a class
class SentimentNetwork:
    def __init__(self ,reviews ,labels ,hidden_nodes =10 , learning_rate = 0.01):
        np.random.seed(1)
       
        # process the reviews and their associated labels so that everything is ready for training
        self.pre_process_data(reviews,labels)
        
        # Build the network to have the number of hidden nodes and the learning rate that
        # were passed into this initializer. Make the same number of input nodes as
        # there are vocabulary words and create a single output node.
        self.init_network(len(self.review_vocab) , hidden_nodes , 1 ,learning_rate)
    
    
    def pre_process_data (self , reviews , labels):
        review_vocab = set()
        
        for review in reviews :
            for word in review.split(" "):    # Split the review and add to vocab set.
                review_vocab.add(word)
        
        # Convert the vocabulary set to a list so we can access words via indices
        self.review_vocab = list(review_vocab)
        label_vocab = set()

        
        for label in labels:
            label_vocab.add(label)
        
        # Convert the label vocabulary set to a list so we can access labels via indices
        self.label_vocab = list(label_vocab)
        
        # Store the sizes of the review and label vocabularies.
        self.review_vocab_size = len(self.review_vocab)
        self.label_vocab_size = len(self.label_vocab)
        
        # Create a dictionary of words in the vocabulary mapped to index positions
        self.word2index = {}
        for i, word in enumerate(self.review_vocab):
            self.word2index[word] = i
            
            
        # Create a dictionary of labels mapped to index positions   
        self.label2index = {}
        for i, label in enumerate(self.label_vocab):
            self.label2index[label] = i
            
    def init_network(self, input_nodes, hidden_nodes, output_nodes, learning_rate):
        # Store the number of nodes in input, hidden, and output layers.
        self.input_nodes = input_nodes
        self.hidden_nodes = hidden_nodes
        self.output_nodes = output_nodes
        
        # Store the learning rate
        self.learning_rate = learning_rate
        
        # Initialize weights
        
        # initialize self.weights_0_1 as a matrix of zeros. These are the weights between the input layer and the hidden layer.
        self.weights_0_1 = np.zeros([self.input_nodes, self.hidden_nodes])
        
        # initialize self.weights_1_2 as a matrix of random values. These are the weights between the hidden layer and the output layer.
        self.weights_1_2 = np.random.normal(0, 1, [self.hidden_nodes, self.output_nodes])
        
        #Create the input layer, a two-dimensional matrix with shape 
        #  1 x input_nodes, with all values initialized to zero
        self.layer_0 = np.zeros((1,input_nodes))
        
        
     # Update the input layer at each review.   
    def update_input_layer(self,review):
        self.layer_0 *= 0
        for word in review.split(' '):
            if word in self.word2index.keys():
                self.layer_0[0][self.word2index[word]] = 1 
    
    # Update the target layer at each review.
    def get_target_for_label(self,label):
        if label == 'POSITIVE': return 1
        else: return 0
     
    # calculating the sigmoid activation function
    def sigmoid(self,x):
        return 1 / (1 + np.exp(-x))
    
    
    def sigmoid_output_2_derivative(self,output):
        return output * (1 - output)
    
    
    # Funtion declared for training
    def train(self, training_reviews, training_labels):
        assert(len(training_reviews) == len(training_labels)) # make sure out we have a matching number of reviews and labels
        correct_so_far = 0
        start = time.time()
        
        
        # loop through all the given reviews and run a forward and backward pass,
        # updating weights for every item
        for i in range(len(training_reviews)):
            
            review = training_reviews[i]
            label = training_labels[i]
            
            self.update_input_layer(review)
            
            self.layer_1 = self.layer_0.dot(self.weights_0_1)
            self.layer_2 = self.sigmoid(self.layer_1.dot(self.weights_1_2))
            
            # Implement the back propagation pass here. 
            # That means calculate the error for the forward pass's prediction
            err = self.layer_2 - self.get_target_for_label(label)
            layer_2_delta = err * self.sigmoid_output_2_derivative(self.layer_2)
            
            
            
            #       and update the weights in the network according to their
            #       contributions toward the error, as calculated via the
            #       gradient descent and back propagation algorithms
            self.weights_1_2 -= self.learning_rate * self.layer_1.T.dot(layer_2_delta)
            
            self.weights_0_1 -= self.learning_rate * self.layer_0.T.dot(layer_2_delta.dot(self.weights_1_2.T))
            
            
            if(self.layer_2 >= 0.5 and label == 'POSITIVE'):
                correct_so_far += 1
            elif(self.layer_2 < 0.5 and label == 'NEGATIVE'):
                correct_so_far += 1
                
            
            elapsed_time = float(time.time() - start)
            reviews_per_second = i / elapsed_time if elapsed_time > 0 else 0
            
            sys.stdout.write("\rProgress:" + str(100 * i/float(len(training_reviews)))[:4] \
                             + "% Speed(reviews/sec):" + str(reviews_per_second)[0:5] \
                             + " #Correct:" + str(correct_so_far) + " #Trained:" + str(i+1) \
                             + " Training Accuracy:" + str(correct_so_far * 100 / float(i+1))[:4] + "%")
            if(i % 2500 == 0):
                print("")
                
        
    def test(self, testing_reviews, testing_labels):
        """
        Attempts to predict the labels for the given testing_reviews,
        and uses the test_labels to calculate the accuracy of those predictions.
        """
        correct = 0
        start = time.time()
        
        # Loop through each of the given reviews and call run to predict
        # its label.
        for i in range(len(testing_reviews)):
            pred = self.run(testing_reviews[i])
            if(pred == testing_labels[i]):
                correct += 1
                
            # For debug purposes, print out our prediction accuracy and speed 
            # throughout the prediction process.
            elapsed_time = float(time.time() - start)
            reviews_per_second = i / elapsed_time if elapsed_time > 0 else 0

            sys.stdout.write("\rProgress:" + str(100 * i/float(len(testing_reviews)))[:4] \
                                 + "% Speed(reviews/sec):" + str(reviews_per_second)[0:5] \
                                 + " #Correct:" + str(correct) + " #Tested:" + str(i+1) \
                                 + " Testing Accuracy:" + str(correct * 100 / float(i+1))[:4] + "%")

            
    def run(self, review):
        """
        Returns a POSITIVE or NEGATIVE prediction for the given review.
        """
        self.update_input_layer(review)
        layer_1 = self.layer_0.dot(self.weights_0_1)
        layer_2 = self.sigmoid(layer_1.dot(self.weights_1_2))
        
        if layer_2 >= 0.5: return 'POSITIVE'
        else: return 'NEGATIVE'
            
            
            
            
            

Run the following cell to test the network's performance against the last 1000 reviews (the ones we held out from our training set). 

**We have not trained the model yet, so the results should be about 50% as it will just be guessing and there are only two possible values to choose from.**

In [42]:
mlp = SentimentNetwork(reviews[:-1000],labels[:-1000], learning_rate=0.1)

In [43]:
mlp.test(reviews[-1000:],labels[-1000:])

Progress:99.9% Speed(reviews/sec):363.2 #Correct:500 #Tested:1000 Testing Accuracy:50.0%

Run the following cell to actually train the network. During training, it will display the model's accuracy repeatedly as it trains so you can see how well it's doing.

In [44]:
mlp.train(reviews[:-1000],labels[:-1000])

Progress:0.0% Speed(reviews/sec):0.0 #Correct:1 #Trained:1 Training Accuracy:100.%
Progress:10.4% Speed(reviews/sec):44.87 #Correct:1812 #Trained:2501 Training Accuracy:72.4%
Progress:20.8% Speed(reviews/sec):44.25 #Correct:3811 #Trained:5001 Training Accuracy:76.2%
Progress:31.2% Speed(reviews/sec):45.20 #Correct:5900 #Trained:7501 Training Accuracy:78.6%
Progress:41.6% Speed(reviews/sec):45.90 #Correct:8031 #Trained:10001 Training Accuracy:80.3%
Progress:52.0% Speed(reviews/sec):46.08 #Correct:10156 #Trained:12501 Training Accuracy:81.2%
Progress:62.5% Speed(reviews/sec):46.24 #Correct:12275 #Trained:15001 Training Accuracy:81.8%
Progress:72.9% Speed(reviews/sec):46.38 #Correct:14397 #Trained:17501 Training Accuracy:82.2%
Progress:83.3% Speed(reviews/sec):46.46 #Correct:16560 #Trained:20001 Training Accuracy:82.7%
Progress:93.7% Speed(reviews/sec):46.61 #Correct:18735 #Trained:22501 Training Accuracy:83.2%
Progress:99.9% Speed(reviews/sec):46.70 #Correct:20058 #Trained:24000 Training

That most likely didn't train very well. Part of the reason may be because the learning rate is too high. Run the following cell to recreate the network with a smaller learning rate, 0.01, and then train the new network.

In [45]:
mlp = SentimentNetwork(reviews[:-1000],labels[:-1000], learning_rate=0.01)
mlp.train(reviews[:-1000],labels[:-1000])

Progress:0.0% Speed(reviews/sec):0.0 #Correct:1 #Trained:1 Training Accuracy:100.%
Progress:10.4% Speed(reviews/sec):46.30 #Correct:1962 #Trained:2501 Training Accuracy:78.4%
Progress:20.8% Speed(reviews/sec):45.90 #Correct:4002 #Trained:5001 Training Accuracy:80.0%
Progress:31.2% Speed(reviews/sec):45.10 #Correct:6120 #Trained:7501 Training Accuracy:81.5%
Progress:41.6% Speed(reviews/sec):40.75 #Correct:8271 #Trained:10001 Training Accuracy:82.7%
Progress:52.0% Speed(reviews/sec):39.56 #Correct:10431 #Trained:12501 Training Accuracy:83.4%
Progress:62.5% Speed(reviews/sec):40.29 #Correct:12565 #Trained:15001 Training Accuracy:83.7%
Progress:72.9% Speed(reviews/sec):40.79 #Correct:14670 #Trained:17501 Training Accuracy:83.8%
Progress:83.3% Speed(reviews/sec):41.13 #Correct:16833 #Trained:20001 Training Accuracy:84.1%
Progress:93.7% Speed(reviews/sec):41.44 #Correct:19015 #Trained:22501 Training Accuracy:84.5%
Progress:99.9% Speed(reviews/sec):41.70 #Correct:20335 #Trained:24000 Training

That probably wasn't much different. Run the following cell to recreate the network one more time with an even smaller learning rate, 0.001, and then train the new networ

In [46]:
mlp = SentimentNetwork(reviews[:-1000],labels[:-1000], learning_rate=0.001)
mlp.train(reviews[:-1000],labels[:-1000])

Progress:0.0% Speed(reviews/sec):0.0 #Correct:1 #Trained:1 Training Accuracy:100.%
Progress:10.4% Speed(reviews/sec):46.92 #Correct:1941 #Trained:2501 Training Accuracy:77.6%
Progress:20.8% Speed(reviews/sec):46.93 #Correct:3988 #Trained:5001 Training Accuracy:79.7%
Progress:31.2% Speed(reviews/sec):46.58 #Correct:6086 #Trained:7501 Training Accuracy:81.1%
Progress:41.6% Speed(reviews/sec):46.51 #Correct:8205 #Trained:10001 Training Accuracy:82.0%
Progress:52.0% Speed(reviews/sec):46.23 #Correct:10338 #Trained:12501 Training Accuracy:82.6%
Progress:62.5% Speed(reviews/sec):45.95 #Correct:12424 #Trained:15001 Training Accuracy:82.8%
Progress:72.9% Speed(reviews/sec):45.90 #Correct:14525 #Trained:17501 Training Accuracy:82.9%
Progress:83.3% Speed(reviews/sec):46.02 #Correct:16698 #Trained:20001 Training Accuracy:83.4%
Progress:93.7% Speed(reviews/sec):45.98 #Correct:18857 #Trained:22501 Training Accuracy:83.8%
Progress:99.9% Speed(reviews/sec):45.98 #Correct:20173 #Trained:24000 Training

With a learning rate of 0.001, the network should finall have started to improve during training. It's still not very good, but it shows that this solution has potential.