# Sentiment Classification & How To "Frame Problems" for a Neural Network

by Andrew Trask

- **Twitter**: @iamtrask
- **Blog**: http://iamtrask.github.io

### What You Should Already Know

- neural networks, forward and back-propagation
- stochastic gradient descent
- mean squared error
- and train/test splits

### Where to Get Help if You Need it
- Re-watch previous Udacity Lectures
- Leverage the recommended Course Reading Material - [Grokking Deep Learning](https://www.manning.com/books/grokking-deep-learning) (Check inside your classroom for a discount code)
- Shoot me a tweet @iamtrask


### Tutorial Outline:

- Intro: The Importance of "Framing a Problem" (this lesson)

- [Curate a Dataset](#lesson_1)
- [Developing a "Predictive Theory"](#lesson_2)
- [**PROJECT 1**: Quick Theory Validation](#project_1)


- [Transforming Text to Numbers](#lesson_3)
- [**PROJECT 2**: Creating the Input/Output Data](#project_2)


- Putting it all together in a Neural Network (video only - nothing in notebook)
- [**PROJECT 3**: Building our Neural Network](#project_3)


- [Understanding Neural Noise](#lesson_4)
- [**PROJECT 4**: Making Learning Faster by Reducing Noise](#project_4)


- [Analyzing Inefficiencies in our Network](#lesson_5)
- [**PROJECT 5**: Making our Network Train and Run Faster](#project_5)


- [Further Noise Reduction](#lesson_6)
- [**PROJECT 6**: Reducing Noise by Strategically Reducing the Vocabulary](#project_6)


- [Analysis: What's going on in the weights?](#lesson_7)

# Lesson: Curate a Dataset<a id='lesson_1'></a>
The cells from here until Project 1 include code Andrew shows in the videos leading up to mini project 1. We've included them so you can run the code along with the videos without having to type in everything.

In [6]:
def pretty_print_review_and_label(i):
    print(labels[i] + "\t:\t" + reviews[i][:80] + "...")

g = open('reviews.txt','r') # What we know!
reviews = list(map(lambda x:x[:-1],g.readlines()))
g.close()

g = open('labels.txt','r') # What we WANT to know!
labels = list(map(lambda x:x[:-1].upper(),g.readlines()))
g.close()

**Note:** The data in `reviews.txt` we're using has already been preprocessed a bit and contains only lower case characters. If we were working from raw data, where we didn't know it was all lower case, we would want to add a step here to convert it. That's so we treat different variations of the same word, like `The`, `the`, and `THE`, all the same way.

In [7]:
len(reviews)

25000

In [None]:
reviews[0]

In [None]:
labels[0]

# Lesson: Develop a Predictive Theory<a id='lesson_2'></a>

In [None]:
print("labels.txt \t : \t reviews.txt\n")
pretty_print_review_and_label(2137)
pretty_print_review_and_label(12816)
pretty_print_review_and_label(6267)
pretty_print_review_and_label(21934)
pretty_print_review_and_label(5297)
pretty_print_review_and_label(4998)

# Project 1: Quick Theory Validation<a id='project_1'></a>

There are multiple ways to implement these projects, but in order to get your code closer to what Andrew shows in his solutions, we've provided some hints and starter code throughout this notebook.

You'll find the [Counter](https://docs.python.org/2/library/collections.html#collections.Counter) class to be useful in this exercise, as well as the [numpy](https://docs.scipy.org/doc/numpy/reference/) library.

In [8]:
from collections import Counter
import numpy as np

We'll create three `Counter` objects, one for words from postive reviews, one for words from negative reviews, and one for all the words.

In [9]:
# Create three Counter objects to store positive, negative and total counts
positive_counts = Counter()
negative_counts = Counter()
total_counts = Counter()

**TODO:** Examine all the reviews. For each word in a positive review, increase the count for that word in both your positive counter and the total words counter; likewise, for each word in a negative review, increase the count for that word in both your negative counter and the total words counter.

**Note:** Throughout these projects, you should use `split(' ')` to divide a piece of text (such as a review) into individual words. If you use `split()` instead, you'll get slightly different results than what the videos and solutions show.

In [10]:
# TODO: Loop over all the words in all the reviews and increment the counts in the appropriate counter objects
for review,label in zip(reviews,labels):
    words = review.split(' ')
    for word in words:
        if label == 'POSITIVE':
            positive_counts[word]+=1
            total_counts[word]+=1
        else: 
            negative_counts[word]+=1
            total_counts[word]+=1

Run the following two cells to list the words used in positive reviews and negative reviews, respectively, ordered from most to least commonly used. 

In [None]:
# Examine the counts of the most common words in positive reviews
positive_counts.most_common()

In [None]:
# Examine the counts of the most common words in negative reviews
negative_counts.most_common()

As you can see, common words like "the" appear very often in both positive and negative reviews. Instead of finding the most common words in positive or negative reviews, what you really want are the words found in positive reviews more often than in negative reviews, and vice versa. To accomplish this, you'll need to calculate the **ratios** of word usage between positive and negative reviews.

**TODO:** Check all the words you've seen and calculate the ratio of postive to negative uses and store that ratio in `pos_neg_ratios`. 
>Hint: the positive-to-negative ratio for a given word can be calculated with `positive_counts[word] / float(negative_counts[word]+1)`. Notice the `+1` in the denominator – that ensures we don't divide by zero for words that are only seen in positive reviews.

In [11]:
# Create Counter object to store positive/negative ratios
pos_neg_ratios = Counter()

# TODO: Calculate the ratios of positive and negative uses of the most common words
#       Consider words to be "common" if they've been used at least 100 times

for word in total_counts:
    if total_counts[word] >= 100:
        pos_neg_ratios[word] = positive_counts[word] / float(negative_counts[word]+1)


Examine the ratios you've calculated for a few words:

In [None]:
print("Pos-to-neg ratio for 'the' = {}".format(pos_neg_ratios["the"]))
print("Pos-to-neg ratio for 'amazing' = {}".format(pos_neg_ratios["amazing"]))
print("Pos-to-neg ratio for 'terrible' = {}".format(pos_neg_ratios["terrible"]))

Looking closely at the values you just calculated, we see the following:

* Words that you would expect to see more often in positive reviews – like "amazing" – have a ratio greater than 1. The more skewed a word is toward postive, the farther from 1 its positive-to-negative ratio  will be.
* Words that you would expect to see more often in negative reviews – like "terrible" – have positive values that are less than 1. The more skewed a word is toward negative, the closer to zero its positive-to-negative ratio will be.
* Neutral words, which don't really convey any sentiment because you would expect to see them in all sorts of reviews – like "the" – have values very close to 1. A perfectly neutral word – one that was used in exactly the same number of positive reviews as negative reviews – would be almost exactly 1. The `+1` we suggested you add to the denominator slightly biases words toward negative, but it won't matter because it will be a tiny bias and later we'll be ignoring words that are too close to neutral anyway.

Ok, the ratios tell us which words are used more often in postive or negative reviews, but the specific values we've calculated are a bit difficult to work with. A very positive word like "amazing" has a value above 4, whereas a very negative word like "terrible" has a value around 0.18. Those values aren't easy to compare for a couple of reasons:

* Right now, 1 is considered neutral, but the absolute value of the postive-to-negative rations of very postive words is larger than the absolute value of the ratios for the very negative words. So there is no way to directly compare two numbers and see if one word conveys the same magnitude of positive sentiment as another word conveys negative sentiment. So we should center all the values around netural so the absolute value fro neutral of the postive-to-negative ratio for a word would indicate how much sentiment (positive or negative) that word conveys.
* When comparing absolute values it's easier to do that around zero than one. 

To fix these issues, we'll convert all of our ratios to new values using logarithms.

**TODO:** Go through all the ratios you calculated and convert them to logarithms. (i.e. use `np.log(ratio)`)

In the end, extremely positive and extremely negative words will have positive-to-negative ratios with similar magnitudes but opposite signs.

In [12]:
# TODO: Convert ratios to logs
for word in pos_neg_ratios:
    pos_neg_ratios[word] = np.log(pos_neg_ratios[word])

Examine the new ratios you've calculated for the same words from before:

In [None]:
print("Pos-to-neg ratio for 'the' = {}".format(pos_neg_ratios["the"]))
print("Pos-to-neg ratio for 'amazing' = {}".format(pos_neg_ratios["amazing"]))
print("Pos-to-neg ratio for 'terrible' = {}".format(pos_neg_ratios["terrible"]))

If everything worked, now you should see neutral words with values close to zero. In this case, "the" is near zero but slightly positive, so it was probably used in more positive reviews than negative reviews. But look at "amazing"'s ratio - it's above `1`, showing it is clearly a word with positive sentiment. And "terrible" has a similar score, but in the opposite direction, so it's below `-1`. It's now clear that both of these words are associated with specific, opposing sentiments.

Now run the following cells to see more ratios. 

The first cell displays all the words, ordered by how associated they are with postive reviews. (Your notebook will most likely truncate the output so you won't actually see *all* the words in the list.)

The second cell displays the 30 words most associated with negative reviews by reversing the order of the first list and then looking at the first 30 words. (If you want the second cell to display all the words, ordered by how associated they are with negative reviews, you could just write `reversed(pos_neg_ratios.most_common())`.)

You should continue to see values similar to the earlier ones we checked – neutral words will be close to `0`, words will get more positive as their ratios approach and go above `1`, and words will get more negative as their ratios approach and go below `-1`. That's why we decided to use the logs instead of the raw ratios.

In [None]:
# words most frequently seen in a review with a "POSITIVE" label
pos_neg_ratios.most_common()

In [None]:
# words most frequently seen in a review with a "NEGATIVE" label
list(reversed(pos_neg_ratios.most_common()))[0:30]

# Note: Above is the code Andrew uses in his solution video, 
#       so we've included it here to avoid confusion.
#       If you explore the documentation for the Counter class, 
#       you will see you could also find the 30 least common
#       words like this: pos_neg_ratios.most_common()[:-31:-1]

# End of Project 1. 
## Watch the next video to see Andrew's solution, then continue on to the next lesson.

# Transforming Text into Numbers<a id='lesson_3'></a>
The cells here include code Andrew shows in the next video. We've included it so you can run the code along with the video without having to type in everything.

In [None]:
from IPython.display import Image

review = "This was a horrible, terrible movie."

Image(filename='sentiment_network.png')

In [None]:
review = "The movie was excellent"

Image(filename='sentiment_network_pos.png')

# Project 2: Creating the Input/Output Data<a id='project_2'></a>

**TODO:** Create a [set](https://docs.python.org/3/tutorial/datastructures.html#sets) named `vocab` that contains every word in the vocabulary.

In [13]:
# TODO: Create set named "vocab" containing all of the words from all of the reviews
vocab = set(total_counts)

Run the following cell to check your vocabulary size. If everything worked correctly, it should print **74074**

In [14]:
vocab_size = len(vocab)
print(vocab_size)

74074


Take a look at the following image. It represents the layers of the neural network you'll be building throughout this notebook. `layer_0` is the input layer, `layer_1` is a hidden layer, and `layer_2` is the output layer.

In [None]:
from IPython.display import Image
Image(filename='sentiment_network_2.png')

**TODO:** Create a numpy array called `layer_0` and initialize it to all zeros. You will find the [zeros](https://docs.scipy.org/doc/numpy/reference/generated/numpy.zeros.html) function particularly helpful here. Be sure you create `layer_0` as a 2-dimensional matrix with 1 row and `vocab_size` columns. 

In [15]:
# TODO: Create layer_0 matrix with dimensions 1 by vocab_size, initially filled with zeros
layer_0 = np.zeros((1,vocab_size))

Run the following cell. It should display `(1, 74074)`

In [16]:
layer_0.shape

(1, 74074)

In [None]:
from IPython.display import Image
Image(filename='sentiment_network.png')

`layer_0` contains one entry for every word in the vocabulary, as shown in the above image. We need to make sure we know the index of each word, so run the following cell to create a lookup table that stores the index of every word.

In [17]:
# Create a dictionary of words in the vocabulary mapped to index positions
# (to be used in layer_0)
word2index = {}
for i,word in enumerate(vocab):
    word2index[word] = i
    
# display the map of words to indices
#word2index

**TODO:**  Complete the implementation of `update_input_layer`. It should count 
          how many times each word is used in the given review, and then store
          those counts at the appropriate indices inside `layer_0`.

In [18]:
def update_input_layer(review):
    """ Modify the global layer_0 to represent the vector form of review.
    The element at a given index of layer_0 should represent
    how many times the given word occurs in the review.
    Args:
        review(string) - the string of the review
    Returns:
        None
    """
    global layer_0
    # clear out previous state by resetting the layer to be all 0s
    layer_0 *= 0
    
    # TODO: count how many times each word is used in the given review and store the results in layer_0 
    for word in review.split(" "):
        layer_0[0][word2index[word]]+=1
        

Run the following cell to test updating the input layer with the first review. The indices assigned may not be the same as in the solution, but hopefully you'll see some non-zero values in `layer_0`.  

In [19]:
update_input_layer(reviews[0])
layer_0

array([[18.,  0.,  0., ...,  0.,  0.,  0.]])

**TODO:** Complete the implementation of `get_target_for_labels`. It should return `0` or `1`, 
          depending on whether the given label is `NEGATIVE` or `POSITIVE`, respectively.

In [20]:
def get_target_for_label(label):
    """Convert a label to `0` or `1`.
    Args:
        label(string) - Either "POSITIVE" or "NEGATIVE".
    Returns:
        `0` or `1`.
    """
    # TODO: Your code here
    if(label == 'POSITIVE'):
        return 1
    else:
        return 0

Run the following two cells. They should print out`'POSITIVE'` and `1`, respectively.

In [None]:
labels[0]

In [22]:
get_target_for_label(labels[0])

1

Run the following two cells. They should print out `'NEGATIVE'` and `0`, respectively.

In [None]:
labels[1]

In [21]:
get_target_for_label(labels[1])

0

# End of Project 2. 
## Watch the next video to see Andrew's solution, then continue on to the next lesson.

# Project 3: Building a Neural Network<a id='project_3'></a>

**TODO:** We've included the framework of a class called `SentimentNetork`. Implement all of the items marked `TODO` in the code. These include doing the following:
- Create a basic neural network much like the networks you've seen in earlier lessons and in Project 1, with an input layer, a hidden layer, and an output layer. 
- Do **not** add a non-linearity in the hidden layer. That is, do not use an activation function when calculating the hidden layer outputs.
- Re-use the code from earlier in this notebook to create the training data (see `TODO`s in the code)
- Implement the `pre_process_data` function to create the vocabulary for our training data generating functions
- Ensure `train` trains over the entire corpus

### Where to Get Help if You Need it
- Re-watch earlier Udacity lectures
- Chapters 3-5 - [Grokking Deep Learning](https://www.manning.com/books/grokking-deep-learning) - (Check inside your classroom for a discount code)

In [77]:
import time
import sys
import numpy as np

# Encapsulate our neural network in a class
class SentimentNetwork:
    def __init__(self, reviews, labels, hidden_nodes = 10, learning_rate = 0.1):
        """Create a SentimenNetwork with the given settings
        Args:
            reviews(list) - List of reviews used for training
            labels(list) - List of POSITIVE/NEGATIVE labels associated with the given reviews
            hidden_nodes(int) - Number of nodes to create in the hidden layer
            learning_rate(float) - Learning rate to use while training
        
        """
        # Assign a seed to our random number generator to ensure we get
        # reproducable results during development 
        np.random.seed(1)

        # process the reviews and their associated labels so that everything
        # is ready for training
        self.pre_process_data(reviews, labels)
        
        # Build the network to have the number of hidden nodes and the learning rate that
        # were passed into this initializer. Make the same number of input nodes as
        # there are vocabulary words and create a single output node.
        self.init_network(len(self.review_vocab),hidden_nodes, 1, learning_rate)

    def pre_process_data(self, reviews, labels):
        
        review_vocab = set()
        # TODO: populate review_vocab with all of the words in the given reviews
        #       Remember to split reviews into individual words 
        #       using "split(' ')" instead of "split()".
        for review in reviews:
            for word in review.split(" "):
                review_vocab.add(word)
        
        # Convert the vocabulary set to a list so we can access words via indices
        self.review_vocab = list(review_vocab)
        
        label_vocab = set()
        # TODO: populate label_vocab with all of the words in the given labels.
        #       There is no need to split the labels because each one is a single word.
        for label in labels:
            label_vocab.add(label)
        
        # Convert the label vocabulary set to a list so we can access labels via indices
        self.label_vocab = list(label_vocab)
        
        # Store the sizes of the review and label vocabularies.
        self.review_vocab_size = len(self.review_vocab)
        self.label_vocab_size = len(self.label_vocab)
        
        # Create a dictionary of words in the vocabulary mapped to index positions
        self.word2index = {}
        # TODO: populate self.word2index with indices for all the words in self.review_vocab
        #       like you saw earlier in the notebook
        for i, word in enumerate(self.review_vocab):
            self.word2index[word] = i
        
        # Create a dictionary of labels mapped to index positions
        self.label2index = {}
        # TODO: do the same thing you did for self.word2index and self.review_vocab, 
        #       but for self.label2index and self.label_vocab instead
        for i, label in enumerate(self.label_vocab):
            self.label2index[label] = i
         
        
    def init_network(self, input_nodes, hidden_nodes, output_nodes, learning_rate):
        # Store the number of nodes in input, hidden, and output layers.
        self.input_nodes = input_nodes
        self.hidden_nodes = hidden_nodes
        self.output_nodes = output_nodes

        # Store the learning rate
        self.learning_rate = learning_rate

        # Initialize weights
        
        # TODO: initialize self.weights_0_1 as a matrix of zeros. These are the weights between
        #       the input layer and the hidden layer.
        self.weights_0_1 = np.zeros((self.input_nodes,self.hidden_nodes))
        
        # TODO: initialize self.weights_1_2 as a matrix of random values. 
        #       These are the weights between the hidden layer and the output layer.
        self.weights_1_2 = np.random.normal(0.0, self.output_nodes**-0.5, 
                                                (self.hidden_nodes, self.output_nodes))
        
        # TODO: Create the input layer, a two-dimensional matrix with shape 
        #       1 x input_nodes, with all values initialized to zero
        self.layer_0 = np.zeros((1,input_nodes))
    
        
    def update_input_layer(self,review):
        # TODO: You can copy most of the code you wrote for update_input_layer 
        #       earlier in this notebook. 
        #
        #       However, MAKE SURE YOU CHANGE ALL VARIABLES TO REFERENCE
        #       THE VERSIONS STORED IN THIS OBJECT, NOT THE GLOBAL OBJECTS.
        #       For example, replace "layer_0 *= 0" with "self.layer_0 *= 0"
        self.layer_0 *= 0
        for word in review.split(" "):
            if(word in self.word2index.keys()):
                self.layer_0[0][self.word2index[word]]+=1

                
    def get_target_for_label(self,label):
        # TODO: Copy the code you wrote for get_target_for_label 
        #       earlier in this notebook. 
        if(label == 'POSITIVE'):
            return 1
        else:
            return 0        
        
    def sigmoid(self,x):
        # TODO: Return the result of calculating the sigmoid activation function
        #       shown in the lectures
        return 1/(1+np.exp(-x))
    
    def sigmoid_output_2_derivative(self,output):
        # TODO: Return the derivative of the sigmoid activation function, 
        #       where "output" is the original output from the sigmoid function 
        return output*(1-output)

    def train(self, training_reviews, training_labels):
        
        # make sure out we have a matching number of reviews and labels
        assert(len(training_reviews) == len(training_labels))
        
        # Keep track of correct predictions to display accuracy during training 
        correct_so_far = 0
        
        # Remember when we started for printing time statistics
        start = time.time()

        # loop through all the given reviews and run a forward and backward pass,
        # updating weights for every item
        for i in range(len(training_reviews)):
            
            # TODO: Get the next review and its correct label
            review = training_reviews[i]
            label = training_labels[i]
            
            # TODO: Implement the forward pass through the network. 
            #       That means use the given review to update the input layer, 
            #       then calculate values for the hidden layer,
            #       and finally calculate the output layer.
            # 
            #       Do not use an activation function for the hidden layer,
            #       but use the sigmoid activation function for the output layer.
            self.update_input_layer(review)
            layer_1 = np.dot(self.layer_0, self.weights_0_1)
            layer_2 = self.sigmoid(np.dot(layer_1, self.weights_1_2))
            
            # TODO: Implement the back propagation pass here. 
            #       That means calculate the error for the forward pass's prediction
            #       and update the weights in the network according to their
            #       contributions toward the error, as calculated via the
            #       gradient descent and back propagation algorithms you 
            #       learned in class.
            layer_2_error_term = (self.get_target_for_label(label) - layer_2) * self.sigmoid_output_2_derivative(layer_2) #### WHY IS MINUS DONE WRONG WAY?
            layer_1_error_term = layer_2_error_term * self.weights_1_2 #No output multiplication, coz NO sigmoid
            self.weights_1_2 += self.learning_rate * layer_2_error_term * layer_1.T #No need to divide by mean
            self.weights_0_1 += self.learning_rate * layer_1_error_term.T * self.layer_0.T #### WHY -=, AND NOT += ?

            # TODO: Keep track of correct predictions. To determine if the prediction was
            #       correct, check that the absolute value of the output error 
            #       is less than 0.5. If so, add one to the correct_so_far count.
            if np.abs(self.get_target_for_label(label) - layer_2) < 0.5: #### IS THIS CORRECT?
                correct_so_far += 1
                
            # For debug purposes, print out our prediction accuracy and speed 
            # throughout the training process. 

            elapsed_time = float(time.time() - start)
            reviews_per_second = i / elapsed_time if elapsed_time > 0 else 0
            
            sys.stdout.write("\rProgress:" + str(100 * i/float(len(training_reviews)))[:4] \
                             + "% Speed(reviews/sec):" + str(reviews_per_second)[0:5] \
                             + " #Correct:" + str(correct_so_far) + " #Trained:" + str(i+1) \
                             + " Training Accuracy:" + str(correct_so_far * 100 / float(i+1))[:4] + "%")
            if(i % 2500 == 0):
                print("")
    
    def test(self, testing_reviews, testing_labels):
        """
        Attempts to predict the labels for the given testing_reviews,
        and uses the test_labels to calculate the accuracy of those predictions.
        """
        
        # keep track of how many correct predictions we make
        correct = 0

        # we'll time how many predictions per second we make
        start = time.time()

        # Loop through each of the given reviews and call run to predict
        # its label. 
        for i in range(len(testing_reviews)):
            pred = self.run(testing_reviews[i])
            if(pred == testing_labels[i]):
                correct += 1
            
            # For debug purposes, print out our prediction accuracy and speed 
            # throughout the prediction process. 

            elapsed_time = float(time.time() - start)
            reviews_per_second = i / elapsed_time if elapsed_time > 0 else 0
            
            sys.stdout.write("\rProgress:" + str(100 * i/float(len(testing_reviews)))[:4] \
                             + "% Speed(reviews/sec):" + str(reviews_per_second)[0:5] \
                             + " #Correct:" + str(correct) + " #Tested:" + str(i+1) \
                             + " Testing Accuracy:" + str(correct * 100 / float(i+1))[:4] + "%")
    
    def run(self, review):
        """
        Returns a POSITIVE or NEGATIVE prediction for the given review.
        """
        # TODO: Run a forward pass through the network, like you did in the
        #       "train" function. That means use the given review to 
        #       update the input layer, then calculate values for the hidden layer,
        #       and finally calculate the output layer.
        #
        #       Note: The review passed into this function for prediction 
        #             might come from anywhere, so you should convert it 
        #             to lower case prior to using it.
        
        # TODO: The output layer should now contain a prediction. 
        #       Return `POSITIVE` for predictions greater-than-or-equal-to `0.5`, 
        #       and `NEGATIVE` otherwise.
        self.update_input_layer(review.lower())
        layer_1 = np.dot(self.layer_0, self.weights_0_1)
        layer_2 = self.sigmoid(np.dot(layer_1, self.weights_1_2))

        if layer_2 >= 0.5:
            return 'POSITIVE'
        else:
            return 'NEGATIVE'


Run the following cell to create a `SentimentNetwork` that will train on all but the last 1000 reviews (we're saving those for testing). Here we use a learning rate of `0.1`.

In [70]:
mlp = SentimentNetwork(reviews[:-1000],labels[:-1000], learning_rate=0.1)

Run the following cell to test the network's performance against the last 1000 reviews (the ones we held out from our training set). 

**We have not trained the model yet, so the results should be about 50% as it will just be guessing and there are only two possible values to choose from.**

In [53]:
mlp.test(reviews[-1000:],labels[-1000:])

Progress:99.9% Speed(reviews/sec):387.2 #Correct:500 #Tested:1000 Testing Accuracy:50.0%

Run the following cell to actually train the network. During training, it will display the model's accuracy repeatedly as it trains so you can see how well it's doing.

In [76]:
mlp.train(reviews[:-1000],labels[:-1000])

Progress:0.0% Speed(reviews/sec):0.0 #Correct:1 #Trained:1 Training Accuracy:100.%
Progress:10.4% Speed(reviews/sec):79.41 #Correct:1251 #Trained:2501 Training Accuracy:50.0%
Progress:20.8% Speed(reviews/sec):83.99 #Correct:2501 #Trained:5001 Training Accuracy:50.0%
Progress:31.2% Speed(reviews/sec):86.08 #Correct:3751 #Trained:7501 Training Accuracy:50.0%
Progress:41.6% Speed(reviews/sec):87.23 #Correct:5001 #Trained:10001 Training Accuracy:50.0%
Progress:52.0% Speed(reviews/sec):88.04 #Correct:6251 #Trained:12501 Training Accuracy:50.0%
Progress:62.5% Speed(reviews/sec):88.65 #Correct:7501 #Trained:15001 Training Accuracy:50.0%
Progress:72.9% Speed(reviews/sec):89.03 #Correct:8751 #Trained:17501 Training Accuracy:50.0%
Progress:83.3% Speed(reviews/sec):89.13 #Correct:10001 #Trained:20001 Training Accuracy:50.0%
Progress:93.7% Speed(reviews/sec):89.28 #Correct:11251 #Trained:22501 Training Accuracy:50.0%
Progress:99.9% Speed(reviews/sec):89.41 #Correct:12000 #Trained:24000 Training Ac

That most likely didn't train very well. Part of the reason may be because the learning rate is too high. Run the following cell to recreate the network with a smaller learning rate, `0.01`, and then train the new network.

In [75]:
mlp = SentimentNetwork(reviews[:-1000],labels[:-1000], learning_rate=0.01)
mlp.train(reviews[:-1000],labels[:-1000])

Progress:0.0% Speed(reviews/sec):0.0 #Correct:0 #Trained:1 Training Accuracy:0.0%
Progress:10.4% Speed(reviews/sec):87.36 #Correct:1247 #Trained:2501 Training Accuracy:49.8%
Progress:20.8% Speed(reviews/sec):88.55 #Correct:2497 #Trained:5001 Training Accuracy:49.9%
Progress:31.2% Speed(reviews/sec):87.81 #Correct:3747 #Trained:7501 Training Accuracy:49.9%
Progress:41.6% Speed(reviews/sec):88.55 #Correct:4997 #Trained:10001 Training Accuracy:49.9%
Progress:52.0% Speed(reviews/sec):88.78 #Correct:6247 #Trained:12501 Training Accuracy:49.9%
Progress:62.5% Speed(reviews/sec):88.73 #Correct:7490 #Trained:15001 Training Accuracy:49.9%
Progress:72.9% Speed(reviews/sec):88.98 #Correct:8757 #Trained:17501 Training Accuracy:50.0%
Progress:76.6% Speed(reviews/sec):89.06 #Correct:9203 #Trained:18395 Training Accuracy:50.0%

  return 1/(1+np.exp(-x))


Progress:83.3% Speed(reviews/sec):88.96 #Correct:10006 #Trained:20001 Training Accuracy:50.0%
Progress:93.7% Speed(reviews/sec):88.84 #Correct:11274 #Trained:22501 Training Accuracy:50.1%
Progress:99.9% Speed(reviews/sec):88.76 #Correct:12023 #Trained:24000 Training Accuracy:50.0%

That probably wasn't much different. Run the following cell to recreate the network one more time with an even smaller learning rate, `0.001`, and then train the new network.

In [74]:
mlp = SentimentNetwork(reviews[:-1000],labels[:-1000], learning_rate=0.001)
mlp.train(reviews[:-1000],labels[:-1000])

Progress:0.0% Speed(reviews/sec):0.0 #Correct:0 #Trained:1 Training Accuracy:0.0%
Progress:10.4% Speed(reviews/sec):85.87 #Correct:1261 #Trained:2501 Training Accuracy:50.4%
Progress:20.8% Speed(reviews/sec):82.70 #Correct:2527 #Trained:5001 Training Accuracy:50.5%
Progress:31.2% Speed(reviews/sec):82.43 #Correct:3877 #Trained:7501 Training Accuracy:51.6%
Progress:41.6% Speed(reviews/sec):82.31 #Correct:5345 #Trained:10001 Training Accuracy:53.4%
Progress:52.0% Speed(reviews/sec):82.01 #Correct:6946 #Trained:12501 Training Accuracy:55.5%
Progress:62.5% Speed(reviews/sec):83.12 #Correct:8521 #Trained:15001 Training Accuracy:56.8%
Progress:72.9% Speed(reviews/sec):84.18 #Correct:10099 #Trained:17501 Training Accuracy:57.7%
Progress:83.3% Speed(reviews/sec):84.86 #Correct:11798 #Trained:20001 Training Accuracy:58.9%
Progress:93.7% Speed(reviews/sec):85.40 #Correct:13482 #Trained:22501 Training Accuracy:59.9%
Progress:99.9% Speed(reviews/sec):85.53 #Correct:14489 #Trained:24000 Training Ac

With a learning rate of `0.001`, the network should finally have started to improve during training. It's still not very good, but it shows that this solution has potential. We will improve it in the next lesson.

# End of Project 3. 
## Watch the next video to see Andrew's solution, then continue on to the next lesson.

# Understanding Neural Noise<a id='lesson_4'></a>

The following cells include includes the code Andrew shows in the next video. We've included it here so you can run the cells along with the video without having to type in everything.

In [None]:
from IPython.display import Image
Image(filename='sentiment_network.png')

In [None]:
def update_input_layer(review):
    
    global layer_0
    
    # clear out previous state, reset the layer to be all 0s
    layer_0 *= 0
    for word in review.split(" "):
        layer_0[0][word2index[word]] += 1

update_input_layer(reviews[0])

In [None]:
layer_0

In [None]:
review_counter = Counter()

In [None]:
for word in reviews[0].split(" "):
    review_counter[word] += 1

In [None]:
review_counter.most_common()

# Project 4: Reducing Noise in Our Input Data<a id='project_4'></a>

**TODO:** Attempt to reduce the noise in the input data like Andrew did in the previous video. Specifically, do the following:
* Copy the `SentimentNetwork` class you created earlier into the following cell.
* Modify `update_input_layer` so it does not count how many times each word is used, but rather just stores whether or not a word was used. 

In [78]:
import time
import sys
import numpy as np

# Encapsulate our neural network in a class
class SentimentNetwork:
    def __init__(self, reviews, labels, hidden_nodes = 10, learning_rate = 0.1):
        """Create a SentimenNetwork with the given settings
        Args:
            reviews(list) - List of reviews used for training
            labels(list) - List of POSITIVE/NEGATIVE labels associated with the given reviews
            hidden_nodes(int) - Number of nodes to create in the hidden layer
            learning_rate(float) - Learning rate to use while training
        
        """
        # Assign a seed to our random number generator to ensure we get
        # reproducable results during development 
        np.random.seed(1)

        # process the reviews and their associated labels so that everything
        # is ready for training
        self.pre_process_data(reviews, labels)
        
        # Build the network to have the number of hidden nodes and the learning rate that
        # were passed into this initializer. Make the same number of input nodes as
        # there are vocabulary words and create a single output node.
        self.init_network(len(self.review_vocab),hidden_nodes, 1, learning_rate)

    def pre_process_data(self, reviews, labels):
        
        review_vocab = set()
        # TODO: populate review_vocab with all of the words in the given reviews
        #       Remember to split reviews into individual words 
        #       using "split(' ')" instead of "split()".
        for review in reviews:
            for word in review.split(" "):
                review_vocab.add(word)
        
        # Convert the vocabulary set to a list so we can access words via indices
        self.review_vocab = list(review_vocab)
        
        label_vocab = set()
        # TODO: populate label_vocab with all of the words in the given labels.
        #       There is no need to split the labels because each one is a single word.
        for label in labels:
            label_vocab.add(label)
        
        # Convert the label vocabulary set to a list so we can access labels via indices
        self.label_vocab = list(label_vocab)
        
        # Store the sizes of the review and label vocabularies.
        self.review_vocab_size = len(self.review_vocab)
        self.label_vocab_size = len(self.label_vocab)
        
        # Create a dictionary of words in the vocabulary mapped to index positions
        self.word2index = {}
        # TODO: populate self.word2index with indices for all the words in self.review_vocab
        #       like you saw earlier in the notebook
        for i, word in enumerate(self.review_vocab):
            self.word2index[word] = i
        
        # Create a dictionary of labels mapped to index positions
        self.label2index = {}
        # TODO: do the same thing you did for self.word2index and self.review_vocab, 
        #       but for self.label2index and self.label_vocab instead
        for i, label in enumerate(self.label_vocab):
            self.label2index[label] = i
         
        
    def init_network(self, input_nodes, hidden_nodes, output_nodes, learning_rate):
        # Store the number of nodes in input, hidden, and output layers.
        self.input_nodes = input_nodes
        self.hidden_nodes = hidden_nodes
        self.output_nodes = output_nodes

        # Store the learning rate
        self.learning_rate = learning_rate

        # Initialize weights
        
        # TODO: initialize self.weights_0_1 as a matrix of zeros. These are the weights between
        #       the input layer and the hidden layer.
        self.weights_0_1 = np.zeros((self.input_nodes,self.hidden_nodes))
        
        # TODO: initialize self.weights_1_2 as a matrix of random values. 
        #       These are the weights between the hidden layer and the output layer.
        self.weights_1_2 = np.random.normal(0.0, self.output_nodes**-0.5, 
                                                (self.hidden_nodes, self.output_nodes))
        
        # TODO: Create the input layer, a two-dimensional matrix with shape 
        #       1 x input_nodes, with all values initialized to zero
        self.layer_0 = np.zeros((1,input_nodes))
    
        
    def update_input_layer(self,review):
        # TODO: You can copy most of the code you wrote for update_input_layer 
        #       earlier in this notebook. 
        #
        #       However, MAKE SURE YOU CHANGE ALL VARIABLES TO REFERENCE
        #       THE VERSIONS STORED IN THIS OBJECT, NOT THE GLOBAL OBJECTS.
        #       For example, replace "layer_0 *= 0" with "self.layer_0 *= 0"
        self.layer_0 *= 0
        for word in review.split(" "):
            if(word in self.word2index.keys()):
                self.layer_0[0][self.word2index[word]]=1

                
    def get_target_for_label(self,label):
        # TODO: Copy the code you wrote for get_target_for_label 
        #       earlier in this notebook. 
        if(label == 'POSITIVE'):
            return 1
        else:
            return 0        
        
    def sigmoid(self,x):
        # TODO: Return the result of calculating the sigmoid activation function
        #       shown in the lectures
        return 1/(1+np.exp(-x))
    
    def sigmoid_output_2_derivative(self,output):
        # TODO: Return the derivative of the sigmoid activation function, 
        #       where "output" is the original output from the sigmoid function 
        return output*(1-output)

    def train(self, training_reviews, training_labels):
        
        # make sure out we have a matching number of reviews and labels
        assert(len(training_reviews) == len(training_labels))
        
        # Keep track of correct predictions to display accuracy during training 
        correct_so_far = 0
        
        # Remember when we started for printing time statistics
        start = time.time()

        # loop through all the given reviews and run a forward and backward pass,
        # updating weights for every item
        for i in range(len(training_reviews)):
            
            # TODO: Get the next review and its correct label
            review = training_reviews[i]
            label = training_labels[i]
            
            # TODO: Implement the forward pass through the network. 
            #       That means use the given review to update the input layer, 
            #       then calculate values for the hidden layer,
            #       and finally calculate the output layer.
            # 
            #       Do not use an activation function for the hidden layer,
            #       but use the sigmoid activation function for the output layer.
            self.update_input_layer(review)
            layer_1 = np.dot(self.layer_0, self.weights_0_1)
            layer_2 = self.sigmoid(np.dot(layer_1, self.weights_1_2))
            
            # TODO: Implement the back propagation pass here. 
            #       That means calculate the error for the forward pass's prediction
            #       and update the weights in the network according to their
            #       contributions toward the error, as calculated via the
            #       gradient descent and back propagation algorithms you 
            #       learned in class.
            layer_2_error_term = (self.get_target_for_label(label) - layer_2) * self.sigmoid_output_2_derivative(layer_2) #### WHY IS MINUS DONE WRONG WAY?
            layer_1_error_term = layer_2_error_term * self.weights_1_2 #No output multiplication, coz NO sigmoid
            self.weights_1_2 += self.learning_rate * layer_2_error_term * layer_1.T #No need to divide by mean
            self.weights_0_1 += self.learning_rate * layer_1_error_term.T * self.layer_0.T #### WHY -=, AND NOT += ?

            # TODO: Keep track of correct predictions. To determine if the prediction was
            #       correct, check that the absolute value of the output error 
            #       is less than 0.5. If so, add one to the correct_so_far count.
            if np.abs(self.get_target_for_label(label) - layer_2) < 0.5: #### IS THIS CORRECT?
                correct_so_far += 1
                
            # For debug purposes, print out our prediction accuracy and speed 
            # throughout the training process. 

            elapsed_time = float(time.time() - start)
            reviews_per_second = i / elapsed_time if elapsed_time > 0 else 0
            
            sys.stdout.write("\rProgress:" + str(100 * i/float(len(training_reviews)))[:4] \
                             + "% Speed(reviews/sec):" + str(reviews_per_second)[0:5] \
                             + " #Correct:" + str(correct_so_far) + " #Trained:" + str(i+1) \
                             + " Training Accuracy:" + str(correct_so_far * 100 / float(i+1))[:4] + "%")
            if(i % 2500 == 0):
                print("")
    
    def test(self, testing_reviews, testing_labels):
        """
        Attempts to predict the labels for the given testing_reviews,
        and uses the test_labels to calculate the accuracy of those predictions.
        """
        
        # keep track of how many correct predictions we make
        correct = 0

        # we'll time how many predictions per second we make
        start = time.time()

        # Loop through each of the given reviews and call run to predict
        # its label. 
        for i in range(len(testing_reviews)):
            pred = self.run(testing_reviews[i])
            if(pred == testing_labels[i]):
                correct += 1
            
            # For debug purposes, print out our prediction accuracy and speed 
            # throughout the prediction process. 

            elapsed_time = float(time.time() - start)
            reviews_per_second = i / elapsed_time if elapsed_time > 0 else 0
            
            sys.stdout.write("\rProgress:" + str(100 * i/float(len(testing_reviews)))[:4] \
                             + "% Speed(reviews/sec):" + str(reviews_per_second)[0:5] \
                             + " #Correct:" + str(correct) + " #Tested:" + str(i+1) \
                             + " Testing Accuracy:" + str(correct * 100 / float(i+1))[:4] + "%")
    
    def run(self, review):
        """
        Returns a POSITIVE or NEGATIVE prediction for the given review.
        """
        # TODO: Run a forward pass through the network, like you did in the
        #       "train" function. That means use the given review to 
        #       update the input layer, then calculate values for the hidden layer,
        #       and finally calculate the output layer.
        #
        #       Note: The review passed into this function for prediction 
        #             might come from anywhere, so you should convert it 
        #             to lower case prior to using it.
        
        # TODO: The output layer should now contain a prediction. 
        #       Return `POSITIVE` for predictions greater-than-or-equal-to `0.5`, 
        #       and `NEGATIVE` otherwise.
        self.update_input_layer(review.lower())
        layer_1 = np.dot(self.layer_0, self.weights_0_1)
        layer_2 = self.sigmoid(np.dot(layer_1, self.weights_1_2))

        if layer_2 >= 0.5:
            return 'POSITIVE'
        else:
            return 'NEGATIVE'


Run the following cell to recreate the network and train it. Notice we've gone back to the higher learning rate of `0.1`.

In [79]:
mlp = SentimentNetwork(reviews[:-1000],labels[:-1000], learning_rate=0.1)
mlp.train(reviews[:-1000],labels[:-1000])

Progress:0.0% Speed(reviews/sec):0.0 #Correct:0 #Trained:1 Training Accuracy:0.0%
Progress:10.4% Speed(reviews/sec):88.18 #Correct:1833 #Trained:2501 Training Accuracy:73.2%
Progress:20.8% Speed(reviews/sec):83.27 #Correct:3841 #Trained:5001 Training Accuracy:76.8%
Progress:31.2% Speed(reviews/sec):81.34 #Correct:5914 #Trained:7501 Training Accuracy:78.8%
Progress:41.6% Speed(reviews/sec):81.45 #Correct:8042 #Trained:10001 Training Accuracy:80.4%
Progress:52.0% Speed(reviews/sec):79.01 #Correct:10184 #Trained:12501 Training Accuracy:81.4%
Progress:62.5% Speed(reviews/sec):79.64 #Correct:12311 #Trained:15001 Training Accuracy:82.0%
Progress:72.9% Speed(reviews/sec):80.18 #Correct:14444 #Trained:17501 Training Accuracy:82.5%
Progress:83.3% Speed(reviews/sec):80.61 #Correct:16626 #Trained:20001 Training Accuracy:83.1%
Progress:93.7% Speed(reviews/sec):81.22 #Correct:18806 #Trained:22501 Training Accuracy:83.5%
Progress:99.9% Speed(reviews/sec):81.40 #Correct:20129 #Trained:24000 Training 

That should have trained much better than the earlier attempts. It's still not wonderful, but it should have improved dramatically. Run the following cell to test your model with 1000 predictions.

In [80]:
mlp.test(reviews[-1000:],labels[-1000:])

Progress:99.9% Speed(reviews/sec):467.9 #Correct:848 #Tested:1000 Testing Accuracy:84.8%

# End of Project 4. 
## Andrew's solution was actually in the previous video, so rewatch that video if you had any problems with that project. Then continue on to the next lesson.
# Analyzing Inefficiencies in our Network<a id='lesson_5'></a>
The following cells include the code Andrew shows in the next video. We've included it here so you can run the cells along with the video without having to type in everything.

In [None]:
Image(filename='sentiment_network_sparse.png')

In [None]:
layer_0 = np.zeros(10)

In [None]:
layer_0

In [None]:
layer_0[4] = 1
layer_0[9] = 1

In [None]:
layer_0

In [None]:
weights_0_1 = np.random.randn(10,5)

In [None]:
layer_0.dot(weights_0_1)

In [None]:
indices = [4,9]

In [None]:
layer_1 = np.zeros(5)

In [None]:
for index in indices:
    layer_1 += (1 * weights_0_1[index])

In [None]:
layer_1

In [None]:
Image(filename='sentiment_network_sparse_2.png')

In [None]:
layer_1 = np.zeros(5)

In [None]:
for index in indices:
    layer_1 += (weights_0_1[index])

In [None]:
layer_1

# Project 5: Making our Network More Efficient<a id='project_5'></a>
**TODO:** Make the `SentimentNetwork` class more efficient by eliminating unnecessary multiplications and additions that occur during forward and backward propagation. To do that, you can do the following:
* Copy the `SentimentNetwork` class from the previous project into the following cell.
* Remove the `update_input_layer` function - you will not need it in this version.
* Modify `init_network`:
>* You no longer need a separate input layer, so remove any mention of `self.layer_0`
>* You will be dealing with the old hidden layer more directly, so create `self.layer_1`, a two-dimensional matrix with shape 1 x hidden_nodes, with all values initialized to zero
* Modify `train`:
>* Change the name of the input parameter `training_reviews` to `training_reviews_raw`. This will help with the next step.
>* At the beginning of the function, you'll want to preprocess your reviews to convert them to a list of indices (from `word2index`) that are actually used in the review. This is equivalent to what you saw in the video when Andrew set specific indices to 1. Your code should create a local `list` variable named `training_reviews` that should contain a `list` for each review in `training_reviews_raw`. Those lists should contain the indices for words found in the review.
>* Remove call to `update_input_layer`
>* Use `self`'s  `layer_1` instead of a local `layer_1` object.
>* In the forward pass, replace the code that updates `layer_1` with new logic that only adds the weights for the indices used in the review.
>* When updating `weights_0_1`, only update the individual weights that were used in the forward pass.
* Modify `run`:
>* Remove call to `update_input_layer` 
>* Use `self`'s  `layer_1` instead of a local `layer_1` object.
>* Much like you did in `train`, you will need to pre-process the `review` so you can work with word indices, then update `layer_1` by adding weights for the indices used in the review.

In [81]:
import time
import sys
import numpy as np

# Encapsulate our neural network in a class
class SentimentNetwork:
    def __init__(self, reviews,labels,hidden_nodes = 10, learning_rate = 0.1):
        """Create a SentimenNetwork with the given settings
        Args:
            reviews(list) - List of reviews used for training
            labels(list) - List of POSITIVE/NEGATIVE labels associated with the given reviews
            hidden_nodes(int) - Number of nodes to create in the hidden layer
            learning_rate(float) - Learning rate to use while training
        
        """
        # Assign a seed to our random number generator to ensure we get
        # reproducable results during development 
        np.random.seed(1)

        # process the reviews and their associated labels so that everything
        # is ready for training
        self.pre_process_data(reviews, labels)
        
        # Build the network to have the number of hidden nodes and the learning rate that
        # were passed into this initializer. Make the same number of input nodes as
        # there are vocabulary words and create a single output node.
        self.init_network(len(self.review_vocab),hidden_nodes, 1, learning_rate)

    def pre_process_data(self, reviews, labels):
        
        # populate review_vocab with all of the words in the given reviews
        review_vocab = set()
        for review in reviews:
            for word in review.split(" "):
                review_vocab.add(word)

        # Convert the vocabulary set to a list so we can access words via indices
        self.review_vocab = list(review_vocab)
        
        # populate label_vocab with all of the words in the given labels.
        label_vocab = set()
        for label in labels:
            label_vocab.add(label)
        
        # Convert the label vocabulary set to a list so we can access labels via indices
        self.label_vocab = list(label_vocab)
        
        # Store the sizes of the review and label vocabularies.
        self.review_vocab_size = len(self.review_vocab)
        self.label_vocab_size = len(self.label_vocab)
        
        # Create a dictionary of words in the vocabulary mapped to index positions
        self.word2index = {}
        for i, word in enumerate(self.review_vocab):
            self.word2index[word] = i
        
        # Create a dictionary of labels mapped to index positions
        self.label2index = {}
        for i, label in enumerate(self.label_vocab):
            self.label2index[label] = i

    def init_network(self, input_nodes, hidden_nodes, output_nodes, learning_rate):
        # Set number of nodes in input, hidden and output layers.
        self.input_nodes = input_nodes
        self.hidden_nodes = hidden_nodes
        self.output_nodes = output_nodes

        # Store the learning rate
        self.learning_rate = learning_rate

        # Initialize weights

        # These are the weights between the input layer and the hidden layer.
        self.weights_0_1 = np.zeros((self.input_nodes,self.hidden_nodes))

        # These are the weights between the hidden layer and the output layer.
        self.weights_1_2 = np.random.normal(0.0, self.output_nodes**-0.5, 
                                                (self.hidden_nodes, self.output_nodes))
        
        ## New for Project 5: Removed self.layer_0; added self.layer_1
        # The input layer, a two-dimensional matrix with shape 1 x hidden_nodes
        self.layer_1 = np.zeros((1,hidden_nodes))
    
    ## New for Project 5: Removed update_input_layer function
    
    def get_target_for_label(self,label):
        if(label == 'POSITIVE'):
            return 1
        else:
            return 0
        
    def sigmoid(self,x):
        return 1 / (1 + np.exp(-x))
    
    def sigmoid_output_2_derivative(self,output):
        return output * (1 - output)
    
    ## New for Project 5: changed name of first parameter form 'training_reviews' 
    #                     to 'training_reviews_raw'
    def train(self, training_reviews_raw, training_labels):

        ## New for Project 5: pre-process training reviews so we can deal 
        #                     directly with the indices of non-zero inputs
        training_reviews = list()
        for review in training_reviews_raw:
            indices = set()
            for word in review.split(" "):
                if(word in self.word2index.keys()):
                    indices.add(self.word2index[word])
            training_reviews.append(list(indices))

        # make sure out we have a matching number of reviews and labels
        assert(len(training_reviews) == len(training_labels))
        
        # Keep track of correct predictions to display accuracy during training 
        correct_so_far = 0

        # Remember when we started for printing time statistics
        start = time.time()
        
        # loop through all the given reviews and run a forward and backward pass,
        # updating weights for every item
        for i in range(len(training_reviews)):
            
            # Get the next review and its correct label
            review = training_reviews[i]
            label = training_labels[i]
            
            #### Implement the forward pass here ####
            ### Forward pass ###

            ## New for Project 5: Removed call to 'update_input_layer' function
            #                     because 'layer_0' is no longer used

            # Hidden layer
            ## New for Project 5: Add in only the weights for non-zero items
            self.layer_1 *= 0
            for index in review:
                self.layer_1 += self.weights_0_1[index]

            # Output layer
            ## New for Project 5: changed to use 'self.layer_1' instead of 'local layer_1'
            layer_2 = self.sigmoid(self.layer_1.dot(self.weights_1_2))            
            
            #### Implement the backward pass here ####
            ### Backward pass ###

            # Output error
            layer_2_error = layer_2 - self.get_target_for_label(label) # Output layer error is the difference between desired target and actual output.
            layer_2_delta = layer_2_error * self.sigmoid_output_2_derivative(layer_2)

            # Backpropagated error
            layer_1_error = layer_2_delta.dot(self.weights_1_2.T) # errors propagated to the hidden layer
            layer_1_delta = layer_1_error # hidden layer gradients - no nonlinearity so it's the same as the error

            # Update the weights
            ## New for Project 5: changed to use 'self.layer_1' instead of local 'layer_1'
            self.weights_1_2 -= self.layer_1.T.dot(layer_2_delta) * self.learning_rate # update hidden-to-output weights with gradient descent step
            
            ## New for Project 5: Only update the weights that were used in the forward pass
            for index in review:
                self.weights_0_1[index] -= layer_1_delta[0] * self.learning_rate # update input-to-hidden weights with gradient descent step

            # Keep track of correct predictions.
            if(layer_2 >= 0.5 and label == 'POSITIVE'):
                correct_so_far += 1
            elif(layer_2 < 0.5 and label == 'NEGATIVE'):
                correct_so_far += 1
            
            # For debug purposes, print out our prediction accuracy and speed 
            # throughout the training process. 
            elapsed_time = float(time.time() - start)
            reviews_per_second = i / elapsed_time if elapsed_time > 0 else 0
            
            sys.stdout.write("\rProgress:" + str(100 * i/float(len(training_reviews)))[:4] \
                             + "% Speed(reviews/sec):" + str(reviews_per_second)[0:5] \
                             + " #Correct:" + str(correct_so_far) + " #Trained:" + str(i+1) \
                             + " Training Accuracy:" + str(correct_so_far * 100 / float(i+1))[:4] + "%")
            if(i % 2500 == 0):
                print("")
    
    def test(self, testing_reviews, testing_labels):
        """
        Attempts to predict the labels for the given testing_reviews,
        and uses the test_labels to calculate the accuracy of those predictions.
        """
        
        # keep track of how many correct predictions we make
        correct = 0

        # we'll time how many predictions per second we make
        start = time.time()

        # Loop through each of the given reviews and call run to predict
        # its label. 
        for i in range(len(testing_reviews)):
            pred = self.run(testing_reviews[i])
            if(pred == testing_labels[i]):
                correct += 1
            
            # For debug purposes, print out our prediction accuracy and speed 
            # throughout the prediction process. 

            elapsed_time = float(time.time() - start)
            reviews_per_second = i / elapsed_time if elapsed_time > 0 else 0
            
            sys.stdout.write("\rProgress:" + str(100 * i/float(len(testing_reviews)))[:4] \
                             + "% Speed(reviews/sec):" + str(reviews_per_second)[0:5] \
                             + " #Correct:" + str(correct) + " #Tested:" + str(i+1) \
                             + " Testing Accuracy:" + str(correct * 100 / float(i+1))[:4] + "%")
    
    def run(self, review):
        """
        Returns a POSITIVE or NEGATIVE prediction for the given review.
        """
        # Run a forward pass through the network, like in the "train" function.
        
        ## New for Project 5: Removed call to update_input_layer function
        #                     because layer_0 is no longer used

        # Hidden layer
        ## New for Project 5: Identify the indices used in the review and then add
        #                     just those weights to layer_1 
        self.layer_1 *= 0
        unique_indices = set()
        for word in review.lower().split(" "):
            if word in self.word2index.keys():
                unique_indices.add(self.word2index[word])
        for index in unique_indices:
            self.layer_1 += self.weights_0_1[index]
        
        # Output layer
        ## New for Project 5: changed to use self.layer_1 instead of local layer_1
        layer_2 = self.sigmoid(self.layer_1.dot(self.weights_1_2))
        
        # Return POSITIVE for values above greater-than-or-equal-to 0.5 in the output layer;
        # return NEGATIVE for other values
        if(layer_2[0] >= 0.5):
            return "POSITIVE"
        else:
            return "NEGATIVE"


Run the following cell to recreate the network and train it once again.

In [82]:
mlp = SentimentNetwork(reviews[:-1000],labels[:-1000], learning_rate=0.1)
mlp.train(reviews[:-1000],labels[:-1000])

Progress:0.0% Speed(reviews/sec):0 #Correct:1 #Trained:1 Training Accuracy:100.%


Progress:0.00% Speed(reviews/sec):0 #Correct:1 #Trained:2 Training Accuracy:50.0%Progress:0.00% Speed(reviews/sec):0 #Correct:2 #Trained:3 Training Accuracy:66.6%Progress:0.01% Speed(reviews/sec):191.9 #Correct:2 #Trained:4 Training Accuracy:50.0%Progress:0.01% Speed(reviews/sec):255.9 #Correct:3 #Trained:5 Training Accuracy:60.0%Progress:0.02% Speed(reviews/sec):319.9 #Correct:3 #Trained:6 Training Accuracy:50.0%Progress:0.02% Speed(reviews/sec):383.9 #Correct:3 #Trained:7 Training Accuracy:42.8%Progress:0.02% Speed(reviews/sec):447.9 #Correct:3 #Trained:8 Training Accuracy:37.5%Progress:0.03% Speed(reviews/sec):511.9 #Correct:4 #Trained:9 Training Accuracy:44.4%Progress:0.03% Speed(reviews/sec):575.9 #Correct:4 #Trained:10 Training Accuracy:40.0%Progress:0.04% Speed(reviews/sec):639.9 #Correct:5 #Trained:11 Training Accuracy:45.4%Progress:0.04% Speed(reviews/sec):703.9 #Correct:5 #Trained:12 Training Accuracy:41.6%Progress:0.05% Speed(reviews/sec):767.9 #Correct:5 #Traine

Progress:0.92% Speed(reviews/sec):565.0 #Correct:124 #Trained:222 Training Accuracy:55.8%Progress:0.92% Speed(reviews/sec):567.5 #Correct:125 #Trained:223 Training Accuracy:56.0%Progress:0.92% Speed(reviews/sec):570.1 #Correct:125 #Trained:224 Training Accuracy:55.8%Progress:0.93% Speed(reviews/sec):572.6 #Correct:126 #Trained:225 Training Accuracy:56.0%Progress:0.93% Speed(reviews/sec):575.2 #Correct:126 #Trained:226 Training Accuracy:55.7%Progress:0.94% Speed(reviews/sec):577.7 #Correct:127 #Trained:227 Training Accuracy:55.9%Progress:0.94% Speed(reviews/sec):580.3 #Correct:127 #Trained:228 Training Accuracy:55.7%Progress:0.95% Speed(reviews/sec):582.9 #Correct:128 #Trained:229 Training Accuracy:55.8%Progress:0.95% Speed(reviews/sec):585.4 #Correct:129 #Trained:230 Training Accuracy:56.0%Progress:0.95% Speed(reviews/sec):588.0 #Correct:130 #Trained:231 Training Accuracy:56.2%Progress:0.96% Speed(reviews/sec):590.5 #Correct:130 #Trained:232 Training Accuracy:56.0%Progress:

Progress:1.35% Speed(reviews/sec):558.7 #Correct:199 #Trained:327 Training Accuracy:60.8%Progress:1.36% Speed(reviews/sec):559.4 #Correct:200 #Trained:328 Training Accuracy:60.9%Progress:1.36% Speed(reviews/sec):559.2 #Correct:200 #Trained:329 Training Accuracy:60.7%Progress:1.37% Speed(reviews/sec):557.1 #Correct:201 #Trained:330 Training Accuracy:60.9%Progress:1.37% Speed(reviews/sec):557.9 #Correct:202 #Trained:331 Training Accuracy:61.0%Progress:1.37% Speed(reviews/sec):558.6 #Correct:202 #Trained:332 Training Accuracy:60.8%Progress:1.38% Speed(reviews/sec):557.5 #Correct:203 #Trained:333 Training Accuracy:60.9%Progress:1.38% Speed(reviews/sec):558.2 #Correct:204 #Trained:334 Training Accuracy:61.0%Progress:1.39% Speed(reviews/sec):558.0 #Correct:204 #Trained:335 Training Accuracy:60.8%Progress:1.39% Speed(reviews/sec):558.7 #Correct:205 #Trained:336 Training Accuracy:61.0%Progress:1.4% Speed(reviews/sec):559.5 #Correct:206 #Trained:337 Training Accuracy:61.1%Progress:1

Progress:1.92% Speed(reviews/sec):573.4 #Correct:283 #Trained:463 Training Accuracy:61.1%Progress:1.92% Speed(reviews/sec):574.7 #Correct:284 #Trained:464 Training Accuracy:61.2%Progress:1.93% Speed(reviews/sec):575.9 #Correct:285 #Trained:465 Training Accuracy:61.2%Progress:1.93% Speed(reviews/sec):577.2 #Correct:286 #Trained:466 Training Accuracy:61.3%Progress:1.94% Speed(reviews/sec):578.4 #Correct:287 #Trained:467 Training Accuracy:61.4%Progress:1.94% Speed(reviews/sec):579.7 #Correct:288 #Trained:468 Training Accuracy:61.5%Progress:1.95% Speed(reviews/sec):580.9 #Correct:288 #Trained:469 Training Accuracy:61.4%Progress:1.95% Speed(reviews/sec):582.1 #Correct:289 #Trained:470 Training Accuracy:61.4%Progress:1.95% Speed(reviews/sec):572.3 #Correct:289 #Trained:471 Training Accuracy:61.3%Progress:1.96% Speed(reviews/sec):573.5 #Correct:290 #Trained:472 Training Accuracy:61.4%Progress:1.96% Speed(reviews/sec):574.7 #Correct:291 #Trained:473 Training Accuracy:61.5%Progress:

Progress:2.44% Speed(reviews/sec):571.2 #Correct:365 #Trained:588 Training Accuracy:62.0%Progress:2.45% Speed(reviews/sec):571.5 #Correct:365 #Trained:589 Training Accuracy:61.9%Progress:2.45% Speed(reviews/sec):572.5 #Correct:366 #Trained:590 Training Accuracy:62.0%Progress:2.45% Speed(reviews/sec):573.4 #Correct:367 #Trained:591 Training Accuracy:62.0%Progress:2.46% Speed(reviews/sec):574.4 #Correct:368 #Trained:592 Training Accuracy:62.1%Progress:2.46% Speed(reviews/sec):575.4 #Correct:369 #Trained:593 Training Accuracy:62.2%Progress:2.47% Speed(reviews/sec):576.4 #Correct:370 #Trained:594 Training Accuracy:62.2%Progress:2.47% Speed(reviews/sec):577.3 #Correct:370 #Trained:595 Training Accuracy:62.1%Progress:2.47% Speed(reviews/sec):578.3 #Correct:370 #Trained:596 Training Accuracy:62.0%Progress:2.48% Speed(reviews/sec):579.3 #Correct:370 #Trained:597 Training Accuracy:61.9%Progress:2.48% Speed(reviews/sec):580.3 #Correct:371 #Trained:598 Training Accuracy:62.0%Progress:

Progress:3.00% Speed(reviews/sec):573.1 #Correct:450 #Trained:722 Training Accuracy:62.3%Progress:3.00% Speed(reviews/sec):573.8 #Correct:451 #Trained:723 Training Accuracy:62.3%Progress:3.01% Speed(reviews/sec):574.6 #Correct:452 #Trained:724 Training Accuracy:62.4%Progress:3.01% Speed(reviews/sec):575.4 #Correct:453 #Trained:725 Training Accuracy:62.4%Progress:3.02% Speed(reviews/sec):576.2 #Correct:453 #Trained:726 Training Accuracy:62.3%Progress:3.02% Speed(reviews/sec):577.0 #Correct:454 #Trained:727 Training Accuracy:62.4%Progress:3.02% Speed(reviews/sec):577.8 #Correct:454 #Trained:728 Training Accuracy:62.3%Progress:3.03% Speed(reviews/sec):578.6 #Correct:455 #Trained:729 Training Accuracy:62.4%Progress:3.03% Speed(reviews/sec):572.3 #Correct:455 #Trained:730 Training Accuracy:62.3%Progress:3.04% Speed(reviews/sec):573.1 #Correct:456 #Trained:731 Training Accuracy:62.3%Progress:3.04% Speed(reviews/sec):573.9 #Correct:457 #Trained:732 Training Accuracy:62.4%Progress:

Progress:3.58% Speed(reviews/sec):576.2 #Correct:551 #Trained:862 Training Accuracy:63.9%Progress:3.59% Speed(reviews/sec):576.9 #Correct:551 #Trained:863 Training Accuracy:63.8%Progress:3.59% Speed(reviews/sec):577.6 #Correct:552 #Trained:864 Training Accuracy:63.8%Progress:3.6% Speed(reviews/sec):578.3 #Correct:552 #Trained:865 Training Accuracy:63.8%Progress:3.60% Speed(reviews/sec):578.9 #Correct:553 #Trained:866 Training Accuracy:63.8%Progress:3.60% Speed(reviews/sec):579.6 #Correct:554 #Trained:867 Training Accuracy:63.8%Progress:3.61% Speed(reviews/sec):580.3 #Correct:555 #Trained:868 Training Accuracy:63.9%Progress:3.61% Speed(reviews/sec):580.9 #Correct:556 #Trained:869 Training Accuracy:63.9%Progress:3.62% Speed(reviews/sec):581.6 #Correct:557 #Trained:870 Training Accuracy:64.0%Progress:3.62% Speed(reviews/sec):582.3 #Correct:558 #Trained:871 Training Accuracy:64.0%Progress:3.62% Speed(reviews/sec):576.9 #Correct:559 #Trained:872 Training Accuracy:64.1%Progress:3

Progress:4.02% Speed(reviews/sec):571.3 #Correct:627 #Trained:967 Training Accuracy:64.8%Progress:4.02% Speed(reviews/sec):571.9 #Correct:628 #Trained:968 Training Accuracy:64.8%Progress:4.03% Speed(reviews/sec):572.4 #Correct:629 #Trained:969 Training Accuracy:64.9%Progress:4.03% Speed(reviews/sec):573.0 #Correct:629 #Trained:970 Training Accuracy:64.8%Progress:4.04% Speed(reviews/sec):573.6 #Correct:630 #Trained:971 Training Accuracy:64.8%Progress:4.04% Speed(reviews/sec):574.2 #Correct:631 #Trained:972 Training Accuracy:64.9%Progress:4.05% Speed(reviews/sec):574.8 #Correct:632 #Trained:973 Training Accuracy:64.9%Progress:4.05% Speed(reviews/sec):575.4 #Correct:633 #Trained:974 Training Accuracy:64.9%Progress:4.05% Speed(reviews/sec):570.7 #Correct:634 #Trained:975 Training Accuracy:65.0%Progress:4.06% Speed(reviews/sec):571.3 #Correct:634 #Trained:976 Training Accuracy:64.9%Progress:4.06% Speed(reviews/sec):571.9 #Correct:634 #Trained:977 Training Accuracy:64.8%Progress:

Progress:4.64% Speed(reviews/sec):578.7 #Correct:748 #Trained:1116 Training Accuracy:67.0%Progress:4.65% Speed(reviews/sec):579.2 #Correct:749 #Trained:1117 Training Accuracy:67.0%Progress:4.65% Speed(reviews/sec):579.7 #Correct:750 #Trained:1118 Training Accuracy:67.0%Progress:4.65% Speed(reviews/sec):580.3 #Correct:750 #Trained:1119 Training Accuracy:67.0%Progress:4.66% Speed(reviews/sec):580.8 #Correct:750 #Trained:1120 Training Accuracy:66.9%Progress:4.66% Speed(reviews/sec):581.3 #Correct:750 #Trained:1121 Training Accuracy:66.9%Progress:4.67% Speed(reviews/sec):581.8 #Correct:751 #Trained:1122 Training Accuracy:66.9%Progress:4.67% Speed(reviews/sec):582.3 #Correct:752 #Trained:1123 Training Accuracy:66.9%Progress:4.67% Speed(reviews/sec):582.9 #Correct:753 #Trained:1124 Training Accuracy:66.9%Progress:4.68% Speed(reviews/sec):583.4 #Correct:754 #Trained:1125 Training Accuracy:67.0%Progress:4.68% Speed(reviews/sec):579.2 #Correct:755 #Trained:1126 Training Accuracy:67.0

Progress:5.16% Speed(reviews/sec):578.0 #Correct:844 #Trained:1241 Training Accuracy:68.0%Progress:5.17% Speed(reviews/sec):577.7 #Correct:844 #Trained:1242 Training Accuracy:67.9%Progress:5.17% Speed(reviews/sec):577.9 #Correct:844 #Trained:1243 Training Accuracy:67.9%Progress:5.17% Speed(reviews/sec):577.8 #Correct:844 #Trained:1244 Training Accuracy:67.8%Progress:5.18% Speed(reviews/sec):577.4 #Correct:845 #Trained:1245 Training Accuracy:67.8%Progress:5.18% Speed(reviews/sec):577.4 #Correct:845 #Trained:1246 Training Accuracy:67.8%Progress:5.19% Speed(reviews/sec):576.8 #Correct:845 #Trained:1247 Training Accuracy:67.7%Progress:5.19% Speed(reviews/sec):576.7 #Correct:845 #Trained:1248 Training Accuracy:67.7%Progress:5.2% Speed(reviews/sec):576.9 #Correct:846 #Trained:1249 Training Accuracy:67.7%Progress:5.20% Speed(reviews/sec):577.1 #Correct:847 #Trained:1250 Training Accuracy:67.7%Progress:5.20% Speed(reviews/sec):576.8 #Correct:848 #Trained:1251 Training Accuracy:67.7%

Progress:5.64% Speed(reviews/sec):578.9 #Correct:925 #Trained:1355 Training Accuracy:68.2%Progress:5.64% Speed(reviews/sec):579.3 #Correct:926 #Trained:1356 Training Accuracy:68.2%Progress:5.65% Speed(reviews/sec):579.7 #Correct:927 #Trained:1357 Training Accuracy:68.3%Progress:5.65% Speed(reviews/sec):580.1 #Correct:928 #Trained:1358 Training Accuracy:68.3%Progress:5.65% Speed(reviews/sec):580.6 #Correct:928 #Trained:1359 Training Accuracy:68.2%Progress:5.66% Speed(reviews/sec):581.0 #Correct:929 #Trained:1360 Training Accuracy:68.3%Progress:5.66% Speed(reviews/sec):581.4 #Correct:930 #Trained:1361 Training Accuracy:68.3%Progress:5.67% Speed(reviews/sec):581.8 #Correct:930 #Trained:1362 Training Accuracy:68.2%Progress:5.67% Speed(reviews/sec):582.3 #Correct:930 #Trained:1363 Training Accuracy:68.2%Progress:5.67% Speed(reviews/sec):582.7 #Correct:931 #Trained:1364 Training Accuracy:68.2%Progress:5.68% Speed(reviews/sec):579.3 #Correct:931 #Trained:1365 Training Accuracy:68.2

Progress:6.24% Speed(reviews/sec):584.8 #Correct:1025 #Trained:1500 Training Accuracy:68.3%Progress:6.25% Speed(reviews/sec):585.2 #Correct:1026 #Trained:1501 Training Accuracy:68.3%Progress:6.25% Speed(reviews/sec):585.6 #Correct:1026 #Trained:1502 Training Accuracy:68.3%Progress:6.25% Speed(reviews/sec):586.0 #Correct:1026 #Trained:1503 Training Accuracy:68.2%Progress:6.26% Speed(reviews/sec):586.3 #Correct:1027 #Trained:1504 Training Accuracy:68.2%Progress:6.26% Speed(reviews/sec):586.7 #Correct:1028 #Trained:1505 Training Accuracy:68.3%Progress:6.27% Speed(reviews/sec):587.1 #Correct:1029 #Trained:1506 Training Accuracy:68.3%Progress:6.27% Speed(reviews/sec):587.5 #Correct:1030 #Trained:1507 Training Accuracy:68.3%Progress:6.27% Speed(reviews/sec):587.9 #Correct:1030 #Trained:1508 Training Accuracy:68.3%Progress:6.28% Speed(reviews/sec):584.7 #Correct:1030 #Trained:1509 Training Accuracy:68.2%Progress:6.28% Speed(reviews/sec):585.1 #Correct:1031 #Trained:1510 Training Ac

Progress:6.81% Speed(reviews/sec):588.8 #Correct:1131 #Trained:1636 Training Accuracy:69.1%Progress:6.81% Speed(reviews/sec):589.1 #Correct:1132 #Trained:1637 Training Accuracy:69.1%Progress:6.82% Speed(reviews/sec):589.5 #Correct:1133 #Trained:1638 Training Accuracy:69.1%Progress:6.82% Speed(reviews/sec):589.9 #Correct:1133 #Trained:1639 Training Accuracy:69.1%Progress:6.82% Speed(reviews/sec):590.2 #Correct:1133 #Trained:1640 Training Accuracy:69.0%Progress:6.83% Speed(reviews/sec):590.6 #Correct:1134 #Trained:1641 Training Accuracy:69.1%Progress:6.83% Speed(reviews/sec):590.9 #Correct:1135 #Trained:1642 Training Accuracy:69.1%Progress:6.84% Speed(reviews/sec):591.3 #Correct:1136 #Trained:1643 Training Accuracy:69.1%Progress:6.84% Speed(reviews/sec):591.7 #Correct:1136 #Trained:1644 Training Accuracy:69.0%Progress:6.85% Speed(reviews/sec):592.0 #Correct:1137 #Trained:1645 Training Accuracy:69.1%Progress:6.85% Speed(reviews/sec):592.4 #Correct:1137 #Trained:1646 Training Ac

Progress:7.33% Speed(reviews/sec):592.1 #Correct:1230 #Trained:1761 Training Accuracy:69.8%Progress:7.33% Speed(reviews/sec):592.4 #Correct:1230 #Trained:1762 Training Accuracy:69.8%Progress:7.34% Speed(reviews/sec):592.7 #Correct:1231 #Trained:1763 Training Accuracy:69.8%Progress:7.34% Speed(reviews/sec):593.1 #Correct:1231 #Trained:1764 Training Accuracy:69.7%Progress:7.35% Speed(reviews/sec):593.4 #Correct:1231 #Trained:1765 Training Accuracy:69.7%Progress:7.35% Speed(reviews/sec):593.7 #Correct:1232 #Trained:1766 Training Accuracy:69.7%Progress:7.35% Speed(reviews/sec):594.1 #Correct:1233 #Trained:1767 Training Accuracy:69.7%Progress:7.36% Speed(reviews/sec):594.4 #Correct:1234 #Trained:1768 Training Accuracy:69.7%Progress:7.36% Speed(reviews/sec):594.7 #Correct:1235 #Trained:1769 Training Accuracy:69.8%Progress:7.37% Speed(reviews/sec):595.1 #Correct:1236 #Trained:1770 Training Accuracy:69.8%Progress:7.37% Speed(reviews/sec):595.4 #Correct:1237 #Trained:1771 Training Ac

Progress:7.89% Speed(reviews/sec):595.5 #Correct:1333 #Trained:1895 Training Accuracy:70.3%Progress:7.89% Speed(reviews/sec):595.2 #Correct:1334 #Trained:1896 Training Accuracy:70.3%Progress:7.9% Speed(reviews/sec):595.2 #Correct:1335 #Trained:1897 Training Accuracy:70.3%Progress:7.90% Speed(reviews/sec):595.1 #Correct:1335 #Trained:1898 Training Accuracy:70.3%Progress:7.90% Speed(reviews/sec):595.0 #Correct:1336 #Trained:1899 Training Accuracy:70.3%Progress:7.91% Speed(reviews/sec):595.0 #Correct:1337 #Trained:1900 Training Accuracy:70.3%Progress:7.91% Speed(reviews/sec):595.1 #Correct:1338 #Trained:1901 Training Accuracy:70.3%Progress:7.92% Speed(reviews/sec):595.0 #Correct:1339 #Trained:1902 Training Accuracy:70.3%Progress:7.92% Speed(reviews/sec):595.2 #Correct:1340 #Trained:1903 Training Accuracy:70.4%Progress:7.92% Speed(reviews/sec):595.2 #Correct:1341 #Trained:1904 Training Accuracy:70.4%Progress:7.93% Speed(reviews/sec):595.5 #Correct:1341 #Trained:1905 Training Acc

Progress:8.41% Speed(reviews/sec):597.7 #Correct:1427 #Trained:2020 Training Accuracy:70.6%Progress:8.41% Speed(reviews/sec):598.0 #Correct:1428 #Trained:2021 Training Accuracy:70.6%Progress:8.42% Speed(reviews/sec):598.3 #Correct:1429 #Trained:2022 Training Accuracy:70.6%Progress:8.42% Speed(reviews/sec):598.5 #Correct:1430 #Trained:2023 Training Accuracy:70.6%Progress:8.42% Speed(reviews/sec):597.3 #Correct:1431 #Trained:2024 Training Accuracy:70.7%Progress:8.43% Speed(reviews/sec):597.2 #Correct:1432 #Trained:2025 Training Accuracy:70.7%Progress:8.43% Speed(reviews/sec):597.4 #Correct:1433 #Trained:2026 Training Accuracy:70.7%Progress:8.44% Speed(reviews/sec):597.5 #Correct:1434 #Trained:2027 Training Accuracy:70.7%Progress:8.44% Speed(reviews/sec):597.4 #Correct:1435 #Trained:2028 Training Accuracy:70.7%Progress:8.45% Speed(reviews/sec):597.2 #Correct:1436 #Trained:2029 Training Accuracy:70.7%Progress:8.45% Speed(reviews/sec):597.1 #Correct:1436 #Trained:2030 Training Ac

Progress:8.93% Speed(reviews/sec):597.7 #Correct:1526 #Trained:2146 Training Accuracy:71.1%Progress:8.94% Speed(reviews/sec):597.8 #Correct:1527 #Trained:2147 Training Accuracy:71.1%Progress:8.94% Speed(reviews/sec):597.7 #Correct:1528 #Trained:2148 Training Accuracy:71.1%Progress:8.95% Speed(reviews/sec):597.9 #Correct:1529 #Trained:2149 Training Accuracy:71.1%Progress:8.95% Speed(reviews/sec):597.3 #Correct:1530 #Trained:2150 Training Accuracy:71.1%Progress:8.95% Speed(reviews/sec):597.3 #Correct:1531 #Trained:2151 Training Accuracy:71.1%Progress:8.96% Speed(reviews/sec):597.4 #Correct:1532 #Trained:2152 Training Accuracy:71.1%Progress:8.96% Speed(reviews/sec):597.5 #Correct:1533 #Trained:2153 Training Accuracy:71.2%Progress:8.97% Speed(reviews/sec):597.6 #Correct:1534 #Trained:2154 Training Accuracy:71.2%Progress:8.97% Speed(reviews/sec):597.5 #Correct:1535 #Trained:2155 Training Accuracy:71.2%Progress:8.97% Speed(reviews/sec):597.5 #Correct:1536 #Trained:2156 Training Ac

Progress:9.37% Speed(reviews/sec):595.5 #Correct:1616 #Trained:2250 Training Accuracy:71.8%Progress:9.37% Speed(reviews/sec):595.8 #Correct:1617 #Trained:2251 Training Accuracy:71.8%Progress:9.37% Speed(reviews/sec):596.0 #Correct:1618 #Trained:2252 Training Accuracy:71.8%Progress:9.38% Speed(reviews/sec):596.3 #Correct:1619 #Trained:2253 Training Accuracy:71.8%Progress:9.38% Speed(reviews/sec):596.6 #Correct:1620 #Trained:2254 Training Accuracy:71.8%Progress:9.39% Speed(reviews/sec):596.8 #Correct:1621 #Trained:2255 Training Accuracy:71.8%Progress:9.39% Speed(reviews/sec):597.1 #Correct:1622 #Trained:2256 Training Accuracy:71.8%Progress:9.4% Speed(reviews/sec):597.3 #Correct:1623 #Trained:2257 Training Accuracy:71.9%Progress:9.40% Speed(reviews/sec):597.6 #Correct:1623 #Trained:2258 Training Accuracy:71.8%Progress:9.40% Speed(reviews/sec):597.9 #Correct:1624 #Trained:2259 Training Accuracy:71.8%Progress:9.41% Speed(reviews/sec):598.1 #Correct:1625 #Trained:2260 Training Acc

Progress:9.83% Speed(reviews/sec):592.8 #Correct:1699 #Trained:2361 Training Accuracy:71.9%Progress:9.83% Speed(reviews/sec):593.1 #Correct:1699 #Trained:2362 Training Accuracy:71.9%Progress:9.84% Speed(reviews/sec):593.3 #Correct:1700 #Trained:2363 Training Accuracy:71.9%Progress:9.84% Speed(reviews/sec):593.6 #Correct:1700 #Trained:2364 Training Accuracy:71.9%Progress:9.85% Speed(reviews/sec):593.8 #Correct:1701 #Trained:2365 Training Accuracy:71.9%Progress:9.85% Speed(reviews/sec):594.1 #Correct:1702 #Trained:2366 Training Accuracy:71.9%Progress:9.85% Speed(reviews/sec):594.3 #Correct:1703 #Trained:2367 Training Accuracy:71.9%Progress:9.86% Speed(reviews/sec):594.6 #Correct:1704 #Trained:2368 Training Accuracy:71.9%Progress:9.86% Speed(reviews/sec):594.8 #Correct:1705 #Trained:2369 Training Accuracy:71.9%Progress:9.87% Speed(reviews/sec):595.1 #Correct:1706 #Trained:2370 Training Accuracy:71.9%Progress:9.87% Speed(reviews/sec):595.3 #Correct:1706 #Trained:2371 Training Ac

Progress:10.3% Speed(reviews/sec):590.0 #Correct:1781 #Trained:2475 Training Accuracy:71.9%Progress:10.3% Speed(reviews/sec):589.9 #Correct:1782 #Trained:2476 Training Accuracy:71.9%Progress:10.3% Speed(reviews/sec):589.9 #Correct:1783 #Trained:2477 Training Accuracy:71.9%Progress:10.3% Speed(reviews/sec):590.0 #Correct:1784 #Trained:2478 Training Accuracy:71.9%Progress:10.3% Speed(reviews/sec):589.9 #Correct:1785 #Trained:2479 Training Accuracy:72.0%Progress:10.3% Speed(reviews/sec):589.9 #Correct:1785 #Trained:2480 Training Accuracy:71.9%Progress:10.3% Speed(reviews/sec):590.0 #Correct:1786 #Trained:2481 Training Accuracy:71.9%Progress:10.3% Speed(reviews/sec):590.1 #Correct:1787 #Trained:2482 Training Accuracy:71.9%Progress:10.3% Speed(reviews/sec):590.0 #Correct:1788 #Trained:2483 Training Accuracy:72.0%Progress:10.3% Speed(reviews/sec):590.1 #Correct:1789 #Trained:2484 Training Accuracy:72.0%Progress:10.3% Speed(reviews/sec):590.1 #Correct:1790 #Trained:2485 Training Ac

Progress:10.4% Speed(reviews/sec):588.7 #Correct:1804 #Trained:2502 Training Accuracy:72.1%Progress:10.4% Speed(reviews/sec):588.6 #Correct:1805 #Trained:2503 Training Accuracy:72.1%Progress:10.4% Speed(reviews/sec):588.5 #Correct:1806 #Trained:2504 Training Accuracy:72.1%Progress:10.4% Speed(reviews/sec):588.2 #Correct:1807 #Trained:2505 Training Accuracy:72.1%Progress:10.4% Speed(reviews/sec):588.1 #Correct:1808 #Trained:2506 Training Accuracy:72.1%Progress:10.4% Speed(reviews/sec):588.1 #Correct:1809 #Trained:2507 Training Accuracy:72.1%Progress:10.4% Speed(reviews/sec):587.6 #Correct:1809 #Trained:2508 Training Accuracy:72.1%Progress:10.4% Speed(reviews/sec):587.2 #Correct:1809 #Trained:2509 Training Accuracy:72.1%Progress:10.4% Speed(reviews/sec):587.0 #Correct:1810 #Trained:2510 Training Accuracy:72.1%Progress:10.4% Speed(reviews/sec):586.9 #Correct:1810 #Trained:2511 Training Accuracy:72.0%Progress:10.4% Speed(reviews/sec):587.0 #Correct:1811 #Trained:2512 Training Ac

Progress:20.8% Speed(reviews/sec):560.8 #Correct:3788 #Trained:5001 Training Accuracy:75.7%
Progress:31.2% Speed(reviews/sec):575.0 #Correct:5868 #Trained:7501 Training Accuracy:78.2%
Progress:41.6% Speed(reviews/sec):583.5 #Correct:8002 #Trained:10001 Training Accuracy:80.0%
Progress:52.0% Speed(reviews/sec):583.9 #Correct:10125 #Trained:12501 Training Accuracy:80.9%
Progress:62.5% Speed(reviews/sec):584.1 #Correct:12259 #Trained:15001 Training Accuracy:81.7%
Progress:72.9% Speed(reviews/sec):576.6 #Correct:14385 #Trained:17501 Training Accuracy:82.1%
Progress:83.3% Speed(reviews/sec):578.5 #Correct:16565 #Trained:20001 Training Accuracy:82.8%
Progress:93.7% Speed(reviews/sec):577.1 #Correct:18744 #Trained:22501 Training Accuracy:83.3%
Progress:99.9% Speed(reviews/sec):574.1 #Correct:20065 #Trained:24000 Training Accuracy:83.6%

That should have trained much better than the earlier attempts. Run the following cell to test your model with 1000 predictions.

In [83]:
mlp.test(reviews[-1000:],labels[-1000:])

Progress:0.0% Speed(reviews/sec):0.0 #Correct:1 #Tested:1 Testing Accuracy:100.%Progress:0.1% Speed(reviews/sec):333.3 #Correct:1 #Tested:2 Testing Accuracy:50.0%Progress:0.2% Speed(reviews/sec):666.7 #Correct:1 #Tested:3 Testing Accuracy:33.3%Progress:0.3% Speed(reviews/sec):750.0 #Correct:2 #Tested:4 Testing Accuracy:50.0%Progress:0.4% Speed(reviews/sec):807.7 #Correct:3 #Tested:5 Testing Accuracy:60.0%Progress:0.5% Speed(reviews/sec):1009. #Correct:4 #Tested:6 Testing Accuracy:66.6%Progress:0.6% Speed(reviews/sec):1008. #Correct:5 #Tested:7 Testing Accuracy:71.4%Progress:0.7% Speed(reviews/sec):1006. #Correct:6 #Tested:8 Testing Accuracy:75.0%Progress:0.8% Speed(reviews/sec):1006. #Correct:7 #Tested:9 Testing Accuracy:77.7%Progress:0.9% Speed(reviews/sec):821.7 #Correct:8 #Tested:10 Testing Accuracy:80.0%Progress:1.0% Speed(reviews/sec):836.5 #Correct:9 #Tested:11 Testing Accuracy:81.8%Progress:1.1% Speed(reviews/sec):849.2 #Correct:10 #Tested:12 Testing Accuracy:83.3%P

Progress:16.6% Speed(reviews/sec):830.4 #Correct:148 #Tested:167 Testing Accuracy:88.6%Progress:16.7% Speed(reviews/sec):835.4 #Correct:149 #Tested:168 Testing Accuracy:88.6%Progress:16.8% Speed(reviews/sec):840.4 #Correct:150 #Tested:169 Testing Accuracy:88.7%Progress:16.9% Speed(reviews/sec):845.4 #Correct:151 #Tested:170 Testing Accuracy:88.8%Progress:17.0% Speed(reviews/sec):850.4 #Correct:151 #Tested:171 Testing Accuracy:88.3%Progress:17.1% Speed(reviews/sec):855.4 #Correct:152 #Tested:172 Testing Accuracy:88.3%Progress:17.2% Speed(reviews/sec):860.4 #Correct:153 #Tested:173 Testing Accuracy:88.4%Progress:17.3% Speed(reviews/sec):865.4 #Correct:153 #Tested:174 Testing Accuracy:87.9%Progress:17.4% Speed(reviews/sec):870.4 #Correct:154 #Tested:175 Testing Accuracy:88.0%Progress:17.5% Speed(reviews/sec):875.5 #Correct:155 #Tested:176 Testing Accuracy:88.0%Progress:17.6% Speed(reviews/sec):880.5 #Correct:156 #Tested:177 Testing Accuracy:88.1%Progress:17.7% Speed(reviews/se

Progress:35.5% Speed(reviews/sec):891.1 #Correct:312 #Tested:356 Testing Accuracy:87.6%Progress:35.6% Speed(reviews/sec):893.6 #Correct:313 #Tested:357 Testing Accuracy:87.6%Progress:35.7% Speed(reviews/sec):896.1 #Correct:313 #Tested:358 Testing Accuracy:87.4%Progress:35.8% Speed(reviews/sec):898.6 #Correct:314 #Tested:359 Testing Accuracy:87.4%Progress:35.9% Speed(reviews/sec):901.1 #Correct:315 #Tested:360 Testing Accuracy:87.5%Progress:36.0% Speed(reviews/sec):903.6 #Correct:316 #Tested:361 Testing Accuracy:87.5%Progress:36.1% Speed(reviews/sec):906.1 #Correct:317 #Tested:362 Testing Accuracy:87.5%Progress:36.2% Speed(reviews/sec):908.6 #Correct:318 #Tested:363 Testing Accuracy:87.6%Progress:36.3% Speed(reviews/sec):911.2 #Correct:319 #Tested:364 Testing Accuracy:87.6%Progress:36.4% Speed(reviews/sec):913.7 #Correct:320 #Tested:365 Testing Accuracy:87.6%Progress:36.5% Speed(reviews/sec):916.2 #Correct:321 #Tested:366 Testing Accuracy:87.7%Progress:36.6% Speed(reviews/se

Progress:53.4% Speed(reviews/sec):891.9 #Correct:467 #Tested:535 Testing Accuracy:87.2%Progress:53.5% Speed(reviews/sec):893.5 #Correct:468 #Tested:536 Testing Accuracy:87.3%Progress:53.6% Speed(reviews/sec):895.2 #Correct:468 #Tested:537 Testing Accuracy:87.1%Progress:53.7% Speed(reviews/sec):896.9 #Correct:469 #Tested:538 Testing Accuracy:87.1%Progress:53.8% Speed(reviews/sec):898.5 #Correct:470 #Tested:539 Testing Accuracy:87.1%Progress:53.9% Speed(reviews/sec):900.2 #Correct:471 #Tested:540 Testing Accuracy:87.2%Progress:54.0% Speed(reviews/sec):901.9 #Correct:472 #Tested:541 Testing Accuracy:87.2%Progress:54.1% Speed(reviews/sec):903.6 #Correct:473 #Tested:542 Testing Accuracy:87.2%Progress:54.2% Speed(reviews/sec):905.2 #Correct:474 #Tested:543 Testing Accuracy:87.2%Progress:54.3% Speed(reviews/sec):906.9 #Correct:475 #Tested:544 Testing Accuracy:87.3%Progress:54.4% Speed(reviews/sec):908.6 #Correct:475 #Tested:545 Testing Accuracy:87.1%Progress:54.5% Speed(reviews/se

Progress:71.9% Speed(reviews/sec):916.9 #Correct:615 #Tested:720 Testing Accuracy:85.4%Progress:72.0% Speed(reviews/sec):918.2 #Correct:616 #Tested:721 Testing Accuracy:85.4%Progress:72.1% Speed(reviews/sec):919.5 #Correct:617 #Tested:722 Testing Accuracy:85.4%Progress:72.2% Speed(reviews/sec):920.8 #Correct:618 #Tested:723 Testing Accuracy:85.4%Progress:72.3% Speed(reviews/sec):922.0 #Correct:618 #Tested:724 Testing Accuracy:85.3%Progress:72.4% Speed(reviews/sec):923.3 #Correct:618 #Tested:725 Testing Accuracy:85.2%Progress:72.5% Speed(reviews/sec):924.6 #Correct:619 #Tested:726 Testing Accuracy:85.2%Progress:72.6% Speed(reviews/sec):925.9 #Correct:620 #Tested:727 Testing Accuracy:85.2%Progress:72.7% Speed(reviews/sec):927.1 #Correct:621 #Tested:728 Testing Accuracy:85.3%Progress:72.8% Speed(reviews/sec):928.4 #Correct:622 #Tested:729 Testing Accuracy:85.3%Progress:72.9% Speed(reviews/sec):929.7 #Correct:623 #Tested:730 Testing Accuracy:85.3%Progress:73.0% Speed(reviews/se

Progress:88.3% Speed(reviews/sec):913.2 #Correct:752 #Tested:884 Testing Accuracy:85.0%Progress:88.4% Speed(reviews/sec):914.2 #Correct:753 #Tested:885 Testing Accuracy:85.0%Progress:88.5% Speed(reviews/sec):915.3 #Correct:754 #Tested:886 Testing Accuracy:85.1%Progress:88.6% Speed(reviews/sec):916.3 #Correct:755 #Tested:887 Testing Accuracy:85.1%Progress:88.7% Speed(reviews/sec):917.3 #Correct:756 #Tested:888 Testing Accuracy:85.1%Progress:88.8% Speed(reviews/sec):918.4 #Correct:756 #Tested:889 Testing Accuracy:85.0%Progress:88.9% Speed(reviews/sec):919.4 #Correct:756 #Tested:890 Testing Accuracy:84.9%Progress:89.0% Speed(reviews/sec):920.4 #Correct:757 #Tested:891 Testing Accuracy:84.9%Progress:89.1% Speed(reviews/sec):921.5 #Correct:758 #Tested:892 Testing Accuracy:84.9%Progress:89.2% Speed(reviews/sec):922.5 #Correct:759 #Tested:893 Testing Accuracy:84.9%Progress:89.3% Speed(reviews/sec):923.6 #Correct:760 #Tested:894 Testing Accuracy:85.0%Progress:89.4% Speed(reviews/se

# End of Project 5. 
## Watch the next video to see Andrew's solution, then continue on to the next lesson.
# Further Noise Reduction<a id='lesson_6'></a>

In [None]:
Image(filename='sentiment_network_sparse_2.png')

In [None]:
# words most frequently seen in a review with a "POSITIVE" label
pos_neg_ratios.most_common()

In [None]:
# words most frequently seen in a review with a "NEGATIVE" label
list(reversed(pos_neg_ratios.most_common()))[0:30]

In [None]:
from bokeh.models import ColumnDataSource, LabelSet
from bokeh.plotting import figure, show, output_file
from bokeh.io import output_notebook
output_notebook()

In [None]:
hist, edges = np.histogram(list(map(lambda x:x[1],pos_neg_ratios.most_common())), density=True, bins=100, normed=True)

p = figure(tools="pan,wheel_zoom,reset,save",
           toolbar_location="above",
           title="Word Positive/Negative Affinity Distribution")
p.quad(top=hist, bottom=0, left=edges[:-1], right=edges[1:], line_color="#555555")
show(p)

In [None]:
frequency_frequency = Counter()

for word, cnt in total_counts.most_common():
    frequency_frequency[cnt] += 1

In [None]:
hist, edges = np.histogram(list(map(lambda x:x[1],frequency_frequency.most_common())), density=True, bins=100, normed=True)

p = figure(tools="pan,wheel_zoom,reset,save",
           toolbar_location="above",
           title="The frequency distribution of the words in our corpus")
p.quad(top=hist, bottom=0, left=edges[:-1], right=edges[1:], line_color="#555555")
show(p)

# Project 6: Reducing Noise by Strategically Reducing the Vocabulary<a id='project_6'></a>

**TODO:** Improve `SentimentNetwork`'s performance by reducing more noise in the vocabulary. Specifically, do the following:
* Copy the `SentimentNetwork` class from the previous project into the following cell.
* Modify `pre_process_data`:
>* Add two additional parameters: `min_count` and `polarity_cutoff`
>* Calculate the positive-to-negative ratios of words used in the reviews. (You can use code you've written elsewhere in the notebook, but we are moving it into the class like we did with other helper code earlier.)
>* Andrew's solution only calculates a postive-to-negative ratio for words that occur at least 50 times. This keeps the network from attributing too much sentiment to rarer words. You can choose to add this to your solution if you would like.  
>* Change so words are only added to the vocabulary if they occur in the vocabulary more than `min_count` times.
>* Change so words are only added to the vocabulary if the absolute value of their postive-to-negative ratio is at least `polarity_cutoff`
* Modify `__init__`:
>* Add the same two parameters (`min_count` and `polarity_cutoff`) and use them when you call `pre_process_data`

In [None]:
import time
import sys
import numpy as np

# Encapsulate our neural network in a class
class SentimentNetwork:
    ## New for Project 6: added min_count and polarity_cutoff parameters
    def __init__(self, reviews,labels,min_count = 10,polarity_cutoff = 0.1,hidden_nodes = 10, learning_rate = 0.1):
        """Create a SentimenNetwork with the given settings
        Args:
            reviews(list) - List of reviews used for training
            labels(list) - List of POSITIVE/NEGATIVE labels associated with the given reviews
            min_count(int) - Words should only be added to the vocabulary 
                             if they occur more than this many times
            polarity_cutoff(float) - The absolute value of a word's positive-to-negative
                                     ratio must be at least this big to be considered.
            hidden_nodes(int) - Number of nodes to create in the hidden layer
            learning_rate(float) - Learning rate to use while training
        
        """
        # Assign a seed to our random number generator to ensure we get
        # reproducable results during development 
        np.random.seed(1)

        # process the reviews and their associated labels so that everything
        # is ready for training
        ## New for Project 6: added min_count and polarity_cutoff arguments to pre_process_data call
        self.pre_process_data(reviews, labels, polarity_cutoff, min_count)
        
        # Build the network to have the number of hidden nodes and the learning rate that
        # were passed into this initializer. Make the same number of input nodes as
        # there are vocabulary words and create a single output node.
        self.init_network(len(self.review_vocab),hidden_nodes, 1, learning_rate)

    ## New for Project 6: added min_count and polarity_cutoff parameters
    def pre_process_data(self, reviews, labels, polarity_cutoff, min_count):
        
        ## ----------------------------------------
        ## New for Project 6: Calculate positive-to-negative ratios for words before
        #                     building vocabulary
        #
        positive_counts = Counter()
        negative_counts = Counter()
        total_counts = Counter()

        for i in range(len(reviews)):
            if(labels[i] == 'POSITIVE'):
                for word in reviews[i].split(" "):
                    positive_counts[word] += 1
                    total_counts[word] += 1
            else:
                for word in reviews[i].split(" "):
                    negative_counts[word] += 1
                    total_counts[word] += 1

        pos_neg_ratios = Counter()

        for term,cnt in list(total_counts.most_common()):
            if(cnt >= 50):
                pos_neg_ratio = positive_counts[term] / float(negative_counts[term]+1)
                pos_neg_ratios[term] = pos_neg_ratio

        for word,ratio in pos_neg_ratios.most_common():
            if(ratio > 1):
                pos_neg_ratios[word] = np.log(ratio)
            else:
                pos_neg_ratios[word] = -np.log((1 / (ratio + 0.01)))
        #
        ## end New for Project 6
        ## ----------------------------------------

        # populate review_vocab with all of the words in the given reviews
        review_vocab = set()
        for review in reviews:
            for word in review.split(" "):
                ## New for Project 6: only add words that occur at least min_count times
                #                     and for words with pos/neg ratios, only add words
                #                     that meet the polarity_cutoff
                if(total_counts[word] > min_count):
                    if(word in pos_neg_ratios.keys()):
                        if((pos_neg_ratios[word] >= polarity_cutoff) or (pos_neg_ratios[word] <= -polarity_cutoff)):
                            review_vocab.add(word)
                    else:
                        review_vocab.add(word)

        # Convert the vocabulary set to a list so we can access words via indices
        self.review_vocab = list(review_vocab)
        
        # populate label_vocab with all of the words in the given labels.
        label_vocab = set()
        for label in labels:
            label_vocab.add(label)
        
        # Convert the label vocabulary set to a list so we can access labels via indices
        self.label_vocab = list(label_vocab)
        
        # Store the sizes of the review and label vocabularies.
        self.review_vocab_size = len(self.review_vocab)
        self.label_vocab_size = len(self.label_vocab)
        
        # Create a dictionary of words in the vocabulary mapped to index positions
        self.word2index = {}
        for i, word in enumerate(self.review_vocab):
            self.word2index[word] = i
        
        # Create a dictionary of labels mapped to index positions
        self.label2index = {}
        for i, label in enumerate(self.label_vocab):
            self.label2index[label] = i

    def init_network(self, input_nodes, hidden_nodes, output_nodes, learning_rate):
        # Set number of nodes in input, hidden and output layers.
        self.input_nodes = input_nodes
        self.hidden_nodes = hidden_nodes
        self.output_nodes = output_nodes

        # Store the learning rate
        self.learning_rate = learning_rate

        # Initialize weights

        # These are the weights between the input layer and the hidden layer.
        self.weights_0_1 = np.zeros((self.input_nodes,self.hidden_nodes))

        # These are the weights between the hidden layer and the output layer.
        self.weights_1_2 = np.random.normal(0.0, self.output_nodes**-0.5, 
                                                (self.hidden_nodes, self.output_nodes))
        
        ## New for Project 5: Removed self.layer_0; added self.layer_1
        # The input layer, a two-dimensional matrix with shape 1 x hidden_nodes
        self.layer_1 = np.zeros((1,hidden_nodes))
    
    ## New for Project 5: Removed update_input_layer function
    
    def get_target_for_label(self,label):
        if(label == 'POSITIVE'):
            return 1
        else:
            return 0
        
    def sigmoid(self,x):
        return 1 / (1 + np.exp(-x))
    
    def sigmoid_output_2_derivative(self,output):
        return output * (1 - output)
    
    ## New for Project 5: changed name of first parameter form 'training_reviews' 
    #                     to 'training_reviews_raw'
    def train(self, training_reviews_raw, training_labels):

        ## New for Project 5: pre-process training reviews so we can deal 
        #                     directly with the indices of non-zero inputs
        training_reviews = list()
        for review in training_reviews_raw:
            indices = set()
            for word in review.split(" "):
                if(word in self.word2index.keys()):
                    indices.add(self.word2index[word])
            training_reviews.append(list(indices))

        # make sure out we have a matching number of reviews and labels
        assert(len(training_reviews) == len(training_labels))
        
        # Keep track of correct predictions to display accuracy during training 
        correct_so_far = 0

        # Remember when we started for printing time statistics
        start = time.time()
        
        # loop through all the given reviews and run a forward and backward pass,
        # updating weights for every item
        for i in range(len(training_reviews)):
            
            # Get the next review and its correct label
            review = training_reviews[i]
            label = training_labels[i]
            
            #### Implement the forward pass here ####
            ### Forward pass ###

            ## New for Project 5: Removed call to 'update_input_layer' function
            #                     because 'layer_0' is no longer used

            # Hidden layer
            ## New for Project 5: Add in only the weights for non-zero items
            self.layer_1 *= 0
            for index in review:
                self.layer_1 += self.weights_0_1[index]

            # Output layer
            ## New for Project 5: changed to use 'self.layer_1' instead of 'local layer_1'
            layer_2 = self.sigmoid(self.layer_1.dot(self.weights_1_2))            
            
            #### Implement the backward pass here ####
            ### Backward pass ###

            # Output error
            layer_2_error = layer_2 - self.get_target_for_label(label) # Output layer error is the difference between desired target and actual output.
            layer_2_delta = layer_2_error * self.sigmoid_output_2_derivative(layer_2)

            # Backpropagated error
            layer_1_error = layer_2_delta.dot(self.weights_1_2.T) # errors propagated to the hidden layer
            layer_1_delta = layer_1_error # hidden layer gradients - no nonlinearity so it's the same as the error

            # Update the weights
            ## New for Project 5: changed to use 'self.layer_1' instead of local 'layer_1'
            self.weights_1_2 -= self.layer_1.T.dot(layer_2_delta) * self.learning_rate # update hidden-to-output weights with gradient descent step
            
            ## New for Project 5: Only update the weights that were used in the forward pass
            for index in review:
                self.weights_0_1[index] -= layer_1_delta[0] * self.learning_rate # update input-to-hidden weights with gradient descent step

            # Keep track of correct predictions.
            if(layer_2 >= 0.5 and label == 'POSITIVE'):
                correct_so_far += 1
            elif(layer_2 < 0.5 and label == 'NEGATIVE'):
                correct_so_far += 1
            
            # For debug purposes, print out our prediction accuracy and speed 
            # throughout the training process. 
            elapsed_time = float(time.time() - start)
            reviews_per_second = i / elapsed_time if elapsed_time > 0 else 0
            
            sys.stdout.write("\rProgress:" + str(100 * i/float(len(training_reviews)))[:4] \
                             + "% Speed(reviews/sec):" + str(reviews_per_second)[0:5] \
                             + " #Correct:" + str(correct_so_far) + " #Trained:" + str(i+1) \
                             + " Training Accuracy:" + str(correct_so_far * 100 / float(i+1))[:4] + "%")
            if(i % 2500 == 0):
                print("")
    
    def test(self, testing_reviews, testing_labels):
        """
        Attempts to predict the labels for the given testing_reviews,
        and uses the test_labels to calculate the accuracy of those predictions.
        """
        
        # keep track of how many correct predictions we make
        correct = 0

        # we'll time how many predictions per second we make
        start = time.time()

        # Loop through each of the given reviews and call run to predict
        # its label. 
        for i in range(len(testing_reviews)):
            pred = self.run(testing_reviews[i])
            if(pred == testing_labels[i]):
                correct += 1
            
            # For debug purposes, print out our prediction accuracy and speed 
            # throughout the prediction process. 

            elapsed_time = float(time.time() - start)
            reviews_per_second = i / elapsed_time if elapsed_time > 0 else 0
            
            sys.stdout.write("\rProgress:" + str(100 * i/float(len(testing_reviews)))[:4] \
                             + "% Speed(reviews/sec):" + str(reviews_per_second)[0:5] \
                             + " #Correct:" + str(correct) + " #Tested:" + str(i+1) \
                             + " Testing Accuracy:" + str(correct * 100 / float(i+1))[:4] + "%")
    
    def run(self, review):
        """
        Returns a POSITIVE or NEGATIVE prediction for the given review.
        """
        # Run a forward pass through the network, like in the "train" function.
        
        ## New for Project 5: Removed call to update_input_layer function
        #                     because layer_0 is no longer used

        # Hidden layer
        ## New for Project 5: Identify the indices used in the review and then add
        #                     just those weights to layer_1 
        self.layer_1 *= 0
        unique_indices = set()
        for word in review.lower().split(" "):
            if word in self.word2index.keys():
                unique_indices.add(self.word2index[word])
        for index in unique_indices:
            self.layer_1 += self.weights_0_1[index]
        
        # Output layer
        ## New for Project 5: changed to use self.layer_1 instead of local layer_1
        layer_2 = self.sigmoid(self.layer_1.dot(self.weights_1_2))
         
        # Return POSITIVE for values above greater-than-or-equal-to 0.5 in the output layer;
        # return NEGATIVE for other values
        if(layer_2[0] >= 0.5):
            return "POSITIVE"
        else:
            return "NEGATIVE"


Run the following cell to train your network with a small polarity cutoff.

In [None]:
mlp = SentimentNetwork(reviews[:-1000],labels[:-1000],min_count=20,polarity_cutoff=0.05,learning_rate=0.01)
mlp.train(reviews[:-1000],labels[:-1000])

And run the following cell to test it's performance. It should be 

In [None]:
mlp.test(reviews[-1000:],labels[-1000:])

Run the following cell to train your network with a much larger polarity cutoff.

In [None]:
mlp = SentimentNetwork(reviews[:-1000],labels[:-1000],min_count=20,polarity_cutoff=0.8,learning_rate=0.01)
mlp.train(reviews[:-1000],labels[:-1000])

And run the following cell to test it's performance.

In [None]:
mlp.test(reviews[-1000:],labels[-1000:])

# End of Project 6. 
## Watch the next video to see Andrew's solution, then continue on to the next lesson.

# Analysis: What's Going on in the Weights?<a id='lesson_7'></a>

In [None]:
mlp_full = SentimentNetwork(reviews[:-1000],labels[:-1000],min_count=0,polarity_cutoff=0,learning_rate=0.01)

In [None]:
mlp_full.train(reviews[:-1000],labels[:-1000])

In [None]:
Image(filename='sentiment_network_sparse.png')

In [None]:
def get_most_similar_words(focus = "horrible"):
    most_similar = Counter()

    for word in mlp_full.word2index.keys():
        most_similar[word] = np.dot(mlp_full.weights_0_1[mlp_full.word2index[word]],mlp_full.weights_0_1[mlp_full.word2index[focus]])
    
    return most_similar.most_common()

In [None]:
get_most_similar_words("excellent")

In [None]:
get_most_similar_words("terrible")

In [None]:
import matplotlib.colors as colors

words_to_visualize = list()
for word, ratio in pos_neg_ratios.most_common(500):
    if(word in mlp_full.word2index.keys()):
        words_to_visualize.append(word)
    
for word, ratio in list(reversed(pos_neg_ratios.most_common()))[0:500]:
    if(word in mlp_full.word2index.keys()):
        words_to_visualize.append(word)

In [None]:
pos = 0
neg = 0

colors_list = list()
vectors_list = list()
for word in words_to_visualize:
    if word in pos_neg_ratios.keys():
        vectors_list.append(mlp_full.weights_0_1[mlp_full.word2index[word]])
        if(pos_neg_ratios[word] > 0):
            pos+=1
            colors_list.append("#00ff00")
        else:
            neg+=1
            colors_list.append("#000000")

In [None]:
from sklearn.manifold import TSNE
tsne = TSNE(n_components=2, random_state=0)
words_top_ted_tsne = tsne.fit_transform(vectors_list)

In [None]:
p = figure(tools="pan,wheel_zoom,reset,save",
           toolbar_location="above",
           title="vector T-SNE for most polarized words")

source = ColumnDataSource(data=dict(x1=words_top_ted_tsne[:,0],
                                    x2=words_top_ted_tsne[:,1],
                                    names=words_to_visualize,
                                    color=colors_list))

p.scatter(x="x1", y="x2", size=8, source=source, fill_color="color")

word_labels = LabelSet(x="x1", y="x2", text="names", y_offset=6,
                  text_font_size="8pt", text_color="#555555",
                  source=source, text_align='center')
p.add_layout(word_labels)

show(p)

# green indicates positive words, black indicates negative words