<a href="https://colab.research.google.com/github/martin-fabbri/colab-notebooks/blob/master/deeplearning.ai/nlp/c3_w1_assigment_deep_neural_networks.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Assignment 1:  Sentiment with Deep Neural Networks

Welcome to the first assignment of course 3. In this assignment, you will explore sentiment analysis using deep neural networks. 
## Outline
- [Part 1:  Import libraries and try out Trax](#1)
- [Part 2:  Importing the data](#2)
    - [2.1  Loading in the data](#2.1)
    - [2.2  Building the vocabulary](#2.2)
    - [2.3  Converting a tweet to a tensor](#2.3)
        - [Exercise 01](#ex01)
    - [2.4  Creating a batch generator](#2.4)
        - [Exercise 02](#ex02)
- [Part 3:  Defining classes](#3)
    - [3.1  ReLU class](#3.1)
        - [Exercise 03](#ex03)
    - [3.2  Dense class ](#3.2)
        - [Exercise 04](#ex04)
    - [3.3  Model](#3.3)
        - [Exercise 05](#ex05)
- [Part 4:  Training](#4)
    - [4.1  Training the model](#4.1)
        - [Exercise 06](#ex06)
    - [4.2  Practice Making a prediction](#4.2)
- [Part 5:  Evaluation  ](#5)
    - [5.1  Computing the accuracy on a batch](#5.1)
        - [Exercise 07](#ex07)
    - [5.2  Testing your model on Validation Data](#5.2)
        - [Exercise 08](#ex08)
- [Part 6:  Testing with your own input](#6)

In course 1, you implemented Logistic regression and Naive Bayes for sentiment analysis. However if you were to give your old models an example like:

<center> <span style='color:blue'> <b>This movie was almost good.</b> </span> </center>

Your model would have predicted a positive sentiment for that review. However, that sentence has a negative sentiment and indicates that the movie was not good. To solve those kinds of misclassifications, you will write a program that uses deep neural networks to identify sentiment in text. By completing this assignment, you will: 

- Understand how you can build/design a model using layers
- Train a model using a training loop
- Use a binary cross-entropy loss function
- Compute the accuracy of your model
- Predict using your own input

As you can tell, this model follows a similar structure to the one you previously implemented in the second course of this specialization. 
- Indeed most of the deep nets you will be implementing will have a similar structure. The only thing that changes is the model architecture, the inputs, and the outputs. Before starting the assignment, we will introduce you to the Google library `trax` that we use for building and training models.


Now we will show you how to compute the gradient of a certain function `f` by just using `  .grad(f)`. 

- Trax source code can be found on Github: [Trax](https://github.com/google/trax)
- The Trax code also uses the JAX library: [JAX](https://jax.readthedocs.io/en/latest/index.html)

<a name="1"></a>
# Part 1:  Import libraries and try out Trax

- Let's import libraries and look at an example of using the Trax library.

In [19]:
%%capture
!pip install trax==1.3.1

In [20]:
import random as rnd

# import relevant libraries
import trax

# set random seeds to make this notebook easier to replicate
trax.supervised.trainer_lib.init_random_number_generators(31)

import os
import re

# import Layer from the utils.py file
import string

import nltk

# import trax.fastmath.numpy
import trax.fastmath.numpy as np

# import trax.layers
from trax import layers as tl

nltk.download('twitter_samples')
nltk.download('stopwords')
from nltk.corpus import stopwords, twitter_samples
from nltk.tokenize import TweetTokenizer

!pip list | grep 'trax\|nltk\|jax'

[nltk_data] Downloading package twitter_samples to /root/nltk_data...
[nltk_data]   Package twitter_samples is already up-to-date!
[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
jax                           0.2.7                
jaxlib                        0.1.57+cuda101       
nltk                          3.2.5                
trax                          1.3.1                


In [21]:
# Stop words are messy and not that compelling; 
# "very" and "not" are considered stop words, but they are obviously expressing sentiment

# The porter stemmer lemmatizes "was" to "wa".  Seriously???

# I'm not sure we want to get into stop words
stopwords_english = stopwords.words('english')

# Also have my doubts about stemming...
from nltk.stem import PorterStemmer
stemmer = PorterStemmer()

def process_tweet(tweet):
    '''
    Input: 
        tweet: a string containing a tweet
    Output:
        tweets_clean: a list of words containing the processed tweet
    
    '''
    # remove stock market tickers like $GE
    tweet = re.sub(r'\$\w*', '', tweet)
    # remove old style retweet text "RT"
    tweet = re.sub(r'^RT[\s]+', '', tweet)
    # remove hyperlinks
    tweet = re.sub(r'https?:\/\/.*[\r\n]*', '', tweet)
    # remove hashtags
    # only removing the hash # sign from the word
    tweet = re.sub(r'#', '', tweet)
    # tokenize tweets
    tokenizer = TweetTokenizer(preserve_case=False, strip_handles=True, reduce_len=True)
    tweet_tokens = tokenizer.tokenize(tweet)
    ### START CODE HERE ###
    tweets_clean = []
    for word in tweet_tokens:
        if (word not in stopwords_english and # remove stopwords
            word not in string.punctuation): # remove punctuation
            #tweets_clean.append(word)
            stem_word = stemmer.stem(word) # stemming word
            tweets_clean.append(stem_word)
    ### END CODE HERE ###
    return tweets_clean


# let's not reuse variables
#all_positive_tweets = twitter_samples.strings('positive_tweets.json')
#all_negative_tweets = twitter_samples.strings('negative_tweets.json')

def load_tweets():
    all_positive_tweets = twitter_samples.strings('positive_tweets.json')
    all_negative_tweets = twitter_samples.strings('negative_tweets.json')  
    return all_positive_tweets, all_negative_tweets
    
# Layers have weights and a foward function.
# They create weights when layer.initialize is called and use them.
# remove this or make it optional 

class Layer(object):
    """Base class for layers."""
    def __init__(self):
        self.weights = None

    def forward(self, x):
        raise NotImplementedError
  
    def init_weights_and_state(self, input_signature, random_key):
        pass

    def init(self, input_signature, random_key):
        self.init_weights_and_state(input_signature, random_key)
        return self.weights
    
    def __call__(self, x):
        return self.forward(x)

In [22]:
# Create an array using trax.fastmath.numpy
a = np.array(5.0)

# View the returned array
display(a)

print(type(a))

DeviceArray(5., dtype=float32)

<class 'jax.interpreters.xla._DeviceArray'>


Notice that trax.fastmath.numpy returns a DeviceArray from the jax library.

In [23]:
# Define a function that will use the trax.fastmath.numpy array
def f(x):
    
    # f = x^2
    return (x**2)

In [24]:
# Call the function
print(f"f(a) for a={a} is {f(a)}")

f(a) for a=5.0 is 25.0


The gradient (derivative) of function `f` with respect to its input `x` is the derivative of $x^2$.
- The derivative of $x^2$ is $2x$.  
- When x is 5, then $2x=10$.

You can calculate the gradient of a function by using `trax.fastmath.grad(fun=)` and passing in the name of the function.
- In this case the function you want to take the gradient of is `f`.
- The object returned (saved in `grad_f` in this example) is a function that can calculate the gradient of f for a given trax.fastmath.numpy array.

In [25]:
grad_f = trax.fastmath.grad(fun=f)
type(grad_f)

function

In [26]:
grad_calculation = grad_f(a)
display(grad_calculation)

DeviceArray(10., dtype=float32)

The function returned by trax.fastmath.grad takes in x=5 and calculates the gradient of f, which is 2*x, which is 10. The value is also stored as a DeviceArray from the jax library.

<a name="2"></a>
# Part 2:  Importing the data

<a name="2.1"></a>
## 2.1  Loading in the data

Import the data set.  
- You may recognize this from earlier assignments in the specialization.
- Details of process_tweet function are available in utils.py file

In [27]:
## DO NOT EDIT THIS CELL

# Import functions from the utils.py file

import numpy as np

# Load positive and negative tweets
all_positive_tweets, all_negative_tweets = load_tweets()

# View the total number of positive and negative tweets.
print(f"The number of positive tweets: {len(all_positive_tweets)}")
print(f"The number of negative tweets: {len(all_negative_tweets)}")

# Split positive set into validation and training
val_pos   = all_positive_tweets[4000:] # generating validation set for positive tweets
train_pos  = all_positive_tweets[:4000]# generating training set for positive tweets

# Split negative set into validation and training
val_neg   = all_negative_tweets[4000:] # generating validation set for negative tweets
train_neg  = all_negative_tweets[:4000] # generating training set for nagative tweets

# Combine training data into one set
train_x = train_pos + train_neg 

# Combine validation data into one set
val_x  = val_pos + val_neg

# Set the labels for the training set (1 for positive, 0 for negative)
train_y = np.append(np.ones(len(train_pos)), np.zeros(len(train_neg)))

# Set the labels for the validation set (1 for positive, 0 for negative)
val_y  = np.append(np.ones(len(val_pos)), np.zeros(len(val_neg)))

print(f"length of train_x {len(train_x)}")
print(f"length of val_x {len(val_x)}")

The number of positive tweets: 5000
The number of negative tweets: 5000
length of train_x 8000
length of val_x 2000


Now import a function that processes tweets (we've provided this in the utils.py file).
- `process_tweets' removes unwanted characters e.g. hashtag, hyperlinks, stock tickers from tweet.
- It also returns a list of words (it tokenizes the original string).

In [28]:
# Import a function that processes the tweets
# from utils import process_tweet

# Try out function that processes tweets
print("original tweet at training position 0")
print(train_pos[0])

print("Tweet at training position 0 after processing:")
process_tweet(train_pos[0])

original tweet at training position 0
#FollowFriday @France_Inte @PKuchly57 @Milipol_Paris for being top engaged members in my community this week :)
Tweet at training position 0 after processing:


['followfriday', 'top', 'engag', 'member', 'commun', 'week', ':)']

In [29]:
# Try out function that processes tweets
print("original tweet at training position 1")
print(train_pos[1])

print("Tweet at training position 0 after processing:")
process_tweet(train_pos[1])

original tweet at training position 1
@Lamb2ja Hey James! How odd :/ Please call our Contact Centre on 02392441234 and we will be able to assist you :) Many thanks!
Tweet at training position 0 after processing:


['hey',
 'jame',
 'odd',
 ':/',
 'pleas',
 'call',
 'contact',
 'centr',
 '02392441234',
 'abl',
 'assist',
 ':)',
 'mani',
 'thank']

Notice that the function `process_tweet` keeps key words, removes the hash # symbol, and ignores usernames (words that begin with '@').  It also returns a list of the words.