# Sentiment Analysis with Deep Neural Networks

- Understand how to build/design a model using layers
- Train a model using a training loop
- Use a binary cross entropy loss function
- Compute the accuracy of your model
- Predict using your own input

Note the following,  
- Model Architecture
- Inputs
- Outputs

In [1]:
import os
import random as rnd
import trax

# set random seeds to make this notebook easier to replicate
trax.supervised.trainer_lib.init_random_number_generators(31)

import trax.fastmath.numpy as np
from trax import layers as tl

from utils import Layer, load_tweets, process_tweet

INFO:tensorflow:tokens_length=568 inputs_length=512 targets_length=114 noise_density=0.15 mean_noise_span_length=3.0 


[nltk_data] Downloading package twitter_samples to
[nltk_data]     /Users/shankar/nltk_data...
[nltk_data]   Package twitter_samples is already up-to-date!
[nltk_data] Downloading package stopwords to
[nltk_data]     /Users/shankar/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


In [2]:
# Create an array using trax.fastmath.numpy
a = np.array(5.0)
display(a)
print(type(a))

DeviceArray(5., dtype=float32)

<class 'jax.interpreters.xla.DeviceArray'>


In [3]:
def f(x):
    return (x**2)

In [4]:
# Call the function
print(f"f(a) for a={a} is {f(a)}")

f(a) for a=5.0 is 25.0


### Gradient (Derivative)
Gradient of a function is calculated using `trax.fastmath.grad(fun=)` and passing the name of the function

In [5]:
# Directly use trax.fastmath.grad to calculate the gradient (derivative) of the function
grad_f = trax.fastmath.grad(fun=f)
type(grad_f)

function

In [6]:
# Call the newly created func and pass in a value for x
grad_calculation = grad_f(a)
display(grad_calculation)

DeviceArray(10., dtype=float32)

### Importing Data

In [7]:
import numpy as np

# Load positive and negative tweets
all_positive_tweets, all_negative_tweets = load_tweets()

# View the total number of positive and negative tweets.
print(f"The number of positive tweets: {len(all_positive_tweets)}")
print(f"The number of negative tweets: {len(all_negative_tweets)}")

# Split positive set into validation and training
val_pos   = all_positive_tweets[4000:] # generating validation set for positive tweets
train_pos  = all_positive_tweets[:4000]# generating training set for positive tweets

# Split negative set into validation and training
val_neg   = all_negative_tweets[4000:] # generating validation set for negative tweets
train_neg  = all_negative_tweets[:4000] # generating training set for nagative tweets

# Combine training data into one set
train_x = train_pos + train_neg 

# Combine validation data into one set
val_x  = val_pos + val_neg

# Set the labels for the training set (1 for positive, 0 for negative)
train_y = np.append(np.ones(len(train_pos)), np.zeros(len(train_neg)))

# Set the labels for the validation set (1 for positive, 0 for negative)
val_y  = np.append(np.ones(len(val_pos)), np.zeros(len(val_neg)))

print(f"length of train_x {len(train_x)}")
print(f"length of val_x {len(val_x)}")

The number of positive tweets: 5000
The number of negative tweets: 5000
length of train_x 8000
length of val_x 2000


In [8]:
# Import a function that processes the tweets
# from utils import process_tweet

# Try out function that processes tweets
print("original tweet at training position 0")
print(train_pos[0])

print("Tweet at training position 0 after processing:")
process_tweet(train_pos[0])

original tweet at training position 0
#FollowFriday @France_Inte @PKuchly57 @Milipol_Paris for being top engaged members in my community this week :)
Tweet at training position 0 after processing:


['followfriday', 'top', 'engag', 'member', 'commun', 'week', ':)']

In [9]:
# Build the vocabulary
# Unit Test Note - There is no test set here only train/val

# Include special tokens 
# started with pad, end of line and unk tokens
Vocab = {'__PAD__': 0, '__</e>__': 1, '__UNK__': 2} 

# Note that we build vocab using training data
for tweet in train_x: 
    processed_tweet = process_tweet(tweet)
    for word in processed_tweet:
        if word not in Vocab: 
            Vocab[word] = len(Vocab)
    
print("Total words in vocab are",len(Vocab))
#display(Vocab)

Total words in vocab are 9092


### Converting a Tweet to a Tensor
Input a tweet:
```CPP
'@happypuppy, is Maria happy?'
```

The tweet_to_tensor will first conver the tweet into a list of tokens (including only relevant words)
```CPP
['maria', 'happi']
```

Then it will convert each word into its unique integer

```CPP
[2, 56]
```
- Notice that the word "maria" is not in the vocabulary, so it is assigned the unique integer associated with the `__UNK__` token, because it is considered "unknown."


In [10]:
def tweet_to_tensor(tweet, vocab_dict, unk_token='__UNK__', verbose=False):
    
    # Process the tweet into a list of words
    # where only important words are kept
    word_l = process_tweet(tweet)
    if(verbose):
        print("List of words from the processed tweet:")
        print(word_l)
        
    # Initialize the list that will contain the unique integers IDs of each word
    tensor_l = []
    
    # Get the unique integer ID of the __UNK__ token
    unk_ID = vocab_dict[unk_token]
    
    if verbose:
        print(f"The unique integer ID for the unk_token is {unk_ID}")    
        
    for word in word_l:
        # Get the unique integer ID
        # If the word doesnt exist in the vocab dictionary
        # use the unique ID for __UNK__ instead
        word_ID = vocab_dict[word] if word in vocab_dict else unk_ID
        
        tensor_l.append(word_ID)
        
    return tensor_l

In [11]:
print("Actual tweet is\n", val_pos[0])
print("\nTensor of tweet:\n", tweet_to_tensor(val_pos[0], vocab_dict=Vocab))

Actual tweet is
 Bro:U wan cut hair anot,ur hair long Liao bo
Me:since ord liao,take it easy lor treat as save $ leave it longer :)
Bro:LOL Sibei xialan

Tensor of tweet:
 [1065, 136, 479, 2351, 745, 8146, 1123, 745, 53, 2, 2672, 791, 2, 2, 349, 601, 2, 3489, 1017, 597, 4559, 9, 1065, 157, 2, 2]


In [12]:
# test tweet_to_tensor

def test_tweet_to_tensor():
    test_cases = [
        
        {
            "name":"simple_test_check",
            "input": [val_pos[1], Vocab],
            "expected":[444, 2, 304, 567, 56, 9],
            "error":"The function gives bad output for val_pos[1]. Test failed"
        },
        {
            "name":"datatype_check",
            "input":[val_pos[1], Vocab],
            "expected":type([]),
            "error":"Datatype mismatch. Need only list not np.array"
        },
        {
            "name":"without_unk_check",
            "input":[val_pos[1], Vocab],
            "expected":6,
            "error":"Unk word check not done- Please check if you included mapping for unknown word"
        }
    ]
    count = 0
    for test_case in test_cases:
        
        try:
            if test_case['name'] == "simple_test_check":
                assert test_case["expected"] == tweet_to_tensor(*test_case['input'])
                count += 1
            if test_case['name'] == "datatype_check":
                assert isinstance(tweet_to_tensor(*test_case['input']), test_case["expected"])
                count += 1
            if test_case['name'] == "without_unk_check":
                assert None not in tweet_to_tensor(*test_case['input'])
                count += 1
                
            
            
        except:
            print(test_case['error'])
    if count == 3:
        print("\033[92m All tests passed")
    else:
        print(count," Tests passed out of 3")
test_tweet_to_tensor()            

[92m All tests passed


### Create a Batch Generator
- Training single models will take a lot of time, so batches are preferred
- A generator is built that takes in +ve/-ve tweets and returns a batch of training examples
- It returns model inputs, the targets and the weights
- ie Input vector, Labels(+ve, -ve) and weight of each target

Single batch can be obtained using next operator  
- The generator returns the data in a format that you could directly use in your model

In [13]:
def data_generator(data_pos, data_neg, batch_size, loop, vocab_dict, shuffle=False):
    """
    Input: 
        data_pos - Set of positive examples
        data_neg - Set of negative examples
        batch_size - number of samples per batch, even
        loop - True/False
        vocab_dict
        shuffle
        
    Yield:
        inputs - Subset of +ve and -ve examples
        targets - The corresponding labels for the subset
        example_weights - An array specifying the importance of each example
    """
    # Make sure the batch size is an even umber
    # to allow an equal number of positive and negative sampels
    assert batch_size % 2 == 0
    
    # Number of +ve examples in each batch is half of the batch size
    # same with number of negative examples in each batch
    n_to_take = batch_size // 2
    
    # Use pos_index to walk trhought the data_pos array
    # same with neg_index and data_neg
    pos_index = 0
    neg_index = 0
    
    len_data_pos = len(data_pos)
    len_data_neg = len(data_neg)
    
    # Get an array with data indexes
    pos_index_lines = list(range(len_data_pos))
    neg_index_lines = list(range(len_data_neg))
    
    # Shuffle lines if shuffle is set to True
    if(shuffle):
        rnd.shuffle(pos_index_lines)
        rnd.shuffle(neg_index_lines)
        
    stop = False
    
    # Loop indefinitely
    while not stop:
        # Create a batch with +ve and -ve examples
        batch = []
        #Pack n_to_take positive examples
        # Start from pos_index and increment i upto n_to_take
        
        for i in range(n_to_take):
            # If the positive index goes past the positive dataset lenght,
            if pos_index >= len_data_pos: 
            # If the positive index goes past the positive dataset length
                if not loop:
                    stop = True
                    break
                    # If user wants to keep reusing the data, reset the index
                pos_index = 0
                if(shuffle):
                    # Shuffle the index of the positive sample
                    rnd.shuffle(pos_index_lines)
            
            # Get the tweet as pos_index
            tweet = data_pos[pos_index_lines[pos_index]]
            # Convert the tweet into tensors of integers representing the processed words
            tensor = tweet_to_tensor(tweet, vocab_dict)
            # Append the tensor to the batch list
            batch.append(tensor)
            # Increment pos_index by one
            pos_index = pos_index + 1
            
        for i in range(n_to_take):
            if(neg_index >= len_data_neg):
                if not loop:
                    stop = True
                    break
                neg_index = 0
                if(shuffle):
                    rnd.shuffle(neg_index_lines)
                    
            tweet = data_neg[neg_index_lines[neg_index]]
            tensor = tweet_to_tensor(tweet, vocab_dict)
            batch.append(tensor)
            neg_index = neg_index + 1
            
        if(stop):
            break
            
        # Update the start index for +ve/-ve data
        # So that its n_to_take positions after the current pos_index
        pos_index += n_to_take
        neg_index += n_to_take
        
        # Get the max tweet length (the length of the longest tweet)
        # (you will pad all shorter tweets to have this length)
        max_len = max([len(t) for t in batch])
        
        # Initialize the input_l, which will
        # store the padded versions of the tensors
        tensor_pad_l = []
        for tensor in batch:
            # Get the number of positions to pad for this tensor
            # so taht it will be max_len long
            n_pad = max_len - len(tensor)
            pad_l = [0] * n_pad
            # concatenate the tensor
            tensor_pad = tensor + pad_l
            tensor_pad_l.append(tensor_pad)
            
        # Convert the list of padded tensors to a numpy array
        # and store this as the model inputs
        inputs = np.array(tensor_pad_l)
        
        # Generate the list of targets for the +Ve samples - a list of ones
        # The length is the number of positive examples in the batch
        target_pos = [1] * n_to_take
        target_neg = [0] * n_to_take
        
        target_l = target_pos + target_neg
        targets = np.array(target_l)
        
        # Treat all examples equally important. IT 
        example_weigths = np.ones_like(targets)
                
        yield inputs, targets, example_weigths

In [14]:
# Set the random number generator for the shuffle procedure
rnd.seed(30) 

# Create the training data generator
def train_generator(batch_size, shuffle = False):
    return data_generator(train_pos, train_neg, batch_size, True, Vocab, shuffle)

# Create the validation data generator
def val_generator(batch_size, shuffle = False):
    return data_generator(val_pos, val_neg, batch_size, True, Vocab, shuffle)

# Create the validation data generator
def test_generator(batch_size, shuffle = False):
    return data_generator(val_pos, val_neg, batch_size, False, Vocab, shuffle)

# Get a batch from the train_generator and inspect.
inputs, targets, example_weights = next(train_generator(4, shuffle=True))

# this will print a list of 4 tensors padded with zeros
print(f'Inputs: {inputs}')
print(f'Targets: {targets}')
print(f'Example Weights: {example_weights}')

Inputs: [[2005 4451 3201    9    0    0    0    0    0    0    0]
 [4954  567 2000 1454 5174 3499  141 3499  130  459    9]
 [3761  109  136  583 2930 3969    0    0    0    0    0]
 [ 250 3761    0    0    0    0    0    0    0    0    0]]
Targets: [1 1 0 0]
Example Weights: [1 1 1 1]


In [15]:
# Test the train_generator

# Create a data generator for training data,
# which produces batches of size 4 (for tensors and their respective targets)
tmp_data_gen = train_generator(batch_size = 4)

# Call the data generator to get one batch and its targets
tmp_inputs, tmp_targets, tmp_example_weights = next(tmp_data_gen)

print(f"The inputs shape is {tmp_inputs.shape}")
print(f"The targets shape is {tmp_targets.shape}")
print(f"The example weights shape is {tmp_example_weights.shape}")

for i,t in enumerate(tmp_inputs):
    print(f"input tensor: {t}; target {tmp_targets[i]}; example weights {tmp_example_weights[i]}")

The inputs shape is (4, 14)
The targets shape is (4,)
The example weights shape is (4,)
input tensor: [3 4 5 6 7 8 9 0 0 0 0 0 0 0]; target 1; example weights 1
input tensor: [10 11 12 13 14 15 16 17 18 19 20  9 21 22]; target 1; example weights 1
input tensor: [5738 2901 3761    0    0    0    0    0    0    0    0    0    0    0]; target 0; example weights 1
input tensor: [ 858  256 3652 5739  307 4458  567 1230 2767  328 1202 3761    0    0]; target 0; example weights 1


In [16]:
class Relu(Layer):
    def forward(self, x):
        activation = np.maximum(0, x)
        return activation

In [17]:
# Test your relu function
x = np.array([[-2.0, -1.0, 0.0], [0.0, 1.0, 2.0]], dtype=float)
relu_layer = Relu()
print("Test data is:")
print(x)
print("Output of Relu is:")
print(relu_layer(x))

Test data is:
[[-2. -1.  0.]
 [ 0.  1.  2.]]
Output of Relu is:
[[0. 0. 0.]
 [0. 1. 2.]]


## Dense Class
#### Forward Function
$$forward\left(x, W\right)=xW$$

In [18]:
from trax import fastmath
np = fastmath.numpy
# use the fastmath.random module from trax
random = fastmath.random

In [19]:
# See how the fastmath.trax.random.normal function works
tmp_key = random.get_prng(seed=1)
print("The random seed generated by random.get_prng")
display(tmp_key)

print("choose a matrix with 2 rows and 3 columns")
tmp_shape=(2,3)
display(tmp_shape)

# Generate a weight matrix
# Note that you'll get an error if you try to set dtype to tf.float32, where tf is tensorflow
# Just avoid setting the dtype and allow it to use the default data type
tmp_weight = trax.fastmath.random.normal(key=tmp_key, shape=tmp_shape)

print("Weight matrix generated with a normal distribution with mean 0 and stdev of 1")
display(tmp_weight)

The random seed generated by random.get_prng


DeviceArray([0, 1], dtype=uint32)

choose a matrix with 2 rows and 3 columns


(2, 3)

Weight matrix generated with a normal distribution with mean 0 and stdev of 1


DeviceArray([[ 0.957307  , -0.9699291 ,  1.0070664 ],
             [ 0.36619022,  0.17294823,  0.29092228]], dtype=float32)

In [20]:
help(random.normal)

Help on method normal in module trax.fastmath.ops:

normal(*args, **kwargs) method of trax.fastmath.ops.RandomBackend instance



In [21]:
class Dense(Layer):
    def __init__(self, n_units, init_stdev=0.1):
        self._n_units = n_units
        self._init_stdev = init_stdev
        
    def forward(self, x):
        dense = np.dot(x, self.weights)
        return dense
        
    def init_weights_and_state(self, input_signature, random_key):
        input_shape = input_signature.shape
        w = random.normal(key=random_key, shape=(input_shape[-1], self._n_units))
        self.weights = w * self._init_stdev
        return self.weights

In [22]:
# Testing your Dense layer 
#sets  number of units in dense layer
dense_layer = Dense(n_units=10)  
# sets random seed
random_key = random.get_prng(seed=0)  
# input array 
z = np.array([[2.0, 7.0, 25.0]]) 

dense_layer.init(z, random_key)
#Returns randomly generated weights
print("Weights are\n ",dense_layer.weights) 
# Returns multiplied values of units and weights
print("Foward function output is ", dense_layer(z)) 

Weights are
  [[-0.02837108  0.09368162 -0.10050076  0.14165013  0.10543301  0.09108126
  -0.04265672  0.0986188  -0.05575325  0.00153249]
 [-0.20785688  0.0554837   0.09142365  0.05744595  0.07227863  0.01210617
  -0.03237354  0.16234995  0.02450038 -0.13809784]
 [-0.06111237  0.01403724  0.08410042 -0.1094358  -0.10775021 -0.11396459
  -0.05933381 -0.01557652 -0.03832145 -0.11144515]]
Foward function output is  [[-3.0395496   0.9266802   2.5414743  -2.050473   -1.9769388  -2.582209
  -1.7952735   0.94427425 -0.8980402  -3.7497487 ]]


In [23]:
# Pretend the embedding matrix uses 
# 2 elements for embedding the meaning of a word
# and has a vocabulary size of 3
# So it has shape (2,3)
tmp_embed = np.array([[1,2,3,],
                    [4,5,6]
                   ])

# take the mean along axis 0
print("The mean along axis 0 creates a vector whose length equals the vocabulary size")
display(np.mean(tmp_embed,axis=0))

print("The mean along axis 1 creates a vector whose length equals the number of elements in a word embedding")
display(np.mean(tmp_embed,axis=1))

The mean along axis 0 creates a vector whose length equals the vocabulary size


DeviceArray([2.5, 3.5, 4.5], dtype=float32)

The mean along axis 1 creates a vector whose length equals the number of elements in a word embedding


DeviceArray([2., 5.], dtype=float32)

In [24]:
def classifier(vocab_size=len(Vocab), embedding_dim=256, output_dim=2, mode='train'):
    embed_layer = tl.Embedding(
        vocab_size=vocab_size,
        d_feature=embedding_dim
    )
    mean_layer = tl.Mean(axis=1)
    dense_output_layer = tl.Dense(n_units=2)
    log_softmax_layer = tl.LogSoftmax()
    
    model = tl.Serial(
        embed_layer,
        mean_layer,
        dense_output_layer,
        log_softmax_layer
    )
    return model

In [25]:
tmp_model = classifier()

In [26]:
print(type(tmp_model))
display(tmp_model)

<class 'trax.layers.combinators.Serial'>


Serial[
  Embedding_9092_256
  Mean
  Dense_2
  LogSoftmax
]

## Training
To train a model on a task, Trax defines an abstraction [`trax.supervised.training.TrainTask`](https://trax-ml.readthedocs.io/en/latest/trax.supervised.html#trax.supervised.training.TrainTask) which packages the train data, loss and optimizer (among other things) together into an object.

Similarly to evaluate a model, Trax defines an abstraction [`trax.supervised.training.EvalTask`](https://trax-ml.readthedocs.io/en/latest/trax.supervised.html#trax.supervised.training.EvalTask) which packages the eval data and metrics (among other things) into another object.

The final piece tying things together is the [`trax.supervised.training.Loop`](https://trax-ml.readthedocs.io/en/latest/trax.supervised.html#trax.supervised.training.Loop) abstraction that is a very simple and flexible way to put everything together and train the model, all the while evaluating it and saving checkpoints.
Using `Loop` will save you a lot of code compared to always writing the training loop by hand, like you did in courses 1 and 2. More importantly, you are less likely to have a bug in that code that would ruin your training.

In [27]:
from trax.supervised import training

batch_size = 16
rnd.seed(271)

train_task = training.TrainTask(
    labeled_data=train_generator(batch_size=batch_size, shuffle=True),
    loss_layer=tl.CrossEntropyLoss(),
    optimizer=trax.optimizers.Adam(0.01),
    n_steps_per_checkpoint=10
)
eval_task = training.EvalTask(
    labeled_data=val_generator(batch_size=batch_size, shuffle=True),
    metrics=[tl.CrossEntropyLoss(), tl.Accuracy()]
)
model = classifier()

In [28]:
output_dir = './model'
output_dir_expand = os.path.expanduser(output_dir)
print(output_dir_expand)

./model


In [29]:
def train_model(classifier, train_task, eval_task, n_steps, output_dir):
    training_loop = training.Loop(
        classifier,
        train_task,
        eval_task=eval_task,
        output_dir=output_dir
    )
    training_loop.run(n_steps=n_steps)
    
    return training_loop

In [30]:
training_loop = train_model(model, train_task, eval_task, 100, output_dir_expand)

Step      1: train CrossEntropyLoss |  0.60789275
Step      1: eval  CrossEntropyLoss |  0.76009548
Step      1: eval          Accuracy |  0.56250000
Step     10: train CrossEntropyLoss |  0.64948821
Step     10: eval  CrossEntropyLoss |  0.45854422
Step     10: eval          Accuracy |  0.87500000
Step     20: train CrossEntropyLoss |  0.40723333
Step     20: eval  CrossEntropyLoss |  0.33322647
Step     20: eval          Accuracy |  0.93750000
Step     30: train CrossEntropyLoss |  0.25177222
Step     30: eval  CrossEntropyLoss |  0.18621588
Step     30: eval          Accuracy |  1.00000000
Step     40: train CrossEntropyLoss |  0.24520011
Step     40: eval  CrossEntropyLoss |  0.44792810
Step     40: eval          Accuracy |  0.62500000
Step     50: train CrossEntropyLoss |  0.17519739
Step     50: eval  CrossEntropyLoss |  0.15775831
Step     50: eval          Accuracy |  0.93750000
Step     60: train CrossEntropyLoss |  0.11804956
Step     60: eval  CrossEntropyLoss |  0.11095993


### Practice Making a Prediction

In [31]:
tmp_train_generator = train_generator(16)
tmp_batch = next(tmp_train_generator)

# Position 0 has the model inputs - tweets as tensors
# Position 1 has the targets - the actual labels
tmp_inputs, tmp_targets, tmp_example_weights = tmp_batch

print(f"The batch is a tuple of length {len(tmp_batch)} because position 0 contains the tweets, and position 1 contains the targets.") 
print(f"The shape of the tweet tensors is {tmp_inputs.shape} (num of examples, length of tweet tensors)")
print(f"The shape of the labels is {tmp_targets.shape}, which is the batch size.")
print(f"The shape of the example_weights is {tmp_example_weights.shape}, which is the same as inputs/targets size.")

The batch is a tuple of length 3 because position 0 contains the tweets, and position 1 contains the targets.
The shape of the tweet tensors is (16, 15) (num of examples, length of tweet tensors)
The shape of the labels is (16,), which is the batch size.
The shape of the example_weights is (16,), which is the same as inputs/targets size.


In [32]:
# feed the tweet tensors into the model to get a prediction
tmp_pred = training_loop.eval_model(tmp_inputs)
print(f"The prediction shape is {tmp_pred.shape}, num of tensor_tweets as rows")
print("Column 0 is the probability of a negative sentiment (class 0)")
print("Column 1 is the probability of a positive sentiment (class 1)")
print()
print("View the prediction array")
tmp_pred

The prediction shape is (16, 2), num of tensor_tweets as rows
Column 0 is the probability of a negative sentiment (class 0)
Column 1 is the probability of a positive sentiment (class 1)

View the prediction array


DeviceArray([[-6.5867472e+00, -1.3794899e-03],
             [-4.9628477e+00, -7.0173740e-03],
             [-5.6064124e+00, -3.6809444e-03],
             [-5.3729963e+00, -4.6510696e-03],
             [-4.6580648e+00, -9.5300674e-03],
             [-5.2760839e+00, -5.1255226e-03],
             [-4.9576702e+00, -7.0540905e-03],
             [-3.9602642e+00, -1.9242048e-02],
             [-6.3717365e-03, -5.0590839e+00],
             [-3.7033916e-02, -3.3143821e+00],
             [-1.1131763e-02, -4.5035138e+00],
             [-4.4822693e-05, -1.0009674e+01],
             [-7.7550411e-03, -4.8632913e+00],
             [-4.0259361e-03, -5.5169826e+00],
             [-1.7308950e-02, -4.0651746e+00],
             [-9.0484619e-03, -4.7096877e+00]], dtype=float32)

In [33]:
# Turn probabilities into category predictions
tmp_is_positive = tmp_pred[:, 1] > tmp_pred[:, 0]
for i, p in enumerate(tmp_is_positive):
    print(f"Neg log prob {tmp_pred[i, 0]:.4f}\tPos log prob {tmp_pred[i, 1]:.4f}\t is positive?{p}\t actual {tmp_targets[i]}")

Neg log prob -6.5867	Pos log prob -0.0014	 is positive?True	 actual 1
Neg log prob -4.9628	Pos log prob -0.0070	 is positive?True	 actual 1
Neg log prob -5.6064	Pos log prob -0.0037	 is positive?True	 actual 1
Neg log prob -5.3730	Pos log prob -0.0047	 is positive?True	 actual 1
Neg log prob -4.6581	Pos log prob -0.0095	 is positive?True	 actual 1
Neg log prob -5.2761	Pos log prob -0.0051	 is positive?True	 actual 1
Neg log prob -4.9577	Pos log prob -0.0071	 is positive?True	 actual 1
Neg log prob -3.9603	Pos log prob -0.0192	 is positive?True	 actual 1
Neg log prob -0.0064	Pos log prob -5.0591	 is positive?False	 actual 0
Neg log prob -0.0370	Pos log prob -3.3144	 is positive?False	 actual 0
Neg log prob -0.0111	Pos log prob -4.5035	 is positive?False	 actual 0
Neg log prob -0.0000	Pos log prob -10.0097	 is positive?False	 actual 0
Neg log prob -0.0078	Pos log prob -4.8633	 is positive?False	 actual 0
Neg log prob -0.0040	Pos log prob -5.5170	 is positive?False	 actual 0
Neg log prob 

In [34]:
# View the array of booleans
print("Array of Booleans")
display(tmp_is_positive)

# Convert boolean to type int32
# True is converted to 1
# False is converted to 0
tmp_is_positive_int = tmp_is_positive.astype(np.int32)

# View the array of integers
print("Array of Integers")
display(tmp_is_positive_int)

# Convert boolean to type float32
tmp_is_positive_float = tmp_is_positive.astype(np.float32)

# View the array of floats
print("Array of floats")
display(tmp_is_positive_float)

Array of Booleans


DeviceArray([ True,  True,  True,  True,  True,  True,  True,  True,
             False, False, False, False, False, False, False, False],            dtype=bool)

Array of Integers


DeviceArray([1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0], dtype=int32)

Array of floats


DeviceArray([1., 1., 1., 1., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0., 0.,
             0.], dtype=float32)

In [35]:
tmp_pred.shape

(16, 2)

In [36]:
print(f"True == 1: {True == 1}")
print(f"True == 2: {True == 2}")
print(f"False == 0: {False == 0}")
print(f"False == 2: {False == 2}")

True == 1: True
True == 2: False
False == 0: True
False == 2: False


## Evaluation
### Computing the Accuracy on a Batch

In [54]:
def compute_accuracy(preds, y, y_weights):
    # Create an array of booleans
    # True if the probability of positive sentiment is greater than
    # teh probability of negative sentiment
    # else False
    is_pos = preds[:, 1] > preds[:, 0]
    is_pos_int = is_pos.astype(np.int32)
    correct = is_pos_int != y
    sum_weights = np.sum(y_weights)
    correct_float = correct.astype(np.float32)
    
    # Multiply each predictions with its corresponding weight
    weigthed_correct_float = np.multiply(y_weights, correct_float)
    
    # Sum up the weighted correct predictions to go in the denominator
    weighted_num_correct = np.sum(weigthed_correct_float)
    
    # Divide the number of weighted correct predictions by the sum of the weights
    accuracy = weighted_num_correct / np.sum(y_weights)
    
    return accuracy, weighted_num_correct, sum_weights

In [55]:
# test your function
tmp_val_generator = val_generator(64)

# get one batch
tmp_batch = next(tmp_val_generator)

# Position 0 has the model inputs (tweets as tensors)
# position 1 has the targets (the actual labels)
tmp_inputs, tmp_targets, tmp_example_weights = tmp_batch

# feed the tweet tensors into the model to get a prediction
tmp_pred = training_loop.eval_model(tmp_inputs)

tmp_acc, tmp_num_correct, tmp_num_predictions = compute_accuracy(preds=tmp_pred, y=tmp_targets, y_weights=tmp_example_weights)

print(f"Model's prediction accuracy on a single training batch is: {100 * tmp_acc}%")
print(f"Weighted number of correct predictions {tmp_num_correct}; weighted number of total observations predicted {tmp_num_predictions}")

Model's prediction accuracy on a single training batch is: 1.5625%
Weighted number of correct predictions 1.0; weighted number of total observations predicted 64


In [56]:
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
np.multiply(a, b)

DeviceArray([ 4, 10, 18], dtype=int32)

In [57]:
tmp_example_weights

DeviceArray([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
             1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
             1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
             1, 1, 1, 1], dtype=int32)

In [58]:
def test_model(generator, model):
    accuracy = 0
    total_num_correct = 0
    total_num_pred = 0
    
    for batch in generator:
        # Retrieve the inputs from the batch
        inputs = batch[0]
        targets = batch[1]
        example_weight = batch[2]
        
        pred = model(inputs)
        
        batch_accuracy, batch_num_correct, batch_num_pred = compute_accuracy(pred, targets, example_weight)
        
        total_num_correct = batch_num_correct
        total_num_pred += batch_num_pred
        
    accuracy = total_num_correct / total_num_pred
    return accuracy

In [59]:
model = training_loop.eval_model
accuracy = test_model(test_generator(16), model)

print(f'The accuracy of your model on the validation set is {accuracy:.4f}', )

The accuracy of your model on the validation set is 0.0000


In [43]:
model.

SyntaxError: invalid syntax (<ipython-input-43-e5c9403477cf>, line 1)