# Sentiment Analysis: Deep Neural Networks

We will explore sentiment analysis on tweets using deep neural networks. Given a tweet, we will decide if it has a positive sentiment or a negative one.

Given an example like: "This movie was almost good.", Logistic regression and Naive Bayes models will predict a positive sentiment for that review. However, that sentence has a negative sentiment and indicates that the movie was not good. To solve those kinds of misclassifications, you will write a program that uses deep neural networks to identify sentiment in text.


Let's first download the necessary datasets.
- ``twitter_samples``: Check out the documentation for the [``twitter_samples`` dataset](http://www.nltk.org/howto/twitter.html).
- ``stopwords``

Uncomment the next cell if you have not downloaded these datasets.


In [None]:
# import nltk
# nltk.download('twitter_samples')
# nltk.download('stopwords')

In [2]:
import re
import string
import random as rnd

from nltk.corpus import twitter_samples
from nltk.corpus import stopwords
from nltk.stem import PorterStemmer
from nltk.tokenize import TweetTokenizer

import numpy as np

import trax
from trax import layers as tl
from trax.supervised import training

INFO:tensorflow:tokens_length=568 inputs_length=512 targets_length=114 noise_density=0.15 mean_noise_span_length=3.0 


In [3]:
# set random seeds to make the results easier to replicate
manual_seed = 123
rnd.seed(manual_seed)

## Prepare the data

The ``twitter_samples`` contains subsets of 5,000 positive tweets, 5,000 negative tweets, and the full set of 10,000 tweets.
- If we used all three datasets, we would introduce duplicates of the positive tweets and negative tweets.
- We will select just the five thousand positive tweets and five thousand negative tweets.


In [4]:
# select the set of positive and negative tweets
all_positive_tweets = twitter_samples.strings('positive_tweets.json')
all_negative_tweets = twitter_samples.strings('negative_tweets.json')

Train test split: 20% will be in the test set, and 80% in the training set.


In [5]:
# split the data into two pieces, one for training and one for testing (validation set)
train_pos = all_positive_tweets[:4000]
test_pos = all_positive_tweets[4000:]
train_neg = all_negative_tweets[:4000]
test_neg = all_negative_tweets[4000:]

train_x = train_pos + train_neg 
test_x = test_pos + test_neg

Create the numpy array of positive labels and negative labels.


In [6]:
# combine positive and negative labels (1 for positive, 0 for negative)
train_y = np.append(np.ones(len(train_pos)), np.zeros(len(train_neg)))
test_y = np.append(np.ones(len(test_pos)), np.zeros(len(test_neg)))

# Print the shape train and test sets
print("train_y.shape = " + str(train_y.shape))
print("test_y.shape = " + str(test_y.shape))

train_y.shape = (8000,)
test_y.shape = (2000,)


## Preprocessing

Data preprocessing is one of the critical steps in any machine learning project. It includes cleaning and formatting the data before feeding into a machine learning algorithm. For NLP, the preprocessing steps are comprised of the following tasks:
- Tokenizing the string
- Lowercasing
- Removing stop words and punctuation
- Stemming

Since we have a Twitter dataset, we'd like to remove some substrings commonly used on the platform like the hashtag, retweet marks, and hyperlinks.


In [7]:
def process_tweet(tweet):
    """Process tweet function.
    Input:
        tweet: a string containing a tweet
    Output:
        tweets_clean: a list of words containing the processed tweet
    """
    stemmer = PorterStemmer()
    stopwords_english = stopwords.words('english')
    # remove stock market tickers like $GE
    tweet = re.sub(r'\$\w*', '', tweet)
    # remove old style retweet text "RT"
    tweet = re.sub(r'^RT[\s]+', '', tweet)
    # remove hyperlinks
    tweet = re.sub(r'https?:\/\/.*[\r\n]*', '', tweet)
    # remove hashtags
    # only removing the hash # sign from the word
    tweet = re.sub(r'#', '', tweet)
    # tokenize tweets
    tokenizer = TweetTokenizer(preserve_case=False, strip_handles=True,
                               reduce_len=True)
    tweet_tokens = tokenizer.tokenize(tweet)

    tweets_clean = []
    for word in tweet_tokens:
        if (word not in stopwords_english and     # remove stopwords
                word not in string.punctuation):  # remove punctuation
            # tweets_clean.append(word)
            stem_word = stemmer.stem(word)        # stemming word
            tweets_clean.append(stem_word)

    return tweets_clean

In [8]:
# test the function below
print('\033[0mAn example of a positive tweet: \n\033[34m', train_x[0])
print('\033[0m\nAn example of the processed version of the tweet: \n\033[32m', process_tweet(train_x[0]))
print('\033[0m')

[0mAn example of a positive tweet: 
[34m #FollowFriday @France_Inte @PKuchly57 @Milipol_Paris for being top engaged members in my community this week :)
[0m
An example of the processed version of the tweet: 
[32m ['followfriday', 'top', 'engag', 'member', 'commun', 'week', ':)']
[0m


Notice that the function ``process_tweet`` keeps key words, removes the hash # symbol, and ignores usernames (words that begin with '@').  It also returns a list of the words.


## Building the vocabulary

Now, let's build the vocabulary based on the training data. 
We will map each word in each tweet to an integer (an "index"). 
The vocabulary will also include some special tokens:
- `__PAD__`: padding
- `</e>`: end of line
- `__UNK__`: a token representing any word that is not in the vocabulary.

In [9]:
# Include special tokens
Vocab = {'__PAD__': 0, '__</e>__': 1, '__UNK__': 2} 

# Note that we build vocab using training data
for tweet in train_x: 
    processed_tweet = process_tweet(tweet)
    for word in processed_tweet:
        if word not in Vocab: 
            Vocab[word] = len(Vocab)
    
print("Total words in vocab are",len(Vocab))

Total words in vocab are 9088


The dictionary `Vocab` will look like this:
```python
{'__PAD__': 0,
 '__</e>__': 1,
 '__UNK__': 2,
 'followfriday': 3,
 'top': 4,
 'engag': 5,
 ...
```

- Each unique word has a unique integer associated with it.
- The total number of words in Vocab: 9088

## Converting a tweet to a tensor

Now, let's write a function that will convert each tweet to a tensor (a list of unique integer IDs representing the processed tweet). 
- Note, the returned data type will be a **regular Python ``list()``**, not a numpy array.
- For words in the tweet that are not in the vocabulary, we'll set them to the unique ID for the token ``__UNK__``.

#### Example
Input a tweet:
```python
'@happypuppy, is Maria happy?'
```

The ``tweet_to_tensor`` will first conver the tweet into a list of tokens (including only relevant words)
```python
['maria', 'happi']
```

Then, it will convert each word into its unique integer
```python
[2, 56]
```
- Notice that the word "maria" is not in the vocabulary, so it is assigned the unique integer associated with the ``__UNK__`` token, because it is considered "unknown."


In [10]:
def tweet_to_tensor(tweet, vocab_dict, unk_token='__UNK__', verbose=False):
    """
    Input: 
        tweet:      A string containing a tweet
        vocab_dict: The words dictionary
        unk_token:  The special string for unknown tokens
        verbose:    Print info durign runtime
    Output:
        tensor_l:   A python list with unique integer IDs representing 
                    the processed tweet
    """
    # Process the tweet into a list of words (stop words removed)
    word_l = process_tweet(tweet)
    if verbose:
        print("List of words from the processed tweet:")
        print(word_l)
        
    # Get the unique integer ID of the __UNK__ token
    unk_ID = vocab_dict[unk_token]
    if verbose:
        print(f"The unique integer ID for the unk_token is {unk_ID}")
        
    # Initialize the list that will contain the unique integer IDs of each word
    tensor_l = []
    
    # for each word in the list:
    for word in word_l:
        # Get the unique integer ID of the word, or 
        # the unique ID for __UNK__ if the word doesn't exist in the vocab dictionary.
        word_ID = vocab_dict.get(word, unk_ID)
        
        # Append the unique integer ID to the tensor list.
        tensor_l.append(word_ID) 
    
    return tensor_l

In [11]:
print("Actual tweet is:\n", test_pos[0])
print("\nTensor of tweet:\n", tweet_to_tensor(test_pos[0], vocab_dict=Vocab))

Actual tweet is:
 Bro:U wan cut hair anot,ur hair long Liao bo
Me:since ord liao,take it easy lor treat as save $ leave it longer :)
Bro:LOL Sibei xialan

Tensor of tweet:
 [1065, 136, 479, 2351, 745, 8148, 1123, 745, 53, 2, 2672, 791, 2, 2, 349, 601, 2, 3489, 1017, 597, 4559, 9, 1065, 157, 2, 2]


## Creating a batch generator

Most of the time in Natural Language Processing, and AI in general, we use batches when training our datasets. 
- If instead of training with batches of examples, you were to train a model with one example at a time, it would take a very long time to train the model. 
- We will now build a data generator that takes in the positive/negative tweets and returns a batch of training examples. It returns the model inputs, the targets (positive or negative labels) and the weight for each target (ex: this allows us to can treat some examples as more important to get right than others, but commonly this will all be 1.0). 

Once we created the generator, we could include it in a for loop like this:
```python
for batch_inputs, batch_targets, batch_example_weights in data_generator:
    ...
```

We can also get a single batch like this:
```python
batch_inputs, batch_targets, batch_example_weights = next(data_generator)
```

The generator returns the next batch each time it's called. 
- This generator returns the data in a format (tensors) that we could directly use in our model.
- It returns a triple: the inputs, targets, and loss weights:
  - Inputs is a tensor that contains the batch of tweets we put into the model.
  - Targets is the corresponding batch of labels that we train to generate.
  - Loss weights here are just 1s with same shape as targets.


In [12]:
def data_generator(data_pos, data_neg, batch_size, loop, vocab_dict, shuffle=False):
    """
    Input: 
        data_pos:   Set of posstive examples
        data_neg:   Set of negative examples
        batch_size: number of samples per batch. Must be even
        loop:       whether to loop over data or not (True or False)
        vocab_dict: The words dictionary
        shuffle:    Shuffle the data order (True or False)
    Yield:
        inputs:          Subset of positive and negative examples
        targets:         The corresponding labels for the subset
        example_weights: An array specifying the importance of each example
    """
    # make sure the batch size is an even number
    # to allow an equal number of positive and negative samples
    assert batch_size % 2 == 0
    
    # Number of positive and negative examples in each batch
    n_to_take = batch_size // 2
    
    # pos_index and neg_index to walk through the data_pos and data_neg arrays
    pos_index = 0
    neg_index = 0
    
    len_data_pos = len(data_pos)
    len_data_neg = len(data_neg)
    
    # Get and array with the data indexes
    pos_index_lines = list(range(len_data_pos))
    neg_index_lines = list(range(len_data_neg))
    
    # shuffle lines if shuffle is set to True
    if shuffle:
        rnd.shuffle(pos_index_lines)
        rnd.shuffle(neg_index_lines)
        
    stop = False
    # Loop indefinitely
    while not stop:
        # create a batch with positive and negative examples
        batch = []
        
        # First part: Pack n_to_take positive examples
        for i in range(n_to_take):
            # If the positive index goes past the positive dataset lenght
            if pos_index >= len_data_pos: 
                # If loop is set to False, break once we reach the end of the dataset
                if not loop:
                    stop = True;
                    break;
                # If user wants to keep re-using the data, reset the index
                pos_index = 0
                if shuffle:
                    # Shuffle the index of the positive sample
                    rnd.shuffle(pos_index_lines)
            # get the tweet as pos_index
            tweet = data_pos[pos_index_lines[pos_index]]
            # convert the tweet into tensors of integers representing the processed words
            tensor = tweet_to_tensor(tweet, vocab_dict)
            # append the tensor to the batch list
            batch.append(tensor)
            # Increment pos_index by one
            pos_index = pos_index + 1

        # Second part: Pack n_to_take negative examples
        for i in range(n_to_take):
            # If the negative index goes past the negative dataset length
            if neg_index >= len_data_neg:
                # If loop is set to False, break once we reach the end of the dataset
                if not loop:
                    stop = True;
                    break;
                # If user wants to keep re-using the data, reset the index
                neg_index = 0
                if shuffle:
                    # Shuffle the index of the negative sample
                    rnd.shuffle(neg_index_lines)
            # get the tweet as neg_index
            tweet = data_neg[neg_index_lines[neg_index]]
            # convert the tweet into tensors of integers representing the processed words
            tensor = tweet_to_tensor(tweet, vocab_dict)
            # append the tensor to the batch list
            batch.append(tensor)
            # Increment neg_index by one
            neg_index = neg_index + 1

        if stop:
            break;

        # # Update the start index for positive and negative data 
        # # so that it's n_to_take positions after the current pos_index and neg_index
        # pos_index += n_to_take
        # neg_index += n_to_take
        
        # Get the max tweet length (the length of the longest tweet) 
        # (we will pad all shorter tweets to have this length)
        max_len = max([len(t) for t in batch])
        
        # Initialize the tensor_pad_l, which will store the padded versions of the tensors
        tensor_pad_l = []
        # Pad shorter tweets with zeros
        for tensor in batch:
            # Get the number of positions to pad for this tensor
            n_pad = max_len - len(tensor)
            # Generate a list of zeros, with length n_pad
            pad_l = [0] * n_pad
            # Concatenate the tensor and the list of padded zeros
            tensor_pad = tensor + pad_l
            # Append the padded tensor to the list of padded tensors
            tensor_pad_l.append(tensor_pad)

        # The list of targets for the positive examples (a list of ones)
        target_pos = [1] * n_to_take
        # The list of targets for the negative examples (a list of zeros)
        target_neg = [0] * n_to_take
        # Concatenate the positve and negative targets
        target_l = target_pos + target_neg
        
        # Convert the list of padded tensors and the target list to numpy arrays
        inputs = np.array(tensor_pad_l)
        targets = np.array(target_l)
        # Example weights: Treat all examples equally important.
        example_weights = np.ones_like(targets)
        
        yield inputs, targets, example_weights

Now, we can use our ``data_generator`` to create a data generator for the training data, and another data generator for the test data. We will create a third data generator that does not loop, for testing the final accuracy of the model.

In [13]:
# Set the random number generator for the shuffle procedure
rnd.seed(manual_seed) 

# Create the training data generator
def train_generator(batch_size, shuffle=False):
    return data_generator(train_pos, train_neg, batch_size, True, Vocab, shuffle)

# Create the validation data generator
def val_generator(batch_size, shuffle=False):
    return data_generator(test_pos, test_neg, batch_size, True, Vocab, shuffle)

# Create the validation data generator
def test_generator(batch_size, shuffle=False):
    return data_generator(test_pos, test_neg, batch_size, False, Vocab, shuffle)

In [14]:
# Get a batch from the train_generator and inspect.
inputs, targets, example_weights = next(train_generator(4, shuffle=True))

# This will print a list of 4 tensors padded with zeros
print(f'Inputs:\n {inputs}')
print(f'Targets:\n {targets}')
print(f'Example Weights:\n {example_weights}')

Inputs:
 [[3327   94   57 5660   75    0    0    0]
 [ 417 1397   22   95    9    0    0    0]
 [ 774  172 6558  377  621 1571  307 3761]
 [ 710  236  293 2193 2306 8093  486 3761]]
Targets:
 [1 1 0 0]
Example Weights:
 [1 1 1 1]


Now that we have our train/val generators, we can just call them and they will return tensors which correspond to our tweets in the first column and their corresponding labels in the second column. Now we can go ahead and start building our neural network.


## Model

Now we will implement a classifier using neural networks. Here is the model architecture we will be implementing. 

<img src = "images/nn.jpg" style="width:400px;height:250px;"/>

For the model implementation, we will use the Trax layers library `tl`. 
Note that the second character of `tl` is the lowercase of letter `L`, not the number 1.


In [15]:
def classifier(vocab_size=len(Vocab), embedding_dim=256, output_dim=2, mode='train'):
    """
    Input:
        vocab_size:    size of the vocabulary
        embedding_dim: number of features in embedding layer
        output_dim:    number of output in classifier
        mode:          mode of classifier
    Output:
        model: the model of type trax.layers.combinators.Serial
    """
    # Embedding layer
    embed_layer = tl.Embedding(
        vocab_size=vocab_size,    # Size of the vocabulary
        d_feature=embedding_dim)  # Embedding dimension
    # Mean layer, to create an "average" over word embedding
    mean_layer = tl.Mean(axis=1)
    # Dense layer, one unit for each output
    dense_output_layer = tl.Dense(n_units=output_dim)
    # Log softmax layer (no parameters needed)
    log_softmax_layer = tl.LogSoftmax()
    
    # Use tl.Serial to combine all layers and create the classifier
    model = tl.Serial(
      embed_layer,        # embedding layer
      mean_layer,         # mean layer
      dense_output_layer, # dense output layer 
      log_softmax_layer   # log softmax layer
    )
    
    return model

In [16]:
model = classifier()
display(model)

Serial[
  Embedding_9088_256
  Mean
  Dense_2
  LogSoftmax
]

## Training the model

To train a model on a task, Trax defines an abstraction [``trax.supervised.training.TrainTask``](https://trax-ml.readthedocs.io/en/latest/trax.supervised.html#trax.supervised.training.TrainTask) which packages the train data, loss and optimizer (among other things) together into an object.

Similarly, to evaluate a model, Trax defines an abstraction [``trax.supervised.training.EvalTask``](https://trax-ml.readthedocs.io/en/latest/trax.supervised.html#trax.supervised.training.EvalTask) which packages the eval data and metrics (among other things) into another object.

The final piece tying things together is the [``trax.supervised.training.Loop``](https://trax-ml.readthedocs.io/en/latest/trax.supervised.html#trax.supervised.training.Loop) abstraction that is a very simple and flexible way to put everything together and train the model, all the while evaluating it and saving checkpoints.
Using ``Loop`` will save us a lot of code compared to always writing the training loop by hand. More importantly, we are less likely to have a bug in that code that would ruin our training.


In [17]:
batch_size = 16
rnd.seed(manual_seed)

train_task = training.TrainTask(
    labeled_data=train_generator(batch_size=batch_size, shuffle=True),
    loss_layer=tl.CrossEntropyLoss(),
    optimizer=trax.optimizers.Adam(0.01),
    n_steps_per_checkpoint=10,
)

eval_task = training.EvalTask(
    labeled_data=val_generator(batch_size=batch_size, shuffle=True),
    metrics=[tl.CrossEntropyLoss(), tl.Accuracy()],
)



This defines a model trained using [``tl.CrossEntropyLoss``](https://trax-ml.readthedocs.io/en/latest/trax.layers.html#trax.layers.metrics.CrossEntropyLoss) optimized with the [``trax.optimizers.Adam``](https://trax-ml.readthedocs.io/en/latest/trax.optimizers.html#trax.optimizers.adam.Adam) optimizer, all the while tracking the accuracy using [``tl.Accuracy``](https://trax-ml.readthedocs.io/en/latest/trax.layers.html#trax.layers.metrics.Accuracy) metric. We also track ``tl.CrossEntropyLoss`` on the validation set.


In [18]:
def train_model(classifier, train_task, eval_task, n_steps, output_dir):
    """
    Input: 
        classifier: the model
        train_task: Training task
        eval_task:  Evaluation task
        n_steps:    the evaluation steps
        output_dir: directory to save files
    Output:
        training_loop: trax trainer
    """
    training_loop = training.Loop(classifier,              # The learning model
                                  train_task,              # The training task
                                  eval_tasks=[eval_task],  # The evaluation task
                                  output_dir=output_dir,   # The output directory
                                  random_seed=manual_seed) # random seed
    
    training_loop.run(n_steps = n_steps)

    # Return the training_loop, since it has the model.
    return training_loop

Now let's make an output directory and train the model.


In [19]:
output_dir = "./model/"
training_loop = train_model(model, train_task, eval_task, 100, output_dir)


Step    110: Ran 10 train steps in 3.70 secs
Step    110: train CrossEntropyLoss |  0.01230888
Step    110: eval  CrossEntropyLoss |  0.01557323
Step    110: eval          Accuracy |  1.00000000

Step    120: Ran 10 train steps in 3.15 secs
Step    120: train CrossEntropyLoss |  0.02942829
Step    120: eval  CrossEntropyLoss |  0.17076454
Step    120: eval          Accuracy |  0.93750000

Step    130: Ran 10 train steps in 1.68 secs
Step    130: train CrossEntropyLoss |  0.03062473
Step    130: eval  CrossEntropyLoss |  0.06471900
Step    130: eval          Accuracy |  0.93750000

Step    140: Ran 10 train steps in 1.31 secs
Step    140: train CrossEntropyLoss |  0.02525533
Step    140: eval  CrossEntropyLoss |  0.01430054
Step    140: eval          Accuracy |  1.00000000

Step    150: Ran 10 train steps in 0.68 secs
Step    150: train CrossEntropyLoss |  0.04391470
Step    150: eval  CrossEntropyLoss |  0.01726753
Step    150: eval          Accuracy |  1.00000000

Step    160: Ran 10

Now that we have trained a model, we can access it as ``training_loop.model`` object. We will actually use ``training_loop.eval_model``. We sometimes use a different model for evaluation, e.g., one without dropout.


##  Computing the accuracy on a batch

We will now write a function that evaluates our model on the validation set and returns the accuracy. 
- ``preds`` contains the predictions.
    - Its dimensions are ``(batch_size, output_dim)``. ``output_dim`` is two in this case. Column 0 contains the probability that the tweet belongs to class 0 (negative sentiment). Column 1 contains probability that it belongs to class 1 (positive sentiment).
    - If the probability in column 1 is greater than the probability in column 0, then interpret this as the model's prediction that the example has label 1 (positive sentiment).  
    - Otherwise, if the probabilities are equal or the probability in column 0 is higher, the model's prediction is 0 (negative sentiment).
- ``y`` contains the actual labels.
- ``y_weights`` contains the weights to give to predictions.

In [20]:
def compute_accuracy(preds, y, y_weights):
    """
    Input: 
        preds:     a tensor of shape (dim_batch, output_dim) 
        y:         a tensor of shape (dim_batch, output_dim) with the true labels
        y_weights: a (np.ndarray) with the a weight for each example
    Output: 
        accuracy:             a float between 0-1 
        weighted_num_correct: Sum of the weighted correct predictions (np.float32)
        sum_weights:          Sum of the weights (np.float32)
    """
    # An array of np.int32: `1` if the probability of positive sentiment
    # is greater than the probability of negative sentiment, else `0`
    is_pos_int = (preds[:, 0] < preds[:, 1]).astype(np.int32)
    # The array of correct predictions of np.float32
    correct_float = (is_pos_int == y).astype(np.float32)
    # Multiply each prediction with its corresponding weight
    weighted_correct_float = np.multiply(correct_float, y_weights)

    # Sum up the weighted correct predictions (of type np.float32)
    weighted_num_correct = np.sum(weighted_correct_float)
    # The sum of the weights
    sum_weights = np.sum(y_weights)
    # Divide the number of weighted correct predictions by the sum of the weights
    accuracy = weighted_num_correct / sum_weights

    return accuracy, weighted_num_correct, sum_weights

## Testing the model

Now we will test our model's prediction accuracy on validation data.


In [21]:
def test_model(generator, model):
    """
    Input: 
        generator: an iterator instance that provides batches of inputs and targets
        model:     a model instance 
    Output: 
        accuracy: float corresponding to the accuracy
    """
    accuracy = 0.
    total_num_correct = 0
    total_num_pred = 0
    
    for batch in generator: 
        # Retrieve the inputs, the targets (actual labels) and the example weight
        inputs, targets, example_weight = batch
        # predictions using the inputs
        pred = model(inputs)
        # Accuracy for the batch by comparing its predictions and targets
        batch_accuracy, batch_num_correct, batch_num_pred = compute_accuracy(pred, targets, example_weight)
        # Update the total number of predictions and correct predictions
        total_num_correct += batch_num_correct
        total_num_pred += batch_num_pred
        
    # Accuracy over all examples
    accuracy = total_num_correct / total_num_pred
    
    return accuracy

In [22]:
# Testing the accuracy of the model
model = training_loop.eval_model
accuracy = test_model(test_generator(16), model)

print(f'The accuracy of the model on the validation set is {accuracy:.4f}')

The accuracy of the model on the validation set is 0.9940


## Predicting a tweet

Finally we will test with new input tweet.


In [23]:
def predict(sentence):
    inputs = np.array(tweet_to_tensor(sentence, vocab_dict=Vocab))
    # Batch size 1, add dimension for batch, to work with the model
    inputs = inputs[None, :]
    # predict with the model
    preds_probs = model(inputs)
    # Turn probabilities into categories
    preds = int(preds_probs[0, 1] > preds_probs[0, 0])
    if preds == 1:
        sentiment = "positive"
    else:
        sentiment = "negative"

    return preds, sentiment

In [24]:
# Try a positive sentence
sentence = "It's such a nice day, think i'll be taking Sid to Ramsgate fish and chips for lunch at Peter's fish factory and then the beach maybe"
tmp_pred, tmp_sentiment = predict(sentence)
print(f"The sentiment of the sentence \n***\n\"{sentence}\"\n***\nis {tmp_sentiment}.")

print()
# try a negative sentence
sentence = "I hated my day, it was the worst, I'm so sad."
tmp_pred, tmp_sentiment = predict(sentence)
print(f"The sentiment of the sentence \n***\n\"{sentence}\"\n***\nis {tmp_sentiment}.")

The sentiment of the sentence 
***
"It's such a nice day, think i'll be taking Sid to Ramsgate fish and chips for lunch at Peter's fish factory and then the beach maybe"
***
is positive.

The sentiment of the sentence 
***
"I hated my day, it was the worst, I'm so sad."
***
is negative.


Notice that the model works well even for complex sentences.

Deep nets allow you to understand and capture dependencies that you would have not been able to capture with a simple linear regression, or logistic regression.
- It also allows us to better use pre-trained embeddings for classification and tends to generalize better.
