## Import NLTK sample dataset and Numpy library

In this project, we use a Twitter dataset that comes with NLTK. This dataset has been manually annotated and serves to establish baselines for models quickly. 

The sample dataset from NLTK is separated into positive and negative tweets. It contains 5000 positive tweets and 5000 negative tweets exactly. It is just because balanced datasets simplify the design of most computational methods that are required for sentiment analysis. However, it is better to be aware that this balance of classes is artificial.

In [1]:
import nltk                                # Python library for NLP
import numpy as np                         # Python Numpy library
from nltk.corpus import twitter_samples    # sample Twitter dataset from NLTK

## Download Twitter dataset

The sample dataset from NLTK is separated into positive and negative tweets. It contains 5000 positive tweets and 5000 negative tweets exactly. It is just because balanced datasets simplify the design of most computational methods that are required for sentiment analysis. However, it is better to be aware that this balance of classes is artificial.

In [3]:
# downloads sample twitter dataset. uncomment the line below if running on a local machine.
nltk.download('twitter_samples')

[nltk_data] Downloading package twitter_samples to
[nltk_data]     /Users/user/nltk_data...
[nltk_data]   Unzipping corpora/twitter_samples.zip.


True

Then load the text fields of tweets form the json file using the provided `string()` method

In [4]:
# load the set of positive and negative tweets
all_positive_tweets = twitter_samples.strings('positive_tweets.json')
all_negative_tweets = twitter_samples.strings('negative_tweets.json')

Then print some information about the dataset

In [38]:
print('Number of positive tweets: ', len(all_positive_tweets))
print('Number of negative tweets: ', len(all_negative_tweets))

print('\nThe type of all_positive_tweets is: ', type(all_positive_tweets))
print('The type of a tweet entry is: ', type(all_negative_tweets[0]))

Number of positive tweets:  5000
Number of negative tweets:  5000

The type of all_positive_tweets is:  <class 'list'>
The type of a tweet entry is:  <class 'str'>


## Check random raw tweet text

In [15]:
import random                              # pseudo-random number generator

#print positive in greeen
print('\033[92m' + all_positive_tweets[random.randint(0,5000)])

# print negative in red
print('\033[91m' + all_negative_tweets[random.randint(0,5000)])

[92m@x22AEW @candlerosa @qvcuk @miceal can't wait Rosa love xmas :) welcome back ann hope u had a good one back just in time :) x
[91m@EmperorJepp yeah, they are just the next prey :( poor girls


## Prepare train data and test data
* The `twitter_samples` contains subsets of 5,000 positive tweets, 5,000 negative tweets, and the full set of 10,000 tweets. 

* Train test split: 20% will be in the test set, and 80% in the training set.



In [39]:
# split the data into two pieces, one for training and one for testing (validation set) 
test_pos = all_positive_tweets[4000:]
train_pos = all_positive_tweets[:4000]
test_neg = all_negative_tweets[4000:]
train_neg = all_negative_tweets[:4000]

train_x = train_pos + train_neg 
test_x = test_pos + test_neg

* Create the numpy array of positive labels and negative labels.

`Y` is an integer representing the corpus: `1` for the positive tweets and `0` for the negative tweets. 

In [43]:
# combine positive and negative labels
train_y = np.append(np.ones((len(train_pos), 1)), np.zeros((len(train_neg), 1)), axis=0)
test_y = np.append(np.ones((len(test_pos), 1)), np.zeros((len(test_neg), 1)), axis=0)

# Print the shape train and test sets
print("train_y.shape = " + str(train_y.shape))
print("test_y.shape = " + str(test_y.shape))

train_y.shape = (8000, 1)
test_y.shape = (2000, 1)


## Preprocess raw tweet for Sentiment analysis

We will process the data in these four steps:

- Removing URL,Twitter mark and styles
- Lowercasing
- Tokenizing the string
- Removing stop words and punctuation
- Stemming


In [19]:
import re                                  # library for regular expression operations
import string                              # for string operations

from nltk.corpus import stopwords          # module for stop words that come with NLTK
from nltk.stem import PorterStemmer        # module for stemming
from nltk.tokenize import TweetTokenizer   # module for tokenizing strings

### Define a process function
    Input:
        tweet: a string containing a tweet
    Output:
        tweets_clean: a list of words containing the processed tweet

In [20]:
def process_tweet(tweet):
    
    stemmer = PorterStemmer()
    stopwords_english = stopwords.words('english')
    
    # remove stock market tickers like $GE
    tweet = re.sub(r'\$\w*', '', tweet)
    
    # remove old style retweet text "RT"
    tweet = re.sub(r'^RT[\s]+', '', tweet)
    
    # remove hyperlinks
    tweet = re.sub(r'https?:\/\/.*[\r\n]*', '', tweet)
    
    # remove hashtags
    # only removing the hash # sign from the word
    tweet = re.sub(r'#', '', tweet)
    
    
    # tokenize tweets
    tokenizer = TweetTokenizer(preserve_case=False, strip_handles=True,
                               reduce_len=True)
    tweet_tokens = tokenizer.tokenize(tweet)

    tweets_clean = []
    for word in tweet_tokens:
        if (word not in stopwords_english and  # remove stopwords
                word not in string.punctuation):  # remove punctuation
            # tweets_clean.append(word)
            stem_word = stemmer.stem(word)  # stemming word
            tweets_clean.append(stem_word)

    return tweets_clean


### Test the process function

In [37]:

tweet = all_positive_tweets[random.randint(0,5000)]

print()
print('\033[92m')
print(tweet)
print('\033[94m')

# call the imported function
tweets_stem = process_tweet(tweet); # Preprocess a given tweet

print('preprocessed tweet:')
print(tweets_stem) # Print the result


[92m
@roma_cream that's the spirit :) #WsaleLove
[94m
preprocessed tweet:
["that'", 'spirit', ':)', 'wsalelov']


## Build word frequency dictionary

In this dictionary, each key is a 2-element tuple containing a `(word, y)` pair. The `word` is an element in a processed tweet.

    """
    Input:
        tweets: a list of tweets
        ys: an array with the sentiment label of each tweet (either 0 or 1)
    Output:
        freqs: a dictionary mapping each (word, sentiment) pair to its frequency
    """


The value associated with this key is the number of times that word appears in the specified corpus. For example: 

``` 
# "folowfriday" appears 25 times in the positive tweets
('followfriday', 1.0): 25

# "shame" appears 19 times in the negative tweets
('shame', 0.0): 19 
```

In [45]:
def build_freqs(tweets, ys):
    # Convert np array to list since zip needs an iterable.
    # The squeeze is necessary or the list ends up with one element.
    # Also note that this is just a NOP if ys is already a list.
    yslist = np.squeeze(ys).tolist()

    # Start with an empty dictionary and populate it by looping over all tweets
    # and over all processed words in each tweet.
    freqs = {}
    for y, tweet in zip(yslist, tweets):
        for word in process_tweet(tweet):
            pair = (word, y)
            if pair in freqs:
                freqs[pair] += 1
            else:
                freqs[pair] = 1

    return freqs

In [46]:
# create frequency dictionary
freqs = build_freqs(train_x, train_y)

# check the output
print("type(freqs) = " + str(type(freqs)))
print("len(freqs) = " + str(len(freqs.keys())))

type(freqs) = <class 'dict'>
len(freqs) = 11340


## Test the frequences dictionary

In [47]:
# select some words to appear in the report. we will assume that each word is unique (i.e. no duplicates)
keys = ['happi', 'merri', 'nice', 'good', 'bad', 'sad', 'mad', 'best', 'pretti',
        '❤', ':)', ':(', '😒', '😬', '😄', '😍', '♛',
        'song', 'idea', 'power', 'play', 'magnific']

# list representing our table of word counts.
# each element consist of a sublist with this pattern: [<word>, <positive_count>, <negative_count>]
data = []

# loop through our selected words
for word in keys:
    
    # initialize positive and negative counts
    pos = 0
    neg = 0
    
    # retrieve number of positive counts
    if (word, 1) in freqs:
        pos = freqs[(word, 1)]
        
    # retrieve number of negative counts
    if (word, 0) in freqs:
        neg = freqs[(word, 0)]
        
    # append the word counts to the table
    data.append([word, pos, neg])
    
data

[['happi', 161, 18],
 ['merri', 1, 0],
 ['nice', 70, 17],
 ['good', 191, 83],
 ['bad', 14, 54],
 ['sad', 5, 100],
 ['mad', 3, 8],
 ['best', 49, 16],
 ['pretti', 17, 12],
 ['❤', 21, 15],
 [':)', 2847, 2],
 [':(', 1, 3663],
 ['😒', 1, 3],
 ['😬', 0, 2],
 ['😄', 3, 1],
 ['😍', 1, 0],
 ['♛', 0, 210],
 ['song', 16, 25],
 ['idea', 23, 8],
 ['power', 6, 5],
 ['play', 37, 39],
 ['magnific', 1, 0]]

# Logistic regression model


In [50]:
def sigmoid(z): 
    h = 1/(1 + np.exp(-z))
    return h
def gradientDescent(x, y, theta, alpha, num_iters):
    m = len(x)
    for i in range(0, num_iters):
        
        # get z, the dot product of x and theta
        z = np.dot(x,theta)
        print(z)

        # get the sigmoid of z
        h = sigmoid(z)
        print(h)
        
        # calculate the cost function
        J = (-1/m)*(np.dot(y.transpose(),np.log(h)) + np.dot((1 - y).transpose(),np.log(1-h)))
        print(J)

        # update the weights theta
        theta = theta - ((alpha/m)*np.dot(x.transpose(),(h-y)))
        
    J = float(J)
    return J, theta

## Extracting the features

* Given a list of tweets, extract the features and store them in a matrix.
    * The first feature is the number of positive words in a tweet.
    * The second feature is the number of negative words in a tweet. 

In [63]:
def extract_features(tweet, freqs):
    # process_tweet tokenizes, stems, and removes stopwords
    word_l = process_tweet(tweet)
    
    print(word_l)
    
    # 3 elements in the form of a 1 x 3 vector
    x = np.zeros((1, 3)) 
    
    #bias term is set to 1
    x[0,0] = 1 
    
    
    # loop through each word in the list of words
    for word in word_l:
        
        # increment the word count for the positive label 1
        if((word,1) in freqs.keys()):
            x[0,1] += freqs[(word,1)]
        
        # increment the word count for the negative label 0
        if((word,0) in freqs.keys()):
            x[0,2] += freqs[(word,0)]
        
    assert(x.shape == (1, 3))
    return x

## Training the Model

To train the model:
* Stack the features for all training examples into a matrix `X`. 
* Use `gradientDescent`, which have implemented above.


In [59]:
# collect the features 'x' and stack them into a matrix 'X'
X = np.zeros((len(train_x), 3))
for i in range(len(train_x)):
    X[i, :]= extract_features(train_x[i], freqs)

# training labels corresponding to X
Y = train_y

# Apply gradient descent
J, theta = gradientDescent(X, Y, np.zeros((3, 1)), 1e-9, 1500)

print(f"The cost after training is {J:.8f}.")
print(f"The resulting vector of theta is {[round(t, 8) for t in np.squeeze(theta)]}")

[[0.]
 [0.]
 [0.]
 ...
 [0.]
 [0.]
 [0.]]
[[0.5]
 [0.5]
 [0.5]
 ...
 [0.5]
 [0.5]
 [0.5]]
[[0.69314718]]
[[ 0.00166208]
 [ 0.00164448]
 [ 0.00160681]
 ...
 [-0.00059619]
 [-0.00325174]
 [-0.00333356]]
[[0.50041552]
 [0.50041112]
 [0.5004017 ]
 ...
 [0.49985095]
 [0.49918707]
 [0.49916661]]
[[0.69207586]]
[[ 0.00332294]
 [ 0.00328812]
 [ 0.00321249]
 ...
 [-0.00119117]
 [-0.00649728]
 [-0.00666077]]
[[0.50083073]
 [0.50082203]
 [0.50080312]
 ...
 [0.49970221]
 [0.49837569]
 [0.49833481]]
[[0.69100784]]
[[ 0.00498257]
 [ 0.0049309 ]
 [ 0.00481703]
 ...
 [-0.00178496]
 [-0.00973663]
 [-0.00998165]]
[[0.50124564]
 [0.50123272]
 [0.50120426]
 ...
 [0.49955376]
 [0.49756586]
 [0.49750461]]
[[0.68994309]]
[[ 0.00664098]
 [ 0.00657283]
 [ 0.00642045]
 ...
 [-0.00237755]
 [-0.0129698 ]
 [-0.01329621]]
[[0.50166024]
 [0.5016432 ]
 [0.50160511]
 ...
 [0.49940561]
 [0.4967576 ]
 [0.496676  ]]
[[0.68888162]]
[[ 0.00829816]
 [ 0.00821392]
 [ 0.00802273]
 ...
 [-0.00296894]
 [-0.01619681]
 [-0.016604

[[0.58058979]]
[[ 0.20144158]
 [ 0.20182995]
 [ 0.19511077]
 ...
 [-0.06696709]
 [-0.36775437]
 [-0.37709861]]
[[0.55019079]
 [0.5502869 ]
 [0.54862354]
 ...
 [0.48326448]
 [0.40908376]
 [0.40682687]]
[[0.57985075]]
[[ 0.20295236]
 [ 0.2033616 ]
 [ 0.19657669]
 ...
 [-0.06743165]
 [-0.37032475]
 [-0.37973499]]
[[0.55056465]
 [0.55066591]
 [0.54898653]
 ...
 [0.48314847]
 [0.40846255]
 [0.40619082]]
[[0.57911376]]
[[ 0.204462  ]
 [ 0.20489234]
 [ 0.19804154]
 ...
 [-0.06789533]
 [-0.3728906 ]
 [-0.38236675]]
[[0.55093817]
 [0.55104464]
 [0.5493492 ]
 ...
 [0.48303268]
 [0.40784274]
 [0.40555619]]
[[0.5783788]]
[[ 0.20597048]
 [ 0.20642217]
 [ 0.19950531]
 ...
 [-0.06835816]
 [-0.37545194]
 [-0.38499388]]
[[0.55131135]
 [0.55142308]
 [0.54971155]
 ...
 [0.48291711]
 [0.4072243 ]
 [0.404923  ]]
[[0.57764588]]
[[ 0.20747783]
 [ 0.20795108]
 [ 0.20096801]
 ...
 [-0.06882011]
 [-0.37800879]
 [-0.38761642]]
[[0.55168419]
 [0.55180123]
 [0.55007358]
 ...
 [0.48280176]
 [0.40660724]
 [0.4042912

[[0.56810899]
 [0.56851734]
 [0.5660346 ]
 ...
 [0.47782662]
 [0.3802229 ]
 [0.37729133]]
[[0.54599963]]
[[ 0.27559577]
 [ 0.27729129]
 [ 0.2671045 ]
 ...
 [-0.08917642]
 [-0.49096477]
 [-0.50348478]]
[[0.56846614]
 [0.56888202]
 [0.56638193]
 ...
 [0.47772066]
 [0.37966632]
 [0.37672208]]
[[0.54535399]]
[[ 0.27705041]
 [ 0.27877716]
 [ 0.26851757]
 ...
 [-0.08960037]
 [-0.49332334]
 [-0.50590438]]
[[0.56882295]
 [0.5692464 ]
 [0.56672894]
 ...
 [0.47761488]
 [0.37911099]
 [0.37615412]]
[[0.54471005]]
[[ 0.27850396]
 [ 0.28026211]
 [ 0.26992961]
 ...
 [-0.09002356]
 [-0.49567796]
 [-0.50831993]]
[[0.56917941]
 [0.56961048]
 [0.56707563]
 ...
 [0.4775093 ]
 [0.3785569 ]
 [0.37558746]]
[[0.54406779]]
[[ 0.2799564 ]
 [ 0.28174615]
 [ 0.27134062]
 ...
 [-0.090446  ]
 [-0.49802863]
 [-0.51073145]]
[[0.56953554]
 [0.56997426]
 [0.56742199]
 ...
 [0.4774039 ]
 [0.37800406]
 [0.37502207]]
[[0.54342721]]
[[ 0.28140775]
 [ 0.28322927]
 [ 0.27275058]
 ...
 [-0.09086769]
 [-0.50037538]
 [-0.513138

[[ 0.39949749]
 [ 0.40455911]
 [ 0.387569  ]
 ...
 [-0.12380536]
 [-0.68447983]
 [-0.70203805]]
[[0.59856692]
 [0.59978254]
 [0.59569735]
 ...
 [0.46908814]
 [0.33526219]
 [0.33136052]]
[[0.49452562]]
[[ 0.40085895]
 [ 0.40596506]
 [ 0.38889378]
 ...
 [-0.12417012]
 [-0.68652776]
 [-0.70413964]]
[[0.59889401]
 [0.60011998]
 [0.59601637]
 ...
 [0.46899729]
 [0.33480594]
 [0.33089506]]
[[0.49400916]]
[[ 0.40221939]
 [ 0.40737011]
 [ 0.3902176 ]
 ...
 [-0.12453428]
 [-0.68857255]
 [-0.70623802]]
[[0.59922077]
 [0.60045711]
 [0.59633508]
 ...
 [0.4689066 ]
 [0.33435069]
 [0.33043063]]
[[0.49349393]]
[[ 0.40357881]
 [ 0.40877427]
 [ 0.39154045]
 ...
 [-0.12489785]
 [-0.69061421]
 [-0.70833319]]
[[0.5995472 ]
 [0.60079394]
 [0.59665348]
 ...
 [0.46881606]
 [0.33389645]
 [0.32996725]]
[[0.49297992]]
[[ 0.40493722]
 [ 0.41017754]
 [ 0.39286234]
 ...
 [-0.12526083]
 [-0.69265276]
 [-0.71042517]]
[[0.5998733 ]
 [0.60113045]
 [0.59697156]
 ...
 [0.46872567]
 [0.33344321]
 [0.3295049 ]]
[[0.492467

[[0.44309019]]
[[ 0.54837731]
 [ 0.55915804]
 [ 0.53256321]
 ...
 [-0.16190245]
 [-0.89950803]
 [-0.92274044]]
[[0.63375903]
 [0.6362577 ]
 [0.63008074]
 ...
 [0.45961257]
 [0.28915161]
 [0.28439984]]
[[0.4426902]]
[[ 0.54962882]
 [ 0.56046445]
 [ 0.53378305]
 ...
 [-0.1622084 ]
 [-0.90124437]
 [-0.92452293]]
[[0.63404947]
 [0.63656   ]
 [0.63036502]
 ...
 [0.45953658]
 [0.28879485]
 [0.28403721]]
[[0.44229106]]
[[ 0.55087943]
 [ 0.56177001]
 [ 0.53500202]
 ...
 [-0.16251391]
 [-0.90297834]
 [-0.926303  ]]
[[0.6343396 ]
 [0.63686199]
 [0.630649  ]
 ...
 [0.45946071]
 [0.28843883]
 [0.28367536]]
[[0.44189278]]
[[ 0.55212912]
 [ 0.56307472]
 [ 0.53622013]
 ...
 [-0.16281897]
 [-0.90470994]
 [-0.92808064]]
[[0.63462942]
 [0.63716367]
 [0.63093269]
 ...
 [0.45938494]
 [0.28808357]
 [0.28331427]]
[[0.44149535]]
[[ 0.5533779 ]
 [ 0.5643786 ]
 [ 0.53743736]
 ...
 [-0.1631236 ]
 [-0.90643919]
 [-0.92985586]]
[[0.63491893]
 [0.63746506]
 [0.63121608]
 ...
 [0.45930929]
 [0.28772904]
 [0.2829539

[[0.6597445 ]
 [0.66336763]
 [0.65554934]
 ...
 [0.45292602]
 [0.25863625]
 [0.2534296 ]]
[[0.40857334]]
[[ 0.66332556]
 [ 0.67956723]
 [ 0.64466408]
 ...
 [-0.18912375]
 [-1.0546047 ]
 [-1.08198103]]
[[0.66000703]
 [0.6636421 ]
 [0.65580702]
 ...
 [0.45285949]
 [0.25834185]
 [0.25313131]]
[[0.40824409]]
[[ 0.66449446]
 [ 0.6807958 ]
 [ 0.64580463]
 ...
 [-0.18939191]
 [-1.05613874]
 [-1.08355627]]
[[0.66026928]
 [0.66391629]
 [0.65606442]
 ...
 [0.45279305]
 [0.25804804]
 [0.25283362]]
[[0.40791549]]
[[ 0.66566254]
 [ 0.68202358]
 [ 0.64694439]
 ...
 [-0.1896597 ]
 [-1.05767087]
 [-1.08512956]]
[[0.66053125]
 [0.66419019]
 [0.65632156]
 ...
 [0.45272669]
 [0.25775481]
 [0.25253652]]
[[0.40758754]]
[[ 0.6668298 ]
 [ 0.68325056]
 [ 0.64808336]
 ...
 [-0.18992715]
 [-1.0592011 ]
 [-1.0867009 ]]
[[0.66079293]
 [0.6644638 ]
 [0.65657842]
 ...
 [0.45266043]
 [0.25746215]
 [0.25224003]]
[[0.40726022]]
[[ 0.66799622]
 [ 0.68447676]
 [ 0.64922153]
 ...
 [-0.19019424]
 [-1.06072943]
 [-1.088270

[[0.68744628]
 [0.69237437]
 [0.68278022]
 ...
 [0.445989  ]
 [0.22897602]
 [0.22343059]]
[[0.37533166]]
[[ 0.78928942]
 [ 0.81238774]
 [ 0.7676351 ]
 ...
 [-0.21712349]
 [-1.21544766]
 [-1.24716718]]
[[0.68767874]
 [0.69261808]
 [0.6830091 ]
 ...
 [0.44593137]
 [0.22873857]
 [0.2231909 ]]
[[0.37506472]]
[[ 0.79037079]
 [ 0.8135315 ]
 [ 0.7686913 ]
 ...
 [-0.21735646]
 [-1.21679161]
 [-1.24854762]]
[[0.68791094]
 [0.69286154]
 [0.68323773]
 ...
 [0.44587381]
 [0.22850156]
 [0.22295165]]
[[0.37479825]]
[[ 0.79145143]
 [ 0.81467454]
 [ 0.76974678]
 ...
 [-0.21758916]
 [-1.21813406]
 [-1.24992652]]
[[0.6881429 ]
 [0.69310473]
 [0.68346612]
 ...
 [0.44581632]
 [0.22826499]
 [0.22271286]]
[[0.37453226]]
[[ 0.79253133]
 [ 0.81581686]
 [ 0.77080156]
 ...
 [-0.21782158]
 [-1.21947501]
 [-1.25130388]]
[[0.6883746 ]
 [0.69334766]
 [0.68369426]
 ...
 [0.4457589 ]
 [0.22802885]
 [0.22247451]]
[[0.37426674]]
[[ 0.7936105 ]
 [ 0.81695846]
 [ 0.77185562]
 ...
 [-0.21805373]
 [-1.22081446]
 [-1.252679

[[0.34770003]]
[[ 0.90800607]
 [ 0.93827393]
 [ 0.88363441]
 ...
 [-0.2420282 ]
 [-1.35964401]
 [-1.39529643]]
[[0.71259197]
 [0.71875087]
 [0.70757479]
 ...
 [0.43978659]
 [0.20429817]
 [0.19856356]]
[[0.3474804]]
[[ 0.90900834]
 [ 0.93933933]
 [ 0.88461412]
 ...
 [-0.24223299]
 [-1.36083414]
 [-1.39651918]]
[[0.7127972 ]
 [0.71896619]
 [0.70777747]
 ...
 [0.43973614]
 [0.20410477]
 [0.19836905]]
[[0.34726113]]
[[ 0.91000995]
 [ 0.94040409]
 [ 0.8855932 ]
 ...
 [-0.24243756]
 [-1.36202307]
 [-1.39774069]]
[[0.7130022 ]
 [0.71918127]
 [0.70797993]
 ...
 [0.43968574]
 [0.2039117 ]
 [0.19817488]]
[[0.34704222]]
[[ 0.91101092]
 [ 0.94146819]
 [ 0.88657164]
 ...
 [-0.24264191]
 [-1.36321079]
 [-1.39896096]]
[[0.71320698]
 [0.71939613]
 [0.70818217]
 ...
 [0.4396354 ]
 [0.20371896]
 [0.19798104]]
[[0.34682367]]
[[ 0.91201123]
 [ 0.94253164]
 [ 0.88754946]
 ...
 [-0.24284604]
 [-1.36439732]
 [-1.40018   ]]
[[0.71341155]
 [0.71961075]
 [0.70838421]
 ...
 [0.43958511]
 [0.20352655]
 [0.1977875

[[0.32552623]]
[[ 1.01448795]
 [ 1.0516839 ]
 [ 0.98775164]
 ...
 [-0.26332265]
 [-1.48378113]
 [-1.52284832]]
[[0.73389753]
 [0.74109812]
 [0.7286436 ]
 ...
 [0.4345471 ]
 [0.18485698]
 [0.17904247]]
[[0.32534156]]
[[ 1.0154224 ]
 [ 1.05268104]
 [ 0.98866562]
 ...
 [-0.26350559]
 [-1.4848509 ]
 [-1.52394763]]
[[0.73407998]
 [0.7412894 ]
 [0.72882428]
 ...
 [0.43450215]
 [0.18469584]
 [0.17888095]]
[[0.32515716]]
[[ 1.01635628]
 [ 1.05367758]
 [ 0.98957903]
 ...
 [-0.26368834]
 [-1.48591967]
 [-1.52504592]]
[[0.73426224]
 [0.74148047]
 [0.72900477]
 ...
 [0.43445725]
 [0.18453495]
 [0.17871968]]
[[0.32497304]]
[[ 1.01728957]
 [ 1.05467354]
 [ 0.99049189]
 ...
 [-0.26387092]
 [-1.48698746]
 [-1.52614319]]
[[0.73444431]
 [0.74167134]
 [0.72918507]
 ...
 [0.43441239]
 [0.18437432]
 [0.17855868]]
[[0.32478921]]
[[ 1.01822229]
 [ 1.05566891]
 [ 0.99140418]
 ...
 [-0.26405331]
 [-1.48805425]
 [-1.52723945]]
[[0.73462618]
 [0.741862  ]
 [0.72936518]
 ...
 [0.43436758]
 [0.18421395]
 [0.178397

[[0.75367059]
 [0.7618095 ]
 [0.74825014]
 ...
 [0.42965455]
 [0.16788291]
 [0.16205541]]
[[0.30582534]]
[[ 1.11915709]
 [ 1.16355862]
 [ 1.09015414]
 ...
 [-0.283425  ]
 [-1.60167252]
 [-1.64400667]]
[[0.75383233]
 [0.76197873]
 [0.74841075]
 ...
 [0.42961429]
 [0.16774799]
 [0.16192061]]
[[0.30566942]]
[[ 1.120028  ]
 [ 1.16449098]
 [ 1.09100641]
 ...
 [-0.28358912]
 [-1.60263781]
 [-1.6449988 ]]
[[0.75399391]
 [0.76214779]
 [0.74857119]
 ...
 [0.42957408]
 [0.16761327]
 [0.16178602]]
[[0.30551371]]
[[ 1.12089839]
 [ 1.16542281]
 [ 1.09185817]
 ...
 [-0.2837531 ]
 [-1.60360228]
 [-1.6459901 ]]
[[0.75415532]
 [0.76231667]
 [0.74873147]
 ...
 [0.42953389]
 [0.16747875]
 [0.16165164]]
[[0.30535822]]
[[ 1.12176827]
 [ 1.16635412]
 [ 1.09270944]
 ...
 [-0.28391693]
 [-1.60456594]
 [-1.64698056]]
[[0.75431656]
 [0.76248537]
 [0.74889158]
 ...
 [0.42949375]
 [0.16734443]
 [0.16151745]]
[[0.30520295]]
[[ 1.12263763]
 [ 1.16728489]
 [ 1.09356021]
 ...
 [-0.28408062]
 [-1.60552878]
 [-1.647970

[[ 1.21605934]
 [ 1.26743703]
 [ 1.18500252]
 ...
 [-0.30139522]
 [-1.70762564]
 [-1.7529151 ]]
[[0.77136932]
 [0.78030369]
 [0.76584608]
 ...
 [0.42521645]
 [0.15347194]
 [0.1476799 ]]
[[0.28909189]]
[[ 1.21687428]
 [ 1.26831178]
 [ 1.18580036]
 ...
 [-0.30154396]
 [-1.70850479]
 [-1.75381885]]
[[0.77151301]
 [0.78045362]
 [0.76598912]
 ...
 [0.42518009]
 [0.15335775]
 [0.14756618]]
[[0.28895864]]
[[ 1.21768877]
 [ 1.26918607]
 [ 1.18659775]
 ...
 [-0.30169257]
 [-1.70938327]
 [-1.7547219 ]]
[[0.77165656]
 [0.78060338]
 [0.76613202]
 ...
 [0.42514377]
 [0.15324373]
 [0.14745262]]
[[0.28882556]]
[[ 1.2185028 ]
 [ 1.27005987]
 [ 1.1873947 ]
 ...
 [-0.30184107]
 [-1.71026105]
 [-1.75562425]]
[[0.77179996]
 [0.780753  ]
 [0.76627478]
 ...
 [0.42510748]
 [0.15312986]
 [0.14733922]]
[[0.28869266]]
[[ 1.21931637]
 [ 1.27093321]
 [ 1.1881912 ]
 ...
 [-0.30198944]
 [-1.71113816]
 [-1.75652589]]
[[0.77194322]
 [0.78090246]
 [0.76641741]
 ...
 [0.42507122]
 [0.15301615]
 [0.14722598]]
[[0.288559

[[ 1.2938116 ]
 [ 1.35097503]
 [ 1.26113429]
 ...
 [-0.31541926]
 [-1.79067314]
 [-1.8382912 ]]
[[0.78479164]
 [0.79428899]
 [0.77922131]
 ...
 [0.42179251]
 [0.14299021]
 [0.13725352]]
[[0.27681269]]
[[ 1.29458368]
 [ 1.35180535]
 [ 1.2618904 ]
 ...
 [-0.31555688]
 [-1.79148966]
 [-1.83913066]]
[[0.78492201]
 [0.79442462]
 [0.77935136]
 ...
 [0.42175895]
 [0.14289018]
 [0.13715414]]
[[0.27669503]]
[[ 1.29535536]
 [ 1.35263523]
 [ 1.2626461 ]
 ...
 [-0.3156944 ]
 [-1.79230558]
 [-1.83996951]]
[[0.78505226]
 [0.79456012]
 [0.77948128]
 ...
 [0.42172541]
 [0.14279029]
 [0.1370549 ]]
[[0.27657752]]
[[ 1.29612661]
 [ 1.35346469]
 [ 1.2634014 ]
 ...
 [-0.31583182]
 [-1.79312091]
 [-1.84080776]]
[[0.78518238]
 [0.79469549]
 [0.77961108]
 ...
 [0.4216919 ]
 [0.14269052]
 [0.13695579]]
[[0.27646015]]
[[ 1.29689745]
 [ 1.35429371]
 [ 1.2641563 ]
 ...
 [-0.31596912]
 [-1.79393564]
 [-1.84164539]]
[[0.78531237]
 [0.79483071]
 [0.77974076]
 ...
 [0.42165842]
 [0.14259088]
 [0.13685681]]
[[0.276342

[[0.79579296]
 [0.80571884]
 [0.79020493]
 ...
 [0.41893595]
 [0.13466401]
 [0.12899048]]
[[0.26698691]]
[[ 1.36094161]
 [ 1.42322214]
 [ 1.32688298]
 ...
 [-0.32727186]
 [-1.86110214]
 [-1.91070304]]
[[0.79591269]
 [0.80584305]
 [0.79032458]
 ...
 [0.41890457]
 [0.13457464]
 [0.12890189]]
[[0.26688108]]
[[ 1.36167817]
 [ 1.42401544]
 [ 1.32760447]
 ...
 [-0.32740067]
 [-1.86186873]
 [-1.91149126]]
[[0.79603231]
 [0.80596714]
 [0.79044411]
 ...
 [0.41887321]
 [0.13448538]
 [0.12881341]]
[[0.26677538]]
[[ 1.36241435]
 [ 1.42480833]
 [ 1.32832559]
 ...
 [-0.32752939]
 [-1.8626348 ]
 [-1.91227894]]
[[0.79615181]
 [0.8060911 ]
 [0.79056353]
 ...
 [0.41884188]
 [0.13439624]
 [0.12872504]]
[[0.2666698]]
[[ 1.36315015]
 [ 1.42560083]
 [ 1.32904634]
 ...
 [-0.32765802]
 [-1.86340035]
 [-1.91306608]]
[[0.7962712 ]
 [0.80621495]
 [0.79068284]
 ...
 [0.41881057]
 [0.13430721]
 [0.12863679]]
[[0.26656434]]
[[ 1.36388556]
 [ 1.42639293]
 [ 1.32976671]
 ...
 [-0.32778655]
 [-1.86416536]
 [-1.9138526

[[0.2577453]]
[[ 1.42714971]
 [ 1.49457849]
 [ 1.39174355]
 ...
 [-0.33874891]
 [-1.929504  ]
 [-1.98103752]]
[[0.80645682]
 [0.81676448]
 [0.80087044]
 ...
 [0.41611341]
 [0.12680549]
 [0.12120828]]
[[0.25765006]]
[[ 1.42785257]
 [ 1.49533652]
 [ 1.39243218]
 ...
 [-0.33886967]
 [-1.9302248 ]
 [-1.98177873]]
[[0.8065665 ]
 [0.8168779 ]
 [0.80098024]
 ...
 [0.41608407]
 [0.1267257 ]
 [0.12112935]]
[[0.25755493]]
[[ 1.42855509]
 [ 1.49609418]
 [ 1.39312047]
 ...
 [-0.33899036]
 [-1.93094514]
 [-1.98251945]]
[[0.80667608]
 [0.81699121]
 [0.80108994]
 ...
 [0.41605475]
 [0.126646  ]
 [0.12105052]]
[[0.25745991]]
[[ 1.42925725]
 [ 1.49685148]
 [ 1.39380842]
 ...
 [-0.33911096]
 [-1.931665  ]
 [-1.98325969]]
[[0.80678556]
 [0.81710442]
 [0.80119954]
 ...
 [0.41602545]
 [0.1265664 ]
 [0.12097178]]
[[0.25736499]]
[[ 1.42995907]
 [ 1.49760841]
 [ 1.39449603]
 ...
 [-0.33923148]
 [-1.9323844 ]
 [-1.98399945]]
[[0.80689494]
 [0.81721751]
 [0.80130904]
 ...
 [0.41599617]
 [0.1264869 ]
 [0.1208931

[[ 1.50442854]
 [ 1.5779836 ]
 [ 1.46746664]
 ...
 [-0.35189919]
 [-2.00811876]
 [-2.06188148]]
[[0.81823405]
 [0.82891876]
 [0.81267202]
 ...
 [0.41292195]
 [0.11835314]
 [0.11285732]]
[[0.24753424]]
[[ 1.50509376]
 [ 1.57870207]
 [ 1.46811853]
 ...
 [-0.3520113 ]
 [-2.00879006]
 [-2.06257186]]
[[0.81833296]
 [0.82902062]
 [0.81277124]
 ...
 [0.41289477]
 [0.11828311]
 [0.11278822]]
[[0.2474501]]
[[ 1.50575866]
 [ 1.57942021]
 [ 1.46877012]
 ...
 [-0.35212334]
 [-2.00946096]
 [-2.06326181]]
[[0.81843179]
 [0.82912239]
 [0.81287038]
 ...
 [0.41286761]
 [0.11821316]
 [0.11271919]]
[[0.24736604]]
[[ 1.50642324]
 [ 1.58013801]
 [ 1.46942141]
 ...
 [-0.35223531]
 [-2.01013144]
 [-2.06395134]]
[[0.81853053]
 [0.82922406]
 [0.81296943]
 ...
 [0.41284047]
 [0.11814328]
 [0.11265025]]
[[0.24728207]]
[[ 1.50708751]
 [ 1.58085548]
 [ 1.47007238]
 ...
 [-0.35234721]
 [-2.01080152]
 [-2.06464046]]
[[0.81862917]
 [0.82932564]
 [0.81306839]
 ...
 [0.41281335]
 [0.11807349]
 [0.11258138]]
[[0.2471981

## Testing the Model

* Implement a predict function

Predict whether a tweet is positive or negative.


In [60]:
def predict_tweet(tweet, freqs, theta):

    print(tweet)
    # extract the features of the tweet and store it into x
    x = extract_features(tweet, freqs)
    
    # make the prediction using x and theta
    y_pred = sigmoid(np.dot(x,theta))
        
    return y_pred

In [64]:
# Test the function
for tweet in ['I am a happy student', 'I am happy to learn NLP', 'this movie should have been great.', 'bad', 'I hate playing this game,shit', 'Awesome!! That is great!', 'suck suck suck suck','good great happy nice']:
    print( '%s -> %f' % (tweet, predict_tweet(tweet, freqs, theta)))

I am a happy student
['happi', 'student']
I am a happy student -> 0.518818
I am happy to learn NLP
['happi', 'learn', 'nlp']
I am happy to learn NLP -> 0.518949
this movie should have been great.
['movi', 'great']
this movie should have been great. -> 0.515331
bad
['bad']
bad -> 0.494339
I hate playing this game,shit
['hate', 'play', 'game', 'shit']
I hate playing this game,shit -> 0.491742
Awesome!! That is great!
['awesom', 'great']
Awesome!! That is great! -> 0.519349
suck suck suck suck
['suck', 'suck', 'suck', 'suck']
suck suck suck suck -> 0.493088
good great happy nice
['good', 'great', 'happi', 'nice']
good great happy nice -> 0.554149


## Check performance using the test set
After training the model using the training set above, check how the model might perform on real, unseen data, by testing it against the test set.


In [67]:
def test_logistic_regression(test_x, test_y, freqs, theta):
    
    # the list for storing predictions
    y_result = []
    
    for tweet in test_x:
        # get the label prediction for the tweet
        y_pred = predict_tweet(tweet, freqs, theta)
        
        if y_pred > 0.5:
            # append 1.0 to the list
            y_result.append(1.0)
        else:
            # append 0 to the list
            y_result.append(0.0)

    # Cast the y_result list to array
    y_result_array = np.array(y_result)
    
    equal = 0
    
    for i in range(0,len(test_y)):
        if(y_result_array[i] == test_y[i]):
            equal += 1
    
    accuracy = equal/len(test_y)

    
    return accuracy

In [68]:
tmp_accuracy = test_logistic_regression(test_x, test_y, freqs, theta)

Bro:U wan cut hair anot,ur hair long Liao bo
Me:since ord liao,take it easy lor treat as save $ leave it longer :)
Bro:LOL Sibei xialan
['bro', 'u', 'wan', 'cut', 'hair', 'anot', 'ur', 'hair', 'long', 'liao', 'bo', 'sinc', 'ord', 'liao', 'take', 'easi', 'lor', 'treat', 'save', 'leav', 'longer', ':)', 'bro', 'lol', 'sibei', 'xialan']
@heyclaireee is back! thnx God!!! i'm so happy :)
['back', 'thnx', 'god', "i'm", 'happi', ':)']
@BBCRadio3 thought it was my ears which were malfunctioning, thank goodness you cleared that one up with an apology :-)
['thought', 'ear', 'malfunct', 'thank', 'good', 'clear', 'one', 'apolog', ':-)']
@HumayAG 'Stuck in the centre right with you. Clowns to the right, jokers to the left...' :) @orgasticpotency @ahmedshaheed @AhmedSaeedGahaa
['stuck', 'centr', 'right', 'clown', 'right', 'joker', 'left', '...', ':)']
Happy Friday :-) http://t.co/iymPIlWXFY
['happi', 'friday', ':-)']
@Sazzi91 we are following you now :) x
['follow', ':)', 'x']
My #TeenChoice For #Cho

['goodnight', 'guy', ':-)', 'rememb', 'tomorrow', 'brand', 'new', 'day', 'fresh', 'start', 'anoth', 'chanc']
Bom-dia :) APOD: Ultraviolet Rings of M31 (2015 Jul 24) - 
http://t.co/f5xThyIN8u -
['bom-dia', ':)', 'apod', 'ultraviolet', 'ring', 'm31', '2015', 'jul', '24']
I'm playing Brain Dots : ) #BrainDots http://t.co/aOKldo3GMj http://t.co/xWCM9qyRG5
["i'm", 'play', 'brain', 'dot', 'braindot']
@ThePrintQuarter thanks for the follow have a great day :)
['thank', 'follow', 'great', 'day', ':)']
@jasmineshaddock Oooh how lovely! Hope you have a fantastic time :)
['oooh', 'love', 'hope', 'fantast', 'time', ':)']
*yawns* good morning everyone *wags tail* how is everyone doing today :D
['yawn', 'good', 'morn', 'everyon', 'wag', 'tail', 'everyon', 'today', ':D']
@Objective_Neo @UMAD congrats on the launch :D
['congrat', 'launch', ':D']
Happy and wonderful birthday my love 💟 Have the most beautiful day in the world :)
['happi', 'wonder', 'birthday', 'love', '💟', 'beauti', 'day', 'world', ':)'

['say', 'someth', "i'm", 'give', ':)', "i'm", 'sorri', 'get', ':D']
@Coastie_alexx good night :)
['good', 'night', ':)']
@heyoppar @Zain9898 @bemybelief @hetthuocchua there's gonna be another one in the finale :)))))
["there'", 'gonna', 'anoth', 'one', 'final', ':)']
@246MissJessica You're very welcome, Jessica. Enjoy! :)
['welcom', 'jessica', 'enjoy', ':)']
@seananmcguire my best friend Carina is one here in San Francisco. Let me know if you want to get I touch. :)
['best', 'friend', 'carina', 'one', 'san', 'francisco', 'let', 'know', 'want', 'get', 'touch', ':)']
Thanks mommy teret for this :) (Y) :) #Potassium #Rehydrate #DrinkitAllup #ThirstQuencher https://t.co/uLORa2tssO
['thank', 'mommi', 'teret', ':)', ':)', 'potassium', 'rehydr', 'drinkitallup', 'thirstquench']
@ManningOfficial @GarrJPRBDF @PsycheDK I'm sure that that Tapir calf isn't the only one who likes to be adventurous at mealtimes :)
["i'm", 'sure', 'tapir', 'calf', 'one', 'like', 'adventur', 'mealtim', ':)']
@ShutUpB

['becom', 'better', 'atrack', 'better', ':)']
@padaleckbye First and third look promising and if all else fails and you can make a decision to go with all of the above! :)
['first', 'third', 'look', 'promis', 'els', 'fail', 'make', 'decis', 'go', ':)']
Crazy girlfriends be like :-)(-: Jesus Christ. http://t.co/RTWjc7e1lM
['crazi', 'girlfriend', 'like', ':-)', '(-:', 'jesu', 'christ']
@TheShreddingA Morning :) you too. A gd day in store? I'm off to badminton with my 2 littl'uns! @AquaDesignGroup @Klick_Business #FabFriday
['morn', ':)', 'gd', 'day', 'store', "i'm", 'badminton', '2', "littl'un", 'fabfriday']
@grafikmag @editionsdulivre I a kid inside!!! I want one!! :-)
['kid', 'insid', 'want', 'one', ':-)']
Imran khan a hero :) really  #IKPrideOfPak
['imran', 'khan', 'hero', ':)', 'realli', 'ikprideofpak']
@DrSadafAlvi @FatimahLove92 @AliJaved93 @Iqbal92S @defilibrator @kiran1144 @tahseenfurqan Janjua is my friend.:)
['janjua', 'friend', ':)']
I AM SO SORRY FUCKING ERIC THINKS HES FUNNY

['ur', 'skin', 'think', 'ur', 'still', '16', 'decid', 'break', ':-)', 'cool']
@OdellSchwarzeJG How?Easy.Get up at 5:30 am, go to work, come home bout 6, take care of home and family therein. That's how. You'll see. :-)
['easy.get', '5:30', 'go', 'work', 'come', 'home', 'bout', '6', 'take', 'care', 'home', 'famili', 'therein', "that'", 'see', ':-)']
Thank you Majalah http://t.co/5jjCWZzXuj for having @nabilaAF2013 on your July issue :) Lovely 👍🏼👍🏼👍🏼😍😍😍😍😍🌸🌸🌸🌸🌸 http://t.co/zhk7SVrIhE
['thank', 'majalah']
@IaneboyIester into a pity party so :-)
['piti', 'parti', ':-)']
#TGIF unless you're one of my students :) Some light reading http://t.co/XxpxBHC4oH on #singapore #heroes #gp #essay #alevel
['tgif', 'unless', 'one', 'student', ':)', 'light', 'read']
@DominiquePirrie @unlatches @ShinonSai Dominique I'm your biggest fan like oh my I'm from England too can I get a fan sign :)
['dominiqu', "i'm", 'biggest', 'fan', 'like', 'oh', "i'm", 'england', 'get', 'fan', 'sign', ':)']
@cyjaexlne good, 'c

['sure', 'done', ':)']
@DeniseAlicia_ if the salon bleaches your hair with Olaplex it won't damage it like at all :)
['salon', 'bleach', 'hair', 'olaplex', 'damag', 'like', ':)']
It's all about teamwork, right? :D #ZitecOfficeStories http://t.co/ycVA2V6ecq
['teamwork', 'right', ':D', 'zitecofficestori']
#FF #HappyFriday @SageVatic  @marchicristian @bounceroriginal @BeenFingered @Dj_Mando_Off have a great #Friday :-)
['ff', 'happyfriday', 'great', 'friday', ':-)']
@JavsNH sure :)
['sure', ':)']
@SBS_MTV #다쇼 #GOT7  

Let's have got7 facts :)
['다쇼', 'got', '7', "let'", 'got', '7', 'fact', ':)']
@monkeymademe the @raspjamberlin is still on tomorrow? A work colleague is interested in bringing his kid along :) can point him to EB page
['still', 'tomorrow', 'work', 'colleagu', 'interest', 'bring', 'kid', 'along', ':)', 'point', 'eb', 'page']
@krigsmamma Tack &lt;3 :D
['tack', '<3', ':D']
T'would not be a TweetUp without you @coleman_21 You are  booked on @Sabrina_Boat rest assured. @MButlerCo

['break', 'shell', ':-)']
Truth. :D http://t.co/EI7VtynLOx
['truth', ':D']
@chuckaikens Fantastic. Thank you :)
['fantast', 'thank', ':)']
@polarizehes heyy :) can u rt this link https://t.co/WztNf8e6cO and tag michael? please. thank you
['heyi', ':)', 'u', 'rt', 'link']
@MariaSharapova The hashtag says it all ................ :)
['hashtag', 'say', '...', ':)']
@JonsCrazyTweets still more green tea blends with other flavors. :)
['still', 'green', 'tea', 'blend', 'flavor', ':)']
@MaslowFanArmy i'm very excited for what's coming :) he really deserves it.
["i'm", 'excit', "what'", 'come', ':)', 'realli', 'deserv']
What you do today can improve all your tomorrows..... :-) :)
['today', 'improv', 'tomorrow', '...', ':-)', ':)']
@ChrisMitchell91 sometimes :-)
['sometim', ':-)']
@AlexCarranza21 I'm so sorry! I ran out with a friend after having a rough day. I'll try to stream on Saturday! :)
["i'm", 'sorri', 'ran', 'friend', 'rough', 'day', "i'll", 'tri', 'stream', 'saturday', ':)']
@ciaela ge

['oh', 'wow', 'thank', ':)', 'skin', 'go', 'perfect', 'time', 'head', 'lago', 'travel']
wow luxord looks really amazing here in his new kingdom hearts 3 promo art !! :) http://t.co/FFJU1rlJpC
['wow', 'luxord', 'look', 'realli', 'amaz', 'new', 'kingdom', 'heart', '3', 'promo', 'art', ':)']
New potatos from the garden - and hundreds more to dig up :)
['new', 'potato', 'garden', 'hundr', 'dig', ':)']
@HowToFixMyCar Love it! :D @CrienaLDavies @HelenRoseTerry1 @PrestigeDiesels @star_aline1 @LeahRebeccaUK
['love', ':D']
Thank you. Have a lovely weekend everyone. :-)) https://t.co/1CcpcLlzic
['thank', 'love', 'weekend', 'everyon', ':-)']
Well this totally blew my mind this morning : ) https://t.co/jjXcm44YPJ
['well', 'total', 'blew', 'mind', 'morn']
@thevieweast Of course! Now I've been cited in an academic paper I feel I've arrived :-)
['cours', "i'v", 'cite', 'academ', 'paper', 'feel', "i'v", 'arriv', ':-)']
Life is smile :)
['life', 'smile', ':)']
@twigtwisters thanks for the info :)
['tha

['mention', 'south', 'africa', 'mani', 'time', 'song', 'time', 'come', 'south', 'africa', ':(']
Hate when I can't remember my dreams, I love sharing them :(
['hate', "can't", 'rememb', 'dream', 'love', 'share', ':(']
What is going on in America man R.I.P to all the victims in #Louisiana :(
['go', 'america', 'man', 'r', 'p', 'victim', 'louisiana', ':(']
When school comes between me and twitter :(
['school', 'come', 'twitter', ':(']
Taking 190 at such a horrible timing :(
['take', '190', 'horribl', 'time', ':(']
@myungfart ella :( cheer up pls
['ella', ':(', 'cheer', 'pl']
@Lizarrdz That's no excuse for launching an attack Lizardz, you should feel like shit for doing that.  I am deeply ashamed of you.  &gt;:(
["that'", 'excus', 'launch', 'attack', 'lizardz', 'feel', 'like', 'shit', 'deepli', 'asham', '>:(']
@dtaylor5633  Nothing worse :(
['noth', 'wors', ':(']
@cockneyradish Sorry about that :( We need to get an emergency engineer out to you! Will you be home for the next 4 hours? ^Laura

['never', 'give', 'best', 'thing', 'take', 'time', 'weh', ':(']
GUYS add my KIK - sprevelink633 #kik #hornykik #chat #porno #orgasm #free #webcamsex :( http://t.co/ZepUgc5yJ3
['guy', 'add', 'kik', 'sprevelink', '633', 'kik', 'hornykik', 'chat', 'porno', 'orgasm', 'free', 'webcamsex', ':(']
@xsushie why you're now lana? :(
['lana', ':(']
Missed 11:11 :( btw my wish was for @carterreynolds to follow me♥ #LoveYouTillTheEndCarter ☺
['miss', '11:11', ':(', 'btw', 'wish', 'follow', '♥', 'loveyoutilltheendcart', '☺']
@Shonette @HartsBakery Aw that sounds great, better than what i  have for lunch :(
['aw', 'sound', 'great', 'better', 'lunch', ':(']
I want to play the SFV!!!
Capcom plz :(
['want', 'play', 'sfv', 'capcom', 'plz', ':(']
Tips ONLINE!
6/7 WINNERS yesterday!
Just 1 goal off a nice 20/1 :(
Hopefully a similar strike rate today!
http://t.co/lJEB3EPZVt
['tip', 'onlin', '6/7', 'winner', 'yesterday', '1', 'goal', 'nice', '20/1', ':(', 'hope', 'similar', 'strike', 'rate', 'today']
@phenom

['cant', 'find', ':(']
@remnantsofme Oh no :( What building &amp; suite are you in!?
['oh', ':(', 'build', 'suit']
I wanna cut k but I'm with my friends and they threw me a sleepover party :( so I'm pretending to be ok like usual
['wanna', 'cut', 'k', "i'm", 'friend', 'threw', 'sleepov', 'parti', ':(', "i'm", 'pretend', 'ok', 'like', 'usual']
@swcgtbh leave me alone :(((
['leav', 'alon', ':(']
praying for you , KHAMIS ... :( ... https://t.co/2Qg4dhxaGg #Kadhafi
['pray', 'khami', '...', ':(', '...']
damn forgot about my hot chocolate :(
['damn', 'forgot', 'hot', 'chocol', ':(']
I'm craving Oreos and milk :(((((
["i'm", 'crave', 'oreo', 'milk', ':(']
And now that love island has finished I feel like summer has gone :(
['love', 'island', 'finish', 'feel', 'like', 'summer', 'gone', ':(']
@sugarymgc i miss you :(
['miss', ':(']
@KrystalHosting Was just about to push a client your way for some hosting. Maybe I had better wait till next week :(
['push', 'client', 'way', 'host', 'mayb', 'bette

['oh', 'ndabenhl', ':-(', 'pleas', 'let', 'us', 'know', 'feel', 'way', 'sa']
I've got one poorly doggy :(
["i'v", 'got', 'one', 'poorli', 'doggi', ':(']
Reply please antagal :(
['repli', 'pleas', 'antag', ':(']
@CrabbyCrabsBB Windows 7 here :(
['window', '7', ':(']
@Miss_J_Hart @staffrm well, I'm only on P.41 &amp; 8 pages of notes thus far :-( and I've read twice before
['well', "i'm", 'p', '41', '8', 'page', 'note', 'thu', 'far', ':-(', "i'v", 'read', 'twice']
i want all of kylie jenners clothes :(
['want', 'kyli', 'jenner', 'cloth', ':(']
@SkyHelpTeam yes unfortunately I am, I have done the whole troubleshooting of the box too and I'm still having issues :(
['ye', 'unfortun', 'done', 'whole', 'troubleshoot', 'box', "i'm", 'still', 'issu', ':(']
I miss those convo's so bad damn :(
['miss', "convo'", 'bad', 'damn', ':(']
Dem free tix to Big Bang concert :(
['dem', 'free', 'tix', 'big', 'bang', 'concert', ':(']
@TinyKyrus So youre not gonna guess whats inside my box? :(((
['your', 'gon

['worri', ':-(']
“@TheShoeBibles: Its mine.... http://t.co/I9rRNjyvUq” damnnn fineeeeeeeee omaigoshhhhhhh can I have you :((((
['“', 'mine', '...']
They didn't care that I was a Muslim :(
['care', 'muslim', ':(']
@sneaksontaylor no it's not :(
[':(']
When Jessica calls and quits on power abs at 5:15 :-(
['jessica', 'call', 'quit', 'power', 'ab', '5:15', ':-(']
She lost us, her friends... :( she's the one who started the argument.
['lost', 'us', 'friend', '...', ':(', 'one', 'start', 'argument']
@MsMeghanMakeup hope you're having fun at vidcon!! rlly bummed i couldnt go :(( love you sunshine 💛💛💛💛
['hope', 'fun', 'vidcon', 'rlli', 'bum', 'couldnt', 'go', ':(', 'love', 'sunshin', '💛', '💛', '💛']
@iTaimikhan yepp.. I'm getting bore :(
['yepp', '..', "i'm", 'get', 'bore', ':(']
last night was so good :( 😺💒💎🎉
['last', 'night', 'good', ':(', '😺', '💒', '💎', '🎉']
@PrinceOfRnbZJM Can you follow me ? :(
['follow', ':(']
feelin ' Sick :((
['feelin', 'sick', ':(']
If you want a biscuit slathered in 

['oh', 'realli', ':(', 'saw', 'gif', 'post', 'seen', 'realli', 'happi', 'mayb', 'peopl', "who'd", 'happi', 'naruhina']
@Errata_0 you can't :(
["can't", ':(']
@hausofbekah @JudasDiamonds Thats an English nameeee :(( Haiqal isnt an English name but you remember him
['that', 'english', 'namee', ':(', 'haiqal', 'isnt', 'english', 'name', 'rememb']
@GFuelEnergy i want that but i dont have paypal :(
['want', 'dont', 'paypal', ':(']
And that's how my 360hrs of "summer" ends..... HUHUHU I WILL MISS PICC A LOT :(((((
["that'", '360hr', 'summer', 'end', '...', 'huhuhu', 'miss', 'picc', 'lot', ':(']
my heart hurts :(
['heart', 'hurt', ':(']
@nextofficial my instore card expired in June &amp; I havent been sent a new one :(
['instor', 'card', 'expir', 'june', 'havent', 'sent', 'new', 'one', ':(']
@loyalistofSoShi MCountdown pre-voting is begin, we are at 5th place :(
['mcountdown', 'pre-vot', 'begin', '5th', 'place', ':(']
I felt bad when Irene forgot what she was going to say :(
['felt', 'bad', '

## Check the accuarcy of model

In [69]:
print(f"Logistic regression model's accuracy = {tmp_accuracy:.4f}")

Logistic regression model's accuracy = 0.9950


# Predict my tweet

In [77]:
# Change the text the tweet below
my_tweet = 'I hate you! You are an idiot!'

print(process_tweet(my_tweet))
y_result = predict_tweet(my_tweet, freqs, theta)
print(y_result)
if y_result > 0.5:
    print('\nPositive sentiment')
else: 
    print('\nNegative sentiment')

['hate', 'idiot']
I hate you! You are an idiot!
['hate', 'idiot']
[[0.49506426]]

Negative sentiment


# Thank you!!!