In [0]:
from google.colab import drive
drive.mount('/gdrive')

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&scope=email%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdocs.test%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive.photos.readonly%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fpeopleapi.readonly&response_type=code

Enter your authorization code:
··········
Mounted at /gdrive


In [0]:
cd /gdrive/My\ Drive

/gdrive/My Drive


In [0]:
cd Practice/Sentiment-Analysis-Movie-Review/notebook/

/gdrive/My Drive/Practice/Sentiment-Analysis-Movie-Review/notebook


## Sentiment Analysis with Python 

<hr>

**[Classifying IMDb Movie Reviews](https://towardsdatascience.com/sentiment-analysis-with-python-part-1-5ce197074184)**

### Step1: Read into Python

In [0]:
path = "../data/movie_data"

In [0]:
reviews_train = []
for line in open(path+'/full_train.txt', 'r'):
    reviews_train.append(line.strip())

In [0]:
reviews_train[1]

'Homelessness (or Houselessness as George Carlin stated) has been an issue for years but never a plan to help those on the street that were once considered human who did everything from going to school, work, or vote for the matter. Most people think of the homeless as just a lost cause while worrying about things such as racism, the war on Iraq, pressuring kids to succeed, technology, the elections, inflation, or worrying if they\'ll be next to end up on the streets.<br /><br />But what if you were given a bet to live on the streets for a month without the luxuries you once had from a home, the entertainment sets, a bathroom, pictures on the wall, a computer, and everything you once treasure to see what it\'s like to be homeless? That is Goddard Bolt\'s lesson.<br /><br />Mel Brooks (who directs) who stars as Bolt plays a rich man who has everything in the world until deciding to make a bet with a sissy rival (Jeffery Tambor) to see if he can live in the streets for thirty days withou

In [0]:
reviews_test = []
for line in open(path+'/full_test.txt', 'r'):
    reviews_test.append(line.strip())

In [0]:
print(reviews_test[1])

Actor turned director Bill Paxton follows up his promising debut, the Gothic-horror "Frailty", with this family friendly sports drama about the 1913 U.S. Open where a young American caddy rises from his humble background to play against his Bristish idol in what was dubbed as "The Greatest Game Ever Played." I'm no fan of golf, and these scrappy underdog sports flicks are a dime a dozen (most recently done to grand effect with "Miracle" and "Cinderella Man"), but some how this film was enthralling all the same.<br /><br />The film starts with some creative opening credits (imagine a Disneyfied version of the animated opening credits of HBO's "Carnivale" and "Rome"), but lumbers along slowly for its first by-the-numbers hour. Once the action moves to the U.S. Open things pick up very well. Paxton does a nice job and shows a knack for effective directorial flourishes (I loved the rain-soaked montage of the action on day two of the open) that propel the plot further or add some unexpected

### Step2: Clean and Preprocess

We will do very basic text processing like removing punctuation and HTML tags and making everything lower-case.

**Note:** Understanding and being able to use regular expressions is a prerequisite for doing any Natural Language Processing task. If you’re unfamiliar with them perhaps start here: [Regex Tutorial](https://medium.com/factory-mind/regex-tutorial-a-simple-cheatsheet-by-examples-649dc1c3f285)

In [0]:
import re

REPLACE_NO_SPACE = re.compile("[.;:!\'?,\"()\[\]]")
REPLACE_WITH_SPACE = re.compile("(<br\s*/><br\s*/>)|(\-)|(\/)")

def preprocess_reviews(reviews):
    reviews = [REPLACE_NO_SPACE.sub("", line.lower()) for line in reviews]
    reviews = [REPLACE_WITH_SPACE.sub(" ", line) for line in reviews]
    
    return reviews

reviews_train_clean = preprocess_reviews(reviews_train)
reviews_test_clean = preprocess_reviews(reviews_test)

In [0]:
reviews_train_clean[1]

'homelessness or houselessness as george carlin stated has been an issue for years but never a plan to help those on the street that were once considered human who did everything from going to school work or vote for the matter most people think of the homeless as just a lost cause while worrying about things such as racism the war on iraq pressuring kids to succeed technology the elections inflation or worrying if theyll be next to end up on the streets but what if you were given a bet to live on the streets for a month without the luxuries you once had from a home the entertainment sets a bathroom pictures on the wall a computer and everything you once treasure to see what its like to be homeless that is goddard bolts lesson mel brooks who directs who stars as bolt plays a rich man who has everything in the world until deciding to make a bet with a sissy rival jeffery tambor to see if he can live in the streets for thirty days without the luxuries if bolt succeeds he can do what he w

### Step3: Vectorization

The simplest form of this is to create one very large matrix with one column for every unique word in your corpus (where the corpus is all 50k reviews in our case). Then we transform each review into one row containing 0s and 1s, where 1 means that the word in the corpus corresponding to that column appears in that review. 
That being said, each row of the matrix will be very sparse (mostly zeros). This process is also known as **one hot encoding**.

In [0]:
from sklearn.feature_extraction.text import CountVectorizer

cv = CountVectorizer(binary=True)
cv.fit(reviews_train_clean)
X = cv.transform(reviews_train_clean)
X_test = cv.transform(reviews_test_clean)

### Step4: Build Classifier

we’ve transformed our dataset into a format suitable for modeling we can start building a classifier.
Logistic Regression is a good baseline model for us to use for several reasons: 

1. They’re easy to interpret, 
2. Linear models tend to perform well on sparse datasets like this one,
3. They learn very fast compared to other algorithms.

**Note:** The targets/labels we use will be the same for training and testing because both datasets are structured the same, where the first 12.5k are positive and the last 12.5k are negative.

**About the hyperparameter C, which adjusts the regularization.**

In [0]:
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split

In [0]:
target = [1 if i < 12500 else 0 for i in range(25000)]

In [0]:
X_train, X_val, y_train, y_val = train_test_split(
    X, target, train_size = 0.75
)



In [0]:
for c in [0.01, 0.05, 0.25, 0.5, 1]:
    
    lr = LogisticRegression(C=c)
    lr.fit(X_train, y_train)
    print ("Accuracy for C=%s: %s" 
           % (c, accuracy_score(y_val, lr.predict(X_val))))
    
# Accuracy for C=0.01: 0.87248
# Accuracy for C=0.05: 0.88272
# Accuracy for C=0.25: 0.88048
# Accuracy for C=0.5: 0.87824
# Accuracy for C=1: 0.87568



Accuracy for C=0.01: 0.86864
Accuracy for C=0.05: 0.87728
Accuracy for C=0.25: 0.88224
Accuracy for C=0.5: 0.87984
Accuracy for C=1: 0.8752


### Step5: Train Final Model

Now that we’ve found the optimal value for C, we should train a model using the entire training set and evaluate our accuracy on the 25k test reviews.

In [0]:
final_model = LogisticRegression(C=0.05)
final_model.fit(X, target)
print ("Final Accuracy: %s" 
       % accuracy_score(target, final_model.predict(X_test)))

# Final Accuracy: 0.88152



Final Accuracy: 0.88152


As a sanity check, let’s look at the 5 most discriminating words for both positive and negative reviews. 
We’ll do this by looking at the largest and smallest coefficients, respectively.

In [0]:
final_model.intercept_

array([0.15757334])

In [0]:
final_model.coef_

array([[ 1.49484259e-03, -3.52969400e-06, -3.71951487e-03, ...,
         2.90177097e-04, -2.69632880e-02, -6.63970470e-03]])

In [0]:
feature_to_coef = {
    word: coef for word, coef in zip(
        cv.get_feature_names(), final_model.coef_[0]
    )
}

In [0]:
for best_positive in sorted(
    feature_to_coef.items(), 
    key=lambda x: x[1], 
    reverse=True)[:5]:
    print (best_positive)
    
# ('excellent', 0.9292549002494034)
# ('perfect', 0.7907005736625977)
# ('great', 0.6745323581303191)
# ('amazing', 0.612703981446081)
# ('superb', 0.6019367936694553)

('excellent', 0.9292549017181694)
('perfect', 0.7907005565370882)
('great', 0.6745323515415729)
('amazing', 0.6127039824916363)
('superb', 0.6019368131550034)


In [0]:
for best_negative in sorted(
    feature_to_coef.items(), 
    key=lambda x: x[1])[:5]:
    print (best_negative)
    
# ('worst', -1.3645958618890326)
# ('waste', -1.166424181103741)
# ('awful', -1.032418905297706)
# ('poorly', -0.8752018666767407)
# ('boring', -0.8563543336107031)

('worst', -1.3645958840794268)
('waste', -1.166424244219479)
('awful', -1.0324190211775237)
('poorly', -0.8752018744646883)
('boring', -0.8563543419889986)


### What do next

<hr>

1. **Text Processing**: Stemming/Lemmatizing to convert different forms of each word into one.
2. **n-grams**: Instead of just single-word tokens (1-gram/unigram) we can also include word pairs.
3. **epresentations**: Instead of simple, binary vectors we can use word counts or TF-IDF to transform those counts.
4. **Algorithms**: In addition to Logistic Regression, we’ll see how Support Vector Machines perform.

## Sentiment Analysis with Python  (Past 2)

<hr>

**[Improving a Movie Review Sentiment Classifier](https://towardsdatascience.com/sentiment-analysis-with-python-part-2-4f71e7bde59a)**

### Enhance1: Text Processing

We can clean things up further by removing stop words and normalizing the text.
To make these transformations we’ll use libraries from the [Natural Language Toolkit](https://www.nltk.org/) (NLTK).

#### Removing Stop Words

Stop words are the very common words like ‘if’, ‘but’, ‘we’, ‘he’, ‘she’, and ‘they’. 
We can usually remove these words without changing the semantics of a text and doing so often (but not always) improves the performance of a model.
Removing these stop words becomes a lot more useful when we start using longer word sequences as model features (see n-grams below).

In [0]:
from nltk.corpus import stopwords

In [0]:
# import nltk

In [0]:
# nltk.download('stopwords')

[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Unzipping corpora/stopwords.zip.


True

In [0]:
english_stop_words = stopwords.words('english')

In [0]:
def remove_stop_words(corpus):
    removed_stop_words = []
    for review in corpus:
        removed_stop_words.append(
            ' '.join([word for word in review.split() 
                      if word not in english_stop_words])
        )
    return removed_stop_words

In [0]:
no_stop_words = remove_stop_words(reviews_train_clean)

In [0]:
reviews_train_clean[2]

'brilliant over acting by lesley ann warren best dramatic hobo lady i have ever seen and love scenes in clothes warehouse are second to none the corn on face is a classic as good as anything in blazing saddles the take on lawyers is also superb after being accused of being a turncoat selling out his boss and being dishonest the lawyer of pepto bolt shrugs indifferently im a lawyer he says three funny words jeffrey tambor a favorite from the later larry sanders show is fantastic here too as a mad millionaire who wants to crush the ghetto his character is more malevolent than usual the hospital scene and the scene where the homeless invade a demolition site are all time classics look for the legs scene and the two big diggers fighting one bleeds this movie gets better each time i see it which is quite often'

In [0]:
no_stop_words[2]

'brilliant acting lesley ann warren best dramatic hobo lady ever seen love scenes clothes warehouse second none corn face classic good anything blazing saddles take lawyers also superb accused turncoat selling boss dishonest lawyer pepto bolt shrugs indifferently im lawyer says three funny words jeffrey tambor favorite later larry sanders show fantastic mad millionaire wants crush ghetto character malevolent usual hospital scene scene homeless invade demolition site time classics look legs scene two big diggers fighting one bleeds movie gets better time see quite often'

#### Normalization

A common next step in text preprocessing is to normalize the words in your corpus by trying to convert all of the different forms of a given word into one. 
Two methods that exist for this are _Stemming_ and _Lemmatization_.

##### Stemming

Stemming is considered to be the more crude/brute-force approach to normalization (although this doesn’t necessarily mean that it will perform worse). 
There’s several algorithms, but in general they all use basic rules to chop off the ends of words.

NLTK has several stemming algorithm implementations. We’ll use the Porter stemmer here but you can explore all of the options with examples here: [NLTK Stemmers](http://www.nltk.org/howto/stem.html)

In [0]:
def get_stemmed_text(corpus):
    from nltk.stem.porter import PorterStemmer
    stemmer = PorterStemmer()
    return [' '.join([stemmer.stem(word) for word in review.split()]) for review in corpus]

In [0]:
stemmed_reviews = get_stemmed_text(reviews_train_clean)

In [0]:
stemmed_reviews[2]

'brilliant over act by lesley ann warren best dramat hobo ladi i have ever seen and love scene in cloth warehous are second to none the corn on face is a classic as good as anyth in blaze saddl the take on lawyer is also superb after be accus of be a turncoat sell out hi boss and be dishonest the lawyer of pepto bolt shrug indiffer im a lawyer he say three funni word jeffrey tambor a favorit from the later larri sander show is fantast here too as a mad millionair who want to crush the ghetto hi charact is more malevol than usual the hospit scene and the scene where the homeless invad a demolit site are all time classic look for the leg scene and the two big digger fight one bleed thi movi get better each time i see it which is quit often'

##### Lemmatization

Lemmatization works by identifying the part-of-speech of a given word and then applying more complex rules to transform the word into its true root.

In [0]:
def get_lemmatized_text(corpus):
    from nltk.stem import WordNetLemmatizer
    lemmatizer = WordNetLemmatizer()
    return [' '.join([lemmatizer.lemmatize(word) for word in review.split()]) for review in corpus]

In [0]:
# nltk.download('wordnet')

[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data]   Unzipping corpora/wordnet.zip.


True

In [0]:
lemmatized_reviews = get_lemmatized_text(reviews_train_clean)

### Enhance2: n-grams

We can potentially add more predictive power to our model by adding two or three word sequences (bigrams or trigrams) as well. 
The scikit-learn library makes this really easy to play around with. Just use the ngram_range argument with any of the ‘Vectorizer’ classes.

In [0]:
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split

In [0]:
ngram_vectorizer = CountVectorizer(binary=True, ngram_range=(1, 2))
ngram_vectorizer.fit(reviews_train_clean)
X = ngram_vectorizer.transform(reviews_train_clean)
X_test = ngram_vectorizer.transform(reviews_test_clean)

X_train, X_val, y_train, y_val = train_test_split(
    X, target, train_size = 0.75
)



In [0]:
for c in [0.01, 0.05, 0.25, 0.5, 1]:
    
    lr = LogisticRegression(C=c)
    lr.fit(X_train, y_train)
    print ("Accuracy for C=%s: %s" 
           % (c, accuracy_score(y_val, lr.predict(X_val))))
    
# Accuracy for C=0.01: 0.8776
# Accuracy for C=0.05: 0.88576
# Accuracy for C=0.25: 0.88816
# Accuracy for C=0.5: 0.88816
# Accuracy for C=1: 0.88816



Accuracy for C=0.01: 0.88128
Accuracy for C=0.05: 0.8904
Accuracy for C=0.25: 0.89088
Accuracy for C=0.5: 0.89184
Accuracy for C=1: 0.89168


In [0]:
final_ngram = LogisticRegression(C=0.5)
final_ngram.fit(X, target)
print ("Final Accuracy: %s" 
       % accuracy_score(target, final_ngram.predict(X_test)))

# Final Accuracy: 0.8976



Final Accuracy: 0.8976


Getting pretty close to 90%! So, simply considering 2-word sequences in addition to single words increased our accuracy by more than 1.6 percentage points.

**Note**: There’s technically no limit on the size that n can be for your model, but there are several things to consider. 
First, increasing the number of grams will not necessarily give you better performance. 
Second, the size of your matrix grows exponentially as you increment n, so if you have a large corpus that is comprised of large documents your model may take a very long time to train.

### Enhance3: Representations

There are ways that we can encode more information into the vector.

#### Word Counts

Instead of simply noting whether a word appears in the review or not, we can include the number of times a given word appears. This can give our sentiment classifier a lot more predictive power. 
For example, if a movie reviewer says ‘amazing’ or ‘terrible’ multiple times in a review it is considerably more probable that the review is positive or negative, respectively.

In [0]:
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split

In [0]:
wc_vectorizer = CountVectorizer(binary=False, ngram_range=(1, 2))
wc_vectorizer.fit(reviews_train_clean)
X = wc_vectorizer.transform(reviews_train_clean)
X_test = wc_vectorizer.transform(reviews_test_clean)

In [0]:
X_train, X_val, y_train, y_val = train_test_split(
    X, target, train_size = 0.75, 
)



In [0]:
for c in [0.01, 0.05, 0.25, 0.5, 1]:
    
    lr = LogisticRegression(C=c)
    lr.fit(X_train, y_train)
    print ("Accuracy for C=%s: %s" 
           % (c, accuracy_score(y_val, lr.predict(X_val))))
    
# Accuracy for C=0.01: 0.8832
# Accuracy for C=0.05: 0.89168
# Accuracy for C=0.25: 0.88704
# Accuracy for C=0.5: 0.88288
# Accuracy for C=1: 0.88176



Accuracy for C=0.01: 0.79264
Accuracy for C=0.05: 0.82352
Accuracy for C=0.25: 0.85936
Accuracy for C=0.5: 0.87248
Accuracy for C=1: 0.88496


In [0]:
final_wc = LogisticRegression(C=1)
final_wc.fit(X, target)
print ("Final Accuracy: %s" 
       % accuracy_score(target, final_wc.predict(X_test)))

# Final Accuracy: 0.8822



Final Accuracy: 0.8824


#### TF-IDF


Another common way to represent each document in a corpus is to use the [tf-idf statistic](https://en.wikipedia.org/wiki/Tf%E2%80%93idf) (term frequency-inverse document frequency) for each word, which is a weighting factor that we can use in place of binary or word count representations.

There are several ways to do tf-idf transformation but in a nutshell, tf-idf aims to represent the number of times a given word appears in a document (a movie review in our case) relative to the number of documents in the corpus that the word appears in — where words that appear in many documents have a value closer to zero and words that appear in less documents have values closer to 1.

**Note**: Now that we’ve gone over n-grams, when I refer to ‘words’ I really mean any n-gram (sequence of words) if the model is using an n greater than one.

In [0]:
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split

In [0]:
tfidf_vectorizer = TfidfVectorizer()
tfidf_vectorizer.fit(reviews_train_clean)
X = tfidf_vectorizer.transform(reviews_train_clean)
X_test = tfidf_vectorizer.transform(reviews_test_clean)

In [0]:
X_train, X_val, y_train, y_val = train_test_split(
    X, target, train_size = 0.75
)



In [0]:
for c in [0.01, 0.05, 0.25, 0.5, 1]:
    
    lr = LogisticRegression(C=c)
    lr.fit(X_train, y_train)
    print ("Accuracy for C=%s: %s" 
           % (c, accuracy_score(y_val, lr.predict(X_val))))

# Accuracy for C=0.01: 0.79632
# Accuracy for C=0.05: 0.83168
# Accuracy for C=0.25: 0.86768
# Accuracy for C=0.5: 0.8736
# Accuracy for C=1: 0.88432



Accuracy for C=0.01: 0.79264
Accuracy for C=0.05: 0.82352
Accuracy for C=0.25: 0.85936
Accuracy for C=0.5: 0.87248
Accuracy for C=1: 0.88496


In [0]:
final_tfidf = LogisticRegression(C=1)
final_tfidf.fit(X, target)
print ("Final Accuracy: %s" 
       % accuracy_score(target, final_tfidf.predict(X_test)))

# Final Accuracy: 0.882



Final Accuracy: 0.8824


### Enhance4: Algorithms

So far we’ve chosen to represent each review as a very sparse vector (lots of zeros!). Linear classifiers typically perform better than other algorithms on data that is represented in this way.



#### Support Vector Machines (SVM)

Another algorithm that can produce great results with a quick training time are Support Vector Machines with a linear kernel.

In [0]:
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.svm import LinearSVC
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split

In [0]:
ngram_vectorizer = CountVectorizer(binary=True, ngram_range=(1, 2))
ngram_vectorizer.fit(reviews_train_clean)
X = ngram_vectorizer.transform(reviews_train_clean)
X_test = ngram_vectorizer.transform(reviews_test_clean)

In [0]:
X_train, X_val, y_train, y_val = train_test_split(
    X, target, train_size = 0.75
)



In [0]:
for c in [0.01, 0.05, 0.25, 0.5, 1]:
    
    svm = LinearSVC(C=c)
    svm.fit(X_train, y_train)
    print ("Accuracy for C=%s: %s" 
           % (c, accuracy_score(y_val, svm.predict(X_val))))
    
# Accuracy for C=0.01: 0.89104
# Accuracy for C=0.05: 0.88736
# Accuracy for C=0.25: 0.8856
# Accuracy for C=0.5: 0.88608
# Accuracy for C=1: 0.88592

Accuracy for C=0.01: 0.89344
Accuracy for C=0.05: 0.89376
Accuracy for C=0.25: 0.89248
Accuracy for C=0.5: 0.89216
Accuracy for C=1: 0.892




In [0]:
final_svm_ngram = LinearSVC(C=0.01)
final_svm_ngram.fit(X, target)
print ("Final Accuracy: %s" 
       % accuracy_score(target, final_svm_ngram.predict(X_test)))

# Final Accuracy: 0.8974

Final Accuracy: 0.89708


There are many great explanations of Support Vector Machines that do a much better job.
If you’re interested in learning more, this is a great tutorial:

https://blog.statsbot.co/support-vector-machines-tutorial-c1618e635e93

## Final Model

<hr>

The goal of this tutorial was to give you a toolbox of things to try and mix together when trying to find the right model + data transformation for our project. 

We found that removing a small set of stop words along with an n-gram range from 1 to 3 and a linear support vector classifier gave me the best results.

In [0]:
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.svm import LinearSVC

In [0]:
stop_words = ['in', 'of', 'at', 'a', 'the']
ngram_vectorizer = CountVectorizer(binary=True, ngram_range=(1, 3), stop_words=stop_words)
ngram_vectorizer.fit(reviews_train_clean)
X = ngram_vectorizer.transform(reviews_train_clean)
X_test = ngram_vectorizer.transform(reviews_test_clean)

In [0]:
X_train, X_val, y_train, y_val = train_test_split(
    X, target, train_size = 0.75
)



In [0]:
for c in [0.001, 0.005, 0.01, 0.05, 0.1]:
    
    svm = LinearSVC(C=c)
    svm.fit(X_train, y_train)
    print ("Accuracy for C=%s: %s" 
           % (c, accuracy_score(y_val, svm.predict(X_val))))
    
# Accuracy for C=0.001: 0.88784
# Accuracy for C=0.005: 0.89456
# Accuracy for C=0.01: 0.89376
# Accuracy for C=0.05: 0.89264
# Accuracy for C=0.1: 0.8928

Accuracy for C=0.001: 0.88624
Accuracy for C=0.005: 0.8912
Accuracy for C=0.01: 0.89264
Accuracy for C=0.05: 0.89344
Accuracy for C=0.1: 0.89392


In [0]:
final = LinearSVC(C=0.01)
final.fit(X, target)
print ("Final Accuracy: %s" 
       % accuracy_score(target, final.predict(X_test)))

# Final Accuracy: 0.90064

Final Accuracy: 0.90024


## Summary

We’ve gone over several options for transforming text that can improve the accuracy of an NLP model. Which combination of these techniques will yield the best results will depend on the task, data representation, and algorithms you choose. 

It’s always a good idea to try out many different combinations to see what works.

## What to do next

Explore deep learning approaches to building a sentiment classifier.