## Greedy Sentiment Transformation
11/13/16 - Use a greedy switching method to exchange words based on best antonym.
Uses the IMDB dataset folder (http://ai.stanford.edu/~amaas/data/sentiment/).

In [1]:
from sentiment_utils import *
import nltk
from nltk.sentiment.vader import SentimentIntensityAnalyzer
from nltk import word_tokenize
from nltk.corpus import wordnet as wn



In [27]:
def greedy_transform_func(filename, review, score):
    sentiment_analyzer = SentimentIntensityAnalyzer()
    tagged_review = nltk.pos_tag(word_tokenize(review))
    transformed_review = []
    for tagged_word in tagged_review:
        if tagged_word[1] in ['JJ', 'JJR', 'JJS', 'RB', 'RBR', 'RBS']:
            #1) get sentiment of the tagged word
            #2) if the sentiment of the tagged word is opposite of the review score
            #3) -get the antonyms of tagged word
            #4) -for each antonym, score sentiment, or pick random sentiment?
            #5) -append antonym
            word_sentiment = sentiment_analyzer.polarity_scores(tagged_word[0])['compound']
            #print tagged_word[0], tagged_word[1], word_sentiment, score
            if word_sentiment*(score - 5) > 0:
                antonyms = get_antonyms(tagged_word[0], tagged_word[1])
                if len(antonyms) == 0:
                    transformed_review.append('not ' + tagged_word[0])
                else: transformed_review.append(antonyms[0])
            else: transformed_review.append(tagged_word[0])
        else:
            transformed_review.append(tagged_word[0])
    return " ".join(transformed_review)

def get_antonyms(word, word_pos):
    all_antonyms = []
    pos_dict = {'JJ': wn.ADJ, 'JJR': wn.ADJ, 'JJS': wn.ADJ, 'RB': wn.ADV, 'RBR': wn.ADV, 'RBS': wn.ADV}
    wn_pos = pos_dict[word_pos]
    for syn in wn.synsets(word, pos = pos_dict[word_pos]):
        for lemma in syn.lemmas():
            if lemma.antonyms():
                all_antonyms.append(lemma.antonyms()[0].name())
    antonyms = list(set(all_antonyms))
    return antonyms

Greedy Test One Review

In [29]:
i = 0
for (filename, review, score) in imdb_sentiment_reader(dataset_type='val', sentiment='pos'):
    if i != 4:
        i +=1
        continue
    print "Original review: "
    print review
    print "Transformed review:" 
    transformed = greedy_transform_func(filename, review, score)
    print transformed
    break

Original review: 
This movie was sadly under-promoted but proved to be truly exceptional. Entering the theatre I knew nothing about the film except that a friend wanted to see it.<br /><br />I was caught off guard with the high quality of the film. I couldn't image Ashton Kutcher in a serious role, but his performance truly exemplified his character. This movie is exceptional and deserves our monetary support, unlike so many other movies. It does not come lightly for me to recommend any movie, but in this case I highly recommend that everyone see it.<br /><br />This films is Truly Exceptional!
Transformed review:
This movie was sadly under-promoted but proved to be not truly exceptional . Entering the theatre I knew nothing about the film except that a friend wanted to see it. < br / > < br / > I was caught off guard with the high quality of the film . I could n't image Ashton Kutcher in a serious role , but his performance insincerely exemplified his character . This movie is exceptio

# Rule-Based

In [80]:
def newWord(word):
    ant = get_antonyms(word[0], word[1])
    if len(ant) == 0:
        return "not " + word[0]
    else:
        return ant[0]


def rule_based_trans_func(filename, review, score):
    sentiment_analyzer = SentimentIntensityAnalyzer()
    tagged_review = nltk.pos_tag(word_tokenize(review))
    transformed_review = []

    
    
    i = 0
    appended = False
    
    while i < (len(tagged_review) - 1):
        word1 = tagged_review[i]
        word2 = tagged_review[i+1]
        
        word1_sentiment = sentiment_analyzer.polarity_scores(word1[0])['compound']
        word2_sentiment = sentiment_analyzer.polarity_scores(word2[0])['compound']
        if (word1[1] in ['JJ', 'JJR', 'JJS', 'RB', 'RBR', 'RBS'] or word2[1] in ['JJ', 'JJR', 'JJS', 'RB', 'RBR', 'RBS']): #and (word1_sentiment*(score - 5) > 0 or word2_sentiment*(score - 5) > 0):

            # (adverb, adj/adv) special case
            if word1[1] in ['RB', 'RBR', 'RBS'] and word2[1] in ['JJ', 'JJR', 'JJS', 'RB', 'RBR', 'RBS']:
                if word1_sentiment*(score - 5) > 0 and word2_sentiment*(score - 5) <= 0:
                    word = newWord(word1)
                    if not appended: transformed_review.append(word)
                    transformed_review.append(word2[0])
                elif word1_sentiment*(score - 5) <= 0 and word2_sentiment*(score - 5) > 0:
                    word = newWord(word2)
                    if not appended: transformed_review.append(word1[0])
                    transformed_review.append(word)
                else:
                    w1 = newWord(word1)
                    w2 = newWord(word2)
                    if not appended: transformed_review.append(w1)
                    transformed_review.append(w2)

            # if not in front
            elif word1[0].lower() == "not" or word1[0].lower() == "never":
                transformed_review.append(word2[0])

            # final special case 
            elif word1[0].lower() == "but" or word1[0].lower() == "yet":
                word = newWord(word2)
                if not appended: transformed_review.append("and")
                transformed_review.append(word)

            else:
                w1 = word1[0]
                w2 = word2[0]
                if word1_sentiment*(score - 5) > 0 and word1[1] in ['JJ', 'JJR', 'JJS', 'RB', 'RBR', 'RBS']:
                    w1 = newWord(word1)
                if word2_sentiment*(score - 5) > 0 and word2[1] in ['JJ', 'JJR', 'JJS', 'RB', 'RBR', 'RBS']:
                    w2 = newWord(word2)
                transformed_review.append(w1)
                transformed_review.append(w2)


            word3 = tagged_review[i+2]
            word3_sentiment = sentiment_analyzer.polarity_scores(word3[0])['compound']

            if word2[1] in ['RB', 'RBR', 'RBS'] and word3[1] in ['JJ', 'JJR', 'JJS'] and word1_sentiment*(score - 5) > 0:
                i += 1
                appended = True
            else:
                i += 2
                appended = False

                    
        else:
            transformed_review.append(word1[0])
            transformed_review.append(word2[0])
            i += 2
            appended = False
                
                    
    return " ".join(transformed_review)
                        
                     
    

 
                            
        
        
    

In [82]:
i = 0
for (filename, review, score) in imdb_sentiment_reader(dataset_type='val', sentiment='pos'):
    if len(review) > 900 or i == 0:
        i +=1
        continue
    print "Original review: "
    print review
    print "Transformed review:" 
    transformed = rule_based_trans_func(filename, review, score)
    print transformed
    break

Original review: 
I went and saw this movie last night after being coaxed to by a few friends of mine. I'll admit that I was reluctant to see it because from what I knew of Ashton Kutcher he was only able to do comedy. I was wrong. Kutcher played the character of Jake Fischer very well, and Kevin Costner played Ben Randall with such professionalism. The sign of a good movie is that it can toy with our emotions. This one did exactly that. The entire theater (which was sold out) was overcome by laughter during the first half of the movie, and were moved to tears during the second half. While exiting the theater I not only saw many women in tears, but many full grown men as well, trying desperately not to let anyone see them crying. This movie was great, and I suggest that you go see it before you judge.
Transformed review:
I went and saw this movie last night after being coaxed to by a few friends of mine . I 'll admit that I was reluctant to see it because from what I knew of Ashton Kut

In [76]:
sentiment_analyzer = SentimentIntensityAnalyzer()
sentiment_analyzer.polarity_scores("Brilliant")['compound']


0.5859

In [72]:
print transformed

Brilliant and moving performances by Tom Courtenay and Peter Finch
