In [1]:
from multiprocessing import Pool
import wss_extract

# Example

In this notebook we will show how to use the sentiment analysis systems within the wss_extract package.

The sentiment systems that currently work:
1. SentiStrength
2. Text Analysis Online
3. Text Processing

The sentiment system that does not work and might never work:
1. Christopher Potts which contains:
  * WordNet
  * Opinion Lexicon
  * IMDB
  * Senti Word Net
  * MPQA
2. Repustate
  
## How to use the working systems

We demostrate this by showing the systems working over a number of test sentences

In [2]:
test_sentences = '''vending machine guy gave me free chips, what a nice guy
vending machine guy gave me free chips, what a nice guy!!!
vending machine guy gave me free chips, what a nice guy!!!!!!
VENDING MACHINE GUY GAVE ME FREE CHIPS, WHAT A NICE GUY
vending machine guy gave me free chips, what a NICE guy
vending machine guy gave me free chips, what a niiiiiiice guy
vending machine guy gave me free chips, what a nice guy :-)
vending machine guy gave me free chips, what a nice guy :-('''
test_sentences = test_sentences.split('\n')
print(test_sentences)

['vending machine guy gave me free chips, what a nice guy', 'vending machine guy gave me free chips, what a nice guy!!!', 'vending machine guy gave me free chips, what a nice guy!!!!!!', 'VENDING MACHINE GUY GAVE ME FREE CHIPS, WHAT A NICE GUY', 'vending machine guy gave me free chips, what a NICE guy', 'vending machine guy gave me free chips, what a niiiiiiice guy', 'vending machine guy gave me free chips, what a nice guy :-)', 'vending machine guy gave me free chips, what a nice guy :-(']


Below we define how each system gets the sentiment of a sentence

In [3]:
def process_sentence(sentiment_system, sentence):
    return sentiment_system().sentiment(sentence)

Below we show how to run the systems in a multiprocessing fashion as they can take a long time to run due to the timeouts set. Here we do not remove the time outs, we just have a seperate process for each sentiment system.

In [4]:
working_systems = [wss_extract.SentiStrength, 
                   wss_extract.TextAnalysisOnline, wss_extract.TextProcessing]
number_systems = len(working_systems)
sentiment_output = []
for test_sentence in test_sentences:
    with Pool(3) as pool:
        test_sentence = [test_sentence] * number_systems
        multi_process_input = list(zip(working_systems, test_sentence))
        sentiment_output.append(pool.starmap(process_sentence, multi_process_input))

In [5]:
print(sentiment_output)

[[1, 0.5, 'pos'], [1, 0.7, 'pos'], [1, 0.7, 'pos'], [1, 0.5, 'pos'], [1, 0.5, 'pos'], [1, 0.4, 'neg'], [1, 0.5, 'pos'], [1, 0.083333, 'pos']]


As we can see above we have a list for each test sentence and within that list we have 3 different sentiment values. The first relates to `SentiStrength`, the second `TextAnalysisOnline`, the third `TextProcessing`.

We can now merge these lists with their corresponding test sentences:

In [6]:
print(list(zip(test_sentences, sentiment_output)))

[('vending machine guy gave me free chips, what a nice guy', [1, 0.5, 'pos']), ('vending machine guy gave me free chips, what a nice guy!!!', [1, 0.7, 'pos']), ('vending machine guy gave me free chips, what a nice guy!!!!!!', [1, 0.7, 'pos']), ('VENDING MACHINE GUY GAVE ME FREE CHIPS, WHAT A NICE GUY', [1, 0.5, 'pos']), ('vending machine guy gave me free chips, what a NICE guy', [1, 0.5, 'pos']), ('vending machine guy gave me free chips, what a niiiiiiice guy', [1, 0.4, 'neg']), ('vending machine guy gave me free chips, what a nice guy :-)', [1, 0.5, 'pos']), ('vending machine guy gave me free chips, what a nice guy :-(', [1, 0.083333, 'pos'])]


To show generally how a sentiment system works in a non multiprocessing fashion

In [8]:
senti_strength = wss_extract.SentiStrength()
print(senti_strength.sentiment('vending machine guy gave me free chips, what a nice guy'))

1


## What happens when you use a Non-working system

Below shows the error you will get when a system does not work:

In [9]:
wordnet_system = wss_extract.ChrisPotts('wordnet')

In [10]:
wordnet_system.sentiment('vending machine guy gave me free chips, what a nice guy')

Reppy cache fetch error on http://sentiment.christopherpotts.net/robots.txt
Traceback (most recent call last):
  File "/home/andrew/miniconda/envs/wss/lib/python3.7/site-packages/reppy/cache/__init__.py", line 66, in factory
    return self.fetch(url)
  File "/home/andrew/miniconda/envs/wss/lib/python3.7/site-packages/reppy/cache/__init__.py", line 109, in fetch
    url, ttl_policy=self.ttl_policy, *self.args, **self.kwargs)
  File "reppy/robots.pyx", line 100, in reppy.robots.FetchMethod
  File "reppy/robots.pyx", line 123, in reppy.robots.FetchMethod
reppy.exceptions.BadStatusCode: ('Got 500 for http://sentiment.christopherpotts.net/robots.txt', 500)


URLError: <urlopen error Bad web page. Status code 500>