Skip to content

IBMPredictiveAnalytics/TwitterBlueMix

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Receive Twitter Decahose Data

With this extension for IBM SPSS Modeler you are able to receive Twitter Decahose data with powerful queries. Decahose stands for a random 10% sample of Tweets which is sufficient in most use cases. This node leverages the Insights for Twitter Service on BlueMix


Quickstart

  1. You need to set up the Insights for Twitter BlueMix Service here.

  2. If you signed up for the Entry Plan of the service, go to your Services --> Insights for Twitter --> Service Credentials --> View Credentials and copy the 'url' from there.

  3. Download and install the TwitterBlueMix.mpe file in SPSS Modeler or search for 'TwitterBlueMix' in the Predictive Extensions Hub (klick here for detailed instructions).

  4. Download the example stream and paste your URL from step 2 into the TwitterBlueMix Node to get started.

UI

The user interface gives you the ability to automatically generate straightforward queries.
You always need to paste your BlueMix Insights for Twitter service URL. Depending on how much Tweets you want to receive, enter a number in the respective field. Type 0 if you want to receive all Tweets available via the Twitter Decahose. The other three boxes give you the option to select specific search terms, specific authors, and/or specific date ranges. If you need more advanced queries than provided via the UI, just only check 'Create Custom Query' and post a more complex query string, matching the correct pattern explained here: http://ibm.biz/TwitterQuery.

Output

The output is always a list of Tweets with the following information:
"author", "gender", "sentimentPolarity", "verb", "postedTime", "generatorDisplayName", "link", "body", "favoritesCount", "twitter_filter_level", "twitter_lang", "retweetCount", "longitude", "latitude", "country", "city", "state"

Some of the fields like gender or the geo locations are only filled if known.
The sentimentPolarity field is provided and pre-scored for you by Twitter. Feel free to compare it with your own sentiment analysis (e.g. I've seen it doesn't recognize doubled negations such as 'not good').
The body field contains the Tweet itself.

Requirements

  • SPSS Modeler v18.0 or later
  • PySpark in Modeler set up as outlined here under the Prerequisites section.
  • Python installation (Anaconda recommended; If you refer to another python installation, make sure to pip install pandas and pip install numpy).

Installation

In SPSS Modeler Click on 'Extensions' --> 'Install Local Extenstions Bundle...' and navigate to the TwitterBlueMix.mpe file in this folder.
or...
In SPSS Modeler Click on 'Extensions' --> 'Extensions Hub...', search, 'Facebook Posts' and click 'Install...'.

Packages used

pandas: https://pypi.python.org/pypi/pandas
numpy: https://pypi.python.org/pypi/numpy

Authors

Jonathan Langefeld - IBM

About

SPSS Modeler nodes enabling the Twitter BlueMix API

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%