Skip to content
A bonobo pipeline for analysing sentiment from a live tweetstream.
Python
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
nodes
.gitignore
LICENSE
Pipfile
Pipfile.lock
README.md
app.py
config.yml
util.py

README.md

A sentiment analysis pipeline for Twitter

A bonobo pipeline for analysing sentiment from a live tweetstream. Supports lexical-based analysers (pattern and VADER) and a Naive Bayes analyser (textblob). Prints tweet sentiment to the terminal.

This project was developed for RebelCon 2019. The accompanying presentation is available here.

Prerequisites

Dependencies

Dependencies are managed with pipenv. Run pipenv install to create a virtual environment with the packages you need to run the pipeline.

API credentials

Apply for Developer API access, register a new app and make a note of your credentials. The pipeline will look for these in the environment variables TWITTER_CONSUMER_KEY, TWITTER_CONSUMER_SECRET, TWITTER_ACCESS_TOKEN and TWITTER_ACCESS_TOKEN_SECRET, so be sure to export these before running.

A handy way to avoid doing this every time is to define your credentials in a .env file, pipenv will automatically load them into the environment before running the pipeline.

Usage

Export the API credentials or log them in a .env file as described above. Then run:

pipenv run ./app.py

This will load the default configuration (config.yml) and run it as a pipeline. You can specify alternate config files, if you like. Run pipenv run ./app.py -h for more info on how to do this.

Filter tweets by topic or hashtag

The tweet stream can be filtered by topic by editing the graph.tweetstream.config.track property in your config file before running the pipeline. By default, #brexit is tracked but any query can be used, e.g. boris johnson.

Use a different sentiment analyser

Currently, pattern, textblob and VADER analysers are supported. By default, VADER is used but this can be changed by modifying the graph.sentiment_analyser.class property to one of the following:

  • pattern: TextBlobPatternAnalyzer
  • textblob: TextBlobNaiveBayesAnalyzer
  • VADER: VaderAnalyzer

Change the terminal window colour thresholds

Edit graph.pprint.config.pos to change the threshold over which tweets are printed in green. Edit graph.pprint.config.neg to change the threshold below which tweets are printed in red.

Do something else

The pipeline is written as a bonobo directed acyclic graph. You can add new functionality by adding a new node that instantiates whatever class or function you like.

You can’t perform that action at this time.