A sentiment analysis pipeline for Twitter
Dependencies are managed with pipenv. Run
pipenv install to create a virtual environment with the packages you need to run the pipeline.
Apply for Developer API access, register a new app and make a note of your credentials. The pipeline will look for these in the environment variables
TWITTER_ACCESS_TOKEN_SECRET, so be sure to export these before running.
A handy way to avoid doing this every time is to define your credentials in a
.env file, pipenv will automatically load them into the environment before running the pipeline.
Export the API credentials or log them in a
.env file as described above. Then run:
pipenv run ./app.py
This will load the default configuration (
config.yml) and run it as a pipeline. You can specify alternate config files, if you like. Run
pipenv run ./app.py -h for more info on how to do this.
Filter tweets by topic or hashtag
The tweet stream can be filtered by topic by editing the
graph.tweetstream.config.track property in your config file before running the pipeline. By default,
#brexit is tracked but any query can be used, e.g.
Use a different sentiment analyser
Change the terminal window colour thresholds
graph.pprint.config.pos to change the threshold over which tweets are printed in green. Edit
graph.pprint.config.neg to change the threshold below which tweets are printed in red.
Do something else
The pipeline is written as a bonobo directed acyclic graph. You can add new functionality by adding a new node that instantiates whatever class or function you like.