Twista is a Twitter streaming and analysis command line tool suite implemented in Python 3. It provides the following core features:
- to record Tweets (statuses, replies, retweets, replies) from the public Twitter streaming API in a standardized way,
- to import collected chunks of Tweets into a Neo4j graph database for analysis.
- The graph database can be used for analysis. We recommand to make use of tools like Jupyter.
Twista provides integrated support for Jupyter. Try the
twista lab
command to start Jupyter with the current config file.
Twista is hosted on PyPi. Therefore, it can be easily installed:
pip3 install twista
Type
$ twista
to get an overview of existing Twista commands.
Usage: twista [OPTIONS] COMMAND [ARGS]...
Options:
--help Show this message and exit.
Commands:
import Imports Twitter records into a Neo4j graph database
init Initializes a directory to be used with Twista
lab Starts Jupyter lab for analysis
record Records a Twitter stream
stop Stops the Neo4j database
version Reports the version of Twista
We recommend to study the Wiki on how to record and analyze public Twitter streams using Twista and graph databases.
Twista (0.3.0) is been used to record a sample of the complete German Twitter stream since April 2019. This dataset is open access, updated monthly, and available here:
Twista (0.2.0) has been evaluated recording tweets during the German Federal Election Campaigns of 2017. Over four months Twista recorded 10 GB of data without any operator interaction! This dataset is open access and available here: