Python workers that collect tweets from the twitter streaming api and track deletions
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.

Install Beanstalkd

Requires installing the libevent-dev package on apt-based systems.

Install Python dependencies

Install pip if you don't already have it then run:

pip install -r requirements.txt

Edit config file


cp conf/tweets-client.ini.example conf/tweets-client.ini

In the [tweets-client] section, add your Twitter account's username and password. This account will be authenticated against to make all API requests.

In the [beanstalk] section, change "tweets_tube" and "screenshot_tube". The values don't matter much, they just need to be unique.

In the [database] section, update the "host", "port", "username", "password", and "database" sections with your own details, if the defaults are not appropriate.

In the [aws] section, add your access key, secret access key, bucket name, and any path prefix inside the bucket you want to use. This is for archiving images and screenshots of tweeted links.


Run to start streaming items from Twitter into the beanstalk queue. Append the lib directory to the PYTHONPATH, either persistently or as part of the command:


Then run to start pulling the tweets out of beanstalk and loading them into MySQL:

PYTHONPATH=$PYTHONPATH:`pwd`/lib ./bin/ --images

Finally, if you ran with the images option turned on, run to grab screenshots of webpages and mirror images linked in tweets.


These three scripts all accept the following options:

  • --loglevel - Sets the verbosity of logging.
  • --output - Destination for log files.
  • --restart - Restart if the script encounters an error that cannot be handled.