Scrapes twitter for a given term
Twitter search tool

Simple command line tool to interact with Twitter's search API. Built on top of the python-twitter to provide a simpler interface just to the GetSearch method, built mostly because I needed a tool for historical searches. It accepts a list of terms, a language and a start ID and searches historical tweets (can't go back further than 7 days as that's the oldest the open Search API will go).

Once done it saves a pickled pandas dataframe with the resulting tweets. Also saves intermediate checkpoints (every 50k by default) in case the program crashes for any reason. It can take a long time to run, as the API has a rate limit and python-twitter will sleep when it's reached, which happens about every 5k tweets downloaded. for a 1.5M download it took around 20 hours to run (probably 90% of this time was spent sleeping anyway).

Sample usage:

# Search for all tweets that have the terms 'Chile' or 'Santiago', in spanish, going as far back as possible
python --terms Chile,Santiago --lang es

Only mandatory argument is --terms, which must be a comma-separated string. Additional arguments are --lang for the language and --start_id to define how far back to search. The defaults are:

  • start_id: As far back as possible (around 7 days)
  • lang: en


Must create a file in the working directory with the following form:

from collections import namedtuple

ApiKey = namedtuple('ApiKey', [

# Replace these strings with the corresponding keys/tokens
api_key = ApiKey(
