Some Python to process the Wikileaks Cablegate data.
Python C JavaScript
Pull request Compare This branch is 5 commits ahead of typecode:master.
Latest commit fbfcee2 Nov 30, 2010 @anarchivist update README :D
Failed to load latest commit information.
data adding as submodule Nov 30, 2010
lib elaborated on some things in readme Nov 29, 2010
.gitmodules adding as submodule Dec 1, 2010
README update README :D Nov 30, 2010 elaborated on some things in readme Nov 29, 2010 Beginning to work on expanding typecode's processing script Nov 30, 2010 processor.Cable now creates python objects, according to ACP-127 pars… Nov 30, 2010


Wikileaks CableGate Processing (based on work by

used andrew's existing code to start coming up with a generic parser
that will scrape data from the cables in HTML form and create python objects.

for an example, please see

data from typecode's original readme follows the dashed line below.


  HTTrack (
    *to update mirror of cablegate site
  MongoDB (
    *database to store parsed cables
    *ouputs json dump

  Will update Cablegate Web Mirror, and pull all existing cables into a 
    MongoDB Collection where their 'Reference ID' is their '_id', and they
    contain the following properties:

  Make sure that mongod is running. The software is configured to access
    Mongod at it's default location, so change that if necessary.
  Confirm that httrack and mongoexport are accesable in your PATH.
  run 'python'
  after that
  run 'mongoexport -d wikileaks -c cables -o dump/cables.json'
  The Tornado app that is there right now serves no function. To come..?