This tool will convert various blog dumps to a standard JSON format.

For example:

  • Wordpress XML dumps to JSON
  • Disqus XML dumps to JSON

It is opinionated and may at times remove data.


git clone
cd blog-to-json
python develop

Wordpress XML to JSON

how to use:

wordpress-xml-to-json example.xml

example of schema:

  "homegrown-python-bread-crumb-module": {
    "name": "a-homegrown-python-bread-crumb-module",
    "title": "A homegrown python bread crumb module",
    "timestamp": 1293995686,
    "comments": [
        "date": "2011-04-03 10:33:07",
        "timestamp": 1301841187,
        "content": "Hi, this was just what I needed, did a few modifications but basically worked out of the box. Thanks for posting",
        "email": "oops",
        "author": "Kristian",
        "author_ip": ""
        "date": "2011-04-03 14:19:46",
        "timestamp": 1301854786,
        "content": "I'm interested in the modifications, I just placed the code into bitbucket.  Feel free to branch it.  \n\nI'm also interested in seeing your project that you used it in.  Thanks",
        "email": "oops",
        "author": "Russell Ballestrini",
        "author_ip": ""
    "content": "<p><strong>I wrote <a href=\"\"></a> a few days ago.</strong> <a href=\"\"></a> is a simple to use python breadcrumb module. \n</p>\n\n<p>\nThe bread object accepts a url string and grants access to the url crumbs (parts) or url links (list of hrefs to each crumb) .\n</p>\n\n<p>\nI have released <a href=\"\"></a> into the public domain and you may view the full source code here: <a href=\"\"></a>\n</p>\n\n<p>\n<strong>Update</strong>\n</p>\n\n<p>\nI recently revisited this module and wrote a tutorial on how to <a href=\"\">Add a Breadcrumb Subscriber to a Pyramid project using 4 simple steps</a>.\n</p>\n\n<ul>\n<li>Demo of <a href=\"\"></a></li>\n<li>Pyrawiki will use</li> \n</ul>\n\n<br />\n\n<strong>You should follow me on twitter <a href=\"\" target=\"_blank\">here</a></strong>\n\n<span style=\"font-size: 10px;\">\n<script src=\"\"></script>\n</span>",
    "link": "",
    "date": "2011-01-02 14:14:46"


It's your data, thats why!

I created and used this tool during my Migration from WordPress to Pelican. Others have used this tool to migrate comments from Disqus to Remarkbox.