Convert various blog dumps to a standard JSON
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
blog_to_json
.gitignore
README.rst
disqus-xml-to-json
requirements.txt
setup.py
wordpress-xml-to-json

README.rst

blog-to-json

This tool will convert various blog dumps to a standard JSON format.

For example:

  • Wordpress XML dumps to JSON
  • Disqus XML dumps to JSON

It is opinionated and may at times remove data.

install

git clone https://github.com/russellballestrini/blog-to-json.git
cd blog-to-json
python setup.py develop

Wordpress XML to JSON

how to use:

wordpress-xml-to-json example.xml

example of schema:

{
  "homegrown-python-bread-crumb-module": {
    "name": "a-homegrown-python-bread-crumb-module",
    "title": "A homegrown python bread crumb module",
    "timestamp": 1293995686,
    "comments": [
      {
        "date": "2011-04-03 10:33:07",
        "timestamp": 1301841187,
        "content": "Hi, this was just what I needed, did a few modifications but basically worked out of the box. Thanks for posting",
        "email": "oops",
        "author": "Kristian",
        "author_ip": "192.168.1.5"
      },
      {
        "date": "2011-04-03 14:19:46",
        "timestamp": 1301854786,
        "content": "I'm interested in the modifications, I just placed the code into bitbucket.  Feel free to branch it.  \n\nI'm also interested in seeing your project that you used it in.  Thanks",
        "email": "oops",
        "author": "Russell Ballestrini",
        "author_ip": "192.168.1.6"
      }
    ],
    "content": "<p><strong>I wrote <a href=\"https://bitbucket.org/russellballestrini/bread/raw/tip/bread.py\">bread.py</a> a few days ago.</strong> <a href=\"https://bitbucket.org/russellballestrini/bread/raw/tip/bread.py\">Bread.py</a> is a simple to use python breadcrumb module. \n</p>\n\n<p>\nThe bread object accepts a url string and grants access to the url crumbs (parts) or url links (list of hrefs to each crumb) .\n</p>\n\n<p>\nI have released <a href=\"https://bitbucket.org/russellballestrini/bread/raw/tip/bread.py\">bread.py</a> into the public domain and you may view the full source code here: <a href=\"https://bitbucket.org/russellballestrini/bread/src\">https://bitbucket.org/russellballestrini/bread/src</a>\n</p>\n\n<p>\n<strong>Update</strong>\n</p>\n\n<p>\nI recently revisited this module and wrote a tutorial on how to <a href=\"http://russell.ballestrini.net/add-a-breadcrumb-subscriber-to-a-pyramid-project-using-4-simple-steps/\">Add a Breadcrumb Subscriber to a Pyramid project using 4 simple steps</a>.\n</p>\n\n<ul>\n<li>Demo of bread.py: <a href=\"http://school.yohdah.com/\">http://school.yohdah.com/</a></li>\n<li>Pyrawiki will use bread.py</li> \n</ul>\n\n<br />\n\n<strong>You should follow me on twitter <a href=\"http://twitter.com/russellbal\" target=\"_blank\">here</a></strong>\n\n<span style=\"font-size: 10px;\">\n<script src=\"https://bitbucket.org/russellballestrini/bread/src/50a1a20fc3f3/bread.py?embed=t\"></script>\n</span>",
    "link": "http://russell.ballestrini.net/a-homegrown-python-bread-crumb-module/",
    "date": "2011-01-02 14:14:46"
  }
}

Why?

It's your data, thats why!

I created and used this tool during my Migration from WordPress to Pelican. Others have used this tool to migrate comments from Disqus to Remarkbox.