WordPress (version 3.x and I think there's a plugin for 2.x) can export xml with all your blog posts, comments, etc. This tool can parse the xml file and give you a convenient access to all your posts, comments, categories, etc in a ruby script that you have to write.
The cool thing is — you don't even need to have access to your own WordPress database, all you need is to download the xml file from your admin panel.
gem install wpxml_parser
require 'wpxml_parser' include WpxmlParser # Blog blog = Blog.new('dump_of_your_blog.xml') blog.posts.size # => 51 post = blog.posts.first # => <Post> # Post post.title # => "Title of your post" post.categories # => ["ruby", "rails"] post.date # => <Time> post.slug # => "title-of-your-post" post.body # => "...shitload of text..." post.post_id # => 27 # Comments post.comments # => [<Comment>, <Comment>, ...] comment = post.comments.first comment.author # => 'dude' comment.date # => <Time> # Comments have parent and children comment.parent_id # => 32 comment.parent # => <Comment> comment.children # => [<Comment>, <Comment>] # Conveniences # You can quickly find a post or a comment by post_id or comment_id respectively blog.find_post(27) # => <Post> post.find_comment(32) # => <Comment> # For more info and more accessible properties check out the source code, it's pretty straightforward.
A note on posts
Currently only published posts are included. Drafts, private, etc are skipped. This was done on purpose, since I didn't really use those wordpress features.
A note on comments
Currently only approved comments are included. This was done on purpose, since I didn't need spam.