WordPress (version 3.x and I think there's a plugin for 2.x) can export xml with all your blog posts, comments, etc. This tool can parse the xml file and give you a convenient access to all your posts, comments, categories, etc in a ruby script that you have to write.
The cool thing is — you don't even need to have access to your own WordPress database, all you need is to download the xml file from your admin panel.
gem install wpxml_parser
require 'wpxml_parser'
include WpxmlParser
# Blog
blog = Blog.new('dump_of_your_blog.xml')
blog.posts.size # => 51
post = blog.posts.first # => <Post>
# Post
post.title # => "Title of your post"
post.categories # => ["ruby", "rails"]
post.date # => <Time>
post.slug # => "title-of-your-post"
post.body # => "...shitload of text..."
post.post_id # => 27
# Comments
post.comments # => [<Comment>, <Comment>, ...]
comment = post.comments.first
comment.author # => 'dude'
comment.date # => <Time>
# Comments have parent and children
comment.parent_id # => 32
comment.parent # => <Comment>
comment.children # => [<Comment>, <Comment>]
# Conveniences
# You can quickly find a post or a comment by post_id or comment_id respectively
blog.find_post(27) # => <Post>
post.find_comment(32) # => <Comment>
# For more info and more accessible properties check out the source code, it's pretty straightforward.
Currently only published posts are included. Drafts, private, etc are skipped. This was done on purpose, since I didn't really use those wordpress features.
Currently only approved comments are included. This was done on purpose, since I didn't need spam.