This is a basic EZBoard scraping script that uses mechanize to spider a specified ezboard forum and download the forum/topic pages, scrape the html, and create users/topics/posts attempting to preserve original post metadata. Topic view/read counts are not currently preserved, but user creation time and post creation time are.
The script was written to migrate an ezboard forum into a custom rails-based forum. As such, it expects to be run from the environment of a Rails application, and uses ActiveRecord to write to database. However, it should be easy to modify to write to database directly, or even to plaintext.
RAILS_ROOT/scriptsdirectory of your Rails app.
- Ensure that you have a
migration_cachedirectory under your
ezboard_migrate.rbscript to point to your forum url (should look something like http://p098.ezboard.com/bYOURFORUM)
admin_passwordvariables to those of an existing forum user, preferably an admin who has access to all subforums and topics.
- Check that you have a User, Topic, Post, and Forum model set up in your Rails application.
Scraped topic pages are cached in
RAILS_ROOT/tmp/migration_cache directory in case you need to tweak the script or modify post parsing/cleanup code. If you see strange things going on with posts, or have markup you wish to clean up, just make your changes, wipe the database and re-run the script: it will re-download only new topics, or topics that have had replies since your last import attempt.
Once you're certain the data migrates successfully, lock down the ezboard forum to make sure that users do not post anything while you're migrating, and run the script one final time. Once all data is moved successfully, feel free to remove the
The database models that the script tries to use have the following fields:
Forum: name User: login, password, created_at Topic: title, created_at (related to user and forum) Post: body, created_at (related to forum, topic and user)
- 2009-03-03 - 0.1 - cleanup and initial public release
This script was written as a one-off. As such, I can promise that the code is messy, poorly documented, and full of ruby newbie mistakes. It works for me, but probably will require changes before it works for you. Please feel free to fork or incorporate it in your own projects. I am making it public in hopes that someone may find it useful when trying to migrate away from ezboard.