Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
Tool that aids migration from a drupal-based blog to pyblosxom
Fetching latest commit…
Cannot retrieve the latest commit at this time.
|Type||Name||Latest commit message||Commit time|
|Failed to load latest commit information.|
== DEPENDENCIES == * python2 * mysql-python * Drupal 5 or any version with similar database schema * php-cli == INTRO == This assumes you run drupal on mysql converts drupal nodes: * type 'blog' -> 'entries' directory * other (in my case always 'page') -> 'static' directory, for consumption with PB 'pages' plugin per node some specific things are processed: * tags in a tags.py plugin compatible format * publish date for Tim Gray's pymetatime plugin) * breakpoints ('<!--break-->') are left intact, you can use the readmore.py plugin; just note that drupal can automatically break, pyblosxom doesn't, so for posts without breakpoints, you might want to insert breakpoints manually. * GUID is stored in guid.py compatible metadata if you specify the base_url of the drupal blog with -B (if unsure about the base_url, look at your guid's in your drupal feed) * comments. note: * manual review needed, because all html tags currently accepted * PB doesn't support comment threading and comment titles, though the data is exported * Comments with uid 1 (blog owner) have no url/email set, just nick. PB should support automatically rendering the correct url * all entities get replaced by their references (except stuff in CDATA tags), because PB comments plugin expects literal text in description * comments support source code hilighting (if you have a patched PB comments plugin to call the syntaxhilight plugin) * source code: exported for syntaxhilight.py plugin (which uses pygments). note: * most of the 'lang' attributes map perfectly ('php', 'bash', 'sql', etc), but for php embedded in html, you need to manually update to html+php * also, pygments is fairly strict and we can't control options like startinline (just the lexer name), so you might need to manually wrap your php code in <?php ..?> ignores: * trackbacks (is there a sane way to do trackbacks anyway? maybe with mollom/akismet/..? if so we could add this feature) * viewcount (i use google analytics) * all static content (images etc) (you should copy that data yourself and ultimately publish it on the same - or a similar - url. sed is your friend) == HOWTO == * start with a clean drupal database: make sure there isn't too much spam in it etc. session and cache tables can get big too DELETE FROM sessions WHERE timestamp < unix_timestamp(now()) -1000 after this cleanup, you might want to optimize your tables aswell * backup your database and/or work on a copy * run `d2pb.py --help` for usage instructions * in the `checkscripts` directory you'll find various scripts that help you check the accuracy of the conversion