Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Tool that aids migration from a drupal-based blog to pyblosxom

branch: master

Fetching latest commit…

Octocat-spinner-32-eaf2f5

Cannot retrieve the latest commit at this time

Octocat-spinner-32 checkscripts
Octocat-spinner-32 drupalcode
Octocat-spinner-32 README
Octocat-spinner-32 d2pb.py
Octocat-spinner-32 pbcomment.cmt
Octocat-spinner-32 text_filter.php
README
== DEPENDENCIES ==
* python2
* mysql-python
* Drupal 5 or any version with similar database schema
* php-cli

== INTRO ==
This assumes you run drupal on mysql
converts drupal nodes:
* type 'blog'                      -> 'entries' directory
* other (in my case always 'page') -> 'static' directory, for consumption with PB 'pages' plugin

per node some specific things are processed:
	* tags in a tags.py plugin compatible format
	* publish date for Tim Gray's pymetatime plugin)
	* breakpoints ('<!--break-->') are left intact,
	you can use the readmore.py plugin;
	just note that drupal can automatically break, pyblosxom doesn't,
	so for posts without breakpoints, you might want to insert
	breakpoints manually.
	* GUID is stored in guid.py compatible metadata if you specify the base_url of the drupal blog with -B
	(if unsure about the base_url, look at your guid's in your drupal feed)
	* comments. note:
		* manual review needed, because all html tags currently accepted
		* PB doesn't support comment threading and comment titles, though the data is exported
		* Comments with uid 1 (blog owner) have no url/email set, just nick.  PB should support automatically rendering the correct url
		* all entities get replaced by their references (except stuff in CDATA tags), because PB comments plugin expects literal text in description
		* comments support source code hilighting (if you have a patched PB comments plugin to call the syntaxhilight plugin)
	* source code: exported for syntaxhilight.py plugin (which uses pygments).  note:
		* most of the 'lang' attributes map perfectly ('php', 'bash', 'sql', etc), but for php embedded in html, you need to manually update to html+php
		* also, pygments is fairly strict and we can't control options like startinline (just the lexer name), so you might need to manually wrap your php code
		  in <?php ..?>
ignores:
	* trackbacks (is there a sane way to do trackbacks anyway?
	maybe with mollom/akismet/..? if so we could add this feature)
	* viewcount (i use google analytics)
	* all static content (images etc) (you should copy that data yourself and ultimately publish it on the same - or a similar - url. sed is your friend)

== HOWTO ==

* start with a clean drupal database: make sure there isn't too much spam in it etc.  session and cache tables can get big too
DELETE FROM sessions WHERE timestamp < unix_timestamp(now()) -1000
after this cleanup, you might want to optimize your tables aswell
* backup your database and/or work on a copy
* run `d2pb.py --help` for usage instructions
* in the `checkscripts` directory you'll find various scripts that help you check the accuracy of the conversion
Something went wrong with that request. Please try again.