Skip to content


Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
tree: 1f36b5c3a3
Fetching contributors…

Cannot retrieve contributors at this time

25 lines (16 sloc) 0.999 kb
This project aims at allowing people to browse Wikipedia offline.
- download a database dump at (you want the "pages-articles" one)
- run "$ perl berkeleyify [lang]wikipedia-page-articles-YYYYMMDD.xml.bz2" into your CGI directory
Requires: a local webserver (I recommend thttpd), a recent version of Perl (5.14 will do),
Text::Mediawiki, and probably other stuff.
The script berkeleyify will create three files:
- [lang]wikipedia-pages-articles-YYYYMMDD.db:
This is the actual database. Articles are stored in compressed blocks of 256 articles.
This is a read-only database. It is not designed to be modified.
- [lang]wikipedia-pages-articles-YYYYMMDD.index:
This berkeley database associates each article title to the position in the .db file.
- [lang]wikipedia-pages-articles-YYYYMMDD.titles:
This is a plain text list of titles. It is used with grep to search for titles.
License information is repeated in each individual files.
Jump to Line
Something went wrong with that request. Please try again.