Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
tree: 1f36b5c3a3
Fetching contributors…

Cannot retrieve contributors at this time

25 lines (16 sloc) 0.999 kb
This project aims at allowing people to browse Wikipedia offline.
HOWTO:
- download a database dump at http://download.wikimedia.org (you want the "pages-articles" one)
- run "$ perl berkeleyify [lang]wikipedia-page-articles-YYYYMMDD.xml.bz2" into your CGI directory
Requires: a local webserver (I recommend thttpd), a recent version of Perl (5.14 will do),
Text::Mediawiki, and probably other stuff.
The script berkeleyify will create three files:
- [lang]wikipedia-pages-articles-YYYYMMDD.db:
This is the actual database. Articles are stored in compressed blocks of 256 articles.
This is a read-only database. It is not designed to be modified.
- [lang]wikipedia-pages-articles-YYYYMMDD.index:
This berkeley database associates each article title to the position in the .db file.
- [lang]wikipedia-pages-articles-YYYYMMDD.titles:
This is a plain text list of titles. It is used with grep to search for titles.
LICENSE:
License information is repeated in each individual files.
Jump to Line
Something went wrong with that request. Please try again.