Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Fetching latest commit…
Cannot retrieve the latest commit at this time.
|Failed to load latest commit information.|
This project aims at allowing people to browse Wikipedia offline. HOWTO: - download a database dump at http://download.wikimedia.org (you want the "pages-articles" one) - run "$ perl berkeleyify [lang]wikipedia-page-articles-YYYYMMDD.xml.bz2" into your CGI directory Requires: a local webserver (I recommend thttpd), a recent version of Perl (5.14 will do), Text::Mediawiki, and probably other stuff. The script berkeleyify will create three files: - [lang]wikipedia-pages-articles-YYYYMMDD.db: This is the actual database. Articles are stored in compressed blocks of 256 articles. This is a read-only database. It is not designed to be modified. - [lang]wikipedia-pages-articles-YYYYMMDD.index: This berkeley database associates each article title to the position in the .db file. - [lang]wikipedia-pages-articles-YYYYMMDD.titles: This is a plain text list of titles. It is used with grep to search for titles. LICENSE: License information is repeated in each individual files.