MWoffliner
MWoffliner is a tool for making a local offline HTML snapshot of any online Mediawiki instance. It goes through all articles (or a selection if specified) and create the corresponding ZIM file to a local directory. It has mainly been tested against Wikimedia projects like Wikipedia, Wiktionary, ... But it should also work for any recent Mediawiki.
Read CONTRIBUTING.md to know more about MWoffliner development.
Prerequisites
- *NIX Operating System (GNU/Linux, macOS, ...)
- NodeJS
- Redis
- Libzim (On linux we automatically download binaries)
- Various build tools that are probably already installed on your machine (libjpeg, gcc)
See Environment setup hints to know more about how to install them.
Usage
To install MWoffliner globally:
npm i -g mwofflinerYou might need to run this command with the sudo command, depending
how your npm is configured.
Then to run it:
mwoffliner --helpTo use MWoffliner with a S3 cache, you should provide a S3 URL like this:
--optimisationCacheUrl="https://wasabisys.com/?bucketName=my-bucket&keyId=my-key-id&secretAccessKey=my-sac"API
MWoffliner provides also an API and therefore can be used as a NodeJS library. Here a stub example:
const mwoffliner = require('mwoffliner');
const parameters = {
mwUrl: "https://es.wikipedia.org",
adminEmail: "foo@bar.net",
verbose: true,
format: "nopic",
articleList: "./articleList"
};
mwoffliner.execute(parameters); // returns a PromiseBackground
Complementary information about MWoffliner:
- MediaWiki software is used by dozen of thousands of wikis, the most famous ones being the Wikimedia ones, including Wikipedia.
- MediaWiki is a PHP wiki runtime engine.
- Wikitext is the name of the markup language that MediaWiki uses.
- MediaWiki includes a parser for WikiText into HTML, and this parser creates the HTML pages displayed in your browser.
- There is another WikiText parser, called Parsoid, implemented in Javascript/NodeJS. MWoffliner uses Parsoid.
- Parsoid is planned to eventually become the main parser for MediaWiki.
- MWoffliner calls Parsoid and then post-processes the results for offline format.
Environment setup hints
macOS
Install NodeJS:
curl -o- https://raw.githubusercontent.com/creationix/nvm/v0.33.11/install.sh | bash && \
source ~/.bashrc && \
nvm install stable && \
node --versionInstall Redis:
brew install redisInstall libzim: Read these instructions
GNU/Linux - Debian based distributions
Install NodeJS:
curl -o- https://raw.githubusercontent.com/creationix/nvm/v0.33.11/install.sh | bash && \
source ~/.bashrc && \
nvm install stable && \
node --versionInstall Redis:
sudo apt-get install redis-serverReleasing
- Update
package.json - Commit
:package: Release version vX.X.X - Run
git tag vX.X.X - Run
git push origin master --tags
