Skip to content
master
Switch branches/tags
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 

mwdumps

install

pip3 install mwdumps

usage

> mwdumps
Usage:
    mwdumps --wiki=<wiki_name> [--date=<date>] [--threads=<threads>]
        [--config=<config_file>] [--verbose] <output_path>
    mwdumps (-h | --help)
Options:
    --config=<config_file>       Configuration file containing a set of regexes,
                                    one per line, that matches dump files to be
                                    downloaded.
    --wiki=<wiki_name>           Abbreviation for wiki of interest.
    --date=<date>                Get dump on <date>. Defaults to most recent.
    --threads=<threads>          Number of parallel downloads [default: 3].
    -v, --verbose                Generate verbose output.

config_file

The configuration file should contain a set of regexes that match the filenames. If it is omitted then we assume that all of the available files in the dump should be downloaded.

Examples

English Wikipedia revision metatdata, no page text:
enwiki-\d+-stub-meta-history\d+\.xml.gz

Wikidata, all pages, current version only.
wikidatawiki-\d+-pages-meta-current\d+\.xml-p\d+p\d+.bz2

About

Simple utility for downloading wikimedia dumps

Resources

License

Releases

No releases published

Packages

No packages published

Languages