Skip to content

DominicBurkart/wikipedia-revisions-server

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
src
 
 
 
 
 
 
 
 
 
 
 
 
 
 
  _      ___ __    _   ___           _     _                 ____                    
 | | /| / (_) /__ (_) / _ \___ _  __(_)__ (_)__  ___  ___   / __/__ _____  _____ ____
 | |/ |/ / /  '_// / / , _/ -_) |/ / (_-</ / _ \/ _ \(_-<  _\ \/ -_) __/ |/ / -_) __/
 |__/|__/_/_/\_\/_/ /_/|_|\__/|___/_/___/_/\___/_//_/___/ /___/\__/_/  |___/\__/_/   

status status

This project serves wikipedia revision differences from a given time period, taking an http request with a start datetime and end datetime, and sending the revisions via a brotli-compressed stream. In the response stream, each line is a JSON-encoded revision.

documentation coming soon 🥧⏲️

Build the project:

docker build -t wikipedia-revisions-server .

Run (specifying working & storage directories, plus dump date):

docker run -it -v /local/path:/fast_dir -v /other/local/path:/big_dir wikipedia-revisions-server -d 20200601

If the data and index files have already been built, you can start the server without having to rebuild:

docker run -it -v /local/path:/fast_dir -v /other/local/path:/big_dir wikipedia-revisions-server

To find a valid date (-d param), go to the wiki archives and find a date with available .xml.bz2 files to download for "All pages with complete page edit history"

See the python wikipedia revisions repo for different download targets & schemes than those available here.

Thanks to JetBrains for providing an open source license to their IDEs for developing this project!

About

store and serve every wikipedia edit

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published