Skip to content

intracer/scalawiki

Repository files navigation

scalawiki

scalawiki is an experimental MediaWiki client in Scala on early stages of development.

Build Status Build status codecov.io Join the chat at https://gitter.im/intracer/scalawiki Download

Why another client library for MediaWiki?

I didn't know any Java client that supported generators (fetching properties from articles listed by list query in a single request). JWBF [only recently] (eldur/jwbf#21) got the ability to query more than 1 page at a time.

When Wikipedia sites are real Big Data it is just a show stopper. Fetching information about Wiki Loves Monuments uploads in such ineffective way will take almost a day even for one country, when could be done in several minutes otherwise in batches.

This library uses Scala Futures for easy job parallelization.

Goals

  • Fully support MediaWiki API
  • Support different backends - MediaWiki API, xml dumps, MediWiki database. Support copying data between backends (importing and exporting xml dumps to database, storing data retrived by MediaWiki API to xml dumps or database).
  • Good test coverage