Skip to content


Switch branches/tags

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Build Status Test coverage GitHub license PyPI version

Revision Scoring

A generic, machine learning-based revision scoring system designed to help automate critical wiki-work — for example, vandalism detection and removal. This library powers ORES.


Using a scorer_model to score a revision::

  import mwapi
  from revscoring import Model
  from revscoring.extractors.api.extractor import Extractor

  with open("models/enwiki.damaging.linear_svc.model") as f:
       scorer_model = Model.load(f)

  extractor = Extractor(mwapi.Session(host="",
                                          user_agent="revscoring demo"))

  feature_values = list(extractor.extract(123456789, scorer_model.features))

  {'prediction': True, 'probability': {False: 0.4694409344514984, True: 0.5305590655485017}}


The easiest way to install is via the Python package installer (pip).

pip install revscoring

You may find that some of the dependencies fail to compile (namely scipy, numpy and sklearn). In that case, you'll need to install some dependencies in your operating system.

Ubuntu & Debian:

  • Run sudo apt-get install python3-dev g++ gfortran liblapack-dev libopenblas-dev enchant
  • Run sudo apt-get install aspell-ar aspell-bn aspell-el aspell-id aspell-is aspell-pl aspell-ro aspell-sv aspell-ta aspell-uk myspell-cs myspell-de-at myspell-de-ch myspell-de-de myspell-es myspell-et myspell-fa myspell-fr myspell-he myspell-hr myspell-hu myspell-lv myspell-nb myspell-nl myspell-pt-pt myspell-pt-br myspell-ru myspell-hr hunspell-bs hunspell-ca hunspell-en-au hunspell-en-us hunspell-en-gb hunspell-eu hunspell-gl hunspell-it hunspell-hi hunspell-sr hunspell-vi voikko-fi


Using Homebrew and pip, installing revscoring and enchant can be accomplished as follows::

brew install aspell --with-all-languages
brew install enchant
pip install --no-binary pyenchant revscoring

Adding languages in aspell (MacOS only)

cd /tmp
bzip2 -dc aspell-pt-0.50-2.tar.bz2 | tar xvf -
cd aspell-pt-0.50-2
sudo make install

The differences between the aspell and myspell dictionaries can cause some of the tests to fail

Finally, in order to make use of language features, you'll need to download some NLTK data. The following command will get the necessary corpora.

python -m nltk.downloader omw sentiwordnet stopwords wordnet

You'll also need to install enchant-compatible dictionaries of the languages you'd like to use. We recommend the following:

  • languages.arabic: aspell-ar
  • languages.basque: hunspell-eu
  • languages.bengali: aspell-bn
  • languages.bosnian: hunspell-bs
  • languages.catalan: myspell-ca
  • languages.czech: myspell-cs
  • languages.croatian: myspell-hr
  • languages.dutch: myspell-nl
  • languages.english: myspell-en-us myspell-en-gb myspell-en-au
  • languages.estonian: myspell-et
  • languages.finnish: voikko-fi
  • languages.french: myspell-fr
  • languages.galician: hunspell-gl
  • languages.german: myspell-de-at myspell-de-ch myspell-de-de
  • languages.greek: aspell-el
  • languages.hebrew: myspell-he
  • languages.hindi: aspell-hi
  • languages.hungarian: myspell-hu
  • languages.icelandic: aspell-is
  • languages.indonesian: aspell-id
  • languages.italian: myspell-it
  • languages.latvian: myspell-lv
  • languages.norwegian: myspell-nb
  • languages.persian: myspell-fa
  • languages.polish: aspell-pl
  • languages.portuguese: myspell-pt-pt myspell-pt-br
  • languages.serbian: hunspell-sr
  • languages.spanish: myspell-es
  • languages.swedish: aspell-sv
  • languages.tamil: aspell-ta
  • languages.russian: myspell-ru
  • languages.ukrainian: aspell-uk
  • languages.vietnamese: hunspell-vi


To contribute, ensure to install the dependencies:

$ pip install -r requirements.txt

Install necessary NLTK data:

python -m nltk.downloader omw sentiwordnet stopwords wordnet

Running tests

Make sure you install test dependencies:

$ pip install -r test-requirements.txt

Then run:

$ pytest . -vv

Reporting bugs

To report a bug, please use Phabricator