Skip to content


Repository files navigation


The shared codebase for experiments on Wikimedia, part of the Study of Harassment and its Impact, and the WikiDetox Project.

This project consists of trying to reconstruct the conversation structure from Wikipedia diffs, human and machine annotation of the structured conversations, and analysis of the impact of harassment and other toxic contributions, and also experimention with ways to visualize this information and make it interesting and useful to Wikipedians.

For large scale machine scoring of comments, we use the Perspective API.

Below is an idea for how we might vizualise toxic comments on Wikipedia (as red objects), and show when they are reverted (as grey objects), where size is the toxicity probability.


We're still exploring how we might visualize more information, such as:

  • Which are the more recent comments? Can we visualizse when comments happen?
  • Can we cluster comments by topic in a meaningful way?
  • Can we make the visualization live, so that new comments 'fly' in when they are sent to wikipedia?
  • Other cool ideas? File an issue with your idea :)


This code is an experimental Wikimedia & Conversation AI project; it is not an official Google product.