Plagiabot is a copyright violation detection bot.
Repository for Turntin-based plagiarism detection for Wikipedia. See https://en.wikipedia.org/wiki/Wikipedia:Turnitin for details.
Running the bot
The bot support standard pywikibot page generators - for most of them it check the latest revision. The bot also supports special generators to check specific edit based on the diff:
- recentchanges (DB based)
- recentchanges_api (api based)
- live - recent changes using streaming or IRC
See command line help for more details
valhallasw@lisilwen:~/src/plagiabot$ python -i plagiabot.py Logging in... Finding folder to upload into, with name 'Wikipedia'... Upload test text to iThenticate... Polling iThenticate until document has been processed... . . . Part #14558041 has a 62% match. Getting details... Details are available on https://api.ithenticate.com/report/14557806/similarity Sources found were: * I 62% 42 words at http://lrd.yahooapis.com/_ylc=X3oDMTVnb2 * I 62% 42 words at http://lrd.yahooapis.com/_ylc=X3oDMTVncn * I 62% 42 words at http://lrd.yahooapis.com/_ylc=X3oDMTU4aD * I 62% 42 words at http://www.games2.about2006.com/aboutsit * I 62% 42 words at http://medlibrary.org/medwiki/All_your_b * I 62% 42 words at http://plumbot.com/All_your_base_are_bel * I 62% 42 words at http://lembolies.com/Your * I 62% 42 words at http://dvdradix.com/capture-flash-video- * I 62% 42 words at http://www.dvdradix.com/capture-flash-vi * I 62% 42 words at http://www.reachinformation.com/define/A * I 60% 41 words at http://www.reference.com/browse/wiki/se/ * I 60% 41 words at http://www.buellersdownunder.com/archive * I 38% 26 words at http://lrd.yahooapis.com/_ylc=X3oDMTVnbm
You can query suspected diffs using the API available in: http://tools.wmflabs.org/eranbot/plagiabot/api.py
The bot supports English, French, Portuguese and Hebrew.
For running the bot on new languages:
- Make sure ithenticate backend index pages in the desired language: http://www.ithenticate.com/products/faqs
- Add relevant messages to help the bot skip reverted edits
The bot can either generate simple wiki report pages, or write to a database to be used by other tools.
See also: https://github.com/wikimedia/CopyPatrol