Skip to content
This repository has been archived by the owner on Mar 1, 2023. It is now read-only.

Add mechanism for detecting suspicious download spikes #238

Open
rabdill opened this issue Feb 20, 2019 · 0 comments
Open

Add mechanism for detecting suspicious download spikes #238

rabdill opened this issue Feb 20, 2019 · 0 comments
Labels
spider Issue with the web crawler

Comments

@rabdill
Copy link
Collaborator

rabdill commented Feb 20, 2019

The problem is this one:
https://rxivist.org/papers/8472
Which had 33,000+ downloads added by a bot. A sample size of 1 is a disaster for detecting these things going forward, but can we develop some kind of rule that will flag suspicious patterns? Could the pattern simply be "An unreasonable increase in the download count of a single month, compared to the months on either side"? Is there a tight enough correlation between tweets and downloads that we could use that?

@rabdill rabdill added the spider Issue with the web crawler label Feb 20, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
spider Issue with the web crawler
Projects
None yet
Development

No branches or pull requests

1 participant