Public crawler and data from Tripwire: Inferring Internet Site Compromise
Switch branches/tags
Nothing to show
Clone or download
Latest commit 206ddb9 Nov 3, 2017
Permalink
Failed to load latest commit information.
casperjs/iframe Public commit Nov 3, 2017
de-captcher/api_php Public commit Nov 3, 2017
deployed Public commit Nov 3, 2017
identities Public commit Nov 3, 2017
logger Public commit Nov 3, 2017
mail Public commit Nov 3, 2017
proxy-watch Public commit Nov 3, 2017
redbeat Public commit Nov 3, 2017
runners Public commit Nov 3, 2017
README.md Public commit Nov 3, 2017
alexa.src Public commit Nov 3, 2017
public_data.csv Public commit Nov 3, 2017
schema.sql Public commit Nov 3, 2017

README.md

Tripwire

This repository stores the public source code for the registration crawler or data used in the paper Tripwire: Inferring Internet Site Compromise, presented at IMC 2017.

Please direct questions to Joe DeBlasio.

Crawler Source

While we provide complete source for the crawler, I highly discourage you from actually trying to run it, and you do so at your own risk. If, however, you are interested in the heuristics that our crawler uses, or how the system works, the code is all here!

But really, if you've been tasked with getting this crawler running, turn back all ye who enter here. This code is very old, very fragile, and requires a lot of moving parts to get working well.

Data

See public_data.csv for a dump of the login events database. This CSV has headers, but more description of the fields are forthcoming.