No description, website, or topics provided.
Python
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
README
anon_sim.py
data_analyzer.py
irc_crawler.py
irc_parse.py
parse.py
run.sh
twitter_crawl.py
twitter_parse.py

README

AnonymitySimulator with Web Crawlers

Overview
===============================================================================
This package contains both an AnonymitySimulator along with data set gathers
for creating datasets that can be used in the AnonymitySimulator.

All utilities have been built uisng Python2 and have not been tested on
Python3.

irc_crawl.py requires the Python irc package to be installed:
http://python-irclib.sourceforge.net

twitter_crawl.py requires Python Twitter package to be installed:
https://github.com/bear/python-twitter

irc_crawl.py
===============================================================================
python2 irc_crawl.py [--server=irc.freenode.org] [--port=6667] 
                     [--channel=#ubuntu] [--username=dedis] [--output=data]
                     [--debug]
  server - IRC server to connect to
  port - Port for the IRC server
  channel - IRC channel to connect to
  username - Preferred IRC username
  output - Where to store the crawl data upon completion
  debug - Print debug (output information)

The data will be stored in a pickled list using the following tuple format:
(float time, string event, var data)
Event types (and data):
  - join : data = name
  - quit : data = name
  - nick : data = (old name, new name)
  - msg : data = (name, msg)
  - whois : data = (name, host)

twitter_crawl.py
===============================================================================
python2 twitter_crawl.py --consumer_key= --consumer_secret=
                         --access_token_key= --access_token_secret=
                         [--output=data] [--min_followers=1000000] [--debug]
  consumer_key - A user's consumer key
  consumer_secret - A user's consumer secret
  access_token_key - OAuth access token key
  access_token_key - OAuth access token secret
  output - Where to store the crawl data upon completion
  min_followers - FOllow a user with at least this many followers
  debug - Print debug (output information)

The data will be stored in a pickled dictionary containing two lists:
"statuses" - the list of statuses
"userids" - the list of followed userids

main.py
===============================================================================
Brings together the parsing engines and the AnonymitySimulator to evaluate
the data set for the effects on churn over time on the anonymity set.

python2 main.py [--input=data] [--type=irc] [--min_anon=0] [--debug]
  input - Where the crawled data is
  type - the parsing engine to use for the dataset
  min_anon - the minimum anonymity as an integer allowed before the system
    would automatically ignore future messages coming from that slot
  debug - Print debug (output information)

Parsers: irc, twitter

Parsers should produce a set of users with UIDs ranging from 0 to
len(users) - 1 and data in the format:
(float time, string event, var data)
Event types (and data):
  - join : data = uid
  - quit : data = uid
  - msg : data = (uid, msg)

Other files
===============================================================================
irc_parse.py - Contains the Irc parser
anon_sim.py - Class library for evaluating anonymity sets over a data set