Mass IPv4 WHOIS Collection Tool

There is a blog post that accompanies this code base.

Running this code base locally

The following allows you to run this code base locally. These commands were tested on a fresh installation of Ubuntu 14.04.3 LTS.

Various dependencies are needed, this will install them via Debian packages.

$ sudo apt-get update
$ sudo apt-get install -y \
    default-jre \
    jq \
    python-dev \
    python-pip \
    python-virtualenv \
    redis-server \
    zip \
    zookeeperd

The following will install and launch a known-good version of Kafka.

$ cd /tmp
$ curl -O http://mirror.cc.columbia.edu/pub/software/apache/kafka/0.8.2.1/kafka_2.11-0.8.2.1.tgz
$ tar -xzf kafka_2.11-0.8.2.1.tgz
$ cd kafka_2.11-0.8.2.1/
$ nohup bin/kafka-server-start.sh \
    config/server.properties \
    > ~/kafka.log 2>&1 &
$ export PATH="`pwd`/bin:$PATH"

The following will create the results and metrics topics in Kafka.

$ kafka-topics.sh \
    --zookeeper 127.0.0.1:2181 \
    --create \
    --partitions 1 \
    --replication-factor 1 \
    --topic results

$ kafka-topics.sh \
    --zookeeper 127.0.0.1:2181 \
    --create \
    --partitions 1 \
    --replication-factor 1 \
    --topic metrics

The following will launch Redis.

$ redis-server &

The following will create a virtual environment and install various Python-based dependencies.

$ virtualenv .ips
$ source .ips/bin/activate
$ pip install -r requirements.txt requirements-dev.txt

The following will bootstrap a local database.

$ cd ips
$ python manage.py migrate

The following will generate 4.7 million seed IP addresses that will be used by workers.

$ python manage.py gen_ips

Set the coordinator IP address:

$ python manage.py set_config 127.0.0.1

The following launches the web interface for the coordinator.

$ python manage.py runserver &

The following launches the look up worker, telemetry reporting and process that collects IP addresses from the coordinator.

$ python manage.py celeryd --concurrency=5 &
$ python manage.py celerybeat &
$ python manage.py get_ips_from_coordinator &

The following launches the process that collects the WHOIS records and stores unique CIDR blocks in Redis.

$ python manage.py collect_whois &

Monitoring

To see aggregated telemetry:

$ python manage.py telemetry

If you want to monitor celery's activity run the following:

$ watch 'python manage.py celery inspect stats'

To see the results of successful WHOIS queries:

$ kafka-console-consumer.sh \
    --zookeeper localhost:2181 \
    --topic results \
    --from-beginning

To continuously dump results to a file:

$ kafka-console-consumer.sh \
    --zookeeper localhost:2181 \
    --topic results \
    --from-beginning > output &

To see per-minute metrics from the workers:

$ kafka-console-consumer.sh \
    --zookeeper localhost:2181 \
    --topic metrics \
    --from-beginning

Deployment

To run Ansible on a cloud service you first need to create an inventory file like the following.

$ vi devops/inventory

[coordinator]
coord1 ansible_host=x.x.x.x ansible_user=ubuntu ansible_private_key_file=~/.ssh/ec2.pem

[worker]
worker1 ansible_host=x.x.x.x ansible_user=ubuntu ansible_private_key_file=~/.ssh/ec2.pem
worker2 ansible_host=x.x.x.x ansible_user=ubuntu ansible_private_key_file=~/.ssh/ec2.pem
worker3 ansible_host=x.x.x.x ansible_user=ubuntu ansible_private_key_file=~/.ssh/ec2.pem

To provision and deploy run:

$ zip -r \
    app.zip \
    ips/ *.txt \
    -x *.sqlite3 \
    -x *.pid \
    -x *.pyc

$ cd devops
$ ansible-playbook bootstrap.yml

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
devops		devops
ips		ips
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mass IPv4 WHOIS Collection Tool

Running this code base locally

Monitoring

Deployment

About

Releases

Packages

Languages

License

marklit/mass-ipv4-whois

Folders and files

Latest commit

History

Repository files navigation

Mass IPv4 WHOIS Collection Tool

Running this code base locally

Monitoring

Deployment

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages