ServeWPM is a docker image for running OpenWPM.
ServeWPM can run OpenWPM locally with machines that have Docker installed. It can also be run remotely on AWS.
There are three main components: the OpenWPM framework, the Jupyter Notebook server used to run the tests, and the Django ORM (Object-Relational Mapping) layer for working with OpenWPM's output.
This step can take more than 10 minutes without any caching.
You will need to have docker installed, as well as django and jupyter.
The passwd()
function will prompt for a new password that is used to protect the Jupyter Notebook.
git clone https://github.com/ughe/ServeWPM ServeWPM
cd ServeWPM
docker build \
--build-arg DJANGO_SECRET_KEY=$(python -c 'from django.core.management.utils import get_random_secret_key; print get_random_secret_key()') \
--build-arg HOST_IP=0.0.0.0 \
--build-arg JUPYTER_PASSWORD_HASH=$(python -c 'from notebook.auth import passwd; print passwd()') \
-t servewpm .
N.B.: The Dockerfile
pulls the latest files from the OpenWPM
git repo, and it can be modified to use a tagged release curl -L https://github.com/citp/OpenWPM/archive/v0.8.0.zip -o OpenWPM.zip
etc... if needed.
This deploys the Jupyter notebook to Port 80. Also opens port 8000 for the optional Django server. The -d
flag can be added to run the docker container in the background.
docker run -p 80:80 -p 8000:8000 servewpm
Go to the URL (i.e. 127.0.0.1), and open the README.ipynb
Jupyter notebook. The notebook starts by running a version of the demo.py
example program from OpenWPM. Once the website tests are run, the tutorial continues with examining the output files in a directory called $PROJECT_NAME/
using the Django ORM and Sqlite3.
docker exec -it $(docker ps -q) bash
This works if there is only one container running. Otherwise, first run docker ps
and then docker exec -it ef2e47fbfa1b bash
where ef2e47fbfa1b
is the CONTAINER ID
of the correct container instance.
This will expose the secret key or password hash if public.
docker save servewpm -o servewpm.tar
docker load -i servewpm.tar
docker kill $(docker ps -q)
The docker image creates three primary sibling directories: OpenWPM
, ServeWPM
(django), and notebooks
.
In notebooks
, there is a special directory called exports
where Django looks for the crawl-data.sqlite
file.
In order to import the OpenWPM
project files from a Jupyter notebook in notebooks
, the path is inserted before any OpenWPM imports:
sys.path.insert(0, os.environ['FRAMEWORK'])
# OpenWPM imports below:
from automation import TaskManager, CommandSequence
The Django Server provides a framework for interacting with the ORM and serving web pages.
Static files are currently not being served. To see css and other static files, set DEBUG = True
in the file ServeWPM/ServeWPM/settings.py
.
To log into the admin site, the credentials need to be set up. First, create a new IPython notebook in Jupyter by using the Django Shell-Plus
option.
Next, create a superuser (admin):
from django.contrib.auth.models import User
User.objects.create_superuser('admin', 'admin@example.com', 'CHANGE_ME')
Finally, go to admin on the 8000 port (i.e. 127.0.0.1:8000/admin) and log in.
For reference, the Django export
models were generated with python $DJANGO/manage.py inspectdb > $DJANGO/export/models.py
. The output file models.py
needs to be edited to conform to Django's rules about id
's.
ServeWPM is fine for local development. It is not advised to deploy online without caution. Security concerns include Jupyter Notebook's root access, password handling, and more.
To deploy to EC2, first build the image locally as described above. Then run docker save servewpm -o servewpm.tar
.
Next launch a new EC2 instance, and configure it, including the Security Group to expose ports 80 (jupyter) and 8000 (django). Also create a key pair when prompted and save it in the current working directory.
Then select the instance on the dashboard and choose Connect
. Replace EC2
, below, with the Public DNS shown (i.e. ec2-user@ec2-48-458-381-20.compute-1.amazonaws.com) for the instace. Also replace CERT
with the path to the certificate generated previously. servewpm.tar
should be in the current working directory. Then run the script below.
# Replace the values below
CERT=certificate.pem
EC2=ec2-user@ec2-xx-xxx-xxx-xx.compute-1.amazonaws.com
scp -i $CERT servewpm.tar $EC2:/home/ec2-user
ssh -i $CERT $EC2 sudo yum update -y
ssh -i $CERT $EC2 sudo yum install -y docker
ssh -i $CERT $EC2 sudo service docker start
ssh -i $CERT $EC2 sudo docker load -i /home/ec2-user/servewpm.tar
ssh -i $CERT $EC2 sudo docker run -d -p 80:80 -p 8000:8000 servewpm
ElasticBeanstalk does not fully support Docker because there is no option for setup Args and the OpenWPM
build actually fails when deployed by EB with a WebDriverException: Message: connection refused
. This issue does not occur in the docker container locally on Mac, Windows 10, or a manually-configure Amazon or Ubuntu EC2 instance.
To attempt an EB deployment, compress the ServeWPM files. In unix you might use: zip -r -X ServeWPM.zip .ebextensions Dockerfile notebooks ServeWPM
. On Windows, too, make sure to zip each individual item together, do not compress the parent directory.
In the AWS console, go to ElasticBeanstalk. Create a new environment in a new application. For the Platform
choose Preconfigured platform
then Docker
. For Application Code
choose Upload your code
and upload the compressed project. Finally, choose Create environment
. The deploy may take at least fifteen minutes as it builds the image instead of using a pre-built one.