Skip to content
MVP of a OpenWPM-based crawl setup for Webcompat analysis
Shell
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
crawl-engineering
crawl-prep @ 83a90e5
crawls/gcp
lists
.gitignore
.gitmodules
README.md

README.md

Webcompat Crawls

Configuration and instructions used to crawls top sites using a specially instrumented version of Firefox gathering data for Webcompat analysis.

Generate the seed list via a series of pre-crawls

See ./crawl-prep/README.md.

Run an OpenWPM crawl in Google Cloud Platform

See ./crawl-engineering/gcp/README.md.

Developer notes

To update the OpenWPM Crawler and crawl-prep submodules to the latest commits in the remotely tracked branches:

git submodule update --remote
You can’t perform that action at this time.