This repo downloads, unpacks, and analyses web extensions looking at their security, privacy, and accessibility.
- TODO Add paper link
- Dataset
- Supplementary Material
-
Python 3.11.5
- Conda
- environment.yaml
- PIP
- Adblockparse
-
Other
- Docker
- docker-compose
-
Post-Processor
- Rust
- Go
Cloning including submodules:
git clone --recurse-submodules git@github.com:JamesMClarke/accessibility_extensions.gitAll files will be saved to the extensions\<name_of_txt> folder.
First, we need to get a list of extensions. This can be done using the following get_urls.mjs, all it needs is the URL you want to get the URLs from for example https://chromewebstore.google.com/search/example.
# For a search
node get_urls.mjs <url> <file_name>
#For a category
node get_urls.mjs -c <url> <file_name>Next, we need to download the CRX files for each extension. While doing this, the script will also gather data about each extension, such as name, developer, rating, number of downloads, etc. This can be done using the script "download_extensions.py", example:
python3 download_extensions.py <file.txt>
# See below command for full details
python3 download_extension.py -hAfter this, we need to process the CRX files ready for analysis. We start by extracting them, which takes them from CrX to normal folders. After this, we try to deobfuscate and beautify where possible using JS Beautify. To run:
python3 preprocess.py <file.sqlite>
#See below for full details
python 3 preprocess.py -hWe then get permissions, manifest version and hosts and output them into two csv files.
python3 get_manifest.py <file.sqlite> We can run a crawl of a test site using all extensions, doing this records the HTML of the page, and the network traffic while visiting it using mitmproxy, VisualV8 logs, the accessibility tree, WAVE and Google Lighthouse results.
# Run the crawl
python3 run_crawl.py <file.sqlite> <time in seconds> We can now run our post-processor, we rely on VisibleV8's post-processor to format VisibleV8's results
First, we need to build VV8's post-processor
cd post-processor/VisibleV8/post-processor
make
cd AXECC/post-processor
cp VisibleV8/post-processor/artifacts/* .
# Change directory to post processWe can now run the post-processor
cd post-processor
# Run post processor
# Note if running on HPC or simular there needs to be storeage avilable in /tmp
python3 postProcess.py ../extensions/folder/file.sqlite