Official repository for "Knowledge Expansion and Counterfactual Interaction for Reference-Based Phishing Detection". Published in USENIX Security 2023.
DynaPD 6K Phishing Kits Online Version • DynaPD 6K Phishing Kits Source Code • WebInteraction Driver: MyXdriver • Paper • Website • Citation
Existing reference-based phishing detectors:
- ❌ Rely on a limited reference list which cannot adapt to temporal (e.g. emerging cryptocurrency brands) and regional (e.g. local brands) interests
- ❌ Unable to address logo-less phishing webpages
- ❌ Use un-interactable benchmark datasets as the test environment
In this work, we propose a framework called DynaPhish, as a complementary module for all reference-based phishing detectors. Our contributions lie in three folds:
- ✅ We perform on-the-fly knowledge expansion of the reference list automatically. Meanwhile, we use the popularity-based validation mechanism to ensure the benignity of added reference.
- ✅ We are the first to introduce the behavioral intention, which makes phishing decisions via observing the suspicious behaviors during the login action
- ✅ We publish DynaPD, which includes 6K clean and live phishing kits that are safe and interactable. Download from here: DynaPD. Visit the online demo here: DynaPD Dataset Demo.
We include the knowledge expansion part in this repository.
|_ knowledge_expansion: Knowledge Expansion Module
|_ brand_knowledge_online.py: Knowledge Expansion Class
Tested on Ubuntu, CUDA 11
- Install the required packages by
conda create -n dynaphish python=3.10
conda activate dynaphish
pip install -r requirements.txt
pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio==0.11.0 --extra-index-url https://download.pytorch.org/whl/cu113
pip install --no-build-isolation git+https://github.com/facebookresearch/detectron2.git
cd knowledge_expansion/phishintention
chmod +x setup.sh
./setup.sh- Create a google cloud service account, set the billing details
- Create a project, enable "Custom Search API", "Cloud Vision API"
- For "Custom Search API", get the API Key and Search Engine ID following this guide.
- Create a blank txt file in the directory
knowledge_expansion/api_key.txt, copy and paste your API Key and Search Engine ID into the txt file like the following:[YOUR_API_KEY] [YOUR_SEARCH_ENGINE_ID] - Create service account and create key follow this guide, save the JSON to
knowledge_expansion/discoverylabel.json{ "type": "service_account", "project_id": "PROJECT_ID", "private_key_id": "KEY_ID", "private_key": "-----BEGIN PRIVATE KEY-----\nPRIVATE_KEY\n-----END PRIVATE KEY-----\n", "client_email": "SERVICE_ACCOUNT_EMAIL", "client_id": "CLIENT_ID", "auth_uri": "https://accounts.google.com/o/oauth2/auth", "token_uri": "https://accounts.google.com/o/oauth2/token", "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs", "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/SERVICE_ACCOUNT_EMAIL" }
- Knowledge expansion
conda activate dynaphish
python knowledge_expansion/main.py --folder [folder_to_test, e.g. datasets/test_sites] If you find our work useful, please consider citing our paper :)
@inproceedings {291106,
author = {Ruofan Liu and Yun Lin and Yifan Zhang and Penn Han Lee and Jin Song Dong},
title = {Knowledge Expansion and Counterfactual Interaction for {Reference-Based} Phishing Detection},
booktitle = {32nd USENIX Security Symposium (USENIX Security 23)},
year = {2023},
isbn = {978-1-939133-37-3},
address = {Anaheim, CA},
pages = {4139--4156},
url = {https://www.usenix.org/conference/usenixsecurity23/presentation/liu-ruofan},
publisher = {USENIX Association},
month = aug,
}I you encounter any issues in code deployment, please reach us via Email or create an issue in the repository: liu.ruofan16@u.nus.edu, lin_yun@sjtu.edu.cn
