Demonstrating a new technique against spoofed phishing.
This project is an experiment to rate the similarity between text, with the goal of using this to detect the likelyness of a message being spoofed in a specific way. This project focuses on phishers that don’t completely disguise as someone else (to avoid DMARC et al), but make it easy for their messages to be mistaken as coming from someone else (as in the names and addresses are similar).
The overall purpose is to utilize machine learning technologies to better approach spoofing, a technique used in phishing. Many cyber crimes are instances of phishing, the goal of this project is to better demonstrate the possibility of machines being able to better adapt to threats against the human component of security. The desired result of this project would be the increased development of security technologies based on or inspired by this work.
- python3
- virtualenv:
pip install virtualenv
- debian:
sudo apt-get install python3 libxext-dev python3-pip virtualenv python3-pyqt4 fonts-arkpandora
- To build qt4:
sudo apt-get install build-essential libxext-dev python-qt4-dev python3-pip virtualenv libqt4-dev fonts-arkpandora
- To build qt4:
- ubuntu (not tested):
sudo apt-get install build-essential libext-dev python3 python3-pip virtualenv qt4-dev-tools libqt4-dev fonts-arkpandora
- fedora (under development):
cd path/to/WordSimilarity
virtualenv —-system-site-packages -—always-copy -p python3 $(pwd)
bash setup.sh --macos
if macos- Linux:
- If you preinstalled pyqt4:
bash setup.sh --linux —-premade
- Otherwise:
bash setup.sh --linux
- If you preinstalled pyqt4:
After following the instructions above, to properly use the code in this directory:
- Before you start running the software here, run
source bin/activate
from the WordSimilarity folder. - After using the software here, run
deactivate
. - All of these instructions assume you are running commands from this directory (WordSimilarity).
- I would recommend you go through the folders in this order: charSim, then wordSim.
- Don't worry about the stuff in the characters folder (unless you want to look at the code there).
- Read the README.md files in the two folders to find out how to run the software.
Author - Josh Danielpour.
All of the code here is distributed under the Gnu Affero General Public License (wow, that's long!)
See COPYING for more information.
- Fork me!
- Commit & push your changes to your fork.
- Create a Pull Request (in a timely and organized manner).
- See more.
This work is licensed under the Creative Commons Attribution-ShareAlike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/ or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.