PhishTriage

PhishTriage is an explainable, offline phishing triage CLI for raw .eml files. It’s built to surface suspicious patterns quickly and produce analyst-readable findings with a simple risk score.

At A Glance

Offline CLI that triages raw emails and explains why something looks suspicious.
Focused on clear, readable findings instead of black-box scoring.

What It Does

Takes raw .eml emails and surfaces the signals an analyst would look for.
Checks for common phishing patterns and outputs evidence-backed findings.
Produces a simple risk score and a triage verdict to guide next steps.
Generates human-readable and machine-readable reports for review.

Core Functions

parse_eml: Turns a raw email into a consistent record for inspection.
extract_urls: Surfaces destinations so links can be reviewed quickly.
run_detectors: Applies targeted checks that map to known phishing behaviors.
compute_risk_score: Summarizes how many high-risk signals were found.
verdict_for_score: Translates the score into an action-oriented verdict.
render_terminal / render_json: Delivers results for humans or automation.
write_findings_csv / write_summary_md: Supports bulk triage and trend spotting.

Detectors And What They Signal

Each detector represents a known phishing pattern. The goal is explainability over cleverness.

REPLY_TO_MISMATCH: Flags when the Reply-To domain doesn’t match the From domain. This is a classic indicator of sender impersonation and reply hijacking.
URL_SHORTENER: Shortened links can hide the true destination and bypass casual inspection.
URL_IP_HOST: Raw IP URLs are often used to avoid domain-based reputation checks.
URL_PUNYCODE: Punycode domains can enable homograph attacks (lookalike characters).
SUSPICIOUS_TLD: Certain TLDs are disproportionately abused in phishing campaigns.
URGENCY_KEYWORDS: Urgent language is a social-engineering pressure tactic.
SENDER_LINK_DOMAIN_MISMATCH: If a credential-themed email links off-domain, that is suspicious even without a known brand.
HTML_LOGIN_FORM: Embedded password fields in HTML emails indicate credential harvesting.
EML_ATTACHMENT_CREDENTIAL_PHISH: Attached EMLs containing credential language and the recipient address are common in phishing kits.
SHORT_BODY_WITH_EML_ATTACHMENT: Minimal bodies with EML attachments often hide the real lure in the attachment.
SUSPICIOUS_ATTACHMENT: Executables, scripts, macros, and archives are common malware delivery methods.
DOUBLE_EXTENSION: Double extensions are used to disguise executables as safe documents.
DOMAIN_TYPOSQUAT: Small edit distance from known brand domains indicates typosquatting.
BRAND_LINK_MISMATCH: Brand mention plus off-brand links is a strong impersonation signal.

Quick Start

pip install -e .
phishtriage analyze tests/fixtures/phish_1.eml
phishtriage analyze tests/fixtures/phish_1.eml --json
phishtriage analyze-dir tests/fixtures --out reports/

Output Notes

Terminal output includes verdict, score, and evidence for each finding.
JSON output is structured for automation.
findings.csv supports bulk triage.
summary.md provides batch-level statistics.

Potential Bypasses And Drawbacks

This tool is intentionally lightweight and offline. That makes it portable and explainable, but it also creates gaps attackers can exploit.

Heuristic-based only. Well-crafted phishing that avoids these signals can pass.
No SPF/DKIM/DMARC checks. Sender authentication signals aren’t evaluated.
No live enrichment. Compromised legitimate domains or recently registered lookalikes won’t be flagged by reputation.
Naive registrable-domain parsing (no Public Suffix List). Some subdomain tricks may evade checks or cause false positives.
Limited brand list. Phishers impersonating lesser-known brands may not trigger brand-specific rules.
HTML and URL tricks like encoded/obfuscated links, redirection chains, or JavaScript-based redirects are only partially addressed.
Social engineering without links or attachments can evade many checks.

Bottom line: this is useful for general phishing triage and educational analysis, but it is not a comprehensive anti-phishing system.

Project Structure

phishtriage/core/parse_email.py: Email parsing and attachment extraction.
phishtriage/core/extract_urls.py: URL parsing and normalization.
phishtriage/core/detectors.py: Rule-based phishing detectors.
phishtriage/core/scoring.py: Risk scoring and verdict mapping.
phishtriage/core/report.py: Terminal, JSON, CSV, and summary outputs.
phishtriage/cli.py: CLI entrypoint.
tests/: Unit tests and .eml fixtures.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
examples		examples
phishtriage.egg-info		phishtriage.egg-info
phishtriage		phishtriage
tests		tests
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PhishTriage

At A Glance

What It Does

Core Functions

Detectors And What They Signal

Quick Start

Output Notes

Potential Bypasses And Drawbacks

Project Structure

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

lponik/phishTriage

Folders and files

Latest commit

History

Repository files navigation

PhishTriage

At A Glance

What It Does

Core Functions

Detectors And What They Signal

Quick Start

Output Notes

Potential Bypasses And Drawbacks

Project Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages