An explainable, cautious AI-generated text risk analyzer for coding agents and local workflows.
简体中文 · Install · Evaluation · Use Cases
This project is intentionally modest. It estimates AI-like signals, not proof of authorship.
Most AI text detectors are either overconfident, opaque, or awkward to embed inside agent workflows.
ai-detector-skill takes the opposite approach:
- explainable weighted signals instead of hidden model claims
- a local CLI and Python API that are easy to script
- a short-text guardrail that refuses to overstate weak evidence
- skill-ready packaging for Codex, Claude Code, and other repo-aware agents
- reproducible benchmark and dataset evaluation scripts
If you want a triage tool that stays cautious and leaves room for human review, this repo is built for that.
pip install -e .
ai-detect examples/sample_ai_like.txt --jsonOr bootstrap a local environment:
bash scripts/setup.shExample output:
Conclusion: AI-like signals are present, but this medium-confidence score is a risk estimate rather than proof.
Score: 84/100
Confidence: medium
Verdict: high_ai_likelihood
Words analyzed: 256
score: 0-100 AI-like writing risk estimateverdict:insufficient_text,low_ai_likelihood,mixed_or_uncertain, orhigh_ai_likelihoodconfidence: currentlylowormediumsignals: strongest weighted evidence signalscaveats: warnings that stay attached to the resultnext_steps: practical follow-up actions when useful
This is designed to help an agent say:
- "AI-like signals are present."
- "The sample is too short for a meaningful estimate."
- "This should be reviewed against known writing samples."
Not:
- "This was definitely written by AI."
- "The detector proves misconduct."
ai-detect examples/sample_ai_like.txt
cat essay.txt | ai-detect --json
python scripts/detect.py examples/sample_human_like.txt --jsonfrom aidetect import analyze_text
text = open("essay.txt", encoding="utf-8").read()
result = analyze_text(text)
print(result.score, result.confidence, result.verdict)
for signal in result.strongest_signals():
print(signal.name, signal.note)cp -R ai-detector-skill "$CODEX_HOME/skills/"Use the root SKILL.md as the portable skill definition, and keep AGENTS.md at the repo root for repo-aware agents.
We ran a reproducible evaluation on the public HC3 dataset, using the English finance, medicine, and open_qa subsets with the first 100 rows from each subset.
Snapshot of the current detector on that slice:
- Human mean score:
5.4 - AI mean score:
18.4 - Mean separation:
13.0points - Human coverage:
0.427 - AI coverage:
0.920 - Covered accuracy at
score >= 45:0.317
What that means in practice:
- the detector separates human and AI answers on average, but only weakly on HC3
- the short-text guardrail is doing useful work, especially on shorter human answers
- the current thresholds are conservative, which keeps false confidence down but also lowers recall
- this tool works better as triage + explanation than as a stand-alone classifier
See the full report in docs/HC3_EVALUATION.md.
Reproduce it with:
make eval-hc3A teacher receives a polished 400-word reflection and wants a cautious signal before doing manual review.
Suggested workflow:
- Run
ai-detect submission.txt --json. - Read the strongest signals and caveats.
- Compare the passage with known writing samples before making any judgment.
An editor wants to spot formulaic product reviews or guest posts before spending time on manual edits.
Why it fits:
- medium-length prose works better than short snippets
- explainable signals help justify why a draft feels templated
A moderation team wants to sort suspicious long-form posts into a manual review queue, not auto-remove them.
Why it fits:
- the tool is conservative by design
- it helps with prioritization more than enforcement
A team compares human drafts and AI-assisted drafts to see where language starts sounding too generic.
Why it fits:
- the score is useful as a relative signal across versions
- strongest signals can guide rewriting
- disciplinary decisions about a named student or employee
- treating a single score as proof of cheating or fraud
- very short samples under about 80 words
- high-stakes authorship disputes without known-sample comparison
ai-detector-skill/
├── SKILL.md
├── scripts/
│ ├── detect.py
│ ├── setup.sh
│ ├── benchmark.py
│ └── evaluate_hc3.py
├── references/
│ └── api-reference.md
├── assets/
│ ├── hero.svg
│ ├── score-bands.svg
│ ├── workflow.svg
│ └── templates/
│ └── report.md
├── src/aidetect/
├── tests/
├── AGENTS.md
└── README.md
make test
make demo
make benchmark
make eval-hc3GitHub Actions automatically runs:
make teston Python3.9,3.11, and3.13make benchmarkto regenerate the synthetic benchmark reportmake eval-hc3to regenerate the HC3 evaluation report- upload of
docs/BENCHMARK.mdanddocs/HC3_EVALUATION.mdas workflow artifacts
See CONTRIBUTING.md.
Good contributions usually improve one of these:
- clearer evidence signals
- safer wording and UX
- multilingual handling that stays explainable
- reproducible evaluation coverage
- agent integration ergonomics
MIT