A lightweight tool that scans text for likely "AIisms" and reports what it found.
Given a block of text, the tool should:
- Detect patterns that feel overtly AI-generated.
- Count how many AIisms appear.
- Return a readable report with matched phrases and definitions.
Example summary:
- "Detected 6 AIisms"
AIisms are language patterns that are unusually common in model-generated writing, including:
- overused transition phrases
- verbose boilerplate framing
- hedging or certainty disclaimers used unnaturally
- repetitive sentence structures
- generic, non-committal wording
Start with a hybrid scoring approach:
- Phrase lexicon: weighted list of common AI-ish phrases.
- Structural patterns: regex checks for repeated templates.
- Stylistic signals: counts of hedge words, stacked qualifiers, and repetitive transitions.
- Score + threshold: classify text as low/moderate/high AIism density.
This keeps v1 fast and explainable. Later versions can add an ML classifier.
total_ai_isms: integermatches: list of matched phrases/patternsdefinitions: brief explanation for each matchdensity: low | moderate | highaiism_score: non-negative number (length-normalized weighted AIism intensity)aiism_ratio: detected AIisms per 100 words
- Build a small baseline lexicon and weighted rules.
- Implement a simple CLI (
stdinor file input). - Add fixture texts and expected reports.
- Tune thresholds against real examples.
Run:
npm run validate:taxonomyOr validate a custom file:
npm run validate:taxonomy -- path/to/taxonomy.yml