Skip to content

sanjoxtech/penprint

Repository files navigation

🖋️ Penprint

Penprint — your voice, by the numbers

A deterministic voice gate for writing. Turn "does this sound like me?" into a number — with no LLM.

python deps llm license


The problem

Code has tests, so coding agents can run in a loop until the tests pass. Writing has no test. So people gate writing loops with an LLM judge — "score this post 1–10." But an LLM judge is noisy: same draft, different score each run. Loops gated that way oscillate (6 → 7 → 6 → 7) and never converge.

The idea

Penprint is a guitar tuner for your writing voice. It learns a numeric fingerprint from your past posts and scores any new draft against it — the same draft always gets the same score, because there's no model in the loop, just math.

That determinism means a writing loop gated by Penprint converges like a code test-loop.

your past posts ──► penprint fingerprint ──► fingerprint.json  (a few numbers)
                                                    │
new draft ──► penprint score ──► 0–100 + exactly which metrics are off

What it measures (all computed, no LLM)

Metric What it captures
sentence rhythm your average sentence length
burstiness your mix of short + long sentences
first-person rate peer "I / we" energy vs detached prose
hook length is the opening line short enough to stop the scroll
build-signal concrete "I built / ran / shipped" markers
banned phrases generic AI/corporate words you'd never use (hard fail)

Install

pip install penprint          # once published
# or, right now, from source:
git clone https://github.com/<you>/penprint && cd penprint
pip install -e .

Zero dependencies — it's Python stdlib only. No API key. Runs offline.

Quickstart

# 1. learn your voice from your past posts
penprint fingerprint examples/corpus/*.md

# 2. score a draft
penprint score examples/good_draft.md
#  -> "SCORE": 79, "FAILS": [ "burstiness ...", "first-person ..." ]

# 3. use it as a CI / loop gate (exit 1 if below threshold)
penprint score draft.md --min 85

# 4. bring your own banned-phrase list (a word normal in YOUR voice shouldn't be banned)
penprint score draft.md --banned examples/banned.txt

The banned-phrase list

A sane default ships in the tool. A fuller, sourced list lives in examples/banned.txt — curated from proselint (BSD), write-good (MIT), anti-ai-slop-writing, and the Max Planck 2024 study on post-ChatGPT word inflation. Override or extend it with --banned <file> — it's your voice, so trim anything that's genuinely yours.

Use it as a loop gate (the fun part)

Penprint has no LLM — but you can put any writing agent in front of it. The agent drafts, Penprint scores, you feed the failing metrics back, the agent fixes, repeat until it passes. Because the gate is deterministic, it converges. See examples/loop.sh for a Claude Code / Codex example.

builder agent ─► draft.md ─► penprint score ─► fails? feed them back ─► fix ─► repeat ─► ✅ ≥ threshold

Does Penprint use an LLM?

No. Penprint itself is pure Python (re, json, statistics). It's a measuring tape, not a brain. The optional writer in a loop can be an LLM of your choice — Penprint just scores the result.

Prior art / credits

Penprint stands on long-standing work — it's a new combination, not a new primitive:

  • Stylometry (writing-as-numbers, since the 1880s) — e.g. jpotts18/stylometry, StyloMetrix. Used for forensics; Penprint uses it as a loop gate.
  • Vale — prose linter ("style guide as code"). Great rules; no personal voice score.
  • conorbronsdon/avoid-ai-writing — preset voice profiles + iterate-to-convergence via an LLM. Penprint differs: a personal, computed, LLM-free score from your own posts.

The gap Penprint fills: a personal computed voice score used as a converging loop's gate.

Author

Built by Sanjay (sanjoxtech) — sanjox.tech · LinkedIn · sanjox.tech@gmail.com

License

MIT — see LICENSE.

About

A deterministic voice gate for writing — turn 'does this sound like me?' into a number, with no LLM.

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages