Whorl identifies which LLM generated a piece of text by analyzing character-level patterns in password generation. Ask any model to generate a few random passwords, feed them to whorl, and it'll tell you which model made them — even if the model refuses to identify itself.
92% exact-model accuracy with 5 passwords.
% ./whorl 'K#m9vQx2$nL7pRw'
# Single password
Query: 'K#m9vQx2$nL7pRw' [mode=ensemble]
Rank Model Score Bar
---------------------------------------------------------------------------
1 claude-4.6-sonnet -84.5609 ████████████████████
2 claude-4-opus -148.2465 ██████████░░░░░░░░░░
3 claude-4.1-opus -151.5831 █████████░░░░░░░░░░░
4 claude-4.5-haiku -152.7688 █████████░░░░░░░░░░░
5 claude-4-sonnet -159.0414 ████████░░░░░░░░░░░░For the full technical writeup, see https://bountyplz.xyz/ai,/security/2026/03/15/Model-Fingerprinting-With-Whorl.html
Requires Python 3.9+ with numpy and scipy:
pip install numpy scipyGet the prompt to use with your target model:
./whorl --promptSend that prompt to your target model a few times, then fingerprint the output:
# Single password
./whorl "K#m9vQx2$nL7pRw"
# Multiple passwords (more accurate) — one per line in a file
./whorl passwords.txt
# See a plain-english explanation for what you're seeing
./whorl passwords.txt --explain# Compare all classifier modes against the test set
./whorl --compare
# Full per-model evaluation
./whorl --eval
# Test against a single random model from the test data
./whorl --eval-one
# Use a specific n-gram order instead of the default ensemble
./whorl passwords.txt --order 1 # unigram only
./whorl passwords.txt --order 2 # bigram only
./whorl passwords.txt --order 1,2 # custom combinationThe fingerprint database is just flat text files. Adding a new model takes about two minutes.
- Get the prompt:
./whorl --prompt - Send it to your model 100 times, save the output to
data/model-name.log(one password per line) - Generate 5 more and save to
test/model-name.test(please use different samples for test than you've used in data) - Open a PR
Naming convention: files are named family-version-tag:
family— provider or base model:claude,gpt,llama,gemini, etc.version— release series:4.6,5.4,3.3, etc.tag(optional) — variant:sonnet,opus,mini,nano,turbo, etc.
Examples: claude-4.6-sonnet.log, gpt-5.4.log, llama-3.3-70b-instruct.log
The classifier automatically picks up new files. No code changes needed.
Anthropic claude-3-haiku, claude-4-opus, claude-4-sonnet, claude-4.1-opus,
claude-4.5-haiku, claude-4.5-opus, claude-4.5-sonnet,
claude-4.6-opus, claude-4.6-sonnet
OpenAI gpt-4, gpt-4-turbo, gpt-4o, gpt-4o-mini, gpt-4.1, gpt-4.1-mini,
gpt-4.1-nano, gpt-5, gpt-5-mini, gpt-5-nano, gpt-5.1, gpt-5.2,
gpt-5.4, gpt-o1, gpt-o3, gpt-o3-mini, gpt-o4-mini
Other composer-1, composer-1.5, deepseek-r1-distill-llama-70b,
gemini-3-flash, gemini-3-pro, gemini-3.1-pro, grok, kimi-k2.5,
llama-3-8b-instruct, llama-3.3-70b-instruct,
mistral-nemo-instruct-2407, qwen3-32b