Skip to content

tehryanx/whorl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Whorl

Whorl identifies which LLM generated a piece of text by analyzing character-level patterns in password generation. Ask any model to generate a few random passwords, feed them to whorl, and it'll tell you which model made them — even if the model refuses to identify itself.

92% exact-model accuracy with 5 passwords.

% ./whorl 'K#m9vQx2$nL7pRw'

# Single password
Query: 'K#m9vQx2$nL7pRw'  [mode=ensemble]

Rank  Model                               Score                  Bar
---------------------------------------------------------------------------
1     claude-4.6-sonnet                   -84.5609               ████████████████████
2     claude-4-opus                       -148.2465              ██████████░░░░░░░░░░
3     claude-4.1-opus                     -151.5831              █████████░░░░░░░░░░░
4     claude-4.5-haiku                    -152.7688              █████████░░░░░░░░░░░
5     claude-4-sonnet                     -159.0414              ████████░░░░░░░░░░░░

For the full technical writeup, see https://bountyplz.xyz/ai,/security/2026/03/15/Model-Fingerprinting-With-Whorl.html

Setup

Requires Python 3.9+ with numpy and scipy:

pip install numpy scipy

Usage

Get the prompt to use with your target model:

./whorl --prompt

Send that prompt to your target model a few times, then fingerprint the output:

# Single password
./whorl "K#m9vQx2$nL7pRw"

# Multiple passwords (more accurate) — one per line in a file
./whorl passwords.txt

# See a plain-english explanation for what you're seeing
./whorl passwords.txt --explain

Other commands

# Compare all classifier modes against the test set
./whorl --compare

# Full per-model evaluation
./whorl --eval

# Test against a single random model from the test data
./whorl --eval-one

# Use a specific n-gram order instead of the default ensemble
./whorl passwords.txt --order 1        # unigram only
./whorl passwords.txt --order 2        # bigram only
./whorl passwords.txt --order 1,2      # custom combination

Contributing

The fingerprint database is just flat text files. Adding a new model takes about two minutes.

  1. Get the prompt: ./whorl --prompt
  2. Send it to your model 100 times, save the output to data/model-name.log (one password per line)
  3. Generate 5 more and save to test/model-name.test (please use different samples for test than you've used in data)
  4. Open a PR

Naming convention: files are named family-version-tag:

  • family — provider or base model: claude, gpt, llama, gemini, etc.
  • version — release series: 4.6, 5.4, 3.3, etc.
  • tag (optional) — variant: sonnet, opus, mini, nano, turbo, etc.

Examples: claude-4.6-sonnet.log, gpt-5.4.log, llama-3.3-70b-instruct.log

The classifier automatically picks up new files. No code changes needed.

Currently supported models

Anthropic    claude-3-haiku, claude-4-opus, claude-4-sonnet, claude-4.1-opus,
             claude-4.5-haiku, claude-4.5-opus, claude-4.5-sonnet,
             claude-4.6-opus, claude-4.6-sonnet

OpenAI       gpt-4, gpt-4-turbo, gpt-4o, gpt-4o-mini, gpt-4.1, gpt-4.1-mini,
             gpt-4.1-nano, gpt-5, gpt-5-mini, gpt-5-nano, gpt-5.1, gpt-5.2,
             gpt-5.4, gpt-o1, gpt-o3, gpt-o3-mini, gpt-o4-mini

Other        composer-1, composer-1.5, deepseek-r1-distill-llama-70b,
             gemini-3-flash, gemini-3-pro, gemini-3.1-pro, grok, kimi-k2.5,
             llama-3-8b-instruct, llama-3.3-70b-instruct,
             mistral-nemo-instruct-2407, qwen3-32b

About

LLM fingerprinting

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages