Brotherizer

The rewrite engine that gives LLM text a pulse.

Brotherizer gives LLM text a pulse when the model lands the facts but goes flat on feeling.

It pulls from donor writing, rewrites for the right surface, and reranks until something actually sticks.

Think of it as voice middleware for teams that want less committee and more human.

No detector theater. No fake warmth. No polished-for-no-reason copy.

Just text that sounds awake instead of overmanaged.

The point is not to remove the human. It is to give them a better machine.

What Brotherizer is

Brotherizer stays narrow by design:

it retrieves donor writing patterns
it rewrites for the mode and surface you actually need
it reranks multiple candidates
it lets the client keep the winner or choose another option later

Think of it as voice middleware for LLM output.

If your model already knows what to say but keeps saying it like it had to clear legal first, this is the lane.

What it is not

Brotherizer is not:

a general chat model
a giant prompt-management suite
a full writing app
a detector-evasion gimmick

The point is not to make text look less AI just to win a benchmark.

The point is to make it sound more like a person actually meant it.

How it works

Brotherizer runs a five-part pipeline:

Retrieve donor texture
- pull donor snippets from local packs or the corpus database
- optionally use local embeddings for semantic lookup
Resolve mode + surface
- choose the right voice family
- apply surface-aware formatting and style directives
Generate multiple rewrites
- produce several candidates instead of pretending the first shot is always the best shot
Rerank
- score candidates for semantic fidelity, mode fit, surface fit, anti-generic behavior, and composition quality
- optionally run an xAI/Grok judge pass for harder selection calls
Persist the decision
- keep the winner
- allow a client or user to choose a different candidate later
- store job, candidate, and choice history in the runtime DB

The result is simple:

send text in
get ranked options back
keep the winner, or override it

Current model stack

Brotherizer is explicit about the model split it ships with today:

Generation lane: Perplexity Sonar
Judge lane: xAI Grok reasoning models
Optional semantic retrieval lane: local Ollama embeddings

In practice, that split looks like this:

Perplexity Sonar handles the fast rewrite pass
Grok handles the optional judgment-heavy pass when selection quality matters more than speed
Ollama is there if you want local semantic retrieval for the donor corpus

Current defaults in the repo:

generation model: sonar
judge model: grok-4.20-reasoning
embedding model: nomic-embed-text

You can still point the judge lane at earlier Grok reasoning variants by setting BROTHERIZER_XAI_MODEL. The public docs explain the split in more detail in docs/wiki/MODEL_ROUTING_AND_PROVIDERS.md.

Research, public and on purpose

Brotherizer ships with a public research substrate. It is the part contributors can inspect, rebuild, and extend:

donor packs under data/donor_packs/
corpus DB builder
optional embedding index builder
style radar seed signals and DB builder
formatting / internet-symbol packs
retrieval selectors that feed the rewrite engine

What is intentionally not public is the private collection layer.

That is deliberate:

the public repo still shows how the system thinks
it just does not include collection machinery or internal ops lanes

If you want the longer public explanation, start here:

Core features

1. Rewrite modes

Brotherizer ships with multiple voice families, including:

british_banter_mode
worldwide_ironic_mode
en_reflective_human_mode
en_professional_human_mode
british_professional_human_mode
casual_us_human_mode
ptbr_twitter_mode
ptbr_narrative_human_mode
ptbr_professional_human_mode
seriously_english_mode
seriously_ptbr_mode

Defined in configs/brotherizer_modes.json.

Quick mode picker:

use casual_us_human_mode for lines that need to feel current and lived-in
use en_reflective_human_mode when you want the text to breathe a bit more
use british_professional_human_mode for restraint, without brochure polish
use seriously_* modes if the source already carries weight; no extra performance needed
use the PT-BR modes to keep things culturally native, not flattened into generic international Portuguese

2. Surface-aware rewriting

Brotherizer can condition the rewrite for:

reply
post
thread
bio
caption
note

That changes more than formatting. It changes rhythm, looseness, compression, and reranking behavior.

3. Donor memory

Brotherizer does not rely on prompt adjectives alone.

It retrieves donor snippets from real writing packs and uses them as texture, pressure, and voice reference without copying them verbatim.

See RESEARCH/DONOR_PACKS.md.

4. Style radar + formatting packs

Brotherizer also uses:

That helps it reason about:

internet-native markers
compact reaction language
reflective vs casual surfaces
profile/bio cleanliness
reply vs thread vs note behavior

5. Candidate ranking

Brotherizer does not emit one rewrite and pray.

It generates several candidates and reranks them with:

semantic preservation
mode fit
surface fit
anti-generic heuristics
composition penalties
optional xAI judge scoring

6. Durable runtime jobs

The runtime persists:

jobs
candidates
choices
runtime errors
idempotency keys

That gives you:

stable job_id
winner vs chosen
replay-safe reads of completed jobs
idempotent rewrite submission

Quick start

1. Install from a source checkout

python3 -m venv .venv
source .venv/bin/activate
python -m pip install -e .

That gives you installable entrypoints such as:

brotherizer-api
brotherize
brotherizer-build-corpus
brotherizer-build-style-radar
brotherizer-build-embeddings

2. Export credentials

export PERPLEXITY_API_KEY=your_key_here
export XAI_API_KEY=your_key_here

Notes:

PERPLEXITY_API_KEY is required for generation
XAI_API_KEY is only required if you want the judge lane
local embeddings require a running Ollama instance if you choose to build them

You can also copy the example env:

cp .runtime/brotherizer.env.example .runtime/brotherizer.env

3. Build the local stores

Build the corpus DB:

brotherizer-build-corpus \
  --inputs data/donor_packs/english_v3.ndjson data/donor_packs/ptbr_v2.ndjson \
  --db data/corpus/brotherizer.db

Build the style radar DB:

brotherizer-build-style-radar \
  --input configs/style_radar_seed_signals.json \
  --db data/corpus/style_radar.db

Optional: build embeddings for semantic retrieval:

brotherizer-build-embeddings \
  --db data/corpus/brotherizer.db

Run Brotherizer

CLI

Recommended mode-driven example:

brotherize \
  --mode casual_us_human_mode \
  --text "This still sounds too polished and generic." \
  --use-xai-judge

Grounded, more restrained example:

brotherize \
  --mode seriously_english_mode \
  --text "I think this still sounds too polished and generic." \
  --use-xai-judge

API

Run the API directly:

brotherizer-api

Or use the helper script:

bash scripts/start_brotherizer_api.sh

By default, Brotherizer serves on http://127.0.0.1:5555.

Rewrite via API:

curl -X POST http://127.0.0.1:5555/v1/rewrite \
  -H 'Content-Type: application/json' \
  -d '{
    "text": "I think this sounds too polished and generic.",
    "mode": "casual_us_human_mode",
    "surface_mode": "reply",
    "candidate_count": 3,
    "use_xai_judge": false
  }'

Choose a non-winner candidate later:

curl -X POST http://127.0.0.1:5555/v1/jobs/<job_id>/choose \
  -H 'Content-Type: application/json' \
  -d '{
    "candidate_id": "<candidate_id>",
    "actor": { "type": "client", "id": "codex" },
    "reason": "User preferred the alternate"
  }'

API surface

Canonical endpoints:

GET /
GET /v1/health
GET /v1/modes
GET /v1/capabilities
POST /v1/rewrite
GET /v1/jobs/:id
POST /v1/jobs/:id/choose

Legacy wrappers:

GET /health
GET /modes
POST /rewrite

The real contract lives under /v1/*.

Build and release ergonomics

Brotherizer now ships with a small build baseline:

Useful local commands:

make dev-install
make test
make run-api
make build-corpus
make build-style-radar
make build-embeddings

Container build:

docker build -t brotherizer:local .
docker run -p 5555:5555 --env-file .runtime/brotherizer.env brotherizer:local

Repo docs / wiki

Start here:

Most useful pages:

Research and corpus-building docs:

Contributing

Clean machine. Human output. Builder energy.

Brotherizer only gets as good as the voice library.

That means the best contributions are usually not another endpoint.

They are:

a cleaner donor pack
a sharper register
a language the repo barely covers today
a better note / reply / caption surface

We especially want:

more languages
more registers
cleaner professional voices
better note / reply / caption coverage

If you can build a clean, text-only donor pack in your language, we want it.

If you can build two, even better. The machine has no shame and would like to sound less generic in more countries.

Please keep identity out of the data:

no handles
no names
no emails
no signatures
no source_ref
no metadata that can reveal the author

Start here:

Positioning

Brotherizer lives between brand-voice systems and LLM middleware.

It is closer to:

a style-retrieval runtime
a rewrite-and-rerank engine
a choice layer for agent output

It is not trying to be:

Jasper
Grammarly
PromptLayer
LangSmith
an "undetectable AI" circus

Those sit nearby. Brotherizer's lane stays narrow:

retrieve the right texture, rewrite the line, rerank the options, and keep what sounds alive.

Verification

Core regression checks:

python3 -m py_compile api/brotherizer_api.py brotherize.py runtime/service.py storage/runtime_db.py tests/test_runtime_service.py tests/test_runtime_api.py
python3 -m unittest tests/test_runtime_service.py tests/test_runtime_api.py

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
.github/workflows		.github/workflows
.runtime		.runtime
RESEARCH		RESEARCH
api		api
assets/readme		assets/readme
brotherizer_assets		brotherizer_assets
configs		configs
data/donor_packs		data/donor_packs
docs		docs
integrations		integrations
retrieval		retrieval
rewrite		rewrite
runtime		runtime
scoring		scoring
scripts		scripts
storage		storage
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
brotherize.py		brotherize.py
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Brotherizer

What Brotherizer is

What it is not

How it works

Current model stack

Research, public and on purpose

Core features

1. Rewrite modes

2. Surface-aware rewriting

3. Donor memory

4. Style radar + formatting packs

5. Candidate ranking

6. Durable runtime jobs

Quick start

1. Install from a source checkout

2. Export credentials

3. Build the local stores

Run Brotherizer

CLI

API

API surface

Build and release ergonomics

Repo docs / wiki

Contributing

Positioning

Verification

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Brotherizer

What Brotherizer is

What it is not

How it works

Current model stack

Research, public and on purpose

Core features

1. Rewrite modes

2. Surface-aware rewriting

3. Donor memory

4. Style radar + formatting packs

5. Candidate ranking

6. Durable runtime jobs

Quick start

1. Install from a source checkout

2. Export credentials

3. Build the local stores

Run Brotherizer

CLI

API

API surface

Build and release ergonomics

Repo docs / wiki

Contributing

Positioning

Verification

About

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages