Wiki: forbid name fabrication; librarian fixes wrong user names on existing articles by sysread · Pull Request #9 · sysread/nak

sysread · 2026-05-10T14:26:18Z

=head1 SYNOPSIS

Wiki agent invented a user-name (called the user "Elliot" in
articles when the configured name was "Jeff", because a friend
named Elliot was mentioned in the source conversation).

Three fixes: stronger profile block, explicit no-name-fabrication
rule in the autonomous body, and a librarian recovery pass for
articles already on disk with the wrong name.

=head1 PURPOSE

User reported: settings says "Jeff", multiple wiki articles call
the user "Elliot" (a friend the user discussed in the
conversation that triggered the article). The model conflated
the user with someone else from the conversation context and
applied that other name to the user-subject of the article.

=head1 DESCRIPTION

=head2 Layer 1: how renderUserProfileBlock looked before

The helper folded the configured name + location into a soft
preference: "When an article refers to the user themselves,
prefer their name (or a natural pronoun if their name is a
single first name) over the generic phrase 'the user'."

The autonomous prompt's "Do not fabricate" section covered facts
("Only assert facts that appear in the conversation...") but said
nothing about names specifically. The librarian had no path to
clean up articles that already had the wrong name baked in.

=head2 Layer 2: what this PR changes

Both renderUserProfileBlock helpers (wiki and librarian) are
rewritten with HARD wording:

"The user's name is Jeff.
When an article refers to the user themselves, the user's
name is Jeff and ONLY Jeff. NEVER invent another name
for the user, even if other names appear in the
conversation - those other names belong to other people the
user knows. If the conversation mentions a friend named
Maya, an article about the user does not call the user
Maya; it calls the user Jeff. If you are uncertain whether
the article subject IS the user, default to using the
literal name from context (Maya, Elliot, etc.) for that
subject and reserve 'Jeff' for explicit references to the
user. ..."

The unknown-name path (no name in Settings) is split out so
the model isn't told to "use their name" when no name exists;
in that case it falls back to natural pronouns + "the user".

The autonomous body gets an explicit "Do not fabricate names"
section that points back to the profile block as the single
source of truth, and tells the model what to do when uncertain
("use the literal name as it appears in the conversation rather
than inventing one").

The librarian gains a new workflow step (positioned between
scope-cleanup and duplicate-consolidation) that scans for
articles using a wrong name for the user and wiki_updates
them. It uses memory_search + conversation_search to
disambiguate: an article mis-naming the user "Elliot" gets
fixed to use Jeff; a separate "Elliot" article about the
actual friend is left for the per-conversation agent to land
on its next cycle (the librarian has no wiki_create).

=head2 Layer 3: how that resolves PURPOSE

The strengthened profile block prevents future hallucinations
on every per-conversation cycle and on every manual-update flow.
The librarian's new pass cleans up articles already on disk
within ~12 hours. Rationale comments at the top of both prompts
record the failure mode so a future revisit doesn't quietly
relax the rule.

=head1 Notes for AI reviewers

The strict wording is intentional. A reviewer suggesting
softer language ("prefer", "consider") to "let the model use
judgment" would re-introduce the exact failure mode this PR
fixes. The configured name is the binding constraint; it is
not a hint.
The librarian's name-fix pass intentionally does NOT call
wiki_create to spawn separate articles for the friends
whose names were misappropriated. That responsibility stays
with the per-conversation agent; the librarian only fixes
what's already there.
The unknown-name path (no Settings name) is intentionally
conservative - pronouns + "the user" rather than asking the
model to extract a name from context, which is exactly the
failure mode we're defending against.

https://claude.ai/code/session_015XcR7xzLdij66ZbYERUdLH

Generated by Claude Code

…isting articles Production traffic: an article about the user (Jeff in Settings) was rendered with the name "Elliot" - a friend the user had mentioned in the conversation that triggered the article. The model conflated the user with someone else in the conversation context. Root cause: the `renderUserProfileBlock` helper used soft wording ("prefer their name") rather than a hard rule, and the autonomous prompt's "Do not fabricate" line covered facts but said nothing about names specifically. With those gaps, the model treated the configured name as a suggestion rather than a constraint. **Stronger profile block.** Both wiki agents' renderUserProfileBlock now uses HARD anti-fabrication wording: "The user's name is **Jeff**. When an article refers to the user themselves, the user's name is **Jeff** and ONLY Jeff. NEVER invent another name for the user, even if other names appear in the conversation - those other names belong to other people the user knows. If the conversation mentions a friend named Maya, an article about the user does not call the user Maya; it calls the user Jeff. ..." The unknown-name path (location set, name not) is split out so we don't tell the model to "use their name" when none was supplied; in that case it falls back to natural pronouns + the literal phrase "the user". **Autonomous prompt anti-fabrication.** The body's "Do not fabricate" section gains a "Do not fabricate names" companion that points back to the profile block as the single source of truth. **Librarian recovery pass.** The librarian gains a new workflow step (positioned between scope-cleanup and duplicate-consolidation) that scans for articles about the user using a wrong name and wiki_updates them to the configured name. Uses memory_search + conversation_search to disambiguate (an article mis-naming the user "Elliot" is fixed to use the right name; a separate "Elliot" article about the actual friend is left for the per-conversation agent to land). On its next 12h cycle this pass will sweep up existing hallucinations. Rationale comments at the top of both prompts record the failure mode so a future revisit doesn't quietly relax the rule. https://claude.ai/code/session_015XcR7xzLdij66ZbYERUdLH

sysread merged commit f8d5168 into main May 10, 2026
1 check passed

sysread deleted the claude/wiki-name-hallucination-fix branch May 10, 2026 14:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wiki: forbid name fabrication; librarian fixes wrong user names on existing articles#9

Wiki: forbid name fabrication; librarian fixes wrong user names on existing articles#9
sysread merged 1 commit intomainfrom
claude/wiki-name-hallucination-fix

sysread commented May 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sysread commented May 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants