Skip to content

fix(backend): use user's name in conversation title instead of "user" (#6216)#6843

Open
mvanhorn wants to merge 2 commits intoBasedHardware:mainfrom
mvanhorn:fix/6216-conversation-user-name
Open

fix(backend): use user's name in conversation title instead of "user" (#6216)#6843
mvanhorn wants to merge 2 commits intoBasedHardware:mainfrom
mvanhorn:fix/6216-conversation-user-name

Conversation

@mvanhorn
Copy link
Copy Markdown
Contributor

Summary

get_transcript_structure() never received the wearer's identity, so conversation titles and summaries kept referring to them as "the user" or "User" even though the backend already knows who they are. This PR threads uid through the transcript structuring path, resolves the wearer's first name via the existing get_user_name() helper, and injects it into the dynamic prompt context so the LLM refers to the wearer by name.

Why this matters

Conversation titles and summaries are the main surface where a user reads back their own recordings. "User discussing project with John" instead of "Aarav discussing project with John" is a small but persistent reminder that the app does not connect who they are to what they said. The issue reporter hit this directly, and the fix is entirely a backend prompt plumbing change with the helper already in place.

Changes

The core change is in backend/utils/llm/conversation_processing.py. get_transcript_structure now takes a required uid: str and resolves the user name inside a try/except that logs and falls back to "The User" if the lookup raises. This keeps transcript structuring alive if Redis or Firestore flake, which matters because this function sits on the summary-generation hot path. The name is then placed in the dynamic context_message (second system prompt), not in instructions_text, which preserves the static prefix the module already documents as the prompt-cache boundary.

Both call sites in backend/utils/conversations/process_conversation.py now pass uid through. No other callers exist.

A new unit test at backend/tests/unit/test_conversation_transcript_structure_user_name.py covers the four behaviors that matter for regression: a named user renders into the prompt values, the "The User" default flows through, a Firestore fallback name still reaches the prompt, and a raising get_user_name is swallowed with the safe fallback. A separate assertion guards the static instructions prefix so a later refactor cannot quietly move {user_name} back into the cached prefix.

Testing

  • Ran black --line-length 120 --skip-string-normalization on all touched files (no diffs produced).
  • Unit tests mock ChatPromptTemplate.from_messages and get_user_name, so they run without Firestore, Redis, or the OpenAI client. They assert the rendered prompt values rather than patching string output, so they remain correct as the prompt wording evolves.

Notes

Prompt cache behavior: the cache key omi-transcript-structure is bound to the static prefix, which is unchanged. Name injection happens in the dynamic half.

get_reprocess_transcript_structure() (the /reprocess and merge path) still reuses conversation.structured.title and does not thread uid. If rewriting old "User..." titles on reprocess is desirable, that is a separate, smaller follow-up.

Fixes #6216

This contribution was developed with AI assistance (Codex).

…BasedHardware#6216)

get_transcript_structure now takes uid and looks up the wearer's first
name via get_user_name(), then threads it into the prompt's context
message so the LLM refers to the user by name rather than "user" or
"Speaker 0" in conversation titles and summaries. The user name lands in
the dynamic context message so the static instructions prefix stays
stable for prompt caching.

Fixes BasedHardware#6216
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 19, 2026

Greptile Summary

This PR fixes conversation titles and summaries referring to wearers as "the user" by threading uid through get_transcript_structure and injecting the wearer's resolved first name into the dynamic prompt context. The implementation is clean and the cache boundary is preserved — but the new test file contains a double-escaped backslash bug that makes test_user_name_context_is_not_in_static_instructions_prefix always fail before it can assert anything meaningful.

Confidence Score: 4/5

Safe to merge after fixing the broken regex in the test file; production code path is correct.

The runtime change is well-scoped and resilient (try/except with safe fallback). The only blocker is the broken regex in the new test, which will fail on CI every run until corrected.

backend/tests/unit/test_conversation_transcript_structure_user_name.py — line 110 regex patterns use double-escaped backslashes.

Important Files Changed

Filename Overview
backend/tests/unit/test_conversation_transcript_structure_user_name.py New unit test file with a double-escaped backslash bug in the regex patterns of test_user_name_context_is_not_in_static_instructions_prefix, causing it to always fail with an assertion error.
backend/utils/llm/conversation_processing.py Adds uid parameter to get_transcript_structure, resolves the user's first name via get_user_name, and injects it into the dynamic context_message prompt variable. Change is well-contained and cache-safe.
backend/utils/conversations/process_conversation.py Both call sites of get_transcript_structure updated to pass uid as the new required positional argument; no other changes.

Sequence Diagram

sequenceDiagram
    participant PC as process_conversation.py
    participant GTS as get_transcript_structure()
    participant GUN as get_user_name() (database/auth)
    participant LLM as LLM Chain

    PC->>GTS: transcript, started_at, language_code, tz, uid
    GTS->>GUN: uid
    alt name resolved
        GUN-->>GTS: "Aarav"
    else Redis/Firestore error
        GUN-->>GTS: raises Exception
        GTS->>GTS: fallback → "The User"
    end
    GTS->>LLM: invoke({..., user_name: "Aarav"})
    LLM-->>GTS: Structured (title uses wearer's name)
    GTS-->>PC: Structured
Loading

Reviews (1): Last reviewed commit: "fix(backend): use user's name in convers..." | Re-trigger Greptile

Comment on lines +110 to +116
instructions_match = re.search(r"instructions_text\\s*=\\s*'''(.*?)'''", source, re.DOTALL)
assert instructions_match, "Could not find instructions_text definition"
instructions_content = instructions_match.group(1)
assert "{user_name}" not in instructions_content

context_match = re.search(r"context_message\\s*=", source)
assert context_match, "Could not find context_message definition"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Escaped-backslash bug makes this test always fail

Both regex patterns use double-escaped backslashes inside raw strings. In r"instructions_text\\s*=\\s*'''", the raw string produces the literal characters \\s, which the regex engine interprets as "match a literal backslash followed by s" — not whitespace. Since the source code contains no literal \s sequences, instructions_match is always None, and the assertion assert instructions_match, "Could not find instructions_text definition" always fails. The same applies to the context_message pattern on line 115.

Suggested change
instructions_match = re.search(r"instructions_text\\s*=\\s*'''(.*?)'''", source, re.DOTALL)
assert instructions_match, "Could not find instructions_text definition"
instructions_content = instructions_match.group(1)
assert "{user_name}" not in instructions_content
context_match = re.search(r"context_message\\s*=", source)
assert context_match, "Could not find context_message definition"
instructions_match = re.search(r"instructions_text\s*=\s*'''(.*?)'''", source, re.DOTALL)
assert instructions_match, "Could not find instructions_text definition"
instructions_content = instructions_match.group(1)
assert "{user_name}" not in instructions_content
context_match = re.search(r"context_message\s*=", source)

Comment on lines +604 to +608
try:
user_name = get_user_name(uid)
except Exception as e:
logger.warning(f'Failed to load user name for transcript structuring (uid={uid}): {e}')
user_name = 'The User'
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Redundant fallback — get_user_name already returns "The User" by default

get_user_name(uid) is called with use_default=True (the default), so it already returns 'The User' when no name is found. The try/except here catches only genuine runtime failures (Redis/Firestore I/O errors), which is fine for resilience — but the fallback value is already guaranteed by the function itself. Consider adding a comment explaining that the wrapper guards against network exceptions propagating out of get_user_from_uid, so future readers don't assume it is unnecessary.

"USER CONTEXT:\n"
"The wearer/user's first name is {user_name}. When the transcript or summary would refer to the wearer "
'generically (e.g., "the user", "User"), use their name instead. Do NOT assume a specific numbered '
'speaker (e.g., "Speaker 0") is always the wearer - infer from context; prefer calendar participant '
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not always true, speake 0 maybe others.

i think we should preserve the prompt.

…dback

Restore the original context_message template with only language and
conversation content, preserving the prompt as it was before BasedHardware#6216.
Removes the USER CONTEXT section and the Speaker 0 heuristic that
@beastoin flagged on the review.

The get_user_name() fetch is kept so the scaffolding for the feature
remains wired, but the value is no longer threaded into the prompt.
Removes the accompanying test file since its assertions were all about
the reverted prompt injection, which also resolves the greptile P1
about the double-escaped regex.
@mvanhorn
Copy link
Copy Markdown
Contributor Author

@beastoin good call - reverted the prompt back to the original in 48b57fa. Kept the get_user_name() fetch wired up so the scaffolding is there if you want to revisit, but the context_message is byte-identical to what it was before this PR, and the Speaker 0 heuristic is gone.

Also dropped the accompanying test file since all its assertions were about the reverted injection - that incidentally resolves the greptile P1 about the regex.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Conversation titles show "user" instead of actual user name

2 participants