feat(user_interviews): render interview transcripts as chat bubbles#60683
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b1a2ef3a7d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Prompt To Fix All With AIFix the following 2 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 2
products/user_interviews/frontend/TranscriptChat.tsx:19
The `g` flag on `TURN_SPLIT_RE` is superfluous: `String.prototype.split()` does not use `lastIndex` and ignores `g` (it resets the index internally per the spec). Leaving `g` on a module-level regex constant adds noise and is a trap for any future caller that uses `exec()` or `test()` on the same constant, since those methods will resume from `lastIndex` and produce wrong results on the second call.
```suggestion
const TURN_SPLIT_RE = /\b(AI|Assistant|Interviewer|User|Interviewee):\s+/
```
### Issue 2 of 2
products/user_interviews/frontend/TranscriptChat.test.tsx:34-41
**Speaker label inside content causes incorrect split**
The `'colons in user content do not split'` test only guards against bare colons; it doesn't protect against a recognised speaker word appearing inside spoken text (e.g. `'AI: I asked the User: about it. User: Sure.'`). Because `TURN_SPLIT_RE` uses `\b` rather than anchoring to line-start or sentence-start, "User:" mid-sentence matches exactly like a turn delimiter, splitting the AI's reply and creating a spurious turn. Adding a parameterised case for this scenario would document the limitation and catch any future regex change that makes it worse.
Reviews (1): Last reviewed commit: "feat(user_interviews): render interview ..." | Re-trigger Greptile |
|
Size Change: +4.23 kB (+0.01%) Total Size: 80.8 MB 📦 View Changed
ℹ️ View Unchanged
|
There was a problem hiding this comment.
Purely additive frontend component replacing a plain markdown renderer with a chat-bubble UI. Degrades gracefully to markdown when no recognised speaker labels are found, so audio-transcribed interviews (which use #### Speaker N headings) remain unbroken. Bot comments raise valid code-quality notes (superfluous /g flag, \b-anchored regex edge cases) but none are showstoppers — no data loss, no API contract changes, no security risk.
pauldambra
left a comment
There was a problem hiding this comment.
QA Swarm review complete. See inline comments and the top-level summary.
|
Note 🤖 Automated comment by QA Swarm — not written by a human Multi-perspective review: qa-team (specialists + generalists), paul-reviewer, xp-reviewer, security-audit Verdict:
|
| Reviewer | Assessment |
|---|---|
| 🔍 qa-team | Component is safe (LemonMarkdown is skipHtml); parser's mid-sentence false splits and leading-prose drop are the substantive concerns. |
| 👤 paul | Stamp anyway — the two things to chat about are the upstream coupling/fragility of recovering structure from free-text and the lack of observability on the fallback path. |
| 📐 xp | Small, well-shaped change in the right XP spirit; tighten the parser contract and add two more parameterized cases to lock the behaviour in. |
| 🛡 security-audit | No new vulnerabilities introduced — XSS surface unchanged from baseline, MD5 Gravatar URL safe, no new endpoints or trust boundaries. |
Automated by QA Swarm — not a human review
New commits pushed (delta classified non_trivial_delta) — stamphog approval dismissed; re-review running automatically.
There was a problem hiding this comment.
Purely additive read-only display component that replaces a plain markdown renderer with a chat-bubble UI; degrades gracefully to markdown when no speaker labels are found. No data writes, no API contract changes, no security risk. Unresolved bot comments raise valid edge-case display bugs (mid-sentence regex split, preamble before first speaker discarded) but neither causes data loss or crashes — the original transcript is always preserved in the database.
New commits pushed (delta classified non_trivial_delta) — stamphog approval dismissed; re-review running automatically.
There was a problem hiding this comment.
Purely additive read-only display component. The two showstopper-level bot concerns (mid-sentence regex split, superfluous g flag) were fixed in follow-up commits and covered by new parameterized tests. Remaining unresolved comments are all automated QA NITs/LOWs — display edge cases (preamble before first speaker lost in rendering, cosmetic bubble color, helper deduplication) with no data loss risk since the transcript is persisted independently. No API contract, data model, or security changes.
New commits pushed (delta classified non_trivial_delta) — stamphog approval dismissed; re-review running automatically.
There was a problem hiding this comment.
Purely additive read-only display component replacing a plain markdown renderer with a chat-bubble UI. No backend, data model, or API contract changes. All substantive bot review comments were addressed in follow-up commits and are resolved; tests cover the key edge cases (mid-turn speaker keyword, leading prose, lowercase prefixes). Graceful markdown fallback preserved.
|
✅ Visual changes approved by @pauldambra — baseline updated in 6 changed, 16 new. |
New commits pushed (delta classified non_trivial_delta) — stamphog approval dismissed; re-review running automatically.
Generated-By: PostHog Code Task-Id: 9c7e17df-cee3-4307-81cd-ffa8a4fcac6d
String.prototype.split() does not consult lastIndex on a RegExp, so the global flag is irrelevant for split usage and would silently break any future exec()/test() call against the same module-level constant. Generated-By: PostHog Code Task-Id: 9c7e17df-cee3-4307-81cd-ffa8a4fcac6d
…urns Vapi's `transcript` field is newline-delimited (confirmed against three real interview responses on the live project). The previous word-boundary regex would have mis-split a turn body containing a speaker keyword (e.g. "How did the User: feel?") into a spurious turn. Anchor the split to start-of-string or a newline so only true turn boundaries match, and accept lowercase prefixes for resilience against provider drift. Update tests to reflect the real input shape (newline-delimited turns) and add new cases pinning down the keyword-in-body, lowercase, and inline-no-newlines behaviour. Storybook fixtures updated to match. Generated-By: PostHog Code Task-Id: 9c7e17df-cee3-4307-81cd-ffa8a4fcac6d
…, add observability hooks - Replace inline ProfilePicture wiring with PersonIcon + asDisplay so the avatar/label fallback chain lives where it already exists (kills the triple-duplicate "name || email || distinct_id" fallback). - Fall back to plain markdown rendering when parseTranscript returns non-empty leading prose, so any preamble before the first speaker prefix is preserved rather than silently dropped. - Add data-attr="transcript-chat" + data-parsed="true|false" on the wrapper so we can measure how often the bubble layout fires vs the markdown fallback. - Use turn.name as the bubble label (so Assistant:/Interviewer: in the raw transcript show through faithfully). - Drop useMemo on the cheap parser, inline it. - Move parseTranscript into its own file so the tests can import the pure parser without dragging the React/Kea import graph in. Generated-By: PostHog Code Task-Id: 9c7e17df-cee3-4307-81cd-ffa8a4fcac6d
The new Storybook story and Jest test in this package pulled in `@storybook/react` and `jest` types without the package.json declaring them as peer dependencies, so the isolated typecheck failed. Generated-By: PostHog Code Task-Id: 9c7e17df-cee3-4307-81cd-ffa8a4fcac6d
…peer deps The previous commit added @storybook/react, jest, and typescript to the product's peerDependencies but did not update pnpm-lock.yaml, so CI's `pnpm install --frozen-lockfile` step failed on the manifest/lockfile mismatch. Record the three resolved versions (all already present in the lockfile's package graph) against the user_interviews importer. Generated-By: PostHog Code Task-Id: 9c7e17df-cee3-4307-81cd-ffa8a4fcac6d
1bbb454 to
449aa73
Compare
|
🎭 Playwright report · View test results →
These issues are not necessarily caused by your changes. |
22 updated Run: 627e0ad6-a019-4158-8299-533cac993d5e Co-authored-by: pauldambra <984817+pauldambra@users.noreply.github.com>

Problem
Interview response transcripts are stored as a single inline string with
AI:/User:speaker prefixes (and sometimesAssistant:/Interviewee:/Interviewer:), then rendered as one wall of markdown. That makes it hard to follow who said what — the speaker labels and turns are visually indistinguishable from the body text.Changes
TranscriptChatcomponent (products/user_interviews/frontend/TranscriptChat.tsx) that:RobotHogavatar, the interviewee on the right withProfilePicture(gravatar via the loaded person's email, falling back to the URL identifier when it is an email).UserInterviewResponse.tsxnow mountsTranscriptChatin place of the rawLemonMarkdownblock inside the Transcript widget.Scenes-App/User Interviews/Transcript Chatwith seven stories covering: known/unknown person, single-paragraph (Vapi-style) input, newline-separated input,Assistant/Intervieweealiases, the no-speaker fallback, and a non-email identifier. All story data is synthetic.How did you test this code?
I'm an agent. I did not drive a browser against the live response page.
TranscriptChat.test.tsx(8 cases viadescribe.each, all passing underjest). Cases cover: empty input, no speaker markers, single AI turn, AI-then-User on one line, multi-turn with newlines,Assistant/Intervieweealiases, embedded colons that must not split a turn, and empty turns being skipped.pnpm --filter=@posthog/frontend formatandoxlintover the changed files — clean.tsc --noEmit(OOMs on this machine), but the affected files compile cleanly when checked.The Storybook scene is the recommended way for a human reviewer to eyeball the rendering before merge.
Automatic notifications
Docs update
No docs changes — this is an internal UX tweak.
🤖 Agent context
Authored by PostHog Code. Investigated the existing transcript renderer (
UserInterviewResponse.tsx), the Vapi webhook ingestion path that storestranscriptverbatim, and howProfilePicturealready does gravatar lookups by email — so the chat bubble work could reuse it rather than introduce a new avatar component. Considered placing the parser inline inUserInterviewResponse.tsxbut split it out so it could be unit-tested without React rendering and showcased in Storybook with fake data.Picked
\b(AI|Assistant|Interviewer|User|Interviewee):\s+/gfor the split: an earlier(?:^|\s)-based regex consumed the whitespace between turns and missed the second speaker when multiple spaces separated them. Word boundary is robust to that and to inline-vs-newline formatting, at the small risk of false-splitting a sentence that ends in a literal speaker token followed by:— accepted as an acceptable edge case for now.Created with PostHog Code