feat(call): speaker diarization by whutchinson98 · Pull Request #2846 · macro-inc/macro

whutchinson98 · 2026-04-24T17:09:06Z

No description provided.

coderabbitai · 2026-04-24T17:09:55Z

Warning

Rate limit exceeded

@whutchinson98 has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 38 minutes and 19 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 38 minutes and 19 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: bf251174-6a01-44bc-ab79-fed0a430a23a

📥 Commits

Reviewing files that changed from the base of the PR and between c54f72d and d072bd1.

⛔ Files ignored due to path filters (8)

js/app/packages/service-clients/service-cognition/generated/tools/schemas.ts is excluded by !**/generated/**
js/app/packages/service-clients/service-cognition/generated/tools/types.ts is excluded by !**/generated/**
js/app/packages/service-clients/service-storage/generated/schemas/callRecordTranscriptSegment.ts is excluded by !**/generated/**
js/app/packages/service-clients/service-storage/generated/schemas/callRecordTranscriptSegmentDiarizedSpeakerId.ts is excluded by !**/generated/**
js/app/packages/service-clients/service-storage/generated/schemas/index.ts is excluded by !**/generated/**
js/app/packages/service-clients/service-storage/generated/schemas/transcriptSegmentRequest.ts is excluded by !**/generated/**
js/app/packages/service-clients/service-storage/generated/schemas/transcriptSegmentRequestDiarizedSpeakerId.ts is excluded by !**/generated/**
js/app/packages/service-clients/service-storage/generated/zod.ts is excluded by !**/generated/**

📒 Files selected for processing (1)

js/app/packages/service-clients/service-storage/openapi.json

📝 Walkthrough

Walkthrough

This PR adds speaker diarization support to the transcription system. The Python agent now requests Deepgram diarization, computes dominant speakers per turn, and generates stable identifiers. The Rust backend extends data models and database schemas to persist and retrieve these speaker identifiers throughout the transcript lifecycle.

Changes

Cohort / File(s)	Summary
Python Agent Diarization `agents/transcription/transcriber.py`	Requests Deepgram diarization, extracts per-word speaker identifiers from FINAL_TRANSCRIPT events, accumulates speaker counts per turn, resolves dominant speaker to stable UUID, and includes `diarizedSpeakerId` in segment payloads.
Rust Data Models `rust/cloud-storage/call/src/domain/models.rs`, `rust/cloud-storage/call/src/inbound/toolset/read_call_record.rs`	Adds optional `diarized_speaker_id` field to `TranscriptSegmentRequest`, `CallRecordTranscriptSegment`, and `TranscriptSegment` structs with appropriate serde attributes for request/response handling.
Rust Persistence Layer `rust/cloud-storage/call/src/outbound/pg_call_repo.rs`, `rust/cloud-storage/call/src/outbound/pg_call_repo/test.rs`	Extends SQL queries to insert, copy, and retrieve `diarized_speaker_id` from transcript tables; updates repository mapping logic and test assertions to verify persistence across active and archived records.
Database Schema & Fixtures `rust/cloud-storage/macro_db_client/migrations/20260424130000_call_transcripts_diarized_speaker_id.sql`, `rust/cloud-storage/call/fixtures/call_repo.sql`	Adds nullable `diarized_speaker_id` TEXT column to `call_transcripts` and `call_record_transcripts` tables; updates test fixture data to include sample diarized speaker identifiers.

Possibly related PRs

feat(calls): move transcript to agent #2377: Directly related—both PRs touch the transcription agent's outbound transcript flow and the call service's server-side ingestion, routing, and storage of transcript fields.

🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Description check	❓ Inconclusive	No pull request description was provided by the author, making it impossible to verify relevance to the changeset.	Add a pull request description explaining the diarization feature, its purpose, and how the changes across the Python and Rust codebases work together.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'feat(call): speaker diarization' follows conventional commits format with the feat prefix, includes a scope (call), and is 31 characters—well under the 72-character limit.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

agents/transcription/transcriber.py (1)
128-142: ⚠️ Potential issue | 🟡 Minor

Speaker counts leak across turns when content is falsy.

self._pending_speakers.clear() sits inside the if content: branch. If a turn completes with empty/None text_content (punctuation-only STT finals, or any edge case where the framework calls on_user_turn_completed without emitting content), the accumulated per-word speaker counts from that turn persist and will skew the dominant-speaker computation for the next turn.

This contradicts the invariant documented at lines 70–71 ("accumulated across final transcripts inside a single user turn and cleared on turn completion").
🛡️ Proposed fix: clear unconditionally on turn completion
     async def on_user_turn_completed(
         self, chat_ctx: llm.ChatContext, new_message: llm.ChatMessage
     ):
         content = new_message.text_content
         if content:
             ...
             diarized_speaker_id = None
             if self._pending_speakers:
                 dominant_speaker, _ = self._pending_speakers.most_common(1)[0]
                 diarized_speaker_id = self._resolve_diarized_speaker_id(dominant_speaker)
-            self._pending_speakers.clear()
-
             segment = {
                 ...
             }
             ...
+        self._pending_speakers.clear()
         raise StopResponse()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@agents/transcription/transcriber.py` around lines 128 - 142, The per-turn
speaker counts in self._pending_speakers are only being cleared when content is
truthy, causing counts to leak into the next turn; always clear
self._pending_speakers at turn completion regardless of content (move or add the
self._pending_speakers.clear() so it executes unconditionally after you compute
diarized_speaker_id and build the segment), ensuring the logic that uses
_resolve_diarized_speaker_id and the segment construction (segmentId, speakerId,
diarizedSpeakerId, content, startedAt, endedAt, isFinal) remains unchanged.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@agents/transcription/transcriber.py`:
- Around line 128-142: The per-turn speaker counts in self._pending_speakers are
only being cleared when content is truthy, causing counts to leak into the next
turn; always clear self._pending_speakers at turn completion regardless of
content (move or add the self._pending_speakers.clear() so it executes
unconditionally after you compute diarized_speaker_id and build the segment),
ensuring the logic that uses _resolve_diarized_speaker_id and the segment
construction (segmentId, speakerId, diarizedSpeakerId, content, startedAt,
endedAt, isFinal) remains unchanged.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 8e2a4e80-de73-4eb6-a00f-39c1e68d40b6

📥 Commits

Reviewing files that changed from the base of the PR and between e3ba550 and c54f72d.

⛔ Files ignored due to path filters (6)

rust/cloud-storage/call/.sqlx/query-12c11b297513c688c8daf5bf541aa13c5f3bfa941859a5fdb9c7bad1654a0ec7.json is excluded by !**/.sqlx/**
rust/cloud-storage/call/.sqlx/query-2a43904e78a134b93cef3613fa69ed67bdd5fe2f059e97d3e74a11d478647c86.json is excluded by !**/.sqlx/**
rust/cloud-storage/call/.sqlx/query-669b1e8428a2dff7724bb47104efc739eb157f1b1dbe21cb607a0fc1f87ebe30.json is excluded by !**/.sqlx/**
rust/cloud-storage/call/.sqlx/query-9b0034065161dec2180a9714d46745b36e32ab06043cd636bbfab029d173bf72.json is excluded by !**/.sqlx/**
rust/cloud-storage/call/.sqlx/query-b564cb72641ccbd1fede2f0fac3167e192665f8dda8043f7a27c22a2eec49662.json is excluded by !**/.sqlx/**
rust/cloud-storage/call/.sqlx/query-b683cd2f7a93ff5eb1264823b9a4ce8e500abd7c2e170ce4efc4fa06123e82f6.json is excluded by !**/.sqlx/**

📒 Files selected for processing (7)

agents/transcription/transcriber.py
rust/cloud-storage/call/fixtures/call_repo.sql
rust/cloud-storage/call/src/domain/models.rs
rust/cloud-storage/call/src/inbound/toolset/read_call_record.rs
rust/cloud-storage/call/src/outbound/pg_call_repo.rs
rust/cloud-storage/call/src/outbound/pg_call_repo/test.rs
rust/cloud-storage/macro_db_client/migrations/20260424130000_call_transcripts_diarized_speaker_id.sql

github-actions · 2026-04-24T17:32:51Z

Preview: https://hutch-feat-call-diarization-27qysy.preview.macro.com/app (bb8b6eb)

feat(call): speaker diarization

c54f72d

whutchinson98 requested a review from a team as a code owner April 24, 2026 17:09

github-actions Bot assigned whutchinson98 Apr 24, 2026

github-actions Bot added the cloud-storage label Apr 24, 2026

coderabbitai Bot reviewed Apr 24, 2026

View reviewed changes

typegen

d072bd1

github-actions Bot added the web-app label Apr 24, 2026

whutchinson98 merged commit e68dd3c into main Apr 24, 2026
40 checks passed

whutchinson98 deleted the hutch/feat-call-diarization branch April 24, 2026 17:40

coderabbitai Bot mentioned this pull request Apr 29, 2026

feat(calls): custom speaker overrides #2939

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(call): speaker diarization#2846

feat(call): speaker diarization#2846
whutchinson98 merged 2 commits into
mainfrom
hutch/feat-call-diarization

whutchinson98 commented Apr 24, 2026

Uh oh!

coderabbitai Bot commented Apr 24, 2026 •

edited

Loading

Rate limit exceeded

Walkthrough

Changes

Possibly related PRs

❌ Failed checks (1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

github-actions Bot commented Apr 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

whutchinson98 commented Apr 24, 2026

Uh oh!

coderabbitai Bot commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Possibly related PRs

❌ Failed checks (1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Apr 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented Apr 24, 2026 •

edited

Loading