Skip to content

feat(call): speaker diarization#2846

Merged
whutchinson98 merged 2 commits into
mainfrom
hutch/feat-call-diarization
Apr 24, 2026
Merged

feat(call): speaker diarization#2846
whutchinson98 merged 2 commits into
mainfrom
hutch/feat-call-diarization

Conversation

@whutchinson98
Copy link
Copy Markdown
Member

No description provided.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 24, 2026

Warning

Rate limit exceeded

@whutchinson98 has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 38 minutes and 19 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 38 minutes and 19 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: bf251174-6a01-44bc-ab79-fed0a430a23a

📥 Commits

Reviewing files that changed from the base of the PR and between c54f72d and d072bd1.

⛔ Files ignored due to path filters (8)
  • js/app/packages/service-clients/service-cognition/generated/tools/schemas.ts is excluded by !**/generated/**
  • js/app/packages/service-clients/service-cognition/generated/tools/types.ts is excluded by !**/generated/**
  • js/app/packages/service-clients/service-storage/generated/schemas/callRecordTranscriptSegment.ts is excluded by !**/generated/**
  • js/app/packages/service-clients/service-storage/generated/schemas/callRecordTranscriptSegmentDiarizedSpeakerId.ts is excluded by !**/generated/**
  • js/app/packages/service-clients/service-storage/generated/schemas/index.ts is excluded by !**/generated/**
  • js/app/packages/service-clients/service-storage/generated/schemas/transcriptSegmentRequest.ts is excluded by !**/generated/**
  • js/app/packages/service-clients/service-storage/generated/schemas/transcriptSegmentRequestDiarizedSpeakerId.ts is excluded by !**/generated/**
  • js/app/packages/service-clients/service-storage/generated/zod.ts is excluded by !**/generated/**
📒 Files selected for processing (1)
  • js/app/packages/service-clients/service-storage/openapi.json
📝 Walkthrough

Walkthrough

This PR adds speaker diarization support to the transcription system. The Python agent now requests Deepgram diarization, computes dominant speakers per turn, and generates stable identifiers. The Rust backend extends data models and database schemas to persist and retrieve these speaker identifiers throughout the transcript lifecycle.

Changes

Cohort / File(s) Summary
Python Agent Diarization
agents/transcription/transcriber.py
Requests Deepgram diarization, extracts per-word speaker identifiers from FINAL_TRANSCRIPT events, accumulates speaker counts per turn, resolves dominant speaker to stable UUID, and includes diarizedSpeakerId in segment payloads.
Rust Data Models
rust/cloud-storage/call/src/domain/models.rs, rust/cloud-storage/call/src/inbound/toolset/read_call_record.rs
Adds optional diarized_speaker_id field to TranscriptSegmentRequest, CallRecordTranscriptSegment, and TranscriptSegment structs with appropriate serde attributes for request/response handling.
Rust Persistence Layer
rust/cloud-storage/call/src/outbound/pg_call_repo.rs, rust/cloud-storage/call/src/outbound/pg_call_repo/test.rs
Extends SQL queries to insert, copy, and retrieve diarized_speaker_id from transcript tables; updates repository mapping logic and test assertions to verify persistence across active and archived records.
Database Schema & Fixtures
rust/cloud-storage/macro_db_client/migrations/20260424130000_call_transcripts_diarized_speaker_id.sql, rust/cloud-storage/call/fixtures/call_repo.sql
Adds nullable diarized_speaker_id TEXT column to call_transcripts and call_record_transcripts tables; updates test fixture data to include sample diarized speaker identifiers.

Possibly related PRs

  • feat(calls): move transcript to agent #2377: Directly related—both PRs touch the transcription agent's outbound transcript flow and the call service's server-side ingestion, routing, and storage of transcript fields.
🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Description check ❓ Inconclusive No pull request description was provided by the author, making it impossible to verify relevance to the changeset. Add a pull request description explaining the diarization feature, its purpose, and how the changes across the Python and Rust codebases work together.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat(call): speaker diarization' follows conventional commits format with the feat prefix, includes a scope (call), and is 31 characters—well under the 72-character limit.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
agents/transcription/transcriber.py (1)

128-142: ⚠️ Potential issue | 🟡 Minor

Speaker counts leak across turns when content is falsy.

self._pending_speakers.clear() sits inside the if content: branch. If a turn completes with empty/None text_content (punctuation-only STT finals, or any edge case where the framework calls on_user_turn_completed without emitting content), the accumulated per-word speaker counts from that turn persist and will skew the dominant-speaker computation for the next turn.

This contradicts the invariant documented at lines 70–71 ("accumulated across final transcripts inside a single user turn and cleared on turn completion").

🛡️ Proposed fix: clear unconditionally on turn completion
     async def on_user_turn_completed(
         self, chat_ctx: llm.ChatContext, new_message: llm.ChatMessage
     ):
         content = new_message.text_content
         if content:
             ...
             diarized_speaker_id = None
             if self._pending_speakers:
                 dominant_speaker, _ = self._pending_speakers.most_common(1)[0]
                 diarized_speaker_id = self._resolve_diarized_speaker_id(dominant_speaker)
-            self._pending_speakers.clear()
-
             segment = {
                 ...
             }
             ...
+        self._pending_speakers.clear()
         raise StopResponse()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@agents/transcription/transcriber.py` around lines 128 - 142, The per-turn
speaker counts in self._pending_speakers are only being cleared when content is
truthy, causing counts to leak into the next turn; always clear
self._pending_speakers at turn completion regardless of content (move or add the
self._pending_speakers.clear() so it executes unconditionally after you compute
diarized_speaker_id and build the segment), ensuring the logic that uses
_resolve_diarized_speaker_id and the segment construction (segmentId, speakerId,
diarizedSpeakerId, content, startedAt, endedAt, isFinal) remains unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@agents/transcription/transcriber.py`:
- Around line 128-142: The per-turn speaker counts in self._pending_speakers are
only being cleared when content is truthy, causing counts to leak into the next
turn; always clear self._pending_speakers at turn completion regardless of
content (move or add the self._pending_speakers.clear() so it executes
unconditionally after you compute diarized_speaker_id and build the segment),
ensuring the logic that uses _resolve_diarized_speaker_id and the segment
construction (segmentId, speakerId, diarizedSpeakerId, content, startedAt,
endedAt, isFinal) remains unchanged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 8e2a4e80-de73-4eb6-a00f-39c1e68d40b6

📥 Commits

Reviewing files that changed from the base of the PR and between e3ba550 and c54f72d.

⛔ Files ignored due to path filters (6)
  • rust/cloud-storage/call/.sqlx/query-12c11b297513c688c8daf5bf541aa13c5f3bfa941859a5fdb9c7bad1654a0ec7.json is excluded by !**/.sqlx/**
  • rust/cloud-storage/call/.sqlx/query-2a43904e78a134b93cef3613fa69ed67bdd5fe2f059e97d3e74a11d478647c86.json is excluded by !**/.sqlx/**
  • rust/cloud-storage/call/.sqlx/query-669b1e8428a2dff7724bb47104efc739eb157f1b1dbe21cb607a0fc1f87ebe30.json is excluded by !**/.sqlx/**
  • rust/cloud-storage/call/.sqlx/query-9b0034065161dec2180a9714d46745b36e32ab06043cd636bbfab029d173bf72.json is excluded by !**/.sqlx/**
  • rust/cloud-storage/call/.sqlx/query-b564cb72641ccbd1fede2f0fac3167e192665f8dda8043f7a27c22a2eec49662.json is excluded by !**/.sqlx/**
  • rust/cloud-storage/call/.sqlx/query-b683cd2f7a93ff5eb1264823b9a4ce8e500abd7c2e170ce4efc4fa06123e82f6.json is excluded by !**/.sqlx/**
📒 Files selected for processing (7)
  • agents/transcription/transcriber.py
  • rust/cloud-storage/call/fixtures/call_repo.sql
  • rust/cloud-storage/call/src/domain/models.rs
  • rust/cloud-storage/call/src/inbound/toolset/read_call_record.rs
  • rust/cloud-storage/call/src/outbound/pg_call_repo.rs
  • rust/cloud-storage/call/src/outbound/pg_call_repo/test.rs
  • rust/cloud-storage/macro_db_client/migrations/20260424130000_call_transcripts_diarized_speaker_id.sql

@github-actions
Copy link
Copy Markdown

@whutchinson98 whutchinson98 merged commit e68dd3c into main Apr 24, 2026
40 checks passed
@whutchinson98 whutchinson98 deleted the hutch/feat-call-diarization branch April 24, 2026 17:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant