feat(call): speaker diarization#2846
Conversation
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 38 minutes and 19 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: ⛔ Files ignored due to path filters (8)
📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThis PR adds speaker diarization support to the transcription system. The Python agent now requests Deepgram diarization, computes dominant speakers per turn, and generates stable identifiers. The Rust backend extends data models and database schemas to persist and retrieve these speaker identifiers throughout the transcript lifecycle. Changes
Possibly related PRs
🚥 Pre-merge checks | ✅ 3 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
agents/transcription/transcriber.py (1)
128-142:⚠️ Potential issue | 🟡 MinorSpeaker counts leak across turns when
contentis falsy.
self._pending_speakers.clear()sits inside theif content:branch. If a turn completes with empty/Nonetext_content(punctuation-only STT finals, or any edge case where the framework callson_user_turn_completedwithout emitting content), the accumulated per-word speaker counts from that turn persist and will skew the dominant-speaker computation for the next turn.This contradicts the invariant documented at lines 70–71 ("accumulated across final transcripts inside a single user turn and cleared on turn completion").
🛡️ Proposed fix: clear unconditionally on turn completion
async def on_user_turn_completed( self, chat_ctx: llm.ChatContext, new_message: llm.ChatMessage ): content = new_message.text_content if content: ... diarized_speaker_id = None if self._pending_speakers: dominant_speaker, _ = self._pending_speakers.most_common(1)[0] diarized_speaker_id = self._resolve_diarized_speaker_id(dominant_speaker) - self._pending_speakers.clear() - segment = { ... } ... + self._pending_speakers.clear() raise StopResponse()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@agents/transcription/transcriber.py` around lines 128 - 142, The per-turn speaker counts in self._pending_speakers are only being cleared when content is truthy, causing counts to leak into the next turn; always clear self._pending_speakers at turn completion regardless of content (move or add the self._pending_speakers.clear() so it executes unconditionally after you compute diarized_speaker_id and build the segment), ensuring the logic that uses _resolve_diarized_speaker_id and the segment construction (segmentId, speakerId, diarizedSpeakerId, content, startedAt, endedAt, isFinal) remains unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@agents/transcription/transcriber.py`:
- Around line 128-142: The per-turn speaker counts in self._pending_speakers are
only being cleared when content is truthy, causing counts to leak into the next
turn; always clear self._pending_speakers at turn completion regardless of
content (move or add the self._pending_speakers.clear() so it executes
unconditionally after you compute diarized_speaker_id and build the segment),
ensuring the logic that uses _resolve_diarized_speaker_id and the segment
construction (segmentId, speakerId, diarizedSpeakerId, content, startedAt,
endedAt, isFinal) remains unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 8e2a4e80-de73-4eb6-a00f-39c1e68d40b6
⛔ Files ignored due to path filters (6)
rust/cloud-storage/call/.sqlx/query-12c11b297513c688c8daf5bf541aa13c5f3bfa941859a5fdb9c7bad1654a0ec7.jsonis excluded by!**/.sqlx/**rust/cloud-storage/call/.sqlx/query-2a43904e78a134b93cef3613fa69ed67bdd5fe2f059e97d3e74a11d478647c86.jsonis excluded by!**/.sqlx/**rust/cloud-storage/call/.sqlx/query-669b1e8428a2dff7724bb47104efc739eb157f1b1dbe21cb607a0fc1f87ebe30.jsonis excluded by!**/.sqlx/**rust/cloud-storage/call/.sqlx/query-9b0034065161dec2180a9714d46745b36e32ab06043cd636bbfab029d173bf72.jsonis excluded by!**/.sqlx/**rust/cloud-storage/call/.sqlx/query-b564cb72641ccbd1fede2f0fac3167e192665f8dda8043f7a27c22a2eec49662.jsonis excluded by!**/.sqlx/**rust/cloud-storage/call/.sqlx/query-b683cd2f7a93ff5eb1264823b9a4ce8e500abd7c2e170ce4efc4fa06123e82f6.jsonis excluded by!**/.sqlx/**
📒 Files selected for processing (7)
agents/transcription/transcriber.pyrust/cloud-storage/call/fixtures/call_repo.sqlrust/cloud-storage/call/src/domain/models.rsrust/cloud-storage/call/src/inbound/toolset/read_call_record.rsrust/cloud-storage/call/src/outbound/pg_call_repo.rsrust/cloud-storage/call/src/outbound/pg_call_repo/test.rsrust/cloud-storage/macro_db_client/migrations/20260424130000_call_transcripts_diarized_speaker_id.sql
No description provided.