Offline sync transcription missing user preferences, vocabulary, language, and speaker identification

## Problem

Offline-synced recordings skip user transcription preferences entirely. The sync path in \`backend/routers/sync.py\` calls \`deepgram_prerecorded()\` with hardcoded defaults, while the live streaming path in \`backend/routers/transcribe.py\` correctly fetches and applies user preferences, vocabulary, language, speaker identification, and translation.

**Consolidates**: #6140 (diarization + vocabulary), #5907 (speaker diarization), #5912 (custom vocabulary)

## Code Comparison

### Sync Path (sync.py:711)
\`\`\`python
words, language = deepgram_prerecorded(url, speakers_count=3, attempts=0, return_language=True)
# No user preferences fetched
# No vocabulary/keywords passed
# No language parameter
# No model selection based on language
# No speaker identification
\`\`\`

### Realtime Path (transcribe.py:297-306, 1011-1035)
\`\`\`python
transcription_prefs = get_user_transcription_preferences(uid)
vocabulary = list({"Omi"} | set(transcription_prefs.get('vocabulary', [])))
single_language_mode = transcription_prefs.get('single_language_mode', False)
# + language-based model selection
# + speech profile support
# + speaker embedding extraction + person matching
# + translation service
# + text-based speaker detection
\`\`\`

## Feature Gap Table

| Feature | Realtime | Sync | Impact |
|---------|----------|------|--------|
| Custom vocabulary/keywords | YES | **NO** | Domain terms transcribed wrong |
| User language preference | YES | **NO** | Ignores user's language setting |
| Single language mode | YES | **NO** | Always multi-lang, accuracy loss |
| Language-based model selection | YES | **NO** (always nova-3) | Chinese/Thai get wrong model |
| Speaker identification (embeddings) | YES | **NO** | Cannot identify who is speaking |
| Speaker-to-person mapping | YES | **NO** | All segments have person_id=None |
| Text-based speaker detection | YES | **NO** | Missed "I am X" name detection |
| Translation service | YES | **NO** | No conversation translation |

## Root Cause

\`process_segment()\` in \`sync.py\` never calls \`get_user_transcription_preferences(uid)\` and passes no user-specific parameters to \`deepgram_prerecorded()\`. The function signature of \`deepgram_prerecorded()\` also lacks a \`keywords\` parameter.

## Key Files

- \`backend/routers/sync.py\` — \`process_segment()\` at line 693
- \`backend/routers/transcribe.py\` — \`_stream_handler()\` at line 219
- \`backend/utils/stt/pre_recorded.py\` — \`deepgram_prerecorded()\` at line 109
- \`backend/database/users.py\` — \`get_user_transcription_preferences()\` at line 995

## Feasible Fixes (pre-recorded API compatible)

These features can be added to the pre-recorded API path:
1. **Fetch user preferences** — call \`get_user_transcription_preferences(uid)\`
2. **Pass vocabulary/keywords** — add \`keywords\` param to \`deepgram_prerecorded()\`, use \`keyterm\` for nova-3 / \`keywords\` for nova-2
3. **Pass language** — respect user language preference, apply model selection via \`get_deepgram_model_for_language()\`
4. **Speaker identification** — post-process: extract embeddings from Deepgram segments, match against user's people embeddings
5. **Translation** — post-process: translate segments if user has language preference

Not feasible (streaming API only): speech profiles/preseconds, VAD gating, multi-channel, onboarding mode.

## Ref

- #6140 — Offline recordings lack diarization and custom vocabulary (P2)
- #5907 — Offline sync: no speaker diarization (P3)
- #5912 — Offline sync: custom vocabulary not applied (P3)
- #5733 — Parent umbrella: audio capture data loss

---
_Filed by @kelvin-agent, 2026-03-30. Code analysis from backend at HEAD._

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Offline sync transcription missing user preferences, vocabulary, language, and speaker identification #6172

Problem

Code Comparison

Sync Path (sync.py:711)

No user preferences fetched

No vocabulary/keywords passed

No language parameter

No model selection based on language

No speaker identification

Realtime Path (transcribe.py:297-306, 1011-1035)

+ language-based model selection

+ speech profile support

+ speaker embedding extraction + person matching

+ translation service

+ text-based speaker detection

Feature Gap Table

Root Cause

Key Files

Feasible Fixes (pre-recorded API compatible)

Ref

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature	Realtime	Sync	Impact
Custom vocabulary/keywords	YES	NO	Domain terms transcribed wrong
User language preference	YES	NO	Ignores user's language setting
Single language mode	YES	NO	Always multi-lang, accuracy loss
Language-based model selection	YES	NO (always nova-3)	Chinese/Thai get wrong model
Speaker identification (embeddings)	YES	NO	Cannot identify who is speaking
Speaker-to-person mapping	YES	NO	All segments have person_id=None
Text-based speaker detection	YES	NO	Missed "I am X" name detection
Translation service	YES	NO	No conversation translation

Offline sync transcription missing user preferences, vocabulary, language, and speaker identification #6172

Description

Problem

Code Comparison

Sync Path (sync.py:711)

No user preferences fetched

No vocabulary/keywords passed

No language parameter

No model selection based on language

No speaker identification

Realtime Path (transcribe.py:297-306, 1011-1035)

+ language-based model selection

+ speech profile support

+ speaker embedding extraction + person matching

+ translation service

+ text-based speaker detection

Feature Gap Table

Root Cause

Key Files

Feasible Fixes (pre-recorded API compatible)

Ref

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions