Problem
Offline-synced recordings skip user transcription preferences entirely. The sync path in `backend/routers/sync.py` calls `deepgram_prerecorded()` with hardcoded defaults, while the live streaming path in `backend/routers/transcribe.py` correctly fetches and applies user preferences, vocabulary, language, speaker identification, and translation.
Consolidates: #6140 (diarization + vocabulary), #5907 (speaker diarization), #5912 (custom vocabulary)
Code Comparison
Sync Path (sync.py:711)
```python
words, language = deepgram_prerecorded(url, speakers_count=3, attempts=0, return_language=True)
No user preferences fetched
No vocabulary/keywords passed
No language parameter
No model selection based on language
No speaker identification
```
Realtime Path (transcribe.py:297-306, 1011-1035)
```python
transcription_prefs = get_user_transcription_preferences(uid)
vocabulary = list({"Omi"} | set(transcription_prefs.get('vocabulary', [])))
single_language_mode = transcription_prefs.get('single_language_mode', False)
+ language-based model selection
+ speech profile support
+ speaker embedding extraction + person matching
+ translation service
+ text-based speaker detection
```
Feature Gap Table
| Feature |
Realtime |
Sync |
Impact |
| Custom vocabulary/keywords |
YES |
NO |
Domain terms transcribed wrong |
| User language preference |
YES |
NO |
Ignores user's language setting |
| Single language mode |
YES |
NO |
Always multi-lang, accuracy loss |
| Language-based model selection |
YES |
NO (always nova-3) |
Chinese/Thai get wrong model |
| Speaker identification (embeddings) |
YES |
NO |
Cannot identify who is speaking |
| Speaker-to-person mapping |
YES |
NO |
All segments have person_id=None |
| Text-based speaker detection |
YES |
NO |
Missed "I am X" name detection |
| Translation service |
YES |
NO |
No conversation translation |
Root Cause
`process_segment()` in `sync.py` never calls `get_user_transcription_preferences(uid)` and passes no user-specific parameters to `deepgram_prerecorded()`. The function signature of `deepgram_prerecorded()` also lacks a `keywords` parameter.
Key Files
- `backend/routers/sync.py` — `process_segment()` at line 693
- `backend/routers/transcribe.py` — `_stream_handler()` at line 219
- `backend/utils/stt/pre_recorded.py` — `deepgram_prerecorded()` at line 109
- `backend/database/users.py` — `get_user_transcription_preferences()` at line 995
Feasible Fixes (pre-recorded API compatible)
These features can be added to the pre-recorded API path:
- Fetch user preferences — call `get_user_transcription_preferences(uid)`
- Pass vocabulary/keywords — add `keywords` param to `deepgram_prerecorded()`, use `keyterm` for nova-3 / `keywords` for nova-2
- Pass language — respect user language preference, apply model selection via `get_deepgram_model_for_language()`
- Speaker identification — post-process: extract embeddings from Deepgram segments, match against user's people embeddings
- Translation — post-process: translate segments if user has language preference
Not feasible (streaming API only): speech profiles/preseconds, VAD gating, multi-channel, onboarding mode.
Ref
Filed by @kelvin-agent, 2026-03-30. Code analysis from backend at HEAD.
Problem
Offline-synced recordings skip user transcription preferences entirely. The sync path in `backend/routers/sync.py` calls `deepgram_prerecorded()` with hardcoded defaults, while the live streaming path in `backend/routers/transcribe.py` correctly fetches and applies user preferences, vocabulary, language, speaker identification, and translation.
Consolidates: #6140 (diarization + vocabulary), #5907 (speaker diarization), #5912 (custom vocabulary)
Code Comparison
Sync Path (sync.py:711)
```python
words, language = deepgram_prerecorded(url, speakers_count=3, attempts=0, return_language=True)
No user preferences fetched
No vocabulary/keywords passed
No language parameter
No model selection based on language
No speaker identification
```
Realtime Path (transcribe.py:297-306, 1011-1035)
```python
transcription_prefs = get_user_transcription_preferences(uid)
vocabulary = list({"Omi"} | set(transcription_prefs.get('vocabulary', [])))
single_language_mode = transcription_prefs.get('single_language_mode', False)
+ language-based model selection
+ speech profile support
+ speaker embedding extraction + person matching
+ translation service
+ text-based speaker detection
```
Feature Gap Table
Root Cause
`process_segment()` in `sync.py` never calls `get_user_transcription_preferences(uid)` and passes no user-specific parameters to `deepgram_prerecorded()`. The function signature of `deepgram_prerecorded()` also lacks a `keywords` parameter.
Key Files
Feasible Fixes (pre-recorded API compatible)
These features can be added to the pre-recorded API path:
Not feasible (streaming API only): speech profiles/preseconds, VAD gating, multi-channel, onboarding mode.
Ref
Filed by @kelvin-agent, 2026-03-30. Code analysis from backend at HEAD.