Desktop: Batch transcription 413 on long speech chunks (3.2MB, 50s+) — 5.5K events

## Problem

The desktop batch transcription fails with HTTP 413 ("Failed to buffer the request body: length limit exceeded") when the VAD gate accumulates long speech chunks:

- **OMI-DESKTOP-10**: 413 Payload Too Large
- **5,574 events** across **70 users**
- Sentry breadcrumb: VADGate batch speech chunk complete **3,208,532 bytes (50.1s)** -> TranscriptionService batch transcribing -> 413 rejected

The VAD gate collects ~3.2MB stereo PCM audio with no size limit, but the backend (or its reverse proxy) rejects payloads above a threshold.

## Root Cause Analysis

Traced through `VADGateService.swift`, `TranscriptionService.swift`, and `AppState.swift`:

### 1. VAD gate has no maximum chunk size
`VADGateService.swift`: Speech audio is accumulated in `batchAudioBuffer` (line 262) until 2+ seconds of silence (hangover timeout, line 207: `batchHangoverMs = 2000`). There is **no maximum duration or size limit**. A user speaking continuously for 50+ seconds produces a single 3.2MB chunk.

### 2. Stereo format doubles payload size
Audio format: stereo Int16 PCM at 16kHz = 64 KB/s. For 50.1 seconds: `50.1 x 16000 x 4 bytes = 3,206,400 bytes (~3.2 MB)`.

### 3. Single HTTP POST with no chunking
`TranscriptionService.batchTranscribeFull()` (line 639-737) sends the entire buffer as a single `POST` request with `Content-Type: application/octet-stream` to `/v1/proxy/deepgram/v1/listen`. There is **no logic to split large audio into smaller chunks** before upload.

### 4. Backend body size limit
The 413 indicates a body size limit at the backend or its reverse proxy (nginx, GCP Cloud Load Balancer, or Cloud Run). The exact limit is between 1-5MB. Cloud Run default is 32MB but nginx proxy_pass or middleware may impose tighter limits.

### 5. No retry with smaller chunks
When a 413 is received, the client logs the error and throws `TranscriptionError.invalidResponse`. No fallback to split and retry. The audio is lost.

## Proposed Fix

### Client-side (recommended primary fix)
1. **Add max chunk duration in VAD gate** — cap at 30 seconds (1.92MB stereo). When buffer exceeds this, emit the chunk as complete and start a new accumulation
2. **Add chunk splitting in TranscriptionService** — if audio exceeds a size threshold (e.g., 2MB), split into overlapping segments (with 1-2s overlap for context) and transcribe separately, then merge results
3. **Handle 413 gracefully** — on 413 response, split the payload in half and retry each half

### Backend-side (defense in depth)
4. **Increase body size limit** — if the proxy or middleware has a limit below 5MB, increase it to at least 10MB
5. **Add streaming upload support** — accept chunked transfer encoding for large audio payloads

## Key Files

- `desktop/Desktop/Sources/VADGateService.swift` — lines 554-684 (batch audio accumulation, no size limit)
- `desktop/Desktop/Sources/TranscriptionService.swift` — lines 639-737 (batchTranscribeFull, single POST)
- `desktop/Desktop/Sources/AppState.swift` — lines 1434-1438 (batchTranscribeChunk caller)

_by AI for @beastoin_


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Desktop: Batch transcription 413 on long speech chunks (3.2MB, 50s+) — 5.5K events #6195

Problem

Root Cause Analysis

1. VAD gate has no maximum chunk size

2. Stereo format doubles payload size

3. Single HTTP POST with no chunking

4. Backend body size limit

5. No retry with smaller chunks

Proposed Fix

Client-side (recommended primary fix)

Backend-side (defense in depth)

Key Files

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Desktop: Batch transcription 413 on long speech chunks (3.2MB, 50s+) — 5.5K events #6195

Description

Problem

Root Cause Analysis

1. VAD gate has no maximum chunk size

2. Stereo format doubles payload size

3. Single HTTP POST with no chunking

4. Backend body size limit

5. No retry with smaller chunks

Proposed Fix

Client-side (recommended primary fix)

Backend-side (defense in depth)

Key Files

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions