Add STT circuit breaker + degraded mode for Deepgram connection failures

When Deepgram hits its concurrent connection limit (or any DG outage), backend-listen closes the client WebSocket with 1011. The client immediately reconnects, spawning a new session that tries DG again — creating a self-amplifying reconnect storm that escalated the Mar 25 incident from a DG capacity issue into a 2.5-hour total outage. Related: #6030 (pusher circuit breaker).

### Current Behavior
- `connect_to_deepgram_with_backoff()` retries 3x with exponential backoff (~8-10s total) — if all fail, the exception propagates up and the session closes with code 1011 (`transcribe.py:1125`)
- Client reconnects on 1011, creating a new session → new DG connection attempt → rejected again → infinite loop
- Each failed cycle also creates a new pusher connection (CB stays CLOSED due to `record_success()` clearing failures), producing zombie connections
- No fallback to alternative STT providers when DG is unavailable
- Existing DG connections stay alive via `SafeDeepgramSocket` keepalive thread, so old connections are never released while new ones are rejected

### Expected Behavior
When DG connections fail, keep the client WebSocket open in a degraded transcription mode and retry DG server-side with backoff — breaking the client reconnect storm feedback loop.

### Affected Areas
| File | Line | Description |
|------|------|-------------|
| `backend/utils/stt/streaming.py` | 412-440 | `connect_to_deepgram_with_backoff()` — raises on exhaustion, no CB |
| `backend/routers/transcribe.py` | 1123-1127 | Exception handler closes session with 1011 on DG init failure |
| `backend/routers/transcribe.py` | 710-729 | `_create_conversation_fallback()` — SYNC `process_conversation()` blocks event loop |
| `backend/utils/pusher.py` | 35-103 | Existing `PusherCircuitBreaker` — pattern to follow for DG |
| `backend/utils/stt/safe_socket.py` | 75-84 | Keepalive thread holds DG connections alive indefinitely |

### Solution
1. **DG circuit breaker** (pod-level singleton, same pattern as `PusherCircuitBreaker`): track DG connection failures across sessions. When failure rate exceeds threshold, trip OPEN and fast-fail new DG attempts instead of burning 8-10s on retries that will fail.
2. **Degraded transcription mode**: when DG CB is OPEN, keep client WS alive, send a `service_status: stt_degraded` event to the client, buffer audio, and retry DG server-side with jitter + capped concurrency on HALF_OPEN probe.
3. **Don't close with 1011 on DG failure**: remove the `websocket.close(code=1011)` path for STT init errors. Instead, enter degraded mode and keep the session open.

### Files to Modify
- `backend/utils/stt/streaming.py` — add `DeepgramCircuitBreaker` singleton, check before connect attempts
- `backend/routers/transcribe.py` — replace 1011 close with degraded mode entry on DG failure, add server-side DG retry loop
- `backend/utils/stt/safe_socket.py` — release DG connection on CB trip (call `finish()` to free connection slots)

### Impact
Breaks the self-amplifying feedback loop that turned a DG capacity limit into a full outage. Client sessions stay alive during DG degradation, eliminating the reconnect storm. No impact on normal operation — CB stays CLOSED when DG is healthy.

---
_by AI for @beastoin_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add STT circuit breaker + degraded mode for Deepgram connection failures #6052

Current Behavior

Expected Behavior

Affected Areas

Solution

Files to Modify

Impact

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

File	Line	Description
`backend/utils/stt/streaming.py`	412-440	`connect_to_deepgram_with_backoff()` — raises on exhaustion, no CB
`backend/routers/transcribe.py`	1123-1127	Exception handler closes session with 1011 on DG init failure
`backend/routers/transcribe.py`	710-729	`_create_conversation_fallback()` — SYNC `process_conversation()` blocks event loop
`backend/utils/pusher.py`	35-103	Existing `PusherCircuitBreaker` — pattern to follow for DG
`backend/utils/stt/safe_socket.py`	75-84	Keepalive thread holds DG connections alive indefinitely

Add STT circuit breaker + degraded mode for Deepgram connection failures #6052

Description

Current Behavior

Expected Behavior

Affected Areas

Solution

Files to Modify

Impact

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions