Skip to content

feat(rag): improve multi-worker support and add search type options#29

Merged
larryro merged 2 commits into
mainfrom
skip-local-kuzu
Dec 19, 2025
Merged

feat(rag): improve multi-worker support and add search type options#29
larryro merged 2 commits into
mainfrom
skip-local-kuzu

Conversation

@larryro
Copy link
Copy Markdown
Collaborator

@larryro larryro commented Dec 19, 2025

Summary

This PR improves the RAG service's multi-worker support and adds configurable search type options.

Changes

Multi-Worker Lock Conflict Fix

  • Add _patch_remote_kuzu_adapter() to override RemoteKuzuAdapter.__init__ and skip local Kuzu database creation that causes file lock conflicts when running multiple uvicorn workers
  • Increase RAG_WORKERS from 1 to 2 in Dockerfile now that the lock conflict issue is resolved
  • The patch initializes only the minimal attributes needed for the remote adapter without calling the parent's _initialize_connection()

Search Type Support

  • Add SearchType enum with CHUNKS, GRAPH_COMPLETION, RAG_COMPLETION, SUMMARIES, GRAPH_SUMMARY_COMPLETION, and TEMPORAL options
  • Update search endpoint to accept search_type parameter (defaults to CHUNKS)
  • Map API search types to Cognee SearchType for proper query routing

Ingestion Timeout Handling

  • Add configurable ingestion_timeout_seconds setting (default: 3 hours)
  • Wrap cognee.add() and cognee.cognify() with asyncio.wait_for timeout
  • Improve logging for ingestion progress and search operations

Files Changed

  • services/rag/Dockerfile
  • services/rag/app/config.py
  • services/rag/app/models.py
  • services/rag/app/routers/search.py
  • services/rag/app/services/cognee/config.py
  • services/rag/app/services/cognee/service.py

Pull Request opened by Augment Code with guidance from the PR author

Summary by CodeRabbit

Release Notes

  • New Features

    • Multiple search strategies now available: comprehensive text chunks, knowledge graph-based reasoning, concise AI-generated answers, document summaries, and insights.
    • Configurable document ingestion timeout (default 3 hours).
  • Improvements

    • Enhanced stability for multi-worker deployments with improved timeout handling during document ingestion.

✏️ Tip: You can customize this high-level summary in your review settings.

- Add _patch_remote_kuzu_adapter() to override RemoteKuzuAdapter.__init__
  and skip local Kuzu database creation that causes file lock conflicts
  when running multiple uvicorn workers
- Increase RAG_WORKERS from 1 to 2 in Dockerfile now that the lock
  conflict issue is resolved
- The patch initializes only the minimal attributes needed for the
  remote adapter without calling the parent's _initialize_connection()
- Add SearchType enum with CHUNKS, GRAPH_COMPLETION, RAG_COMPLETION,
  SUMMARIES, GRAPH_SUMMARY_COMPLETION, and TEMPORAL options
- Update search endpoint to accept search_type parameter (defaults to CHUNKS)
- Map API search types to Cognee SearchType for proper query routing
- Add configurable ingestion_timeout_seconds setting (default: 3 hours)
- Wrap cognee.add() and cognee.cognify() with asyncio.wait_for timeout
- Improve logging for ingestion progress and search operations
@larryro larryro merged commit c58b424 into main Dec 19, 2025
1 check was pending
@larryro larryro deleted the skip-local-kuzu branch December 19, 2025 16:42
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Dec 19, 2025

Caution

Review failed

The pull request is closed.

📝 Walkthrough

Walkthrough

This pull request introduces multi-search-type support and timeout-based document ingestion protection to the RAG service. Changes include: increasing Docker default workers from 1 to 2, adding a configurable ingestion timeout setting, introducing a SearchType enum to support different query strategies (CHUNKS, GRAPH_COMPLETION, RAG_COMPLETION, SUMMARIES, INSIGHTS), implementing asyncio-based timeout wrapping around Cognee ingestion operations, patching RemoteKuzuAdapter to prevent file-lock conflicts in multi-worker setups, and plumbing search type mappings through the service layer.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Areas requiring extra attention:

  • services/rag/app/services/cognee/config.py — The RemoteKuzuAdapter patching mechanism with the _tale_patched guard flag requires verification that the patch is applied correctly and idempotently before Cognee import
  • services/rag/app/services/cognee/service.py — The timeout handling logic, particularly the remaining-time calculation between add and cognify operations, needs validation to ensure timeouts propagate correctly and the 60-second floor is appropriate
  • services/rag/app/services/cognee/service.py — The mapping from API SearchType to Cognee query_type values should be cross-referenced against Cognee's expected input to prevent runtime failures
  • services/rag/app/routers/search.py — Verify that the search_type parameter is correctly threaded from the request through to the service layer and that all search type values are documented

📜 Recent review details

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro (Legacy)

📥 Commits

Reviewing files that changed from the base of the PR and between 211baa2 and 7c43d96.

📒 Files selected for processing (6)
  • services/rag/Dockerfile (1 hunks)
  • services/rag/app/config.py (1 hunks)
  • services/rag/app/models.py (1 hunks)
  • services/rag/app/routers/search.py (1 hunks)
  • services/rag/app/services/cognee/config.py (2 hunks)
  • services/rag/app/services/cognee/service.py (6 hunks)

Comment @coderabbitai help to get the list of available commands and usage tips.

larryro added a commit that referenced this pull request May 17, 2026
Closes #3, #19, #20, #21, #22, #23, #24, #25, #26, #29, #39 — frontend
audio UX + resolver tests.

- `message-bubble.tsx` renders a single stable `<VoiceOutputIndicator>`
  per assistant message instead of three separate mounts (inline-
  streaming + two toolbar copies). The previous shape unmounted the
  inline indicator at streaming-end → triggered `stop()` → mounted a
  fresh toolbar indicator with a `mountTimeRef` captured AFTER all
  chunks were created → auto-play short-circuited and the user heard
  silence at the stream-end boundary. The single mount keeps
  `mountTimeRef` stable across both phases. (#3)
- `use-voice-output.ts` tracks every retry `setTimeout` id in a `Set`
  ref and clears them on unmount + on message change. The prior code
  let the 1.5s backoff timer fire after unmount and re-invoke
  `synthesize` against a dead component. (#19)
- `use-voice-output.ts` caps the synthesis queue at
  `MAX_TTS_QUEUE_DEPTH = 50`. When full, drops the new task and
  surfaces `QUEUE_OVERFLOW` via the error sink so the user sees why
  playback paused. `MAX_IN_FLIGHT` previously throttled concurrent
  dispatch but did not bound queue depth. (#20)
- `use-voice-output.ts` catch branch now falls back to
  `'UNKNOWN_NETWORK'` when `extractConvexErrorCode` returns undefined
  (network drop, action timeout). Previously the only signal was
  `console.error`; the indicator stayed stuck with no actionable
  message. (#21)
- `use-voice-output-player.ts` re-calls `primeAudio(el)` at the start
  of every `play()` invocation and drops the `el.load()` in `stop()`.
  Together these stop iOS Safari from expiring the user-activation
  token between messages of a session. (#22)
- `voice-output-context.tsx` + `prime-audio.ts`: per-provider audio
  element ownership. Each `<VoiceOutputProvider>` constructs its own
  `<audio>` via `useMemo` and exposes it via `useVoiceAudioElement()`.
  The prior module-level singleton meant arena split-view's two
  providers stomped each other's `src` mid-playback. `primeAudio(el?)`
  now takes the element to pre-warm; callers without a provider scope
  (settings page) call it with `undefined` and only the AudioContext
  is banked. (#23)
- `voice-output-indicator.tsx` classifies error codes into
  `retryable | config | terminal`. Config codes (NO_PROVIDER,
  HOST_POLICY, forbidden) render a `<Link>` to Settings → AI
  providers; terminal codes (BUDGET_EXCEEDED, QUEUE_OVERFLOW, char-
  cap) render a non-interactive `<Badge>`. Only retryable codes keep
  the click-to-retry button. Stops the tap→fail→tap→fail loop on
  unrecoverable errors. (#24)
- `voice-output-announcer.tsx` now reads `{ state, errorCode }` from
  the announcer store and speaks the per-code reason on transitions
  into `'error'` (e.g. "Voice provider not configured"). Screen-
  reader users on touch devices — where the indicator's per-code
  tooltip is unreachable — now hear the actionable reason instead of
  the generic "Voice output failed". (#25)
- `personalization-settings.tsx` composes the `providerUnavailable`
  hint into the Switch's `description` prop (a ReactNode) when
  `providerAvailable === false`. The hint now lands in the same
  `aria-describedby` block as the base description, so SR focus on
  the Switch reads it. The duplicate sibling `<Text>` is removed. (#26)
- `voice-output-announcer.tsx` drains announcements through a small
  queue with a 1500ms hold per entry. Rapid transitions
  (playing → blocked → error in <1500ms) no longer clobber the
  previous text mid-utterance; each entry plays in order. (#39)
- `resolve_tts_model.test.ts` adds the missing call-contract assertions
  (tag=text-to-speech, orgSlug propagation, providerName propagation
  on a pinned-provider call) and three failure-path tests that pin
  the resolver's re-throw behaviour for UNKNOWN_MODEL,
  UNKNOWN_PROVIDER, and plain rejections. Without these, a regression
  that hard-coded `tag: 'chat'` or dropped `orgSlug` would have passed
  every prior test silently. (#29)
- i18n: `voiceOutputErrorConfig`, `voiceOutputErrorOpenSettings`,
  `voiceOutputErrorQueueOverflow`, `voiceOutputErrorNetwork` added to
  en/de/fr. The pre-existing orphan `voiceOutputErrorProvider` is
  removed (superseded by `voiceOutputErrorConfig`).
larryro added a commit that referenced this pull request May 17, 2026
Closes #3, #19, #20, #21, #22, #23, #24, #25, #26, #29, #39 — frontend
audio UX + resolver tests.

- `message-bubble.tsx` renders a single stable `<VoiceOutputIndicator>`
  per assistant message instead of three separate mounts (inline-
  streaming + two toolbar copies). The previous shape unmounted the
  inline indicator at streaming-end → triggered `stop()` → mounted a
  fresh toolbar indicator with a `mountTimeRef` captured AFTER all
  chunks were created → auto-play short-circuited and the user heard
  silence at the stream-end boundary. The single mount keeps
  `mountTimeRef` stable across both phases. (#3)
- `use-voice-output.ts` tracks every retry `setTimeout` id in a `Set`
  ref and clears them on unmount + on message change. The prior code
  let the 1.5s backoff timer fire after unmount and re-invoke
  `synthesize` against a dead component. (#19)
- `use-voice-output.ts` caps the synthesis queue at
  `MAX_TTS_QUEUE_DEPTH = 50`. When full, drops the new task and
  surfaces `QUEUE_OVERFLOW` via the error sink so the user sees why
  playback paused. `MAX_IN_FLIGHT` previously throttled concurrent
  dispatch but did not bound queue depth. (#20)
- `use-voice-output.ts` catch branch now falls back to
  `'UNKNOWN_NETWORK'` when `extractConvexErrorCode` returns undefined
  (network drop, action timeout). Previously the only signal was
  `console.error`; the indicator stayed stuck with no actionable
  message. (#21)
- `use-voice-output-player.ts` re-calls `primeAudio(el)` at the start
  of every `play()` invocation and drops the `el.load()` in `stop()`.
  Together these stop iOS Safari from expiring the user-activation
  token between messages of a session. (#22)
- `voice-output-context.tsx` + `prime-audio.ts`: per-provider audio
  element ownership. Each `<VoiceOutputProvider>` constructs its own
  `<audio>` via `useMemo` and exposes it via `useVoiceAudioElement()`.
  The prior module-level singleton meant arena split-view's two
  providers stomped each other's `src` mid-playback. `primeAudio(el?)`
  now takes the element to pre-warm; callers without a provider scope
  (settings page) call it with `undefined` and only the AudioContext
  is banked. (#23)
- `voice-output-indicator.tsx` classifies error codes into
  `retryable | config | terminal`. Config codes (NO_PROVIDER,
  HOST_POLICY, forbidden) render a `<Link>` to Settings → AI
  providers; terminal codes (BUDGET_EXCEEDED, QUEUE_OVERFLOW, char-
  cap) render a non-interactive `<Badge>`. Only retryable codes keep
  the click-to-retry button. Stops the tap→fail→tap→fail loop on
  unrecoverable errors. (#24)
- `voice-output-announcer.tsx` now reads `{ state, errorCode }` from
  the announcer store and speaks the per-code reason on transitions
  into `'error'` (e.g. "Voice provider not configured"). Screen-
  reader users on touch devices — where the indicator's per-code
  tooltip is unreachable — now hear the actionable reason instead of
  the generic "Voice output failed". (#25)
- `personalization-settings.tsx` composes the `providerUnavailable`
  hint into the Switch's `description` prop (a ReactNode) when
  `providerAvailable === false`. The hint now lands in the same
  `aria-describedby` block as the base description, so SR focus on
  the Switch reads it. The duplicate sibling `<Text>` is removed. (#26)
- `voice-output-announcer.tsx` drains announcements through a small
  queue with a 1500ms hold per entry. Rapid transitions
  (playing → blocked → error in <1500ms) no longer clobber the
  previous text mid-utterance; each entry plays in order. (#39)
- `resolve_tts_model.test.ts` adds the missing call-contract assertions
  (tag=text-to-speech, orgSlug propagation, providerName propagation
  on a pinned-provider call) and three failure-path tests that pin
  the resolver's re-throw behaviour for UNKNOWN_MODEL,
  UNKNOWN_PROVIDER, and plain rejections. Without these, a regression
  that hard-coded `tag: 'chat'` or dropped `orgSlug` would have passed
  every prior test silently. (#29)
- i18n: `voiceOutputErrorConfig`, `voiceOutputErrorOpenSettings`,
  `voiceOutputErrorQueueOverflow`, `voiceOutputErrorNetwork` added to
  en/de/fr. The pre-existing orphan `voiceOutputErrorProvider` is
  removed (superseded by `voiceOutputErrorConfig`).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant