refactor(rag): modularize main.py into routers and utilities#15
Conversation
- Extract endpoint handlers into separate router modules (health, documents, search, jobs) and register them via app.include_router() - Move cognee_service from services/cognee_service.py to services/cognee/ - Extract memory cleanup utilities to utils module - Add vector_db_url and vector_db_collection_name settings to config - Export routers from routers/__init__.py - Remove obsolete test-setup.sh script - Clean up imports and prefix unused parameters with underscore
📝 WalkthroughWalkthroughThis pull request refactors the RAG service architecture through modularization and simplification. The changes include: (1) restructuring the monolithic Cognee service into a package with separated concerns (initialization, configuration, service logic, cleanup utilities); (2) migrating endpoint implementations from main.py to a router-based architecture (health, documents, search, jobs routers); (3) simplifying configuration by removing database pooling fields, provider-specific auth keys, and legacy multi-provider support, while adding methods for database URL validation and LLM configuration; (4) introducing application lifecycle management via an async context manager for Cognee initialization at startup; (5) adding memory cleanup instrumentation for debugging; and (6) removing the test setup bash script and simplifying Docker environment variables. Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
Possibly related PRs
Comment |
There was a problem hiding this comment.
Actionable comments posted: 27
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (6)
services/rag/Dockerfile (1)
47-72: Fix HEALTHCHECK to respectRAG_PORT(currently hardcoded to 8001).
IfRAG_PORTis overridden, the container can be marked unhealthy despite serving correctly.-HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \ - CMD curl -f http://localhost:8001/health || exit 1 +HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \ + CMD sh -c 'curl -fsS "http://localhost:${RAG_PORT:-8001}/health" || exit 1'Also consider pinning/verifying the
uvinstall script (Line 17) to reduce supply-chain risk (e.g., versioned URL + checksum).services/rag/app/config.py (2)
144-148:get_allowed_origins_list()should drop empty entries.
Trailing commas will produce""origins.- return [origin.strip() for origin in self.allowed_origins.split(",")] + return [o for o in (origin.strip() for origin in self.allowed_origins.split(",")) if o]
85-142:get_llm_config()incorrectly omitsbase_urlwhen no API key is set, breaking Ollama and other local OpenAI-compatible setups.The
base_urlis only added to the config dict whenapi_keyis truthy (lines 138–141). Local providers like Ollama that useOPENAI_BASE_URLwithout an API key will not receive the base URL configuration, causing requests to route to the wrong endpoint.Additionally, lines 103–104 and 110–111 call
int(max_tokens_env)andfloat(temperature_env)without error handling. Invalid environment variable values will raiseValueErrorat startup with an unclear message; wrap these conversions in try/except and raise a descriptiveValueErrorlike"OPENAI_MAX_TOKENS must be a valid integer".- if api_key: - config["api_key"] = api_key - config["base_url"] = base_url + if api_key: + config["api_key"] = api_key + + # Base URL may be required even without an API key (e.g., Ollama/local gateways) + if base_url: + config["base_url"] = base_urlAnd guard the conversions:
if max_tokens_env is not None: try: max_tokens = int(max_tokens_env) except ValueError: raise ValueError("OPENAI_MAX_TOKENS must be a valid integer") if temperature_env is not None: try: temperature = float(temperature_env) except ValueError: raise ValueError("OPENAI_TEMPERATURE must be a valid float")services/rag/app/main.py (3)
26-42: Uselogger.exception(...)(orlogger.opt(exception=e)) for startup failures.
Inlifespan(),logger.error(f"...{e}")loses the stack trace; this makes init failures hard to debug.
72-83: Ensure memory cleanup runs even when request handling raises.
cleanup_memory(...)won’t run ifcall_nextraises. Wrap intry/finallyand make cleanup best-effort.@app.middleware("http") async def memory_cleanup_middleware(request: Request, call_next): ... - response = await call_next(request) - cleanup_memory(context=f"after request {request.url.path}") - return response + try: + return await call_next(request) + finally: + try: + cleanup_memory(context=f"after request {request.url.path}") + except Exception: + logger.exception("cleanup_memory failed")
98-109:logurudoesn’t honorexc_info=True; fix stacktrace logging + debug check.
Also,settings.log_level == "debug"is case-sensitive; if config is"DEBUG"you’ll never includedetails.@app.exception_handler(Exception) async def general_exception_handler(_request, exc): """Handle general exceptions.""" - logger.error(f"Unhandled exception: {exc}", exc_info=True) + logger.opt(exception=exc).error("Unhandled exception") return JSONResponse( status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, content=ErrorResponse( error=exc.__class__.__name__, message="Internal server error", - details={"error": str(exc)} if settings.log_level == "debug" else None, + details={"error": str(exc)} if settings.log_level.lower() == "debug" else None, ).model_dump(), )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro (Legacy)
📒 Files selected for processing (17)
services/rag/Dockerfile(1 hunks)services/rag/app/config.py(3 hunks)services/rag/app/main.py(4 hunks)services/rag/app/models.py(0 hunks)services/rag/app/routers/__init__.py(1 hunks)services/rag/app/routers/documents.py(1 hunks)services/rag/app/routers/health.py(1 hunks)services/rag/app/routers/jobs.py(1 hunks)services/rag/app/routers/search.py(1 hunks)services/rag/app/services/cognee/__init__.py(1 hunks)services/rag/app/services/cognee/cleanup.py(1 hunks)services/rag/app/services/cognee/config.py(1 hunks)services/rag/app/services/cognee/service.py(1 hunks)services/rag/app/services/cognee/utils.py(1 hunks)services/rag/app/services/cognee_service.py(0 hunks)services/rag/app/utils.py(1 hunks)services/rag/test-setup.sh(0 hunks)
💤 Files with no reviewable changes (3)
- services/rag/test-setup.sh
- services/rag/app/models.py
- services/rag/app/services/cognee_service.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*
📄 CodeRabbit inference engine (.cursor/rules/workspace_rules.mdc)
Use English only for ALL user-facing content including UI components, labels, buttons, dialogs, forms, toast messages, error messages, success messages, comments, documentation, README files, variable names, function names, and type names
Files:
services/rag/app/routers/__init__.pyservices/rag/app/routers/health.pyservices/rag/app/services/cognee/cleanup.pyservices/rag/Dockerfileservices/rag/app/routers/search.pyservices/rag/app/routers/documents.pyservices/rag/app/utils.pyservices/rag/app/services/cognee/utils.pyservices/rag/app/services/cognee/__init__.pyservices/rag/app/services/cognee/service.pyservices/rag/app/services/cognee/config.pyservices/rag/app/main.pyservices/rag/app/routers/jobs.pyservices/rag/app/config.py
🧬 Code graph analysis (6)
services/rag/app/routers/__init__.py (3)
services/rag/app/routers/health.py (1)
health(30-36)services/rag/app/routers/search.py (1)
search(21-58)services/rag/app/services/cognee/service.py (1)
search(108-167)
services/rag/app/routers/search.py (2)
services/rag/app/models.py (4)
QueryRequest(150-168)QueryResponse(185-191)GenerateRequest(198-216)GenerateResponse(219-225)services/rag/app/services/cognee/service.py (2)
search(108-167)generate(169-281)
services/rag/app/routers/documents.py (3)
services/rag/app/models.py (6)
DocumentAddRequest(43-57)DocumentAddResponse(60-84)DocumentDeleteRequest(87-90)DocumentDeleteResponse(93-97)BatchAddRequest(232-237)BatchAddResponse(240-246)services/rag/app/utils.py (1)
cleanup_memory(33-43)services/rag/app/services/cognee/service.py (1)
add_document(57-106)
services/rag/app/services/cognee/service.py (2)
services/rag/app/services/cognee/utils.py (2)
normalize_add_result(12-53)normalize_search_results(96-105)services/rag/app/config.py (1)
get_llm_config(85-142)
services/rag/app/services/cognee/config.py (1)
services/rag/app/config.py (2)
get_llm_config(85-142)get_database_url(79-83)
services/rag/app/routers/jobs.py (1)
services/rag/app/models.py (1)
JobStatus(114-143)
🪛 Ruff (0.14.8)
services/rag/app/routers/__init__.py
8-8: __all__ is not sorted
Apply an isort-style sorting to __all__
(RUF022)
services/rag/app/routers/health.py
3-3: typing.Dict is deprecated, use dict instead
(UP035)
services/rag/app/services/cognee/cleanup.py
55-55: Do not catch blind exception: Exception
(BLE001)
128-128: Do not catch blind exception: Exception
(BLE001)
services/rag/app/routers/search.py
53-53: Do not catch blind exception: Exception
(BLE001)
55-58: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
57-57: Use explicit conversion flag
Replace with conversion flag
(RUF010)
91-91: Do not catch blind exception: Exception
(BLE001)
93-96: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
95-95: Use explicit conversion flag
Replace with conversion flag
(RUF010)
services/rag/app/routers/documents.py
5-5: typing.Dict is deprecated, use dict instead
(UP035)
34-34: Async functions should not open files with blocking methods like open
(ASYNC230)
96-96: Do not catch blind exception: Exception
(BLE001)
127-127: Do not perform function call File in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
151-154: Abstract raise to an inner function
(TRY301)
161-161: Async functions should not open files with blocking methods like open
(ASYNC230)
170-173: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
211-211: Do not catch blind exception: Exception
(BLE001)
226-226: Do not catch blind exception: Exception
(BLE001)
257-257: Do not catch blind exception: Exception
(BLE001)
259-262: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
261-261: Use explicit conversion flag
Replace with conversion flag
(RUF010)
276-276: Do not catch blind exception: Exception
(BLE001)
278-281: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
280-280: Use explicit conversion flag
Replace with conversion flag
(RUF010)
307-307: Do not catch blind exception: Exception
(BLE001)
314-314: Use explicit conversion flag
Replace with conversion flag
(RUF010)
services/rag/app/utils.py
29-29: Do not catch blind exception: Exception
(BLE001)
services/rag/app/services/cognee/utils.py
7-7: typing.Dict is deprecated, use dict instead
(UP035)
7-7: typing.List is deprecated, use list instead
(UP035)
13-13: Dynamically typed expressions (typing.Any) are disallowed in result
(ANN401)
48-48: Do not catch blind exception: Exception
(BLE001)
50-50: Do not catch blind exception: Exception
(BLE001)
56-56: Dynamically typed expressions (typing.Any) are disallowed in result
(ANN401)
85-85: Do not catch blind exception: Exception
(BLE001)
services/rag/app/services/cognee/__init__.py
21-21: __all__ is not sorted
Apply an isort-style sorting to __all__
(RUF022)
services/rag/app/services/cognee/service.py
8-8: typing.Dict is deprecated, use dict instead
(UP035)
8-8: typing.List is deprecated, use list instead
(UP035)
60-60: Unused method argument: metadata
(ARG002)
97-102: Consider moving this statement to an else block
(TRY300)
113-113: Unused method argument: filters
(ARG002)
154-154: Consider moving this statement to an else block
(TRY300)
258-263: Consider moving this statement to an else block
(TRY300)
300-303: Consider moving this statement to an else block
(TRY300)
services/rag/app/services/cognee/config.py
30-30: Dynamically typed expressions (typing.Any) are disallowed in *args
(ANN401)
30-30: Dynamically typed expressions (typing.Any) are disallowed in **kwargs
(ANN401)
30-30: Dynamically typed expressions (typing.Any) are disallowed in _safe_encoding_for_model
(ANN401)
33-33: Do not catch blind exception: Exception
(BLE001)
61-61: Avoid specifying long messages outside the exception class
(TRY003)
139-143: Avoid specifying long messages outside the exception class
(TRY003)
147-147: Avoid specifying long messages outside the exception class
(TRY003)
149-149: Avoid specifying long messages outside the exception class
(TRY003)
151-151: Avoid specifying long messages outside the exception class
(TRY003)
153-153: Avoid specifying long messages outside the exception class
(TRY003)
228-228: Do not catch blind exception: Exception
(BLE001)
247-247: Unused noqa directive (non-enabled: F401)
Remove unused noqa directive
(RUF100)
249-249: Consider moving this statement to an else block
(TRY300)
services/rag/app/config.py
59-59: Probable insecure usage of temporary file or directory: "/tmp/tale-rag-data"
(S108)
🔇 Additional comments (11)
services/rag/app/services/cognee/utils.py (1)
96-105:normalize_search_results()is fine.services/rag/app/routers/health.py (1)
39-55: Double-check/configis safe to expose in your deployment model.
It includes runtime knobs (and model names). If this service is internet-facing, consider gating behind auth or restricting to internal networks.
Based on coding guidelines, keep all exposed error/config strings in English (looks good here).services/rag/app/routers/documents.py (1)
65-123: Queued ingestion flow looks good; ensure job_id/document_id semantics are intentional.
Using a stabledoc_idand immediately creating a queued job record is a good UX for callers polling status.services/rag/app/services/cognee/config.py (2)
206-230: Fail-fast vs continue: decide if base_config preconfiguration is required.
configure_cognee_base_config()swallows all exceptions; if cognee requires these directories, you may want to re-raise to avoid a partially configured runtime.
232-253: Handle non-ImportError failures duringimport cogneeintentionally.
Ifcogneeraises at import time for reasons other thanImportError(bad env, incompatible deps), you currently crash the process. Confirm whether that’s desired; otherwise catch broader import-time exceptions and returnFalsewith a logged stack trace.services/rag/app/services/cognee/service.py (6)
18-22: LGTM!Good design decision to use a single shared dataset. The comment clearly explains the rationale for avoiding per-document datasets to keep memory usage bounded.
25-30: LGTM!Clean initialization with a simple state flag.
32-55: LGTM!Good initialization workflow with proper cleanup of legacy data and error handling.
76-106: LGTM!Good use of incremental loading to avoid reprocessing the entire dataset. The timing instrumentation and result normalization are well implemented.
309-321: LGTM!Clean implementation with proper error handling for the reset operation.
156-167: Error handling is appropriate given Cognee's API constraints.The string-based check for
"DatabaseNotCreatedError"is necessary because this exception is not exposed as a public, importable type in Cognee's API (v0.4.0). The defensive error message handling is pragmatic and correct.
- config.py: Change EMBEDDING_PROVIDER and EMBEDDING_MODEL to use
os.environ.setdefault() for consistency with other env vars
- config.py: Fix all logger calls to use brace-style {} formatting
instead of percent-style %s
- cleanup.py: Fix remaining logger call to use brace-style formatting
Additional Fixes Pushed (Commit 0b716de)Addressed remaining CodeRabbit review feedback: config.py
cleanup.py
All 27 CodeRabbit review comments have been addressed. The changes are now complete and ready for final review. |
When the cognee package is not installed, provide a _UnavailableCogneeService placeholder class instead of None. This allows: - cognee_service.initialized to return False (no AttributeError) - Health endpoint to report 'degraded' status safely - Startup to call initialize() without crashing - All service methods to return appropriate failure responses Addresses CodeRabbit review feedback on cognee/__init__.py.
Closes #15, #16, #17, #18, #41 — secret sanitiser + audio-fetch headers. - `sanitize_secrets.ts` JSON-value class now honours backslash-escaped characters: `(?:[^"\\]|\\.)*` instead of `[^"]*`. Previously a value containing `\"` (an escaped quote inside the string) terminated the redaction at the first inner quote and left the rest of the JSON object exposed. Test added that pins the new behaviour. - `sanitize_secrets.ts` patterns extended with Stripe (`sk_/pk_/rk_` × `live/test`), OpenAI org/project identifiers (`org-…`, `proj_…`), Cookie / Set-Cookie header lines, and bare Better Auth session tokens. An idempotency check (sanitize ∘ sanitize = sanitize) and a 100k-char ReDoS guard round out the new coverage. - `http.ts /api/tts-audio` moves the per-IP rate limit ABOVE the Better Auth session lookup. An unauthenticated attacker hammering the route was previously forcing a DB session-query per request — mirrors the `/storage` route's order. - `http.ts /api/tts-audio` Cache-Control changes from `private, max-age=600` to `private, max-age=0, must-revalidate`. A removed member can no longer replay cached audio for up to 10 minutes past their removal; the revalidating round-trip re-hits the membership gate on every replay. - `http.ts /api/tts-audio` sets `Accept-Ranges: none` so iOS Safari `<audio preload="auto">` range probes don't trigger audible mid-chunk restarts.
Closes #15, #16, #17, #18, #41 — secret sanitiser + audio-fetch headers. - `sanitize_secrets.ts` JSON-value class now honours backslash-escaped characters: `(?:[^"\\]|\\.)*` instead of `[^"]*`. Previously a value containing `\"` (an escaped quote inside the string) terminated the redaction at the first inner quote and left the rest of the JSON object exposed. Test added that pins the new behaviour. - `sanitize_secrets.ts` patterns extended with Stripe (`sk_/pk_/rk_` × `live/test`), OpenAI org/project identifiers (`org-…`, `proj_…`), Cookie / Set-Cookie header lines, and bare Better Auth session tokens. An idempotency check (sanitize ∘ sanitize = sanitize) and a 100k-char ReDoS guard round out the new coverage. - `http.ts /api/tts-audio` moves the per-IP rate limit ABOVE the Better Auth session lookup. An unauthenticated attacker hammering the route was previously forcing a DB session-query per request — mirrors the `/storage` route's order. - `http.ts /api/tts-audio` Cache-Control changes from `private, max-age=600` to `private, max-age=0, must-revalidate`. A removed member can no longer replay cached audio for up to 10 minutes past their removal; the revalidating round-trip re-hits the membership gate on every replay. - `http.ts /api/tts-audio` sets `Accept-Ranges: none` so iOS Safari `<audio preload="auto">` range probes don't trigger audible mid-chunk restarts.
Summary by CodeRabbit
Release Notes
New Features
Changes
✏️ Tip: You can customize this high-level summary in your review settings.