refactor(phase 8): service layer extraction#396
Conversation
Extract the OIDC Authorization-Code + PKCE flow and the auth-policy helpers out of the fat routers/auth.py into a testable orchestrator. services/orchestrators/auth_service.py owns every decision: PKCE/state generation, state-cookie round-trip, code exchange, user lookup / auto-provisioning, claim-mapping sync, opaque session issuance with Fernet-encrypted IdP tokens, back-channel + RP-initiated logout, plus pure policy helpers (ROLE_HIERARCHY, check_partition_access, require_admin, validate_file_quota, open-redirect sanitisation). It talks to the Phase 7 domain repositories (UserRepository, OIDCSessionRepository) instead of the Ray vectordb actor and deals in User / OIDCSession models rather than ad-hoc dicts. routers/auth.py drops from 636 to 233 lines and keeps HTTP transport only: the AUTH_MODE gate, cookie set/clear, the Secure-flag heuristic, and mapping OIDCFlowError to the previous responses verbatim. /auth/me is unchanged. OIDC config is built from env into OIDCConfig (added auto_provision_login field) in the container; it is not yet wired into the root Settings. ServiceContainer gains a lazy auth_service property; di/providers.py exposes the one-liner get_auth_service reading app.state.container, which main.py now attaches as best-effort minimal wiring. Activating the live OIDC flow (container.initialize) remains a Phase 11 concern; token mode is unaffected (those routes still 400 before any service). 19 new unit tests cover login, callback (happy / CSRF / unregistered / auto-provision / masked exchange failure / userinfo claim mapping), back-channel logout, logout URL construction, and the policy helpers. Full services/di/auth sweep: 291 passed. Layer guard clean.
Extract the non-delegation parts of routers/users.py into a testable
orchestrator: input validation (display_name length, email format), the
default-quota creation rule, and existence / not-found semantics. The
service talks to the Phase 7 UserRepository directly; response shapes
are unchanged (the repo *_dict helpers reproduce the legacy
PartitionFileManager dict contract) so clients are unaffected. Typed
core exceptions surface through the existing global OpenRAGError handler,
matching the legacy 404 behaviour that already went through it.
routers/users.py drops to a thin layer: request-scoped authz Depends
wrappers (unchanged, retired later), the two id==1 guard rules whose
exact HTTPException body the legacy returned, and response shaping.
GET /users/info stays in the router — it reads the TaskStateManager Ray
actor, which orchestrators must not touch (8H); it moves once the queue
is de-Ray'd.
Known temporary gap (deferred to 8B / PartitionService, per decision):
the legacy Ray delete_user cascade-deleted every partition the user
owned (Milvus + Postgres) before removing the row. UserService.delete_user
is a plain repo delete; the owner-partition cascade is reintroduced when
PartitionService lands. So deleting a partition-owning user via
DELETE /users/{id} temporarily leaves their owned partitions' Milvus
data in place until 8B.
ServiceContainer gains a lazy user_service property (constructor extended
with default_file_quota from settings.rdb for the creation default
rule); di/providers.py exposes the get_user_service one-liner.
21 new unit tests (create default-quota matrix + validation, read /
delete-without-cascade / regenerate / update dict shape). Full
services/di/auth sweep: 307 passed. Ruff + layer guard + 8H clean.
…ore delete-user cascade Extract partition CRUD, membership management, file/chunk reads and the relationship queries from routers/partition.py + the partition slice of the legacy Ray vectordb shim into PartitionService. It talks to the Phase 7 repos and the VectorStore port directly — delete_partition (the one cross-cutting op) drops vectors via the clean port (query_ids_by_filter + delete) then the relational rows, staying backend-agnostic. Chunk reads return plain dicts (no LangChain Document in orchestrators, 8H); the thin router keeps request.url_for link building and the byte-identical 409/404 HTTPException guards. Constructor takes two args beyond the plan's four, both to preserve legacy behaviour without reaching into Ray/config: collection (vector store collection name, from settings.vectordb) and user_repo (to reproduce the VDBUserNotFound 404 add_partition_member raised). Restores the owner-partition cascade dropped in 8A.2: UserService now composes PartitionService — delete_user deletes every partition the user owns (vectors + rows) before removing the user row, matching the legacy Ray delete_user. UserService ctor extended with partition_service + membership_repo; container wires it (lazy props resolve the order). routers/partition.py: 481 -> thin delegations. di adds the partition_service lazy property + get_partition_service provider. 23 new PartitionService unit tests + updated UserService cascade tests. Full orchestrators/di sweep: 97 passed. Ruff + layer guard + 8H clean.
Extract workspace CRUD, file association and the cross-cutting delete-with-orphan-cleanup from routers/workspaces.py + the workspace slice of the legacy Ray vectordb shim into WorkspaceService. The simple endpoints were already 1:1 repo delegations; the substantive piece is delete_workspace — it removes the workspace, then for every file the removal orphaned, deletes the chunks from the vector store (clean port: query_ids_by_filter + delete) and detaches/removes the relational rows, concurrently with per-file failures collected (not raised), matching the legacy router's asyncio.gather(..., return_exceptions=True). routers/workspaces.py drops all Ray plumbing (call_ray_actor_with_timeout, get_vectordb, the VECTORDB_TIMEOUT/config load) and becomes thin delegations; it keeps request schema validation, request-scoped authz, and the byte-identical HTTPException guards (409 duplicate, the require_workspace_in_partition 404, the unknown/missing-file 404s, the not-removed 404). require_workspace_in_partition now resolves the workspace through the service. Constructor takes collection (vector store name, from settings.vectordb) beyond the plan's three, to reproduce the legacy delete_file behaviour without reaching into Ray/config. di adds the workspace_service lazy property + get_workspace_service provider. 11 new WorkspaceService unit tests (delegations + the orphan-cleanup matrix incl. per-file failure collection). Full services/di/auth sweep: 333 passed. Ruff + layer guard + 8H clean.
Extract the retrieval path (was indexer.asearch Ray call + the legacy _expand_with_related_chunks shim) into RetrievalService, wrapping the clean core.retrieval pipeline (strategy + optional reranker + related/ ancestor expansion + RRF fusion). Exposes search() for routers/search.py and retrieve()/retrieve_multi()/retrieve_per_query()/fuse() for QueryService (8C.2). Searcher backing (logged, REFACTORING_DECISION_LOG Phase 8 entry 6): the core retriever talks to the RetrievalSearcher port; the only impl is the Ray-backed MilvusRayShim. Per the dev-workflow doc Ray cleanup is Phase 9 and orchestrators may call Ray behind a port during the shim, so the searcher is injected — the orchestrator file has no Ray remote-call and no Ray import (8H satisfied). Ctor therefore deviates from the plan's (vector_store, embedder_factory, ...) signature: with the shim searcher those are unused; it takes the built searcher/reranker/llm + config. A clean VectorStoreSearcher replaces the shim in Phase 9. routers/search.py drops all Ray plumbing and becomes thin: request-scoped authz, partition resolution, the byte-identical workspace-not-found 404 (now via WorkspaceService.get_workspace), url_for links, Chunk→response shaping. di adds the retrieval_service lazy property (builds the LLM + reranker from settings, MilvusRayShim from the Ray namespace) and the get_retrieval_service provider. 8 RetrievalService unit tests (search param/normalisation/expansion + retrieve/retrieve_multi/retrieve_per_query/fuse via the real core pipeline with a faked searcher). Full services/di/auth/core-retrieval sweep: 377 passed. Ruff + layer guard + 8H clean.
Rebuild the RAG core (RagPipeline + map_reduce) into QueryService: query generation, retrieve_multi/per-query + RRF (via RetrievalService), concurrent web search, map-reduce relevancy/summarisation, context + system-prompt assembly, and streaming. LangChain removed (logged, decision log Phase 8 entry 7): query generation (SearchQueries) and map-reduce (SummarizedChunk) now use the injected core LLM with a JSON-instructed prompt + response_format=json_object + json.loads into the Pydantic model, keeping the legacy fallbacks (retry → raw user query; relevancy=False on parse failure). No LangChain symbol is imported into the orchestrator; Chunk→Document conversion happens via Chunk.to_langchain() at the boundary so format_context / the source helpers are reused verbatim. Streaming + citations live in QueryService (entry 8): chat_stream drives the proven components.utils.stream_with_source_filtering (delicate 100-char [Sources: N] buffer reused as-is); chat/complete return the finalized OpenAI dict with citation-filtered extra. routers/openai.py drops the module-level ragpipe singleton and all RAG/citation logic — it keeps only model→partition resolution, token-limit validation, the /models listing (now via PartitionService), the request-bound __prepare_sources callable (keeps request.url_for in transport), and StreamingResponse/JSONResponse wrapping with the SSE error envelope. QueryService ctor also takes workspace_service (workspace validation without the Ray actor). Legacy components/pipeline.py + map_reduce.py + retriever.py are now dead code (flagged for Phase 12 cleanup). di adds the query_service lazy property (builds the core LLM + WebSearchFactory service) and the get_query_service provider. 11 QueryService unit tests (query-gen simple/chatbot/fallback, chat direct + with-partition citation filtering, complete, stream → [DONE], map-reduce keep/drop, helpers). Full services/di/auth/core-retrieval sweep: 388 passed. Ruff + layer guard + 8H clean.
…f routers/users.py
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 Walkthrough<review_stack_artifact> </review_stack_artifact> ✨ Finishing Touches🧪 Generate unit tests (beta)
|
There was a problem hiding this comment.
Actionable comments posted: 8
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
openrag/routers/indexer.py (1)
144-180:⚠️ Potential issue | 🟠 Major | ⚡ Quick winClean up staged uploads when post-save steps fail.
Line 144 and Line 269 persist files before all fail-fast checks/dispatch complete. Any later 4xx/5xx path (e.g., workspace validation failure or dispatch error) can leave orphaned files on disk.
💡 Suggested fix (centralized cleanup on failure)
async def add_file(...): - original_filename = file.filename - file.filename = sanitize_filename(file.filename) - try: - file_path = await save_file_to_disk(file, Path(DATA_DIR), with_random_prefix=True) + file_path = None + original_filename = file.filename + file.filename = sanitize_filename(file.filename) + try: + # validate workspace_ids first (unchanged logic) ... + file_path = await save_file_to_disk(file, Path(DATA_DIR), with_random_prefix=True) + task_id = await service.add_file( + file_path=str(file_path), + file_id=file_id, + partition=partition, + metadata=metadata, + sanitized_filename=file.filename, + original_filename=original_filename, + user=user, + workspace_ids=parsed_workspace_ids, + ) except Exception as e: + if file_path is not None and file_path.exists(): + file_path.unlink() logger.exception("Failed to save file to disk.", error=str(e)) raise HTTPException( status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, detail=str(e), ) async def put_file(...): - file_path = await save_file_to_disk(file, Path(DATA_DIR), with_random_prefix=True) - - task_id = await service.add_file(...) + file_path = None + try: + file_path = await save_file_to_disk(file, Path(DATA_DIR), with_random_prefix=True) + task_id = await service.add_file(...) + except Exception as e: + if file_path is not None and file_path.exists(): + file_path.unlink() + raiseAlso applies to: 269-280
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@openrag/routers/indexer.py` around lines 144 - 180, The code saves uploads to disk via save_file_to_disk (yielding file_path) before running follow-up checks (workspace validation via service.get_workspace and dispatch via service.add_file), which can leave orphaned files on any subsequent 4xx/5xx exit; update the handler to centrally clean up the staged file on failure: after calling save_file_to_disk track file_path and wrap the remaining logic (parsing workspace_ids, workspace lookup, and service.add_file) in a try/except/finally so that on any exception you remove the saved file (using the Path represented by file_path) before re-raising/returning the HTTPException; apply the same pattern to the other upload branch that saves files (the second save_file_to_disk usage referenced) so both paths remove their staged files on error.
🧹 Nitpick comments (9)
openrag/routers/openai.py (1)
50-53: 💤 Low valueDeprecated
on_eventusage flagged by CI.The
@router.on_event("startup")decorator is deprecated in recent FastAPI versions. The pipeline warns about this. Consider migrating to FastAPI's lifespan context manager pattern.♻️ Suggested migration to lifespan handler
+from contextlib import asynccontextmanager + +@asynccontextmanager +async def lifespan(app): + global _max_model_tokens + _max_model_tokens = await _fetch_max_model_tokens() + yield + -@router.on_event("startup") -async def _cache_max_model_tokens(): - global _max_model_tokens - _max_model_tokens = await _fetch_max_model_tokens()Then attach the lifespan to the router or app at mount time.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@openrag/routers/openai.py` around lines 50 - 53, The decorator `@router.on_event`("startup") is deprecated; replace this startup hook by creating an async lifespan context manager that calls _fetch_max_model_tokens and assigns the result to the module-level _max_model_tokens (i.e., replicate what _cache_max_model_tokens() did), and then attach that lifespan to the router/app when mounting this router. Specifically, remove the decorated _cache_max_model_tokens function, implement an async contextmanager (lifespan) that awaits _fetch_max_model_tokens() and sets _max_model_tokens before yielding, and ensure the router is created/attached with that lifespan so the token cache runs at startup.openrag/services/orchestrators/user_service.py (2)
109-111: 💤 Low valueClarify:
default_quota=0does not apply as a creation default.When
default_file_quotais 0 (a valid limit per guidelines), the condition> 0prevents it from being applied to new users. This means users created without an explicit quota will havefile_quota=Noneand inherit the global default at runtime. If the intent is thatdefault_quota=0should also be applied at creation time, change to>= 0. Otherwise, this is acceptable but worth a brief inline comment explaining the distinction.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@openrag/services/orchestrators/user_service.py` around lines 109 - 111, The creation logic skips applying a default when self._default_file_quota == 0 because it checks "> 0", so change the condition to apply zero as a valid default: update the block that reads "file_quota = fields.get('file_quota'); if self._default_file_quota > 0 and file_quota is None:" to use "if self._default_file_quota >= 0 and file_quota is None:" (or alternatively check "if self._default_file_quota is not None and file_quota is None:" if None is used to signal no default), and add a brief inline comment next to self._default_file_quota explaining that 0 is an intentional valid quota to be applied at creation.
193-211: 💤 Low value
update_userreturnsUsermodel butcreate_userreturns dict — inconsistent shapes.
create_userreturns the dict fromcreate_legacy_user, whileupdate_usermanually constructs a dict from theUsermodel. The router expects dicts for JSON serialization, soupdate_useris correctly converting. However, this inconsistency may cause confusion. Consider adding a comment or extracting a_user_to_dicthelper for consistency.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@openrag/services/orchestrators/user_service.py` around lines 193 - 211, The update_user function returns a dict constructed from the User model while create_user returns whatever create_legacy_user returns, creating inconsistent response shapes; extract a single helper (e.g., _user_to_dict) that accepts a User instance and returns the canonical dict with fields id, display_name, external_user_id, email, is_admin, created_at (isoformat or None), file_quota, file_count, then replace the manual dict construction in update_user with a call to _user_to_dict and update create_user to call the same helper (or add a short comment linking create_legacy_user to the helper if that return must remain special) so both code paths produce the same JSON shape.openrag/services/orchestrators/auth_service.py (1)
92-100: 💤 Low valueMisleading function name
_utcnowreturns local time.The function name suggests UTC, but
datetime.now()returns local time. While the docstring explains the rationale (matching tz-naive DB columns), the name creates confusion for maintainers. Consider renaming to_local_now()or_naive_now()to match the actual behavior.♻️ Suggested rename for clarity
-def _utcnow() -> datetime: - """Naive local ``now`` — matches the DB columns. +def _naive_now() -> datetime: + """Naive local ``now`` — matches the DB columns.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@openrag/services/orchestrators/auth_service.py` around lines 92 - 100, The function named `_utcnow` actually returns local naive time (datetime.now()) which is misleading; rename the function to a clearer identifier such as `_naive_now` or `_local_now`, update its docstring to reflect the new name, and update every usage/call site (all references to `_utcnow`) to the new name (e.g., `_naive_now`) to keep behavior unchanged; run/adjust any tests or imports that reference `_utcnow`.openrag/services/orchestrators/test_auth_service.py (2)
376-385: ⚡ Quick winUse specific exception type for quota test.
Line 382 uses
pytest.raises(Exception)which is too broad.♻️ Suggested fix
# Specific limit exceeded (3 indexed + 2 pending >= 5). - with pytest.raises(Exception): + with pytest.raises(OpenRAGError): AuthService.validate_file_quota({"file_count": 3, "file_quota": 5}, pending_task_count=2, default_quota=10)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@openrag/services/orchestrators/test_auth_service.py` around lines 376 - 385, The test uses a broad pytest.raises(Exception) in test_validate_file_quota; replace that with the exact exception type that AuthService.validate_file_quota actually raises (e.g., QuotaExceededError or ValueError) by importing that exception and using it in the with pytest.raises(...) assertion so the test asserts the specific failure mode from AuthService.validate_file_quota.
349-363: ⚡ Quick winUse specific exception types in
pytest.raisesfor clearer test intent.Lines 351, 360, and 382 use
pytest.raises(Exception)orOIDCFlowError.__bases__[0], which are broad or fragile. Using the specificAuthErrororOpenRAGErrorwould make test failures more informative and prevent false passes if a different exception is raised.♻️ Suggested fix
+from core.utils.exceptions import AuthError, OpenRAGError + def test_require_admin(): assert AuthService.require_admin({"is_admin": True}) == {"is_admin": True} - with pytest.raises(OIDCFlowError.__bases__[0]): # OpenRAGError subclass (AuthError) + with pytest.raises(AuthError): AuthService.require_admin({"is_admin": False}) def test_check_partition_access_role_hierarchy(): parts = [{"partition": "p1", "role": "viewer"}] assert AuthService.check_partition_access( user={"is_admin": False}, partition="p1", user_partitions=parts, required_role="viewer" ) - with pytest.raises(Exception): + with pytest.raises(AuthError): AuthService.check_partition_access( user={"is_admin": False}, partition="p1", user_partitions=parts, required_role="owner" )🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@openrag/services/orchestrators/test_auth_service.py` around lines 349 - 363, Replace the fragile/overbroad exception assertions in tests by asserting the specific error type: in test_require_admin replace pytest.raises(OIDCFlowError.__bases__[0]) with pytest.raises(AuthError) (import AuthError if needed), and in test_check_partition_access_role_hierarchy replace pytest.raises(Exception) with pytest.raises(AuthError) (or pytest.raises(OpenRAGError) if the implementation raises the more general type); update imports to bring AuthError/OpenRAGError into scope and run tests to confirm.openrag/services/orchestrators/test_partition_service.py (2)
83-88: ⚡ Quick win
FakeVectorStoredropscollection, so collection wiring is untested.The fake accepts
collectionbut doesn’t record or assert it. A wrong collection in service calls would still pass these tests.Suggested tightening
class FakeVectorStore: def __init__(self, ids=None, rows=None): self._ids = ids or [] self._rows = rows or [] self.deleted_ids: list[str] = [] + self.query_collections: list[str] = [] + self.delete_collections: list[str] = [] async def query_ids_by_filter(self, collection, filters): + self.query_collections.append(collection) return list(self._ids) async def delete(self, ids, collection="default") -> int: + self.delete_collections.append(collection) self.deleted_ids.extend(ids) return len(ids)async def test_delete_partition_drops_vectors_then_rows(): @@ await _svc(prepo=prepo, vstore=vstore).delete_partition("p1") + assert vstore.query_collections == ["vdb"] + assert vstore.delete_collections == ["vdb"]Also applies to: 90-91, 149-152, 158-161
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@openrag/services/orchestrators/test_partition_service.py` around lines 83 - 88, FakeVectorStore methods query_ids_by_filter and delete (and the other fake methods at the noted ranges) accept a collection parameter but ignore it, so collection wiring isn't tested; update FakeVectorStore (class and its methods like query_ids_by_filter, delete and the other fake handlers around the noted ranges) to record the collection argument (e.g., append to a self.seen_collections list or store per-call entries in self.calls) and, where appropriate, include the collection in returned data or expose it via a property so tests can assert the collection used for each call; ensure each method signature still accepts the collection default and that tests are updated to assert on the recorded collection(s).
148-154: ⚡ Quick winOrder claim in test name is not actually asserted.
Line 148 says “drops_vectors_then_rows”, but the test only checks both happened, not sequence. This can miss orchestration regressions.
Suggested tightening
`@pytest.mark.asyncio` async def test_delete_partition_drops_vectors_then_rows(): + events: list[str] = [] + + class OrderedPartitionRepo(FakePartitionRepo): + async def delete_partition(self, name: str) -> bool: + events.append("rows") + return await super().delete_partition(name) + + class OrderedVectorStore(FakeVectorStore): + async def delete(self, ids, collection="default") -> int: + events.append("vectors") + return await super().delete(ids, collection) + - prepo = FakePartitionRepo(existing={"p1"}) - vstore = FakeVectorStore(ids=["c1", "c2"]) + prepo = OrderedPartitionRepo(existing={"p1"}) + vstore = OrderedVectorStore(ids=["c1", "c2"]) await _svc(prepo=prepo, vstore=vstore).delete_partition("p1") assert vstore.deleted_ids == ["c1", "c2"] assert prepo.deleted == ["p1"] + assert events == ["vectors", "rows"]🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@openrag/services/orchestrators/test_partition_service.py` around lines 148 - 154, The test test_delete_partition_drops_vectors_then_rows claims an order but only asserts both deletions happened; update the fakes and assertions to verify sequence: have FakeVectorStore (used as vstore) and FakePartitionRepo (prepo) record deletion events with an order marker (e.g., incrementing sequence id or timestamp) when their delete methods are called, then assert that the vector-store delete events for ids ["c1","c2"] occurred before the partition delete event recorded by prepo; keep test name and assert that vstore deletion sequence indices are all less than prepo.deleted sequence index to enforce "vectors then rows" behavior.openrag/services/orchestrators/test_workspace_service.py (1)
70-75: ⚡ Quick winVector-store collection propagation isn’t verified in workspace delete tests.
The fake currently ignores
collection; this weakens confidence thatWorkspaceServiceuses the configured collection consistently.Suggested tightening
class FakeVectorStore: def __init__(self, ids_by_file=None): self._ids_by_file = ids_by_file or {} self.deleted: list[list[str]] = [] + self.query_collections: list[str] = [] + self.delete_collections: list[str] = [] async def query_ids_by_filter(self, collection, filters): + self.query_collections.append(collection) return list(self._ids_by_file.get(filters.get("file_id"), [])) async def delete(self, ids, collection="default") -> int: + self.delete_collections.append(collection) self.deleted.append(list(ids)) return len(ids)async def test_delete_workspace_cleans_orphans_vectors_and_rows(): @@ out = await _svc(wrepo=wrepo, drepo=drepo, vstore=vstore).delete_workspace("p", "w1") + assert vstore.query_collections == ["vdb", "vdb"] + assert vstore.delete_collections == ["vdb"]Also applies to: 135-137, 149-150
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@openrag/services/orchestrators/test_workspace_service.py` around lines 70 - 75, The fake in test_workspace_service currently ignores the collection parameter which weakens validation; update the fake methods (query_ids_by_filter and delete) to consider the collection argument (e.g., keying _ids_by_file by (collection, file_id) or storing last_collection used), then update the tests to assert the expected collection is passed through — specifically modify the fake's query_ids_by_filter(collection, filters) to use collection when looking up ids and modify delete(ids, collection="default") to record the collection (e.g., append (collection, ids) or increment per-collection counters) and add assertions in the WorkspaceService delete tests to verify the configured collection was used.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@openrag/core/config/auth.py`:
- Around line 25-28: Remove the runtime auto-provisioning flag and any code
paths that create users when an OIDC `sub` is unknown: delete the
auto_provision_login configuration (auto_provision_login) from
openrag/core/config/auth.py and any references to it, and update the OIDC login/
callback logic (where external_user_id or user provisioning is handled) to
strictly match users by users.external_user_id == sub with no email fallback;
when no matching user exists, return a 403 error path instead of creating a new
user.
In `@openrag/di/container.py`:
- Around line 257-423: The properties (user_service, partition_service,
workspace_service, retrieval_service, query_service, conversion_service — and
any other getters that dereference self._settings) access self._settings without
a null guard; add a consistent guard that raises the container's intended
runtime error when settings are missing. Implement a small helper (e.g.
_ensure_settings_present or _require_settings) that checks if self._settings is
None and raises the container-specific RuntimeError, then call that helper at
the start of each affected property (user_service, partition_service,
workspace_service, retrieval_service, query_service, conversion_service) before
any use of self._settings to ensure failures follow the container's error
contract.
In `@openrag/main.py`:
- Around line 265-271: Replace the broad except in the ServiceContainer wiring
so only expected infrastructure/connectivity errors are handled as graceful
degradation: catch specific exceptions (e.g., database/VectorDB connection
errors such as psycopg2.OperationalError, sqlalchemy.exc.OperationalError,
ConnectionError) and log those with logger.warning while keeping
app.state.container = None; for any other Exception (import errors, syntax
errors, misconfiguration) log full traceback at error/exception level and
re-raise or fail startup. Update the try/except around the from di.container
import ServiceContainer and the ServiceContainer(config) instantiation
accordingly and ensure di/providers.py behavior remains unchanged for
request-time fallback.
In `@openrag/routers/extract.py`:
- Line 65: Guard the metadata access for chunk["metadata"]["partition"] to avoid
KeyError by checking for the presence of "partition" (e.g., via "partition" in
chunk.get("metadata", {}) or metadata.get("partition") is not None) before
assigning chunk_partition; if missing, follow the same not-found/corrupt-chunk
response path currently used elsewhere in this handler (return the 404 flow) so
absent partition metadata produces a 404 instead of a 500.
In `@openrag/services/orchestrators/auth_service.py`:
- Around line 368-374: The code currently indexes cls.ROLE_HIERARCHY[user_role]
which can raise KeyError if membership.get("role") returns an unexpected value;
update the check in the method that uses membership and user_role to first
validate that user_role is truthy and present in cls.ROLE_HIERARCHY (e.g., if
not user_role or user_role not in cls.ROLE_HIERARCHY: raise AuthError with a
clear message about unknown or invalid role for partition and appropriate
status_code), then proceed to compare ROLE_HIERARCHY[user_role] <
ROLE_HIERARCHY[required_role] and return True on success.
In `@openrag/services/orchestrators/job_service.py`:
- Around line 39-40: Replace direct Ray actor .remote() awaits with the
centralized timeout helper: import and use call_ray_actor_with_timeout from
services.workers.ray_utils and invoke it with the TaskStateManager actor method
(e.g., replace await self._tsm.get_all_states.remote() with await
call_ray_actor_with_timeout(self._tsm.get_all_states, ...) ), doing the same for
the other actor calls in this file (the uses of self._tsm.*.remote() around the
other occurrences). Ensure you pass the same args/kwargs and an appropriate
timeout value so the TaskStateManager methods (get_all_states, and the other
actor methods used in this module) are called through
call_ray_actor_with_timeout to preserve cancellation/timeouts.
In `@openrag/services/orchestrators/test_user_service.py`:
- Around line 235-251: The test fails because the fake existing User is created
without a created_at so the service returns None; update the test_setup where
repo._users[2] is assigned (in test_update_user_returns_legacy_dict_shape) to
include a valid created_at (e.g., a datetime value) so the service will produce
an ISO-formatted string in out["created_at"]; locate the FakeUserRepo/User
instantiation and add a created_at value (import datetime in the test file if
needed) rather than changing service logic.
In `@openrag/services/storage/indexer_ray_shim.py`:
- Around line 47-53: The _call() implementation imports
call_ray_actor_with_timeout from services.workers.ray_utils — change that import
to components.ray_utils and use the centralized call_ray_actor_with_timeout to
ensure consistent timeout/cancellation semantics; update the import statement in
the file containing _call (reference symbol: call_ray_actor_with_timeout and
function name _call) so the call remains identical but sources from
components.ray_utils, and run tests to confirm no other references to
services.workers.ray_utils remain.
---
Outside diff comments:
In `@openrag/routers/indexer.py`:
- Around line 144-180: The code saves uploads to disk via save_file_to_disk
(yielding file_path) before running follow-up checks (workspace validation via
service.get_workspace and dispatch via service.add_file), which can leave
orphaned files on any subsequent 4xx/5xx exit; update the handler to centrally
clean up the staged file on failure: after calling save_file_to_disk track
file_path and wrap the remaining logic (parsing workspace_ids, workspace lookup,
and service.add_file) in a try/except/finally so that on any exception you
remove the saved file (using the Path represented by file_path) before
re-raising/returning the HTTPException; apply the same pattern to the other
upload branch that saves files (the second save_file_to_disk usage referenced)
so both paths remove their staged files on error.
---
Nitpick comments:
In `@openrag/routers/openai.py`:
- Around line 50-53: The decorator `@router.on_event`("startup") is deprecated;
replace this startup hook by creating an async lifespan context manager that
calls _fetch_max_model_tokens and assigns the result to the module-level
_max_model_tokens (i.e., replicate what _cache_max_model_tokens() did), and then
attach that lifespan to the router/app when mounting this router. Specifically,
remove the decorated _cache_max_model_tokens function, implement an async
contextmanager (lifespan) that awaits _fetch_max_model_tokens() and sets
_max_model_tokens before yielding, and ensure the router is created/attached
with that lifespan so the token cache runs at startup.
In `@openrag/services/orchestrators/auth_service.py`:
- Around line 92-100: The function named `_utcnow` actually returns local naive
time (datetime.now()) which is misleading; rename the function to a clearer
identifier such as `_naive_now` or `_local_now`, update its docstring to reflect
the new name, and update every usage/call site (all references to `_utcnow`) to
the new name (e.g., `_naive_now`) to keep behavior unchanged; run/adjust any
tests or imports that reference `_utcnow`.
In `@openrag/services/orchestrators/test_auth_service.py`:
- Around line 376-385: The test uses a broad pytest.raises(Exception) in
test_validate_file_quota; replace that with the exact exception type that
AuthService.validate_file_quota actually raises (e.g., QuotaExceededError or
ValueError) by importing that exception and using it in the with
pytest.raises(...) assertion so the test asserts the specific failure mode from
AuthService.validate_file_quota.
- Around line 349-363: Replace the fragile/overbroad exception assertions in
tests by asserting the specific error type: in test_require_admin replace
pytest.raises(OIDCFlowError.__bases__[0]) with pytest.raises(AuthError) (import
AuthError if needed), and in test_check_partition_access_role_hierarchy replace
pytest.raises(Exception) with pytest.raises(AuthError) (or
pytest.raises(OpenRAGError) if the implementation raises the more general type);
update imports to bring AuthError/OpenRAGError into scope and run tests to
confirm.
In `@openrag/services/orchestrators/test_partition_service.py`:
- Around line 83-88: FakeVectorStore methods query_ids_by_filter and delete (and
the other fake methods at the noted ranges) accept a collection parameter but
ignore it, so collection wiring isn't tested; update FakeVectorStore (class and
its methods like query_ids_by_filter, delete and the other fake handlers around
the noted ranges) to record the collection argument (e.g., append to a
self.seen_collections list or store per-call entries in self.calls) and, where
appropriate, include the collection in returned data or expose it via a property
so tests can assert the collection used for each call; ensure each method
signature still accepts the collection default and that tests are updated to
assert on the recorded collection(s).
- Around line 148-154: The test test_delete_partition_drops_vectors_then_rows
claims an order but only asserts both deletions happened; update the fakes and
assertions to verify sequence: have FakeVectorStore (used as vstore) and
FakePartitionRepo (prepo) record deletion events with an order marker (e.g.,
incrementing sequence id or timestamp) when their delete methods are called,
then assert that the vector-store delete events for ids ["c1","c2"] occurred
before the partition delete event recorded by prepo; keep test name and assert
that vstore deletion sequence indices are all less than prepo.deleted sequence
index to enforce "vectors then rows" behavior.
In `@openrag/services/orchestrators/test_workspace_service.py`:
- Around line 70-75: The fake in test_workspace_service currently ignores the
collection parameter which weakens validation; update the fake methods
(query_ids_by_filter and delete) to consider the collection argument (e.g.,
keying _ids_by_file by (collection, file_id) or storing last_collection used),
then update the tests to assert the expected collection is passed through —
specifically modify the fake's query_ids_by_filter(collection, filters) to use
collection when looking up ids and modify delete(ids, collection="default") to
record the collection (e.g., append (collection, ids) or increment
per-collection counters) and add assertions in the WorkspaceService delete tests
to verify the configured collection was used.
In `@openrag/services/orchestrators/user_service.py`:
- Around line 109-111: The creation logic skips applying a default when
self._default_file_quota == 0 because it checks "> 0", so change the condition
to apply zero as a valid default: update the block that reads "file_quota =
fields.get('file_quota'); if self._default_file_quota > 0 and file_quota is
None:" to use "if self._default_file_quota >= 0 and file_quota is None:" (or
alternatively check "if self._default_file_quota is not None and file_quota is
None:" if None is used to signal no default), and add a brief inline comment
next to self._default_file_quota explaining that 0 is an intentional valid quota
to be applied at creation.
- Around line 193-211: The update_user function returns a dict constructed from
the User model while create_user returns whatever create_legacy_user returns,
creating inconsistent response shapes; extract a single helper (e.g.,
_user_to_dict) that accepts a User instance and returns the canonical dict with
fields id, display_name, external_user_id, email, is_admin, created_at
(isoformat or None), file_quota, file_count, then replace the manual dict
construction in update_user with a call to _user_to_dict and update create_user
to call the same helper (or add a short comment linking create_legacy_user to
the helper if that return must remain special) so both code paths produce the
same JSON shape.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 770384a3-acf1-450d-9840-929b2c30b66f
📒 Files selected for processing (38)
REFACTORING_DECISION_LOG.mdopenrag/core/config/auth.pyopenrag/core/indexing/dispatcher.pyopenrag/core/indexing/serializer.pyopenrag/di/container.pyopenrag/di/providers.pyopenrag/di/test_container.pyopenrag/main.pyopenrag/routers/auth.pyopenrag/routers/extract.pyopenrag/routers/indexer.pyopenrag/routers/openai.pyopenrag/routers/partition.pyopenrag/routers/queue.pyopenrag/routers/search.pyopenrag/routers/tools.pyopenrag/routers/users.pyopenrag/routers/workspaces.pyopenrag/services/orchestrators/auth_service.pyopenrag/services/orchestrators/conversion_service.pyopenrag/services/orchestrators/indexing_service.pyopenrag/services/orchestrators/job_service.pyopenrag/services/orchestrators/partition_service.pyopenrag/services/orchestrators/query_service.pyopenrag/services/orchestrators/retrieval_service.pyopenrag/services/orchestrators/test_auth_service.pyopenrag/services/orchestrators/test_conversion_service.pyopenrag/services/orchestrators/test_indexing_service.pyopenrag/services/orchestrators/test_job_service.pyopenrag/services/orchestrators/test_partition_service.pyopenrag/services/orchestrators/test_query_service.pyopenrag/services/orchestrators/test_retrieval_service.pyopenrag/services/orchestrators/test_user_service.pyopenrag/services/orchestrators/test_workspace_service.pyopenrag/services/orchestrators/user_service.pyopenrag/services/orchestrators/workspace_service.pyopenrag/services/storage/indexer_ray_shim.pyopenrag/services/storage/serializer_ray_shim.py
…s token count - main.py: open the container's asyncpg pool on startup (and close on shutdown). The thinned routers use the container's own PostgresStore, which was never initialized, so catalog-backed routes 500'd. Corrects the decision-log section 1 deferral that broke the live app and the API test job. - routers/auth.py: run the AUTH_MODE gate as a dependency before get_auth_service so token mode returns 400 instead of 503. - components/utils.py: fall back to a tiktoken encoder when ChatOpenAI cannot be constructed (no api_key in keyless / CI envs). - routers/test_auth_router.py: replace the stale pre-8A.1 fat-router test with a thin transport test; OIDC logic is covered by services/orchestrators/test_auth_service.py. - decision log: record the four fixes.
Chunk.from_langchain lifts file_id/partition/page/_id out of the free-form metadata into typed Chunk fields. The thinned search router returned only Chunk.metadata, so the API contract lost metadata.file_id (5 test_search filtering tests saw file_id == None). Rebuild the legacy metadata shape via Chunk.to_langchain().metadata, matching the pre-Phase-8 router that returned the raw Document metadata.
- di/container.py: add _require_settings() guard so the no-settings ServiceContainer() path raises the documented RuntimeError instead of AttributeError, consistent with catalog_store/vector_store. - main.py: log the boot guards at exception level (full traceback) so an unexpected wiring/init failure is loud; keep best-effort degradation. - routers/extract.py: return 404 when chunk metadata has no partition instead of KeyError -> 500. - auth_service.py: unknown membership role -> 403 instead of KeyError. - job_service.py: route the TaskStateManager .remote() calls through call_ray_actor_with_timeout for timeout/cancellation handling. Skipped (stale or out of scope): OIDC auto_provision_login removal (an intentional, documented, pre-Phase-8 feature); test_user_service created_at (User.created_at defaults to datetime, test is green); indexer_ray_shim ray_utils path (services.workers.ray_utils is the canonical location, components.ray_utils is the deprecated re-export).
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
openrag/routers/test_auth_router.py (1)
37-45: ⚡ Quick winRestore
sys.modules["utils.dependencies"]after this module’s tests.The current import-time override is global and can leak into other test modules. Add a module-scoped cleanup fixture to restore the previous module object.
Suggested fix
+_PREV_UTILS_DEPENDENCIES = sys.modules.get("utils.dependencies") + def _install_dependencies_stub() -> None: stub = types.ModuleType("utils.dependencies") stub.get_vectordb = lambda: None stub.get_task_state_manager = lambda: None stub.get_serializer = lambda: None @@ _install_dependencies_stub() + + +@pytest.fixture(scope="module", autouse=True) +def _restore_dependencies_module(): + yield + if _PREV_UTILS_DEPENDENCIES is None: + sys.modules.pop("utils.dependencies", None) + else: + sys.modules["utils.dependencies"] = _PREV_UTILS_DEPENDENCIES🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@openrag/routers/test_auth_router.py` around lines 37 - 45, The test helper _install_dependencies_stub currently overwrites sys.modules["utils.dependencies"] at import time and can leak into other tests; change this by adding a pytest fixture (module-scoped) that saves orig = sys.modules.get("utils.dependencies"), installs the stub (using the same stub creation logic in _install_dependencies_stub), yields to run tests, and on teardown restores the original: if orig is None remove sys.modules["utils.dependencies"], otherwise set sys.modules["utils.dependencies"] = orig; reference the fixture from tests that need the stub or call the fixture to ensure proper cleanup.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@openrag/di/container.py`:
- Line 93: The call currently sets auto_provision_login using
os.getenv("OIDC_AUTO_PROVISION_LOGIN"...), which allows runtime enabling of OIDC
auto-provisioning; change this to force False by replacing the env-based
expression with literal False (i.e., set auto_provision_login=False) so the OIDC
flow never auto-provisions users; update the argument named auto_provision_login
in container.py accordingly and remove the os.getenv(...) expression.
In `@openrag/main.py`:
- Around line 293-297: When container.initialize() raises, the app must mark the
container unavailable so requests don't hit a partially initialized
ServiceContainer; in the except block where you catch Exception around await
container.initialize(), set app.state.container = None (in addition to
logger.exception) so the app uses the degraded path, referencing the existing
container variable and app.state.container in the ServiceContainer
initialization flow.
---
Nitpick comments:
In `@openrag/routers/test_auth_router.py`:
- Around line 37-45: The test helper _install_dependencies_stub currently
overwrites sys.modules["utils.dependencies"] at import time and can leak into
other tests; change this by adding a pytest fixture (module-scoped) that saves
orig = sys.modules.get("utils.dependencies"), installs the stub (using the same
stub creation logic in _install_dependencies_stub), yields to run tests, and on
teardown restores the original: if orig is None remove
sys.modules["utils.dependencies"], otherwise set
sys.modules["utils.dependencies"] = orig; reference the fixture from tests that
need the stub or call the fixture to ensure proper cleanup.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 23dae878-2312-4a7b-b3f3-07b4732caa89
📒 Files selected for processing (10)
REFACTORING_DECISION_LOG.mdopenrag/components/utils.pyopenrag/di/container.pyopenrag/main.pyopenrag/routers/auth.pyopenrag/routers/extract.pyopenrag/routers/search.pyopenrag/routers/test_auth_router.pyopenrag/services/orchestrators/auth_service.pyopenrag/services/orchestrators/job_service.py
🚧 Files skipped from review as they are similar to previous changes (4)
- REFACTORING_DECISION_LOG.md
- openrag/routers/extract.py
- openrag/routers/auth.py
- openrag/services/orchestrators/auth_service.py
… 503 If container.initialize() (asyncpg pool open) fails, the container object still exists but its repos are unusable, so requests would 500. Set app.state.container = None in that path so di/providers.py serves the intended degraded 503 instead. Skipped CodeRabbit's container.py:93 (force auto_provision_login=False): OIDC_AUTO_PROVISION_LOGIN is an intentional, documented (CLAUDE.md) product feature predating Phase 8 (commits ae17f31, 170db40); hard-coding False would regress shipped behavior. Same rationale as the earlier core/config/auth.py finding.
Builds the real AuthService/UserService over repo-port adapters on the shared _StubVectorDB state and injects them via dependency_overrides instead of patching the removed router OIDCClient singleton. Stub get_oidc_session_by_token is now hash-based to mirror production.
Scope
Extracts a service layer between the routers and the components, thinning each router down to request/response handling and pushing orchestration into dedicated services.
Commits (10):
Summary by CodeRabbit
New Features
Refactor
Bug Fixes