refactor(12A): move logger + monitoring to core/, repoint imports#434
Conversation
Phase 12A. Relocate the cross-cutting observability utilities out of the old utils/ tree into the hexagonal core layer: - utils/logger.py -> core/utils/logging.py (structlog-convention name) - utils/monitoring.py -> core/observability/monitoring.py - utils/test_logger.py -> core/utils/test_logging.py logging.py now reads config via `from core.config import load_config`; the core.config package exposes the cached load_config/get_settings/Settings public API (previously only the config/ shim did). Repoint every import site to the new paths: - from utils.logger -> from core.utils.logging - from utils.monitoring -> from core.observability.monitoring - from utils.exceptions.base -> from core.utils.exceptions - from utils.exceptions.embeddings -> from core.utils.exceptions - from utils.external_resource_errors -> from core.utils.external_errors The exceptions and external_errors modules already live canonically in core.utils; this only switches the remaining consumers off the shims. Delete the duplicate utils/test_external_resource_errors.py (kept in core/utils/test_external_errors.py). Update the auth-router test stub to patch core.utils.logging instead of the moved utils.logger. No remaining `from utils.` imports outside the (now shim-only) utils/ dir.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThis PR centralizes logging/config/prompts into core.*, adds FastAPI error handlers that map OpenRAG errors to HTTP statuses and include optional request_id, introduces core utilities (source-citation filtering, token counting, temporal-field parsing), runtime inference semaphores, async file-upload saving, and migrates many legacy component re-exports and tests to the services/core packages. ChangesCore module reorganization and centralized infrastructure
Sequence Diagram(s)(omitted — change is refactor-heavy, no new multi-component runtime flow requiring a sequence diagram) Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
✨ Finishing Touches🧪 Generate unit tests (beta)
|
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (2)
openrag/core/config/__init__.py (2)
30-30: 💤 Low valueConsider explicit None checks instead of truthiness for clearer semantics.
The condition
if overrides or config_path:treats empty string""as falsy and falls through to the cached singleton. If an empty string is a valid (though unusual) config path, this could be unexpected. An explicitis not Nonecheck makes the intent clearer.🔍 Alternative: explicit None checks
- if overrides or config_path: + if overrides is not None or config_path is not None:🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@openrag/core/config/__init__.py` at line 30, Change the truthy check "if overrides or config_path:" to explicit None checks so empty strings or empty dicts don't incorrectly skip the cached singleton; specifically replace that condition with "if overrides is not None or config_path is not None:" (referencing the existing overrides and config_path variables/condition in __init__.py) to make the intent explicit.
22-22: ⚡ Quick winAdd type hints to
load_configparameters for consistency.The
get_settings()function has complete type hints, butload_config()parameters lack type annotations. Adding hints improves IDE support and maintains consistency with the rest of the module.📝 Proposed type hint additions
-def load_config(config_path=None, overrides=None) -> Settings: +def load_config(config_path: str | None = None, overrides: dict | None = None) -> Settings:🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@openrag/core/config/__init__.py` at line 22, Update the load_config signature to include explicit type hints: annotate config_path as Optional[Union[str, Path]] = None and overrides as Optional[Mapping[str, Any]] = None, returning Settings (matching get_settings style). Add necessary imports (Optional, Union, Mapping, Any from typing and Path from pathlib) if not already present, and ensure the function definition for load_config(config_path, overrides) is updated accordingly to maintain consistency with get_settings and improve IDE/type checking.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@openrag/app_front.py`:
- Line 12: Import mask_email from core.utils.logging alongside get_logger and
use it to pseudonymize the username variable before any authentication-failure
logging; locate the auth failure block that currently logs username (the
statement around the auth handler that logs on failed login) and replace the raw
username in the processLogger/get_logger call with mask_email(username) so email
addresses are not written to logs.
In `@openrag/components/indexer/loaders/pdf_loaders/dotsocr.py`:
- Line 1: The import currently pulls a pre-configured logger instance; replace
it with the project pattern that calls the factory: import get_logger from
core.utils.logging and then create a logger by assigning logger = get_logger()
in this module (update the import and the logger variable in
openrag/components/indexer/loaders/pdf_loaders/dotsocr.py so the module uses
get_logger() instead of importing logger directly).
In `@openrag/components/indexer/loaders/pdf_loaders/openai.py`:
- Line 10: Replace the direct import of the logger instance with the get_logger
factory: change the import line that currently reads "from core.utils.logging
import logger" to import get_logger, then call get_logger() to assign to a
module-level logger variable (e.g., logger = get_logger()) so this module
follows the same structured Loguru pattern used by other loaders; update any
references to the existing logger variable to use this newly created logger if
needed.
---
Nitpick comments:
In `@openrag/core/config/__init__.py`:
- Line 30: Change the truthy check "if overrides or config_path:" to explicit
None checks so empty strings or empty dicts don't incorrectly skip the cached
singleton; specifically replace that condition with "if overrides is not None or
config_path is not None:" (referencing the existing overrides and config_path
variables/condition in __init__.py) to make the intent explicit.
- Line 22: Update the load_config signature to include explicit type hints:
annotate config_path as Optional[Union[str, Path]] = None and overrides as
Optional[Mapping[str, Any]] = None, returning Settings (matching get_settings
style). Add necessary imports (Optional, Union, Mapping, Any from typing and
Path from pathlib) if not already present, and ensure the function definition
for load_config(config_path, overrides) is updated accordingly to maintain
consistency with get_settings and improve IDE/type checking.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 56ce3b25-b2ad-442d-a7fa-1ca9d4d720d9
📒 Files selected for processing (80)
CLAUDE.mddocs/content/docs/documentation/milvus_migration.mdxopenrag/api/dependencies/auth.pyopenrag/api/dependencies/llm.pyopenrag/api/error_handlers.pyopenrag/api/main.pyopenrag/api/mcp/server.pyopenrag/api/middleware/auth.pyopenrag/api/middleware/instrumentation.pyopenrag/api/routers/admin/cluster.pyopenrag/api/routers/admin/indexing.pyopenrag/api/routers/admin/monitoring.pyopenrag/api/routers/admin/partitions.pyopenrag/api/routers/admin/tools.pyopenrag/api/routers/admin/users.pyopenrag/api/routers/admin/workspaces.pyopenrag/api/routers/auth/oidc.pyopenrag/api/routers/user/chat.pyopenrag/api/routers/user/extract.pyopenrag/api/routers/user/search.pyopenrag/app_front.pyopenrag/components/indexer/chunker/chunker.pyopenrag/components/indexer/embeddings/openai.pyopenrag/components/indexer/loaders/__init__.pyopenrag/components/indexer/loaders/audio/local_whisper.pyopenrag/components/indexer/loaders/audio/openai.pyopenrag/components/indexer/loaders/base.pyopenrag/components/indexer/loaders/doc.pyopenrag/components/indexer/loaders/docx.pyopenrag/components/indexer/loaders/image.pyopenrag/components/indexer/loaders/pdf_loaders/docling.pyopenrag/components/indexer/loaders/pdf_loaders/docling2.pyopenrag/components/indexer/loaders/pdf_loaders/dotsocr.pyopenrag/components/indexer/loaders/pdf_loaders/marker.pyopenrag/components/indexer/loaders/pdf_loaders/openai.pyopenrag/components/indexer/loaders/pdf_loaders/pymupdf.pyopenrag/components/indexer/loaders/pptx_loader.pyopenrag/components/indexer/loaders/txt_loader.pyopenrag/components/llm.pyopenrag/components/reranker/infinity.pyopenrag/components/reranker/openai.pyopenrag/components/utils.pyopenrag/components/websearch/content_fetcher.pyopenrag/components/websearch/providers/staan.pyopenrag/components/websearch/service.pyopenrag/core/config/__init__.pyopenrag/core/observability/monitoring.pyopenrag/core/utils/logging.pyopenrag/core/utils/test_logging.pyopenrag/di/container.pyopenrag/scripts/backup.pyopenrag/scripts/restore.pyopenrag/services/auth/refresh.pyopenrag/services/inference/_circuit_breaker.pyopenrag/services/inference/_retry.pyopenrag/services/inference/healthcheck.pyopenrag/services/inference/ollama_client.pyopenrag/services/inference/reranker_clients.pyopenrag/services/inference/vllm_client.pyopenrag/services/orchestrators/auth_service.pyopenrag/services/orchestrators/conversion_service.pyopenrag/services/orchestrators/indexing_service.pyopenrag/services/orchestrators/mcp_service.pyopenrag/services/orchestrators/partition_service.pyopenrag/services/orchestrators/query_service.pyopenrag/services/orchestrators/retrieval_service.pyopenrag/services/orchestrators/user_service.pyopenrag/services/orchestrators/workspace_service.pyopenrag/services/persistence/connection.pyopenrag/services/persistence/migrations/milvus/1.add_created_at_temporal_fields.pyopenrag/services/persistence/migrations/milvus/migrate.pyopenrag/services/workers/batch_ingest.pyopenrag/services/workers/bootstrap.pyopenrag/services/workers/parsers/doc_serializer.pyopenrag/services/workers/parsers/docling_workers.pyopenrag/services/workers/parsers/marker_workers.pyopenrag/services/workers/parsers/whisper_workers.pyopenrag/services/workers/ray_utils.pyopenrag/test_auth_router.pyopenrag/utils/test_external_resource_errors.py
💤 Files with no reviewable changes (1)
- openrag/utils/test_external_resource_errors.py
…refactor/phase-12-person-b-cleanup # Conflicts: # openrag/services/websearch/content_fetcher.py # openrag/services/websearch/providers/staan.py # openrag/services/websearch/service.py
Phase 12B. The canonical auth adapters (OIDC client, session tokens, state cookie, deps) live in services/auth/; components/auth/ is now only re-export shims. Repoint the remaining consumers off the shims: - api/middleware/auth.py components.auth.refresh -> services.auth.refresh - api/routers/auth/oidc.py components.auth -> services.auth.state_cookie - di/container.py components.auth -> services.auth - services/orchestrators/auth_service.py components.auth -> services.auth - services/orchestrators/test_auth_service.py components.auth -> services.auth Also refresh stale module references in docstrings that pointed at the old components.auth paths (auth_service header, oidc_session_repo notes). No remaining `from components.auth` imports outside components/ itself.
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@openrag/api/dependencies/files.py`:
- Around line 47-48: Normalize and sanitize the incoming UploadFile.filename
before joining with dest_dir to prevent path traversal: in the code that
computes filename (where make_unique_filename(file.filename) is called and
file_path is computed), strip any path components and leading slashes (e.g., use
the filename's basename), reject or normalize absolute paths, then apply
make_unique_filename to that sanitized basename and join with dest_dir to form
file_path; ensure the final path is contained under dest_dir (e.g., by resolving
and asserting the parent is dest_dir) before writing.
In `@openrag/core/utils/source_filtering.py`:
- Around line 36-39: The debug log currently emits raw LLM output via
tail=repr(tail), which may contain PII; update the block that checks "if not
citations and not saw_none:" to redact/sanitize the preview before logging
(e.g., call a new or existing helper like sanitize_preview(text) or
mask_email(text) to replace email addresses and other PII, then truncate to a
fixed length) and pass that sanitized preview to logger.debug instead of
repr(tail); reference the variables "tail", "citations", "saw_none" and the
logger call in openrag/core/utils/source_filtering.py when making this change.
- Line 18: The regex _SOURCES_NUMS_RE only matches capitalized "Source(s)" so
lower/upper case tags are missed; update its declaration to be case-insensitive
by compiling it with the IGNORECASE flag (e.g., re.compile(r"...",
re.IGNORECASE)) or by adding an inline (?i) modifier to the pattern, keeping the
existing capture group and structure unchanged so numeric parsing still works
for variants like [sources: 1,3] or [SOURCES: 2].
In `@openrag/services/inference/runtime.py`:
- Around line 17-20: Guard detect_language against empty input and failed
detection by validating and normalizing the incoming text (e.g., strip and
return None for blank strings), wrapping the call to _lang_detector.detect(text,
k=1) in a try/except, and verifying outputs is a non-empty list before indexing;
if detection fails or outputs[0] has no "lang" key, return None (or a designated
fallback value) so callers like QueryService.generate_query can fall back
instead of propagating an exception.
In `@openrag/services/workers/indexer_pool.py`:
- Around line 104-106: The current check only verifies the presence of a "chunk"
attribute on the created chunker but not that it's callable; update the
validation in the indexer pool after create_chunker(cfg) to ensure
callable(getattr(chunker, "chunk", None)) and raise a TypeError if it's not
callable, with the same explanatory message referencing the chunk(document,
partition) method so misconfigured chunkers (e.g., chunk = None) are rejected
early.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 09d1d8bd-a991-4f90-b669-1146a6220e12
📒 Files selected for processing (25)
openrag/api/dependencies/files.pyopenrag/api/dependencies/test_files.pyopenrag/api/routers/admin/indexing.pyopenrag/api/routers/admin/tools.pyopenrag/api/routers/user/chat.pyopenrag/core/chunking/factory.pyopenrag/core/config/retrieval.pyopenrag/core/utils/filename.pyopenrag/core/utils/source_filtering.pyopenrag/core/utils/test_source_filtering.pyopenrag/core/utils/text.pyopenrag/di/container.pyopenrag/services/inference/parsers/openai_audio.pyopenrag/services/inference/runtime.pyopenrag/services/orchestrators/indexing_service.pyopenrag/services/orchestrators/query_service.pyopenrag/services/websearch/__init__.pyopenrag/services/websearch/base.pyopenrag/services/websearch/content_fetcher.pyopenrag/services/websearch/providers/__init__.pyopenrag/services/websearch/providers/staan.pyopenrag/services/websearch/service.pyopenrag/services/websearch/test_content_fetcher.pyopenrag/services/workers/indexer_pool.pyopenrag/services/workers/test_indexer_pool.py
✅ Files skipped from review due to trivial changes (6)
- openrag/services/websearch/content_fetcher.py
- openrag/services/websearch/providers/staan.py
- openrag/services/websearch/test_content_fetcher.py
- openrag/services/orchestrators/indexing_service.py
- openrag/services/inference/parsers/openai_audio.py
- openrag/services/websearch/service.py
🚧 Files skipped from review as they are similar to previous changes (3)
- openrag/di/container.py
- openrag/api/routers/admin/tools.py
- openrag/api/routers/user/chat.py
…ts.prompts Phase 12C. The orchestrators imported HYDE/MULTI_QUERY/SYS/CONTEXTUALIZER/ SPOKEN_STYLE prompts from the components.prompts shim. Those are not Python string constants — they are disk templates the shim eager-loads at import via load_config(). Distributing them as eager module-level constants into core/prompts builders would add import-time disk I/O and a config dependency to the whole core.prompts package (its __init__ imports every builder), so instead each service loads its templates from the injected Settings using the existing pure core.prompts.load_template_by_key(prompts_dir, mapping, key) — exactly the call shape template_loader documents for its callers. - retrieval_service.py: multiQuery / hyde branches load "multi_query" / "hyde" from `config` (already in __init__) instead of importing the shim inline. - query_service.py: load "query_contextualizer" / "spoken_style_answer" / "sys_prompt" once in __init__ from the injected config; store on the instance. Per-instance config binding replaces the old global eager load. - test_query_service.py: extend the fake config with the real paths/prompts so QueryService can resolve templates. No remaining `components.prompts` imports outside components/ itself.
Phase 12F. Two parts.
1. Bootstrap re-points (safe — old modules are re-export shims):
- components.indexer.loaders.serializer.DocSerializer
-> services.workers.parsers.doc_serializer
- components.indexer.loaders.pdf_loaders.docling2.DoclingPool
-> services.workers.parsers.docling_workers
- components.indexer.loaders.pdf_loaders.marker.MarkerPool
-> services.workers.parsers.marker_workers
2. Loader-registry relocation. The plan assumed the loaders were already
migrated to core/indexing/parsers/ and the old ones were dead. They are
not: core/indexing/parsers/registry.py is an empty Registry, and the new
doc_serializer / doc_serializer_bridge still drive the legacy loaders via
get_loader_classes (which walks the loaders package and discovers ~20
BaseLoader adapters). So the legacy loader subsystem is the live runtime
path. To make components/ deletable (12H) it is moved wholesale:
components/indexer/loaders/ -> services/workers/parsers/legacy_loaders/
- get_loader_classes root_pkg updated to the new package path.
- The two get_loader_classes consumers (doc_serializer, doc_serializer_bridge)
and internal absolute imports / tests repointed.
- The orphaned serializer.py re-export shim (no remaining consumers after the
bootstrap re-point) is deleted rather than carried into services/.
- Internal `from components.utils` / `from components.prompts` imports inside
the loaders are left as-is; they remain valid and are 12C/12E/12H's concern.
Runtime discovery verified: get_loader_classes resolves all 19 extension
mappings from the new location.
Also fix a latent test-isolation bug this relocation surfaced:
services/inference/parsers/test_openai_audio.py installed a fake non-package
`pydub` into sys.modules at import time and never restored it. It was masked
only by collection order (the real-pydub loader test used to sort before it
under components/). It now tries the real pydub first and only falls back to
the stub when the import genuinely fails — its stated Python-3.13 purpose.
No remaining `components.indexer.loaders` references anywhere.
The 12B import switch repointed two api files at services.auth directly, which the layer guard forbids (api -> di, core only; never services). Resolve it properly rather than bridging through di: - StateCookieSerializer / StateCookiePayload are pure (dataclasses + itsdangerous, no I/O), so move them to core/auth/state_cookie.py. api/routers/auth/oidc.py now imports from core (api -> core, allowed); services/auth re-exports them so existing services consumers are unaffected; the components.auth shim repoints to core. - refresh_session_if_needed is infrastructure and stays in services, but the middleware reached it via a direct module import. Expose it as AuthService.refresh_session_if_needed (a thin seam over the helper, passing self) so the middleware calls it on the AuthService it already obtains from di — consistent with every other session op it performs. Update the middleware test: the mock auth-service gains a default refresh delegating to the real helper, and the two refresh-outcome tests override that method instead of patching the module name. `scripts/check_layer_imports.py` is now clean (0 violations).
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@openrag/services/inference/parsers/test_openai_audio.py`:
- Around line 24-27: Replace the broad except Exception in the pydub shim with a
narrow import-related catch so real runtime errors still propagate: when
importing pydub fails in the test setup (the block that creates fake_pydub and
assigns fake_pydub.AudioSegment = MagicMock() and sys.modules["pydub"] =
fake_pydub), catch only ImportError or ModuleNotFoundError (e.g., except
(ImportError, ModuleNotFoundError)) and preserve the existing shim behavior
inside that except block.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: e1131938-9210-45c8-aa83-829df6336df7
📒 Files selected for processing (45)
openrag/api/middleware/auth.pyopenrag/api/routers/auth/oidc.pyopenrag/components/auth/state_cookie.pyopenrag/components/auth/test_middleware.pyopenrag/components/indexer/loaders/serializer.pyopenrag/core/auth/__init__.pyopenrag/core/auth/state_cookie.pyopenrag/di/container.pyopenrag/services/auth/__init__.pyopenrag/services/inference/parsers/test_openai_audio.pyopenrag/services/orchestrators/auth_service.pyopenrag/services/orchestrators/query_service.pyopenrag/services/orchestrators/retrieval_service.pyopenrag/services/orchestrators/test_auth_service.pyopenrag/services/orchestrators/test_query_service.pyopenrag/services/persistence/oidc_session_repo.pyopenrag/services/workers/bootstrap.pyopenrag/services/workers/parsers/doc_serializer.pyopenrag/services/workers/parsers/doc_serializer_bridge.pyopenrag/services/workers/parsers/legacy_loaders/CustomDocLoader.pyopenrag/services/workers/parsers/legacy_loaders/CustomHTMLLoader.pyopenrag/services/workers/parsers/legacy_loaders/__init__.pyopenrag/services/workers/parsers/legacy_loaders/audio/__init__.pyopenrag/services/workers/parsers/legacy_loaders/audio/local_whisper.pyopenrag/services/workers/parsers/legacy_loaders/audio/openai.pyopenrag/services/workers/parsers/legacy_loaders/audio/test_openai.pyopenrag/services/workers/parsers/legacy_loaders/base.pyopenrag/services/workers/parsers/legacy_loaders/doc.pyopenrag/services/workers/parsers/legacy_loaders/docx.pyopenrag/services/workers/parsers/legacy_loaders/eml_loader.pyopenrag/services/workers/parsers/legacy_loaders/image.pyopenrag/services/workers/parsers/legacy_loaders/pdf_loaders/__init__.pyopenrag/services/workers/parsers/legacy_loaders/pdf_loaders/docling.pyopenrag/services/workers/parsers/legacy_loaders/pdf_loaders/docling2.pyopenrag/services/workers/parsers/legacy_loaders/pdf_loaders/dotsocr.pyopenrag/services/workers/parsers/legacy_loaders/pdf_loaders/marker.pyopenrag/services/workers/parsers/legacy_loaders/pdf_loaders/openai.pyopenrag/services/workers/parsers/legacy_loaders/pdf_loaders/pymupdf.pyopenrag/services/workers/parsers/legacy_loaders/pptx_loader.pyopenrag/services/workers/parsers/legacy_loaders/test_base_loader.pyopenrag/services/workers/parsers/legacy_loaders/test_customdocloader.pyopenrag/services/workers/parsers/legacy_loaders/test_doc_loader.pyopenrag/services/workers/parsers/legacy_loaders/test_docx_loader.pyopenrag/services/workers/parsers/legacy_loaders/test_eml_recursion.pyopenrag/services/workers/parsers/legacy_loaders/txt_loader.py
💤 Files with no reviewable changes (13)
- openrag/services/workers/parsers/legacy_loaders/pdf_loaders/dotsocr.py
- openrag/services/workers/parsers/legacy_loaders/pdf_loaders/openai.py
- openrag/services/workers/parsers/legacy_loaders/base.py
- openrag/services/workers/parsers/legacy_loaders/audio/openai.py
- openrag/services/workers/parsers/legacy_loaders/image.py
- openrag/services/workers/parsers/legacy_loaders/pdf_loaders/pymupdf.py
- openrag/services/workers/parsers/legacy_loaders/docx.py
- openrag/services/workers/parsers/legacy_loaders/doc.py
- openrag/components/indexer/loaders/serializer.py
- openrag/services/workers/parsers/legacy_loaders/pdf_loaders/marker.py
- openrag/services/workers/parsers/legacy_loaders/pptx_loader.py
- openrag/services/workers/parsers/legacy_loaders/pdf_loaders/docling2.py
- openrag/services/workers/parsers/legacy_loaders/pdf_loaders/docling.py
✅ Files skipped from review due to trivial changes (4)
- openrag/components/auth/state_cookie.py
- openrag/services/auth/init.py
- openrag/core/auth/init.py
- openrag/services/persistence/oidc_session_repo.py
🚧 Files skipped from review as they are similar to previous changes (3)
- openrag/api/routers/auth/oidc.py
- openrag/services/workers/parsers/doc_serializer.py
- openrag/services/orchestrators/query_service.py
Phase 12A — Migrate logger + monitoring to
core/First step of Phase 12 (internal cleanup). Relocates the cross-cutting observability utilities out of the legacy
utils/tree into the hexagonalcore/layer and repoints every consumer. This is the critical-path task that unblocks 12B–12G (it touches files those tasks also edit, so it lands first to avoid merge conflicts).Module moves (git-tracked as renames)
utils/logger.py→core/utils/logging.py(renamed to the structlog-wrapper convention)utils/monitoring.py→core/observability/monitoring.pyutils/test_logger.py→core/utils/test_logging.pylogging.pynow reads config viafrom core.config import load_config. To preserve the cached-singleton behavior the old shim provided,core/config/__init__.py(previously empty) now exposes the canonicalload_config/get_settings/Settingspublic API.Import re-points (61 files, every layer)
from utils.logger→from core.utils.logging(all forms:get_logger, barelogger,mask_email)from utils.monitoring→from core.observability.monitoringfrom utils.exceptions.base/from utils.exceptions.embeddings→from core.utils.exceptionsfrom utils.external_resource_errors→from core.utils.external_errorsThe exceptions and
external_errorsmodules already lived canonically incore/utils/(with shims inutils/); this only switches the remaining consumers off the shims. Theutils/directory now holds re-export shims only — 12H deletes it.Notes
utils/test_external_resource_errors.py(kept ascore/utils/test_external_errors.py).test_auth_router.pysys.modulesstub to patchcore.utils.logging.from core.utils.exceptions import *(widened from 3 to ~38 names by the sweep, sincecore/utils/exceptions.pyhas no__all__) with an explicit 3-name import in the embeddings shim.utils.*sorted late,core.*sorts early) soruff checkstays clean.from utils.loggerexamples inCLAUDE.mdand the Milvus-migration doc.Out of scope (deferred per the phase plan): the bare-vs-
openrag.import-style unification (Phase 14) and deletion of the remainingutils/,config/,components/shims (12H).Verification
grep "from utils."outside the shim dir → zero resultsruff check/ruff format --checkon all changed files → cleanSummary by CodeRabbit
New Features
Bug Fixes
Refactoring