Skip to content

Conversation

albertyosef
Copy link
Collaborator

@albertyosef albertyosef commented Sep 24, 2025

Stabilizes web_answering.py by removing syntax-breaking try/except blocks, consolidating service initialization, fixing citation assembly, and replacing leaked exception strings with generic 500 responses. Includes small timing/logging cleanups and (optionally) CI-safe tweak for Tavily tests.


Features

  • Consistent timings & telemetry

    • Single start_time and per-phase timestamps with a uniform timings = {search|fetch|rank|generate|total}.
  • Safer HTTP responses

    • Internal errors now return a generic 500 detail; full diagnostics stay in logs.

Fixes

  • Import/parsing errors

    • Removed duplicated try: and mismatched except blocks that caused SyntaxError/indentation failures on import.
  • Broken response assembly

    • Dedented citation building so success responses are constructed when used_citation_ids exist.
    • 422 now triggers only when no citations are produced, as intended.
  • Error leakage

    • Replaced str(e) in responses with generic messages; details are logged via logger.error.

Refactors

  • Initialization

    • Collapsed service setup (Search/Fetch/Rank/Synthesis) into one guarded block; on failure: log + generic 500.
  • Cleanup

    • Removed duplicate timing/logging lines and unused imports.
    • Positioned start_time = time.time() immediately after docstring for clarity.

Tests (related, optional to include)

  • tests/test_tavily_search.py now CI-safe:

    • Skips when TAVILY_API_KEY is absent.
    • Uses pytest-asyncio marker.

How to QA

  • Import smoke test

    • python -c "import sys, pathlib; sys.path.insert(0, str(pathlib.Path('core').resolve())); import importlib; importlib.import_module('api.endpoints.web_answering'); print('IMPORT_OK')"
  • Runtime

    • Hit /websearch/answer with a valid payload (or mocked services); verify timings, citations, and generic 500s.
  • Lint/Typecheck (optional)

    • Run ruff/flake8/mypy as per repo standards.

Risks & Mitigations

  • Behavioral changes on 500s: Clients expecting raw exception text will now see a generic message. Mitigated by preserving detailed logs and maintaining phase-specific status codes (timeouts/4xx) unchanged.

Summary by CodeRabbit

  • New Features

    • Web search answering now runs as a multi-phase pipeline (search, fetch, rank, synthesize) with phase-level timeouts, timings breakdown, and stable citation ordering in responses.
    • Logging can be output as structured JSON (pretty or compact) and written to rotating log files; per-request logging context is supported.
  • Refactor

    • Timing calculation moved to cover the full end-to-end flow for more accurate metrics.
  • Notes

    • Public request/response models and logging APIs were extended; clients may see new fields (timings, citations).

Copy link

coderabbitai bot commented Sep 24, 2025

Walkthrough

Replaces module-level logging with a project-wide JSON logging_config (context-aware, rotating file support, adapter-based loggers) and rewrites the web_answering endpoint into a stepwise multi-phase pipeline (search → fetch → rank → generate) with per-phase timing, timeouts, granular logging, and new request/response models and helpers.

Changes

Cohort / File(s) Summary
Logging infrastructure
services/logging_config.py
Introduces JsonFormatter (ISO-8601 UTC times, host/pid, safe JSON serialization, pretty mode), setup_logging(), add_rotating_file_handler(), context-aware logging via a contextvar, RequestLoggerAdapter, get_logger(), and helpers set_log_context/get_log_context/clear_log_context.
Web answering endpoint & API models
api/endpoints/web_answering.py
Replaces ad-hoc logging with get_logger + per-request set/clear context; changes flow to multi-phase pipeline: search, fetch, rank, generate with phase-specific timing and timeouts; expands error handling; adds public request/response dataclasses/models (WebSearchAnswerRequest, Citation, Timings, Meta, WebSearchAnswerResponse) and verify_internal_api_key dependency.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant API as web_answering endpoint
    participant Logger
    participant SearchSvc as Search
    participant FetchSvc as Fetch
    participant RankSvc as Rank
    participant GenSvc as Generate

    Client->>API: POST /web-answer (WebSearchAnswerRequest)
    API->>Logger: set_log_context(request_id,...)
    API->>SearchSvc: search(query)  %% phase: search
    alt search timed out or empty
        SearchSvc-->>API: timeout/empty
        API->>Logger: log phase timeout/empty
        API-->>Client: response (no results)
    else search results
        SearchSvc-->>API: snippets/ids
        API->>FetchSvc: fetch(snippets/ids)  %% phase: fetch
        alt fetch includes snippets
            FetchSvc-->>API: documents
        else fetch failed/timeout
            FetchSvc-->>API: timeout/error
            API->>Logger: log phase timeout/error
        end
        API->>RankSvc: rank(evidence)  %% phase: rank
        RankSvc-->>API: ranked_evidence
        API->>GenSvc: synthesize(ranked_evidence)  %% phase: generate
        GenSvc-->>API: answer, citations
        API->>Logger: clear_log_context()
        API-->>Client: WebSearchAnswerResponse (answer, citations, timings)
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • fehranbit

Poem

"I hopped through logs and phases bright,
I timed each search from morn to night.
From sniff to fetch, through rank and pen,
I stitched the web answers back again.
— A happy rabbit, logging light 🐇✨"

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 36.36% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title concisely and accurately summarizes the primary change set: stabilizing control flow in web_answering, removing stray try/except blocks, and hardening error handling; this matches the PR objectives and file-level summaries and is specific enough for a reviewer to understand the main intent.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/web-answering-refactor

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (8)
api/endpoints/web_answering.py (8)

153-166: Unify search time budget in a single variable

Avoid recomputing the 40% time budget twice and risking drift. Compute once and reuse for both the inner call and wait_for.

Apply this diff:

-    search_start = time.time()
+    search_start = time.time()
+    search_budget_s = min(request.timeout_seconds * 0.4, 10.0)
     try:
         search_results = await asyncio.wait_for(
             search_service.search(
                 query=request.query,
                 k=request.top_k_results,
                 region=request.region,
                 language=request.language,
-                timeout_seconds=min(request.timeout_seconds * 0.4, 10)  # Allocate 40% of timeout
+                timeout_seconds=search_budget_s  # Allocate 40% of timeout (capped at 10s)
             ),
-            timeout=request.timeout_seconds * 0.4
+            timeout=search_budget_s
         )

182-191: Align fetch timeouts and consider removing double timeout

Use a single computed budget to avoid mismatches. If fetch_search_results already enforces its own timeout, consider dropping outer wait_for to prevent double cancellations.

Apply this diff:

-    fetch_start = time.time()
+    fetch_start = time.time()
+    fetch_budget_s = min(request.timeout_seconds * 0.3, 8.0)
     try:
         fetched_docs = await asyncio.wait_for(
             fetch_service.fetch_search_results(
                 search_results=search_results,
-                timeout_seconds=min(request.timeout_seconds * 0.3, 8),  # Allocate 30% of timeout
+                timeout_seconds=fetch_budget_s,  # Allocate 30% of timeout (capped at 8s)
                 preserve_snippets=request.include_snippets
             ),
-            timeout=request.timeout_seconds * 0.3
+            timeout=fetch_budget_s
         )

206-215: Unify ranking time budget and add generic exception handling

  • Compute the 15% budget once and reuse.
  • Also catch unexpected exceptions to log and return a generic 500 instead of leaking errors.

Apply this diff:

-    rank_start = time.time()
+    rank_start = time.time()
+    rank_budget_s = request.timeout_seconds * 0.15
     try:
         ranked_evidence = await asyncio.wait_for(
             ranking_service.rank_documents(
                 query=request.query,
                 docs=fetched_docs,
                 max_context_chars=request.max_context_chars
             ),
-            timeout=request.timeout_seconds * 0.15  # Allocate 15% of timeout
+            timeout=rank_budget_s  # Allocate 15% of timeout
         )
     except asyncio.TimeoutError:
         logger.warning(f"Ranking timed out after {request.timeout_seconds * 0.15}s", extra=logger_ctx)
         raise HTTPException(
             status_code=status.HTTP_504_GATEWAY_TIMEOUT,
             detail="Ranking phase timed out."
         )
+    except Exception as e:
+        logger.error("Ranking failed: %s", e, extra=logger_ctx)
+        raise HTTPException(
+            status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
+            detail="Ranking phase failed."
+        )

232-242: Unify synthesis time budget and add generic exception handling

  • Compute the 15% budget once and reuse.
  • Catch unexpected exceptions and return a generic 500 while logging.

Apply this diff:

-    generate_start = time.time()
+    generate_start = time.time()
+    synth_budget_s = request.timeout_seconds * 0.15
     try:
         synthesis_result = await asyncio.wait_for(
             synthesis_service.generate_answer(
                 query=request.query,
                 evidence_list=ranked_evidence,
                 answer_tokens=request.answer_tokens,
                 style_profile_id=request.style_profile_id
             ),
-            timeout=request.timeout_seconds * 0.15  # Allocate remaining time
+            timeout=synth_budget_s  # Allocate remaining time
         )
     except asyncio.TimeoutError:
         logger.warning(f"Synthesis timed out after {request.timeout_seconds * 0.15}s", extra=logger_ctx)
         raise HTTPException(
             status_code=status.HTTP_504_GATEWAY_TIMEOUT,
             detail="Synthesis phase timed out."
         )
+    except Exception as e:
+        logger.error("Synthesis failed: %s", e, extra=logger_ctx)
+        raise HTTPException(
+            status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
+            detail="Synthesis phase failed."
+        )

156-171: Add generic exception handling to all phases

To fully meet the goal of not leaking exception details and to ensure consistent logging, add broad exception handling (in addition to TimeoutError) for search and fetch too.

Apply these diffs:

     try:
         search_results = await asyncio.wait_for(
             search_service.search(
                 query=request.query,
                 k=request.top_k_results,
                 region=request.region,
                 language=request.language,
                 timeout_seconds=search_budget_s  # Allocate 40% of timeout (capped at 10s)
             ),
             timeout=search_budget_s
         )
     except asyncio.TimeoutError:
         logger.warning(f"Search timed out after {request.timeout_seconds * 0.4}s", extra=logger_ctx)
         raise HTTPException(
             status_code=status.HTTP_504_GATEWAY_TIMEOUT,
             detail="Search phase timed out."
         )
+    except Exception as e:
+        logger.error("Search failed: %s", e, extra=logger_ctx)
+        raise HTTPException(
+            status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
+            detail="Search phase failed."
+        )
     try:
         fetched_docs = await asyncio.wait_for(
             fetch_service.fetch_search_results(
                 search_results=search_results,
                 timeout_seconds=fetch_budget_s,  # Allocate 30% of timeout (capped at 8s)
                 preserve_snippets=request.include_snippets
             ),
             timeout=fetch_budget_s
         )
     except asyncio.TimeoutError:
         logger.warning(f"Fetch timed out after {request.timeout_seconds * 0.3}s", extra=logger_ctx)
         # Continue with whatever we got
         fetched_docs = []
+    except Exception as e:
+        logger.error("Fetch failed: %s", e, extra=logger_ctx)
+        # Continue with empty docs; ranking will handle and return 422 if needed
+        fetched_docs = []

Also applies to: 183-203, 208-221, 234-248


262-275: Harden citation snippet building against None/non-str

Guard against None or non-string passage to avoid TypeError during slicing.

Apply this diff:

-    for evidence in ranked_evidence:
+    for evidence in ranked_evidence:
         if evidence.id in synthesis_result.used_citation_ids:
+            passage = evidence.passage or ""
+            snippet = (passage[:200] + "...") if len(passage) > 200 else passage
             citations.append(Citation(
                 id=evidence.id,
                 url=evidence.url,
                 title=evidence.title,
                 site_name=evidence.site_name,
                 published_at=evidence.published_at,
-                snippet=evidence.passage[:200] + "..." if len(evidence.passage) > 200 else evidence.passage,
+                snippet=snippet,
                 score=evidence.score
             ))

126-131: Prefer parameterized logging over f-strings

Use logger’s lazy formatting to avoid building strings when the log level is disabled and for consistency.

Example:

-logger.info(
-    f"Web search answer request: query='{request.query}', "
-    f"top_k={request.top_k_results}, timeout={request.timeout_seconds}s",
-    extra=logger_ctx
-)
+logger.info(
+    "Web search answer request: query='%s', top_k=%d, timeout=%ss",
+    request.query, request.top_k_results, request.timeout_seconds,
+    extra=logger_ctx
+)

Also applies to: 167-167, 179-179, 193-193, 203-203, 217-217, 229-229, 244-244, 294-297


79-85: Consider time.perf_counter for higher-resolution timings

perf_counter is monotonic and better for measuring elapsed durations.

Example:

-start_time = time.time()
+start_time = time.perf_counter()
...
-timings["search"] = int((time.time() - search_start) * 1000)
+timings["search"] = int((time.perf_counter() - search_start) * 1000)
...
-timings["total"] = int((time.time() - start_time) * 1000)
+timings["total"] = int((time.perf_counter() - start_time) * 1000)

Also applies to: 251-253

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 483f3b1 and 8d63cfa.

📒 Files selected for processing (1)
  • api/endpoints/web_answering.py (1 hunks)
🔇 Additional comments (2)
api/endpoints/web_answering.py (2)

120-120: LGTM: start_time placement

Starting the timer immediately after the docstring is fine and keeps service initialization within the total timing.


47-67: Pydantic v2 dependency verified — no changes needed
requirements.txt pins pydantic>=2.6.0,<3.0.0 and api/endpoints/web_answering.py correctly imports field_validator (Pydantic v2 API).

paytoniga and others added 2 commits September 24, 2025 08:03
* Enhance structured logging with improved JSON formatting and exception handling

* Refactor logging setup to prevent duplicate stdout handlers and improve handler management
…xt-aware logging, and update JSON formatter for better structured output.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (11)
api/endpoints/web_answering.py (7)

155-167: Log service init failures with stack traces and preserve cause; satisfy Ruff TRY400/B904.

Use logger.exception() and chain the HTTPException.

Apply:

-        except Exception as e:
-            logger.error("Failed to initialize services: %s", e)
-            raise HTTPException(
+        except Exception as err:  # noqa: BLE001
+            logger.exception("Failed to initialize services")
+            raise HTTPException(
                 status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
                 detail="Service initialization error.",
-            )
+            ) from err

168-187: Align inner/outer timeouts; fix exception chaining per Ruff B904.

Compute a single search budget and use it for both the service call and wait_for. Chain the HTTPException with from None to avoid misleading tracebacks.

-        search_start = time.time()
+        search_start = time.time()
+        budget_search = min(request.timeout_seconds * 0.4, 10)
         try:
             search_results = await asyncio.wait_for(
                 search_service.search(
                     query=request.query,
                     k=request.top_k_results,
                     region=request.region,
                     language=request.language,
-                    timeout_seconds=min(request.timeout_seconds * 0.4, 10),  # Allocate 40% of timeout
+                    timeout_seconds=budget_search,  # Allocate 40% of timeout
                 ),
-                timeout=request.timeout_seconds * 0.4,
+                timeout=budget_search,
             )
         except asyncio.TimeoutError:
             logger.warning(f"Search timed out after {request.timeout_seconds * 0.4}s")
-            raise HTTPException(
+            raise HTTPException(
                 status_code=status.HTTP_504_GATEWAY_TIMEOUT,
                 detail="Search phase timed out.",
-            )
+            ) from None

197-207: Reuse a single fetch budget for consistency.

Keeps the actual operation and wait bound aligned.

-        fetch_start = time.time()
+        fetch_start = time.time()
+        budget_fetch = min(request.timeout_seconds * 0.3, 8)
         try:
             fetched_docs = await asyncio.wait_for(
                 fetch_service.fetch_search_results(
                     search_results=search_results,
-                    timeout_seconds=min(request.timeout_seconds * 0.3, 8),  # Allocate 30% of timeout
+                    timeout_seconds=budget_fetch,  # Allocate 30% of timeout
                     preserve_snippets=request.include_snippets,
                 ),
-                timeout=request.timeout_seconds * 0.3,
+                timeout=budget_fetch,
             )

207-218: Graceful fallback when fetch times out and snippets were requested.

Currently, a fetch timeout yields [], leading to 422 later. If snippets were requested, consider falling back to ranking over search snippets to avoid unnecessary failures.

Would you like me to draft a fallback that maps search_results into minimal doc objects (id/url/title/passage=snippet) when fetched_docs is empty and include_snippets is True?


231-236: Chain HTTPException to satisfy Ruff B904.

-            raise HTTPException(
+            raise HTTPException(
                 status_code=status.HTTP_504_GATEWAY_TIMEOUT,
                 detail="Ranking phase timed out.",
-            )
+            ) from None

259-263: Chain HTTPException to satisfy Ruff B904.

-            raise HTTPException(
+            raise HTTPException(
                 status_code=status.HTTP_504_GATEWAY_TIMEOUT,
                 detail="Synthesis phase timed out.",
-            )
+            ) from None

270-276: 422 for no citations is correct; consider including a hint in logs about budgets.

Good guard. Consider logging remaining overall budget for debugging flakiness.

services/logging_config.py (4)

26-26: Mutable default for ContextVar; silence Ruff B039 or switch to an immutable default.

Either mark the line or use an immutable mapping to avoid the warning.

Option A (minimal):

-_log_context: contextvars.ContextVar[Dict[str, Any]] = contextvars.ContextVar("_log_context", default={})
+_log_context: contextvars.ContextVar[Dict[str, Any]] = contextvars.ContextVar("_log_context", default={})  # noqa: B039

Option B (immutable default, preferred):

+from types import MappingProxyType
-_log_context: contextvars.ContextVar[Dict[str, Any]] = contextvars.ContextVar("_log_context", default={})
+_log_context = contextvars.ContextVar("_log_context", default=MappingProxyType({}))

If you choose option B, adjust calls like _log_context.get({}) to _log_context.get() and keep using dict(...) where mutation is needed.


66-69: Don’t catch broad Exception around json.dumps; catch TypeError.

Keeps noise out of logs and satisfies linters.

-            except Exception:
+            except TypeError:
                 base[k] = str(v)

90-99: Narrow exception when parsing LOG_LEVEL env; satisfies Ruff BLE001.

Only ValueError is expected from int().

-            try:
-                level = int(env_level)
-            except Exception:
-                level = getattr(logging, env_level.upper(), logging.INFO)
+            try:
+                level = int(env_level)
+            except ValueError:
+                level = getattr(logging, env_level.upper(), logging.INFO)

128-137: Make add_rotating_file_handler idempotent; add UTF‑8 encoding.

Prevents duplicate file handlers on reload and ensures encoding.

-def add_rotating_file_handler(path: str, max_bytes: int = 10_000_000, backup_count: int = 3, *, level: Optional[int] = None,
-                              pretty: bool = False) -> RotatingFileHandler:
+def add_rotating_file_handler(path: str, max_bytes: int = 10_000_000, backup_count: int = 3, *, level: Optional[int] = None,
+                              pretty: bool = False) -> RotatingFileHandler:
     """Add a RotatingFileHandler to the root logger and return it."""
-    handler = RotatingFileHandler(path, maxBytes=max_bytes, backupCount=backup_count)
+    import os as _os
+    abs_path = _os.path.abspath(path)
+    root = logging.getLogger()
+    for h in root.handlers:
+        if isinstance(h, RotatingFileHandler) and getattr(h, "baseFilename", None) == abs_path:
+            return h
+    handler = RotatingFileHandler(abs_path, maxBytes=max_bytes, backupCount=backup_count, encoding="utf-8")
     handler.setFormatter(JsonFormatter(pretty=pretty))
     if level is None:
         level = logging.getLogger().level
     handler.setLevel(level)
-    logging.getLogger().addHandler(handler)
+    root.addHandler(handler)
     return handler
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8d63cfa and 375ed17.

📒 Files selected for processing (2)
  • api/endpoints/web_answering.py (2 hunks)
  • services/logging_config.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
api/endpoints/web_answering.py (1)
services/logging_config.py (5)
  • setup_logging (76-125)
  • get_logger (171-174)
  • set_log_context (157-160)
  • clear_log_context (167-168)
  • add_rotating_file_handler (128-137)
🪛 Ruff (0.13.1)
services/logging_config.py

26-26: Do not use mutable data structures for ContextVar defaults

(B039)


38-38: Unused method argument: datefmt

(ARG002)


68-68: Do not catch blind exception: Exception

(BLE001)


95-95: Do not catch blind exception: Exception

(BLE001)

api/endpoints/web_answering.py

161-161: Do not catch blind exception: Exception

(BLE001)


162-162: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


163-166: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


183-186: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


233-236: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


260-263: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🔇 Additional comments (3)
api/endpoints/web_answering.py (2)

315-317: Nice success log with totals and citations.

LGTM.


320-322: Good hygiene clearing per-request log context.

Prevents context bleed between requests.

services/logging_config.py (1)

171-175: get_logger adapter looks good.

API is clear and composes well with contextvars.

Comment on lines +29 to +39
setup_logging()
LOG_FILE = os.getenv("LOG_FILE")
if LOG_FILE:
# Add a rotating file handler if LOG_FILE is set in the environment
try:
add_rotating_file_handler(LOG_FILE)
except Exception:
# Avoid raising on startup if file handler can't be added
logging.getLogger(__name__).exception("Failed to add rotating file handler")

logger = get_logger(__name__)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Avoid configuring logging in an importable module; dedup rotating file handler.

Module import side effects can add duplicate handlers (especially under reload) and override app-level logging. Move setup to app startup, or at minimum guard file handler addition.

Apply this diff to make the endpoint side‑effect free (preferred):

-setup_logging()
-LOG_FILE = os.getenv("LOG_FILE")
-if LOG_FILE:
-    # Add a rotating file handler if LOG_FILE is set in the environment
-    try:
-        add_rotating_file_handler(LOG_FILE)
-    except Exception:
-        # Avoid raising on startup if file handler can't be added
-        logging.getLogger(__name__).exception("Failed to add rotating file handler")
-
 logger = get_logger(__name__)

If you must keep file logging here, dedup handlers by path:

-LOG_FILE = os.getenv("LOG_FILE")
-if LOG_FILE:
-    # Add a rotating file handler if LOG_FILE is set in the environment
-    try:
-        add_rotating_file_handler(LOG_FILE)
-    except Exception:
-        # Avoid raising on startup if file handler can't be added
-        logging.getLogger(__name__).exception("Failed to add rotating file handler")
+LOG_FILE = os.getenv("LOG_FILE")
+if LOG_FILE:
+    try:
+        root = logging.getLogger()
+        # prevent duplicates on reload
+        if not any(
+            getattr(h, "baseFilename", None) == os.path.abspath(LOG_FILE)
+            for h in root.handlers
+            if h.__class__.__name__ == "RotatingFileHandler"
+        ):
+            add_rotating_file_handler(LOG_FILE)
+    except Exception:
+        logging.getLogger(__name__).exception("Failed to add rotating file handler")

Add this import outside the changed range:

from logging.handlers import RotatingFileHandler  # if you choose the dedup path uses isinstance
🤖 Prompt for AI Agents
In api/endpoints/web_answering.py around lines 29 to 39, the module configures
logging at import time which causes side effects and can add duplicate rotating
file handlers; move setup_logging() and any add_rotating_file_handler(LOG_FILE)
calls into the application startup/init routine so the endpoint module remains
side‑effect free. If you cannot move it, add an import for RotatingFileHandler
and guard handler addition by scanning logging.getLogger().handlers for an
existing RotatingFileHandler with the same baseFilename (or same handler class
and filename) and only create/add a new RotatingFileHandler when none exists,
and keep the try/except but log the exception including context.

@fehranbit fehranbit self-requested a review September 25, 2025 05:55
@fehranbit fehranbit merged commit 77ca247 into main Sep 25, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants