Skip to content

Conversation

alvabba
Copy link
Collaborator

@alvabba alvabba commented Sep 20, 2025

Short summary: Adds an internal, citation-first web answering endpoint that searches, fetches, ranks, and synthesizes Markdown answers with inline footnotes and a References block. Handles partial failures and returns timings/meta for observability.

New Features

  • Endpoint (POST /internal/websearch/answer, X-Internal-API-Key):

    • Request: query, top_k_results=8, max_context_chars=20000, region="auto", language="en", style_profile_id?, answer_tokens=800, include_snippets=true, timeout_seconds=25.
    • Response: { answer_markdown, citations[], used_sources_count, timings_ms, meta }.
  • Router: Included under /internal with tag websearch.

  • Web Fetch (httpx.AsyncClient, semaphore MAX_FETCH_CONCURRENCY):

    • Fetch only returned URLs; timeout-aware; graceful partial failure.
    • Text extraction via trafilatura with HTML→text fallback.
  • Ranking: Passage split (~800–1200 chars + small overlap), cosine sim via EmbeddingService, trimmed to max_context_chars.

  • Synthesis: Evidence-first prompt, inline [^i] markers, References block, (citation needed) for unsupported claims, optional style capsule, return only actually-cited sources.

Improvements

  • Resilience & Observability: Best-effort answers under timeouts/throttling; timings_ms + request_id in logs.
  • Consistency: Normalized search → fetch → rank → synth pipeline; markers align with citations[].

Security & Config

  • Guarded by X-Internal-API-Key; internal-only under /internal.
  • Works with dummy and tavily; per-request timeout_seconds; controlled concurrency via env.

Summary by CodeRabbit

  • New Features
    • New web search answering API that returns concise Markdown answers with source citations, per-phase timing metadata, region/language selection, style profiles, optional snippets, configurable answer length and timeout, and improved relevance ranking and synthesis with clearer timeout/error handling.
  • Tests
    • Added an integration test validating real web search results via an external provider.

Copy link

coderabbitai bot commented Sep 20, 2025

Walkthrough

Adds a Perplexity-style web-answering FastAPI endpoint with internal API-key auth, plus new RankingService and SynthesisService modules, per-phase timeouted pipeline (search → fetch → rank → synthesize), request/response models, an integration test for Tavily, and related dataclasses/models.

Changes

Cohort / File(s) Change Summary
Web search answering endpoint
api/endpoints/web_answering.py
New FastAPI router with POST /websearch/answer, Pydantic request/response models, verify_internal_api_key dependency, request logging and id, four-phase pipeline (search → fetch → rank → synthesize) with per-phase timeouts/fallbacks, timings collection, citation assembly, and HTTP error handling.
Ranking service
services/ranking_service.py
New RankingService and RankedEvidence dataclass: splits fetched docs into passages, embeds query & passages, computes cosine similarities, ranks passages, selects/truncates passages under max_context_chars, and logs performance.
Synthesis service
services/synthesis_service.py
New SynthesisService and SynthesisResult dataclass: builds system/user prompts with an evidence block and optional style profiles, calls OpenAI chat API to generate Markdown answers, extracts citation IDs, records token usage and generation time, and returns structured fallback results on error.
Integration test (Tavily)
tests/test_tavily_search.py
New async integration test test_tavily_search_returns_results using WebSearchService(provider_name="tavily") to perform a live search and assert non-empty results (skips if TAVILY_API_KEY not set).

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Client
  participant API as FastAPI /websearch/answer
  participant Search as WebSearchService
  participant Fetch as WebFetchService
  participant Rank as RankingService
  participant Synth as SynthesisService

  Client->>API: POST /websearch/answer (query, params, X-Internal-API-Key)
  API->>API: verify_internal_api_key
  rect rgb(235,245,255)
    note over API: Phase 1 — Search (<=40% timeout)
    API->>Search: search(query, top_k_results, region, language, timeout)
    Search-->>API: results or timeout/empty
  end
  alt no results
    API-->>Client: 422 Unprocessable Entity
  else results
    rect rgb(235,245,255)
      note over API: Phase 2 — Fetch (<=30% timeout)
      API->>Fetch: fetch(results, timeout, include_snippets)
      Fetch-->>API: fetched docs or empty
    end
    alt no docs (and snippets not allowed)
      API-->>Client: 422 Unprocessable Entity
    else docs
      rect rgb(235,245,255)
        note over API: Phase 3 — Rank (<=15% timeout)
        API->>Rank: rank_documents(query, docs, max_context_chars)
        Rank-->>API: ranked evidence or empty
      end
      alt no evidence
        API-->>Client: 422 Unprocessable Entity
      else evidence
        rect rgb(235,245,255)
          note over API: Phase 4 — Synthesis (remaining timeout)
          API->>Synth: generate_answer(query, evidence, answer_tokens, style_profile_id)
          Synth-->>API: answer_markdown, used_citation_ids, token usage
        end
        API->>API: assemble citations, timings_ms, meta
        API-->>Client: 200 WebSearchAnswerResponse
      end
    end
  end
  par Any phase timeout
    API-->>Client: 504 Gateway Timeout
  end
  par Unexpected error
    API-->>Client: 500 Internal Server Error
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Suggested reviewers

  • fehranbit

Poem

I hop through links with whiskers keen,
I gather snippets, stitch each scene.
I number sources, bind them true,
Turn scattered pages into view.
A rabbit's hop brings answers new. 🥕🐇

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The PR title "Web Answering API: Cited Answers via Fetch→Rank→Synthesize" succinctly captures the primary change — adding an internal web-answering API that returns citation-aware answers and implements a fetch→rank→synthesize pipeline. It is concise, specific, and developer-focused so a teammate scanning history can understand the main purpose without extraneous detail.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/web-answering-endpoint

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (11)
services/synthesis_service.py (2)

133-148: Harden citation extraction to body-only and filter to valid IDs.

Regex currently counts markers in the References block and any hallucinated IDs. Limit to the body (before “References”) and intersect with evidence IDs.

Apply this diff:

-def _extract_citation_ids(self, text: str) -> Set[int]:
+def _extract_citation_ids(self, text: str, valid_ids: Optional[Set[int]] = None) -> Set[int]:
@@
-        # Find all [^i] citations
-        citation_markers = re.findall(r'\[\^(\d+)\]', text)
-        
-        # Convert to integers and return as a set
-        return {int(id) for id in citation_markers if id.isdigit()}
+        # Consider citations only in the answer body (before "References")
+        body = text.split("\nReferences", 1)[0]
+        ids = {int(m) for m in re.findall(r'\[\^(\d+)\]', body)}
+        if valid_ids is not None:
+            ids &= set(valid_ids)
+        return ids

And pass the valid IDs when extracting:

-            used_citation_ids = self._extract_citation_ids(answer)
+            used_citation_ids = self._extract_citation_ids(
+                answer, valid_ids={e.id for e in evidence_list}
+            )

Also applies to: 195-195


50-56: Prompt looks good. Consider moving model name and tone defaults to settings.

Keeps config centralized and testable.

services/ranking_service.py (2)

129-140: Avoid relying on exceptions for control flow; handle empty embedding responses.

embed_texts([query])[0] will IndexError if API returns []. Check lengths before indexing.

Apply this diff:

-            # Embed query
-            query_embedding = embed_texts([query])[0]
-            
-            # Embed passages
-            passage_embeddings = embed_texts(passages)
+            # Embed query
+            q_embeds = embed_texts([query])
+            if not q_embeds:
+                raise RuntimeError("Empty query embedding")
+            query_embedding = q_embeds[0]
+
+            # Embed passages
+            passage_embeddings = embed_texts(passages)
+            if not passage_embeddings:
+                raise RuntimeError("Empty passage embeddings")

141-144: Prefer logger.exception and avoid f-strings in logs.

Improves stack traces and avoids needless string interpolation.

Apply this diff:

-            logger.error(f"Error during embedding or similarity computation: {str(e)}")
+            logger.exception("Error during embedding or similarity computation")
api/endpoints/web_answering.py (5)

111-112: Unused dependency variable; use _ to signal intent.

Keeps API key check while silencing “unused variable”.

Apply this diff:

-    api_key: str = Depends(verify_internal_api_key)
+    _: None = Depends(verify_internal_api_key)

21-25: Import FetchedDoc for snippet-only fallback.

Needed to build docs when fetch times out.

Apply this diff:

-from services.web_fetch_service import WebFetchService
+from services.web_fetch_service import WebFetchService, FetchedDoc

195-199: Timeout fallback: preserve snippets instead of dropping all docs.

If fetch times out, degrade gracefully by constructing snippet-backed docs when include_snippets=True.

Apply this diff:

-        except asyncio.TimeoutError:
-            logger.warning(f"Fetch timed out after {request.timeout_seconds * 0.3}s", extra=logger_ctx)
-            # Continue with whatever we got
-            fetched_docs = []
+        except asyncio.TimeoutError:
+            logger.warning(
+                "Fetch timed out after %.2fs", request.timeout_seconds * 0.3, extra=logger_ctx
+            )
+            if request.include_snippets:
+                fetched_docs = [
+                    FetchedDoc(
+                        url=r.url,
+                        title=r.title,
+                        site_name=r.site_name,
+                        text=(r.snippet or ""),
+                        published_at=getattr(r, "published_at", None),
+                        fetch_ms=0,
+                    )
+                    for r in search_results
+                    if getattr(r, "snippet", None)
+                ]
+            else:
+                fetched_docs = []

121-125: Use parameterized logging and avoid f-strings.

Saves formatting cost and standardizes logs.

Apply this diff:

-    logger.info(
-        f"Web search answer request: query='{request.query}', "
-        f"top_k={request.top_k_results}, timeout={request.timeout_seconds}s",
-        extra=logger_ctx
-    )
+    logger.info(
+        "Web search answer request: query=%r, top_k=%d, timeout=%ds",
+        request.query, request.top_k_results, request.timeout_seconds, extra=logger_ctx
+    )
@@
-        logger.info(f"Search completed with {len(search_results)} results", extra=logger_ctx)
+        logger.info("Search completed with %d results", len(search_results), extra=logger_ctx)
@@
-        logger.info(f"Fetch completed with {len(fetched_docs)} documents", extra=logger_ctx)
+        logger.info("Fetch completed with %d documents", len(fetched_docs), extra=logger_ctx)
@@
-        logger.info(f"Ranking completed with {len(ranked_evidence)} evidence passages", extra=logger_ctx)
+        logger.info("Ranking completed with %d evidence passages", len(ranked_evidence), extra=logger_ctx)
@@
-            logger.warning(f"Synthesis timed out after {request.timeout_seconds * 0.15}s", extra=logger_ctx)
+            logger.warning("Synthesis timed out after %.2fs", request.timeout_seconds * 0.15, extra=logger_ctx)
@@
-    logger.info(
-            f"Answer generated successfully in {timings['total']}ms with {len(citations)} citations",
-            extra=logger_ctx
-        )
+    logger.info(
+        "Answer generated successfully in %dms with %d citations",
+        timings["total"], len(citations), extra=logger_ctx
+    )

Also applies to: 181-182, 208-209, 237-238, 251-256, 303-306


121-125: Minor: avoid logging full queries (possible PII).

Consider masking or sampling in prod logs.

tests/test_tavily_search.py (2)

9-24: Make this a stable, skippable pytest async test.

As written, this will be flaky and may fail under pytest (async test without marker). Gate on API key and remove “today” from query.

Apply this diff:

+import pytest
@@
-async def test_tavily_search():
+@pytest.mark.asyncio
+async def test_tavily_search():
+    if not os.getenv("TAVILY_API_KEY"):
+        pytest.skip("TAVILY_API_KEY not set; skipping Tavily integration test")
@@
-        query = "WHat is the result of the asia cup match between ban vs sri today?"
+        query = "Asia Cup cricket Bangladesh vs Sri Lanka result site:icc-cricket.com"
@@
-        assert len(results) > 0, "No search results returned"
+        assert len(results) > 0, "No search results returned"

39-44: Drop broad try/except and prints in tests.

Let pytest report failures; reserve prints for debugging.

-    except Exception as e:
-        print(f"Error: {str(e)}")
-        import traceback
-        traceback.print_exc()
-        return False
+    except Exception:
+        # Let pytest surface the traceback naturally
+        raise
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e01739f and 2309b37.

📒 Files selected for processing (4)
  • api/endpoints/web_answering.py (1 hunks)
  • services/ranking_service.py (1 hunks)
  • services/synthesis_service.py (1 hunks)
  • tests/test_tavily_search.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (4)
services/ranking_service.py (3)
services/chunking_service.py (1)
  • chunk_text (9-39)
services/embedding_service.py (1)
  • embed_texts (19-42)
services/web_fetch_service.py (1)
  • FetchedDoc (30-37)
api/endpoints/web_answering.py (4)
services/web_search_service.py (1)
  • WebSearchService (197-298)
services/web_fetch_service.py (2)
  • WebFetchService (39-274)
  • fetch_search_results (235-274)
services/ranking_service.py (2)
  • RankingService (31-190)
  • rank_documents (95-190)
services/synthesis_service.py (2)
  • SynthesisService (29-224)
  • generate_answer (149-224)
services/synthesis_service.py (1)
services/ranking_service.py (1)
  • RankedEvidence (21-29)
tests/test_tavily_search.py (1)
services/web_search_service.py (1)
  • WebSearchService (197-298)
🪛 Ruff (0.13.1)
services/ranking_service.py

140-140: Do not catch blind exception: Exception

(BLE001)


141-141: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


141-141: Use explicit conversion flag

Replace with conversion flag

(RUF010)

api/endpoints/web_answering.py

142-142: Expected an indented block after try statement

(invalid-syntax)

services/synthesis_service.py

45-45: Avoid specifying long messages outside the exception class

(TRY003)


210-210: Do not catch blind exception: Exception

(BLE001)


211-211: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


211-211: Use explicit conversion flag

Replace with conversion flag

(RUF010)


216-216: Use explicit conversion flag

Replace with conversion flag

(RUF010)

tests/test_tavily_search.py

38-38: Consider moving this statement to an else block

(TRY300)


39-39: Do not catch blind exception: Exception

(BLE001)


40-40: Use explicit conversion flag

Replace with conversion flag

(RUF010)

🔇 Additional comments (2)
services/ranking_service.py (1)

183-189: LGTM: sequential ID reassignment for citations.

Ensures [^i] markers align with final selection order.

api/endpoints/web_answering.py (1)

107-107: Endpoint path prefix "/internal" — verify router inclusion.

Spec expects POST /internal/websearch/answer; file api/endpoints/web_answering.py defines @router.post("/websearch/answer", ...). Search returned no matches for app.include_router(router, prefix='/internal'); confirm the application mounts this router with prefix='/internal' (e.g. app.include_router(router, prefix='/internal')) or update the route to include '/internal'.

@fehranbit fehranbit changed the base branch from enhancement/tavily-api-auth to main September 21, 2025 16:29
…d ranking and synthesis services for enhanced query responses
@alvabba alvabba force-pushed the feature/web-answering-endpoint branch from a714bcc to 7d1bfb9 Compare September 22, 2025 12:34
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (6)
tests/test_tavily_search.py (1)

1-5: Drop unused imports.

sys and Path are unused.

-import sys
-import asyncio
-import os
-from pathlib import Path
+import os
+import asyncio
services/ranking_service.py (2)

139-144: Use logger.exception and keep logs structured; avoid f-strings.

Improves observability and aligns with lint hints.

-        except Exception as e:
-            logger.error(f"Error during embedding or similarity computation: {str(e)}")
+        except Exception:
+            logger.exception("Error during embedding or similarity computation")
             # Fallback: assign decreasing scores based on original order
             similarity_scores = [1.0 - (i / len(passages)) for i in range(len(passages))]

135-140: Defensive check for embedding length mismatches.

If the embedding service returns fewer vectors than passages (partial failure), fall back instead of raising later.

             # Embed passages
             passage_embeddings = embed_texts(passages)
             
-            # Compute similarity scores
+            # Defensive: ensure shapes match
+            if len(passage_embeddings) != len(passages):
+                raise RuntimeError("Embedding count mismatch")
+            # Compute similarity scores
             similarity_scores = self._compute_similarity(query_embedding, passage_embeddings)
api/endpoints/web_answering.py (2)

161-164: Pass integer timeouts to providers and wait_for.

Avoid float seconds where ints are expected.

-                    timeout_seconds=min(request.timeout_seconds * 0.4, 10)  # Allocate 40% of timeout
+                    timeout_seconds=int(min(request.timeout_seconds * 0.4, 10))  # Allocate 40% of timeout
                 ),
-                timeout=request.timeout_seconds * 0.4
+                timeout=float(request.timeout_seconds) * 0.4
@@
-                    timeout_seconds=min(request.timeout_seconds * 0.3, 8),  # Allocate 30% of timeout
+                    timeout_seconds=int(min(request.timeout_seconds * 0.3, 8)),  # Allocate 30% of timeout
                 ),
-                timeout=request.timeout_seconds * 0.3
+                timeout=float(request.timeout_seconds) * 0.3
@@
-                timeout=request.timeout_seconds * 0.15  # Allocate 15% of timeout
+                timeout=float(request.timeout_seconds) * 0.15  # Allocate 15% of timeout
@@
-                timeout=request.timeout_seconds * 0.15  # Allocate remaining time
+                timeout=float(request.timeout_seconds) * 0.15  # Allocate remaining time

Also applies to: 189-193, 219-221, 249-250


33-35: Verify internal route prefix.

PR says the endpoint is internal under /internal. Confirm a parent router prefixes “/internal”; if not, add prefix or inline it on this route.

Possible change:

-router = APIRouter(tags=["websearch"])
+router = APIRouter(prefix="/internal", tags=["websearch"])
@@
-@router.post("/websearch/answer", response_model=WebSearchAnswerResponse, status_code=200)
+@router.post("/websearch/answer", response_model=WebSearchAnswerResponse, status_code=200)

Or keep router as-is and change path to "/internal/websearch/answer".

Also applies to: 107-112

services/synthesis_service.py (1)

190-201: Defensively access token usage and use structured logs.

Avoid attribute errors and f-strings in logs.

-            prompt_tokens = response.usage.prompt_tokens
-            completion_tokens = response.usage.completion_tokens
+            prompt_tokens = getattr(response.usage, "prompt_tokens", 0) or 0
+            completion_tokens = getattr(response.usage, "completion_tokens", 0) or 0
@@
-            logger.info(f"Generated answer in {generation_ms}ms, {completion_tokens} tokens, {len(used_citation_ids)} citations")
+            logger.info(
+                "Generated answer in %dms, %d tokens, %d citations",
+                generation_ms, completion_tokens, len(used_citation_ids)
+            )
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2309b37 and 7d1bfb9.

📒 Files selected for processing (4)
  • api/endpoints/web_answering.py (1 hunks)
  • services/ranking_service.py (1 hunks)
  • services/synthesis_service.py (1 hunks)
  • tests/test_tavily_search.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (4)
services/ranking_service.py (3)
services/chunking_service.py (1)
  • chunk_text (9-39)
services/embedding_service.py (1)
  • embed_texts (19-42)
services/web_fetch_service.py (1)
  • FetchedDoc (31-38)
api/endpoints/web_answering.py (4)
services/web_search_service.py (1)
  • WebSearchService (202-303)
services/web_fetch_service.py (2)
  • WebFetchService (40-301)
  • fetch_search_results (262-301)
services/ranking_service.py (2)
  • RankingService (31-190)
  • rank_documents (95-190)
services/synthesis_service.py (2)
  • SynthesisService (29-224)
  • generate_answer (149-224)
services/synthesis_service.py (1)
services/ranking_service.py (1)
  • RankedEvidence (21-29)
tests/test_tavily_search.py (1)
services/web_search_service.py (1)
  • WebSearchService (202-303)
🪛 Ruff (0.13.1)
services/ranking_service.py

140-140: Do not catch blind exception: Exception

(BLE001)


141-141: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


141-141: Use explicit conversion flag

Replace with conversion flag

(RUF010)

api/endpoints/web_answering.py

142-142: Expected an indented block after try statement

(invalid-syntax)

services/synthesis_service.py

45-45: Avoid specifying long messages outside the exception class

(TRY003)


210-210: Do not catch blind exception: Exception

(BLE001)


211-211: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


211-211: Use explicit conversion flag

Replace with conversion flag

(RUF010)


216-216: Use explicit conversion flag

Replace with conversion flag

(RUF010)

tests/test_tavily_search.py

38-38: Consider moving this statement to an else block

(TRY300)


39-39: Do not catch blind exception: Exception

(BLE001)


40-40: Use explicit conversion flag

Replace with conversion flag

(RUF010)

🔇 Additional comments (5)
services/ranking_service.py (1)

86-93: Guard cosine normalization against zero-norm vectors.

Zero norms yield NaNs and break ranking; add safe normalization.

-        # Normalize vectors
-        query_vec = query_vec / np.linalg.norm(query_vec)
-        passage_vecs = passage_vecs / np.linalg.norm(passage_vecs, axis=1, keepdims=True)
+        # Normalize vectors with zero-norm guards
+        q_norm = np.linalg.norm(query_vec)
+        if q_norm == 0:
+            return [0.0] * len(passage_embeddings)
+        query_vec = query_vec / q_norm
+        p_norms = np.linalg.norm(passage_vecs, axis=1, keepdims=True)
+        p_norms[p_norms == 0] = 1.0
+        passage_vecs = passage_vecs / p_norms
api/endpoints/web_answering.py (2)

18-18: Migrate validators to Pydantic v2.

Replace v1 @validator with @field_validator and update import.

-from pydantic import BaseModel, Field, validator
+from pydantic import BaseModel, field_validator
@@
-    # Validators
-    @validator("top_k_results")
-    def validate_top_k_results(cls, v):
+    # Validators
+    @field_validator("top_k_results")
+    def validate_top_k_results(cls, v: int) -> int:
         return max(3, min(15, v))  # Clamp between 3 and 15
@@
-    @validator("max_context_chars")
-    def validate_max_context_chars(cls, v):
+    @field_validator("max_context_chars")
+    def validate_max_context_chars(cls, v: int) -> int:
         return max(1000, min(50000, v))  # Clamp between 1000 and 50000
@@
-    @validator("answer_tokens")
-    def validate_answer_tokens(cls, v):
+    @field_validator("answer_tokens")
+    def validate_answer_tokens(cls, v: int) -> int:
         return max(100, min(2000, v))  # Clamp between 100 and 2000
@@
-    @validator("timeout_seconds")
-    def validate_timeout_seconds(cls, v):
+    @field_validator("timeout_seconds")
+    def validate_timeout_seconds(cls, v: int) -> int:
         return max(5, min(60, v))  # Clamp between 5 and 60 seconds

Also applies to: 47-63


139-154: Fix syntax error: stray/duplicated try; endpoint won’t import.

Remove the extra try and keep a single init block.

-    try:
-    # Initialize services
-    # Initialize services
-    try:
+    # Initialize services
+    try:
         search_service = WebSearchService()
         fetch_service = WebFetchService()
         ranking_service = RankingService()
         synthesis_service = SynthesisService()
     except Exception as e:
-        logger.error(f"Failed to initialize services: {str(e)}", extra=logger_ctx)
+        logger.error("Failed to initialize services: %s", e, extra=logger_ctx)
         raise HTTPException(
             status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
-            detail=f"Service initialization error: {str(e)}"
-        )        # 1. Search phase
+            detail="Service initialization error."
+        )
+    # 1. Search phase
services/synthesis_service.py (2)

15-16: Make the client async to avoid blocking the event loop.

Use AsyncOpenAI and await the call so endpoint timeouts work.

-from openai import OpenAI
+from openai import AsyncOpenAI
@@
     def __init__(self, api_key: Optional[str] = None, model: str = "gpt-4o"):
@@
-        self.model = model
-        self.client = OpenAI(api_key=self.api_key)
+        self.model = model
+        self.client = AsyncOpenAI(api_key=self.api_key, timeout=30.0)
@@
-            response = self.client.chat.completions.create(
+            response = await self.client.chat.completions.create(
                 model=self.model,
                 messages=[
                     {"role": "system", "content": system_prompt},
                     {"role": "user", "content": user_prompt}
                 ],
                 temperature=0.2,
                 max_tokens=answer_tokens,
             )

Also applies to: 35-49, 174-185


210-224: Don’t leak exception details; log stack and return a generic message.

Aligns with security/observability practices.

-        except Exception as e:
-            logger.error(f"Error generating answer: {str(e)}")
+        except Exception:
+            logger.exception("Error generating answer")
@@
-            fallback_answer = (
-                f"I apologize, but I encountered an error while generating an answer to your question: \"{query}\"\n\n"
-                f"Error: {str(e)}\n\n"
-                f"Please try again with a different query or contact support if the problem persists."
-            )
+            fallback_answer = (
+                "I encountered an internal error while generating the answer. "
+                "Please try again or contact support if the problem persists."
+            )

@fehranbit fehranbit self-requested a review September 22, 2025 13:59
…r handling and performance; update synthesis service for better logging and timeout management; enhance test for Tavily search service.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (4)
services/ranking_service.py (1)

146-150: Prefer logger.exception and avoid f-string in logs; keep broad fallback.

Catching Exception is acceptable here given the downstream fallback, but use logger.exception for stack traces and structured formatting.

Apply this diff:

-        except Exception as e:
-            logger.error(f"Error during embedding or similarity computation: {str(e)}")
+        except Exception:
+            logger.exception("Error during embedding or similarity computation")
             # Fallback: assign decreasing scores based on original order
             similarity_scores = [1.0 - (i / len(passages)) for i in range(len(passages))]
api/endpoints/web_answering.py (2)

111-116: Don’t bind unused dependency parameter.

The dependency is only for auth; bind to underscore to avoid “unused” and clarify intent.

Apply this diff:

-async def web_search_answer(
-    request: WebSearchAnswerRequest,
-    req: Request,
-    api_key: str = Depends(verify_internal_api_key)
-):
+async def web_search_answer(
+    request: WebSearchAnswerRequest,
+    req: Request,
+    _: None = Depends(verify_internal_api_key),
+):

126-130: Prefer structured logging over f-strings.

Reduces formatting overhead when the log level is higher.

Apply this diff:

-    logger.info(
-        f"Web search answer request: query='{request.query}', "
-        f"top_k={request.top_k_results}, timeout={request.timeout_seconds}s",
-        extra=logger_ctx
-    )
+    logger.info(
+        "Web search answer request: query='%s', top_k=%d, timeout=%ss",
+        request.query, request.top_k_results, request.timeout_seconds,
+        extra=logger_ctx,
+    )
services/synthesis_service.py (1)

15-18: Remove unused OpenAI sync client import.

Cleanup to avoid confusion; you’re using AsyncOpenAI.

Apply this diff:

-from openai import OpenAI
-from services.ranking_service import RankedEvidence
-from openai import AsyncOpenAI
+from services.ranking_service import RankedEvidence
+from openai import AsyncOpenAI
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7d1bfb9 and 4241143.

📒 Files selected for processing (4)
  • api/endpoints/web_answering.py (1 hunks)
  • services/ranking_service.py (1 hunks)
  • services/synthesis_service.py (1 hunks)
  • tests/test_tavily_search.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/test_tavily_search.py
🧰 Additional context used
🧬 Code graph analysis (3)
api/endpoints/web_answering.py (4)
services/web_search_service.py (1)
  • WebSearchService (202-303)
services/web_fetch_service.py (2)
  • WebFetchService (40-301)
  • fetch_search_results (262-301)
services/ranking_service.py (2)
  • RankingService (31-196)
  • rank_documents (101-196)
services/synthesis_service.py (2)
  • SynthesisService (30-227)
  • generate_answer (150-227)
services/ranking_service.py (3)
services/chunking_service.py (1)
  • chunk_text (9-39)
services/embedding_service.py (1)
  • embed_texts (19-42)
services/web_fetch_service.py (1)
  • FetchedDoc (31-38)
services/synthesis_service.py (1)
services/ranking_service.py (1)
  • RankedEvidence (21-29)
🪛 Ruff (0.13.1)
api/endpoints/web_answering.py

301-301: Expected a statement

(invalid-syntax)


301-302: Expected an expression

(invalid-syntax)


303-303: Unexpected indentation

(invalid-syntax)


304-304: Expected a statement

(invalid-syntax)


304-304: Expected a statement

(invalid-syntax)


304-305: Expected an expression

(invalid-syntax)


306-306: Unexpected indentation

(invalid-syntax)


310-310: Expected a statement

(invalid-syntax)

services/ranking_service.py

146-146: Do not catch blind exception: Exception

(BLE001)


147-147: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


147-147: Use explicit conversion flag

Replace with conversion flag

(RUF010)

services/synthesis_service.py

46-46: Avoid specifying long messages outside the exception class

(TRY003)

🔇 Additional comments (4)
services/ranking_service.py (1)

86-95: Good fix: cosine similarity now guards zero-norms.

Prevents NaNs and stabilizes scores. Looks correct.

services/synthesis_service.py (2)

44-51: LGTM: async client initialization with timeout.

Appropriate default and aligns with endpoint’s asyncio.wait_for.


176-186: Await the AsyncOpenAI call; current code returns a coroutine and will break.

This is a runtime blocker.

Apply this diff:

-            response = self.client.chat.completions.create(
+            response = await self.client.chat.completions.create(
                 model=self.model,
                 messages=[
                     {"role": "system", "content": system_prompt},
                     {"role": "user", "content": user_prompt}
                 ],
                 temperature=0.2,
                 max_tokens=answer_tokens,
             )
api/endpoints/web_answering.py (1)

111-112: Confirmed — router is mounted under /internal.
web_answering router is included in api/main.py via app.include_router(web_answering_router, prefix="/internal", tags=["websearch"]).

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (6)
api/endpoints/web_answering.py (6)

140-151: Consider more specific exception handling and structured logging.

The broad Exception catch could mask specific initialization failures. Also, the error handling could follow chain patterns for better error context.

 # Initialize services
 try:
     search_service = WebSearchService()
     fetch_service = WebFetchService()
     ranking_service = RankingService()
     synthesis_service = SynthesisService()
-except Exception as e:
-    logger.error("Failed to initialize services: %s", e, extra=logger_ctx)
+except (ImportError, ValueError, ConnectionError) as e:
+    logger.exception("Failed to initialize services", extra=logger_ctx)
+    raise HTTPException(
+        status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
+        detail="Service initialization error."
+    ) from e
+except Exception as e:
+    logger.exception("Unexpected error during service initialization", extra=logger_ctx)
     raise HTTPException(
         status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
         detail="Service initialization error."
-    )
+    ) from e

166-171: Use exception chaining for better error context.

Following Python best practices for exception handling in nested contexts.

 except asyncio.TimeoutError:
     logger.warning(f"Search timed out after {request.timeout_seconds * 0.4}s", extra=logger_ctx)
     raise HTTPException(
         status_code=status.HTTP_504_GATEWAY_TIMEOUT,
         detail="Search phase timed out."
-    )
+    ) from None

216-221: Use exception chaining for better error context.

Following Python best practices for exception handling in nested contexts.

 except asyncio.TimeoutError:
     logger.warning(f"Ranking timed out after {request.timeout_seconds * 0.15}s", extra=logger_ctx)
     raise HTTPException(
         status_code=status.HTTP_504_GATEWAY_TIMEOUT,
         detail="Ranking phase timed out."
-    )
+    ) from None

241-241: Incorrect timeout allocation comment: synthesis gets less than 15%.

The comment says "Allocate remaining time" and uses 15%, but after allocating 40% + 30% + 15% = 85%, the remaining time should be 15%, not another 15%. The math seems inconsistent.

-            timeout=request.timeout_seconds * 0.15  # Allocate remaining time
+            timeout=max(request.timeout_seconds * 0.15, 2)  # Allocate remaining time (~15%)

244-248: Use exception chaining for better error context.

Following Python best practices for exception handling in nested contexts.

 except asyncio.TimeoutError:
     logger.warning(f"Synthesis timed out after {request.timeout_seconds * 0.15}s", extra=logger_ctx)
     raise HTTPException(
         status_code=status.HTTP_504_GATEWAY_TIMEOUT,
         detail="Synthesis phase timed out."
-    )
+    ) from None

272-272: Potential issue with snippet truncation for citation.

The citation snippet truncation uses a hardcoded 200-character limit without considering word boundaries, potentially creating awkward cuts mid-word.

-                snippet=evidence.passage[:200] + "..." if len(evidence.passage) > 200 else evidence.passage,
+                snippet=(
+                    evidence.passage[:197] + "..." 
+                    if len(evidence.passage) > 200 
+                    else evidence.passage
+                ),

Or for better word boundary handling:

def smart_truncate(text: str, max_length: int = 200) -> str:
    if len(text) <= max_length:
        return text
    # Find the last space before max_length
    last_space = text.rfind(' ', 0, max_length - 3)
    if last_space > max_length * 0.8:  # Only truncate at word boundary if it's reasonably close
        return text[:last_space] + "..."
    return text[:max_length - 3] + "..."
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4241143 and fad8a54.

📒 Files selected for processing (1)
  • api/endpoints/web_answering.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
api/endpoints/web_answering.py (4)
services/web_search_service.py (1)
  • WebSearchService (202-303)
services/web_fetch_service.py (2)
  • WebFetchService (40-301)
  • fetch_search_results (262-301)
services/ranking_service.py (2)
  • RankingService (31-196)
  • rank_documents (101-196)
services/synthesis_service.py (2)
  • SynthesisService (30-227)
  • generate_answer (150-227)
🪛 Ruff (0.13.1)
api/endpoints/web_answering.py

115-115: Unused function argument: api_key

(ARG001)


146-146: Do not catch blind exception: Exception

(BLE001)


147-147: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


148-151: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


168-171: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


218-221: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


245-248: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🔇 Additional comments (2)
api/endpoints/web_answering.py (2)

47-66: Already fixed: Pydantic v2 @field_validator correctly implemented.

The code correctly uses @field_validator decorators with the proper syntax for Pydantic v2. The validator functions follow the correct pattern with @classmethod methods that return the validated value.


115-115: The api_key parameter serves a functional dependency injection purpose.

The api_key parameter from Depends(verify_internal_api_key) is necessary for FastAPI's dependency injection system to enforce authentication. While the parameter isn't used within the function body, removing it would disable the authentication check.

@alvabba
Copy link
Collaborator Author

alvabba commented Sep 22, 2025

@fehranbit please review the update

@fehranbit fehranbit merged commit 483f3b1 into main Sep 23, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants