Add DummySearchProvider Fallback and Settings Improvements #27

paytoniga · 2025-09-20T21:27:29Z

This PR introduces a safe, no-key fallback for web search and loosens config requirements to improve local/dev workflows.

What Changed

core/config/settings.py
- API keys for search providers are now optional.
- Default search engine set to dummy when unspecified.
- Result: app can start and function in dev/fallback scenarios without external keys.
core/services/web_search_service.py
- Added DummySearchProvider returning mock results when no real provider is configured/available.
- WebSearchService now auto-falls back to the dummy provider to keep the system responsive.

Why It Matters

Better developer experience: run the app without provisioning keys.
Higher resilience: graceful degradation when a provider is misconfigured or unavailable.

Notes / Usage

In production, set a real provider and keys to restore full search behavior.
Logs indicate when the dummy provider is active for easier troubleshooting.

Verification

Start app with no search keys → dummy provider returns mock results.
Start app with valid keys → real provider is used; no functional regression observed.

Summary by CodeRabbit

New Features
- Added web search capability with provider selection (Tavily + dummy), result tuning, optional in-memory caching, and fallback behavior.
- Improved vector storage upserts to support storing associated metadata.
API
- Mounted internal websearch endpoints under /internal.
Configuration
- New example environment settings for web search engine, API keys, max fetch concurrency, and default top-k results.
Performance/Reliability
- HTTP/2-enabled client and improved retry/backoff for web requests.
Dependencies
- Added web content extraction and numeric processing libraries.

- Added web search configuration variables to .env.example - Updated .gitignore to include a new line for clarity - Refactored import statement in indexing_router.py for consistency - Included web answering router in main.py for improved routing structure - Updated requirements.txt to include trafilatura and ensure httpx uses HTTP/2

…ngs for search configuration

coderabbitai · 2025-09-20T21:27:35Z

Caution

Review failed

The pull request is closed.

Walkthrough

Adds a provider-agnostic asynchronous web search service (Tavily and Dummy providers) with optional in-memory caching, exposes web-answering endpoints under /internal, extends settings and .env.example for web search, updates requirements, and adds VectorDBClient.upsert plus an aliasing change in api/indexing_router.py.

Changes

Cohort / File(s)	Summary of changes
Web Search Service `services/web_search_service.py`	New module: `SearchResult` dataclass, `SearchProvider` ABC, `DummySearchProvider`, `TavilySearchProvider` (HTTP requests, retry/backoff, parsing), and `WebSearchService` with provider registry, async `search(...)`, and optional TTL LRU-like cache.
API Integration `api/main.py`, `api/endpoints/web_answering.py`	Registers/mounts the `web_answering` router under the `/internal` prefix with tag `websearch`; imports added in `api/main.py`.
Settings & Env `.env.example`, `config/settings.py`	Adds `WEB_SEARCH_ENGINE`, `TAVILY_API_KEY`, `MAX_FETCH_CONCURRENCY`, `DEFAULT_TOP_K_RESULTS` to example env; `Settings` gains `web_search_engine`, `tavily_api_key`, `bing_api_key`, `max_fetch_concurrency`, `default_top_k_results`.
Vector DB `services/vector_db_service.py`, `api/indexing_router.py`	Adds `VectorDBClient.upsert(namespace, ids, vectors, metadata=None)` which ensures index exists, validates lengths, and upserts items; `api/indexing_router.py` now aliases `VectorDBClient as VectorDBService`.
Dependencies `requirements.txt`	Replaces `httpx` spec with `httpx[http2]` and adds `trafilatura` and `numpy`.
Git Hygiene `.gitignore`	Adds blank line after `.env`.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    actor Client
    participant API as FastAPI (/internal/websearch)
    participant Svc as WebSearchService
    participant Prov as SearchProvider (Tavily / Dummy)
    participant Ext as External Search API

    Client->>API: GET /internal/search?query=...
    API->>Svc: search(query, k, language, region, timeout, use_cache)
    alt Cache hit
        Svc-->>API: cached List<SearchResult>
    else Cache miss
        Svc->>Prov: search(query, k, region, language, timeout)
        alt Tavily selected
            Prov->>Ext: HTTP request (with API key)
            Ext-->>Prov: JSON response
            Prov-->>Svc: parsed List<SearchResult>
        else Dummy selected / fallback
            Prov-->>Svc: synthesized List<SearchResult>
        end
        Svc-->>API: results (caches response)
    end
    API-->>Client: JSON response

    note over Prov: Tavily provider applies retries with exponential backoff on transient errors (e.g., 429)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Refactor: Improve Indexing, Environment, and Dependency Management #26 — Overlapping changes: .env variables, requirements updates, and web_answering router registration.
Indexing Router Integration with VectorDBClient and Pinecone API integration #24 — Related import/alias changes in api/indexing_router.py switching to VectorDBClient.
Add DummySearchProvider Fallback and Settings Improvements #27 — Adds/changes web-search settings and introduces Dummy/Tavily provider code similar to this PR.

Poem

I twitch my whiskers, sniff the net,
New tunnels map where queries set.
Tavily hums, dummy hops in play,
Caches tuck crumbs to lead the way.
Hop—answers sprout at break of day. 🐇✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Add DummySearchProvider Fallback and Settings Improvements" is concise and directly describes the primary changes in the PR: adding a DummySearchProvider fallback and updating settings to support that fallback and optional API keys. It clearly reflects the main intent shown in the changed files (services/web_search_service.py and config/settings.py) without extraneous detail. A teammate scanning history would understand the primary change from this title.

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d7cc4d3 and 16a8693.

📒 Files selected for processing (2)

api/indexing_router.py (1 hunks)
services/web_search_service.py (1 hunks)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (12)

.env.example (1)
9-13: Default to dummy provider in example and add trailing newline

To match settings’ default/fallback behavior and avoid noisy fallback logs when no key is set, prefer WEB_SEARCH_ENGINE=dummy in the example and keep the API‑specific line commented until a key is available. Also add a trailing newline.

Apply this diff:
-# Web search configuration
-WEB_SEARCH_ENGINE=tavily
-TAVILY_API_KEY=your_tavily_api_key_here
-MAX_FETCH_CONCURRENCY=4
-DEFAULT_TOP_K_RESULTS=8
+# Web search configuration
+# Set to "dummy" for local/dev without keys; switch to "tavily" when you add TAVILY_API_KEY.
+WEB_SEARCH_ENGINE=dummy
+# TAVILY_API_KEY=your_tavily_api_key_here
+MAX_FETCH_CONCURRENCY=4
+DEFAULT_TOP_K_RESULTS=8
+
Also applies to: 7-7
api/indexing_router.py (2)
37-39: Size check uses characters, not bytes

len(src["text"]) counts chars. Enforce MB against bytes to match max_upload_mb.

Apply this diff:
-        if "text" in src and len(src["text"]) > settings.max_upload_mb * 1024 * 1024:
+        if "text" in src and len(src["text"].encode("utf-8")) > settings.max_upload_mb * 1024 * 1024:
50-53: 415 here may mislead

A chunking failure for plain text isn’t a media-type problem. Consider 422 (unprocessable) or 500 with a more specific detail.
-        except Exception:
-            raise HTTPException(status_code=415, detail="UNSUPPORTED_FILE_TYPE")
+        except Exception:
+            raise HTTPException(status_code=422, detail="CHUNKING_FAILED")
config/settings.py (2)
8-9: Remove unused imports

Field isn’t used.
-from pydantic import SecretStr, ValidationError, Field
+from pydantic import SecretStr, ValidationError
30-34: Let BaseSettings load env; avoid direct os.environ in defaults

Declare defaults and let Pydantic map env (e.g., WEB_SEARCH_ENGINE). The current os.environ.get(...) duplicates config loading and complicates precedence.
-    web_search_engine: str = os.environ.get("WEB_SEARCH_ENGINE", "dummy")  # Default to dummy provider if not specified
+    web_search_engine: str = "dummy"  # overridden by WEB_SEARCH_ENGINE if set
Optional: validate concurrency bounds:
# e.g., max_fetch_concurrency: int = Field(default=4, ge=1, le=64)
services/web_search_service.py (7)
125-127: Align k clamp with Tavily’s documented range

Docs show max_results range 0..20 (default 5). Clamp to 1..20 (or 0..20 if you want to allow zero).
-        # Clamp k between 3 and 15
-        k = max(3, min(15, k))
+        # Clamp k to Tavily's documented range
+        k = max(1, min(20, k))
Reference: Tavily max_results parameter. (docs.tavily.com)

170-176: Use logger.exception for failures

Keep stack traces on failures; helps ops and debugging.
-                        logger.error(f"Tavily search failed after {max_retries} attempts: rate limited")
+                        logger.exception("Tavily search failed after %s attempts: rate limited", max_retries)
@@
-                        logger.error(f"Tavily search failed with status {e.response.status_code}: {e.response.text}")
+                        logger.exception("Tavily search failed with status %s: %s", e.response.status_code, e.response.text)
@@
-                    logger.error(f"Tavily request failed: {str(e)}")
+                    logger.exception("Tavily request failed: %s", e)
@@
-            logger.error(f"Failed to initialize {self.provider_name} search provider: {str(e)}")
+            logger.exception("Failed to initialize %s search provider: %s", self.provider_name, e)
@@
-            logger.error(f"Search failed: {str(e)}")
+            logger.exception("Search failed: %s", e)
Also applies to: 164-168, 237-240, 297-297

52-65: Mark unused parameters in Dummy provider to satisfy linters

Prefix with underscores or use them to mirror real provider behavior.
-    async def search(self, query: str, k: int, region: str, language: str, timeout_seconds: int) -> List[SearchResult]:
+    async def search(self, query: str, k: int, region: str, language: str, timeout_seconds: int) -> List[SearchResult]:
@@
-        for i in range(min(k, 5)):  # Cap at 5 results
+        for i in range(min(k, 20)):  # Cap to a sane upper bound
Or:
-    async def search(self, query: str, k: int, region: str, language: str, timeout_seconds: int) -> List[SearchResult]:
+    async def search(self, query: str, k: int, _region: str, _language: str, _timeout_seconds: int) -> List[SearchResult]:
214-233: Fallback to dummy for unknown provider strings (match PR intent)

Currently an unknown provider raises ValueError; better to warn and fall back to dummy, same as init‑failure path.
-        if self.provider_name not in self._providers:
-            raise ValueError(f"Unsupported search provider: {self.provider_name}")
+        if self.provider_name not in self._providers:
+            logger.warning("Unsupported search provider '%s'. Falling back to DummySearchProvider.", self.provider_name)
+            self.provider_name = "dummy"
+            self.provider = DummySearchProvider()
+            return
242-245: Type Optional for k and keep signature tidy

Satisfy Ruff RUF013 and clarify optional arg.
-    async def search(self, query: str, k: int = None, region: str = "auto", 
+    async def search(self, query: str, k: Optional[int] = None, region: str = "auto", 
@@
-        k = k or settings.default_top_k_results
+        k = k or settings.default_top_k_results
(Import Optional already present.)

Also applies to: 263-264

14-16: Prune unused imports

Protocol isn’t used.
-from typing import Dict, List, Optional, Protocol, Type, ClassVar, Mapping
+from typing import Dict, List, Optional, Type, ClassVar, Mapping
209-213: Cache is process‑local and not concurrency‑safe across workers

Fine for dev; in prod behind multiple workers/nodes it won’t share or lock. Document or consider a TTL cache with locks or an external cache if this becomes hot‑path.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between af0338c and 6a0e924.

📒 Files selected for processing (7)

.env.example (1 hunks)
.gitignore (1 hunks)
api/indexing_router.py (1 hunks)
api/main.py (1 hunks)
config/settings.py (2 hunks)
requirements.txt (2 hunks)
services/web_search_service.py (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

api/indexing_router.py (1)

services/vector_db_service.py (1)

VectorDBClient (10-56)

🪛 dotenv-linter (3.3.0)

.env.example

[warning] 7-7: [UnorderedKey] The CORS_ALLOW_ORIGINS key should go before the POSTGRES_URI key

(UnorderedKey)

[warning] 11-11: [UnorderedKey] The TAVILY_API_KEY key should go before the WEB_SEARCH_ENGINE key

(UnorderedKey)

[warning] 12-12: [UnorderedKey] The MAX_FETCH_CONCURRENCY key should go before the TAVILY_API_KEY key

(UnorderedKey)

[warning] 13-13: [EndingBlankLine] No blank line at the end of the file

(EndingBlankLine)

[warning] 13-13: [UnorderedKey] The DEFAULT_TOP_K_RESULTS key should go before the MAX_FETCH_CONCURRENCY key

(UnorderedKey)

🪛 Ruff (0.13.1)

services/web_search_service.py

55-55: Unused method argument: region

(ARG002)

55-55: Unused method argument: language

(ARG002)

55-55: Unused method argument: timeout_seconds

(ARG002)

103-103: Avoid specifying long messages outside the exception class

(TRY003)

108-108: Unused method argument: region

(ARG002)

123-123: Avoid specifying long messages outside the exception class

(TRY003)

160-160: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

164-164: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

167-167: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

170-170: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

170-170: Use explicit conversion flag

Replace with conversion flag

(RUF010)

172-172: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

231-231: Avoid specifying long messages outside the exception class

(TRY003)

236-236: Do not catch blind exception: Exception

(BLE001)

237-237: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

237-237: Use explicit conversion flag

Replace with conversion flag

(RUF010)

242-242: PEP 484 prohibits implicit Optional

Convert to Optional[T]

(RUF013)

295-295: Consider moving this statement to an else block

(TRY300)

297-297: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

297-297: Use explicit conversion flag

Replace with conversion flag

(RUF010)

🔇 Additional comments (4)

.gitignore (1)

4-4: Whitespace tweak is fine

The blank line after .env improves readability between sections.
requirements.txt (2)
7-7: Good switch to httpx HTTP/2 extra

Enabling httpx[http2] is appropriate given external API calls.

16-18: Remove duplicate pydantic and verify heavy deps (numpy/trafilatura)

Verification script returned no matches; confirm whether requirements.txt (lines 16–18) contains a duplicate pydantic entry and whether numpy/trafilatura are required on the request path (cold-start/size impact). If duplicate pydantic is present, apply this diff to dedupe:
- pydantic>=2.6.0,<3.0.0
 trafilatura>=1.6.0,<2.0.0
 numpy>=1.26.0,<2.0.0
api/main.py (1)

49-49: Resolve missing web_answering router and duplicate "/internal" mounts

api/endpoints/web_answering.py is missing; main.py imports it (line ~49) and mounts it with prefix="/internal" (line ~52). Add/restore that module or remove/fix the import and include_router call.

api/indexing_router.py and api/endpoints/internal.py both declare APIRouter(prefix="/internal") and are both included in main.py — this results in two routers mounted at /internal; consolidate or change prefixes as appropriate.

.env.example

api/indexing_router.py

services/web_search_service.py

…ion and metadata consistency

…e response handling and data extraction

fehranbit

@paytoniga fixed the reviews

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (4)

services/vector_db_service.py (4)

51-54: Type the public API (and add batch_size kw-only).

Tighten the contract and make batching configurable.

Apply this diff within the shown range:

-    def upsert(self, namespace, ids, vectors, metadata=None):
+    def upsert(
+        self,
+        namespace: str,
+        ids: list[str],
+        vectors: list[list[float]],
+        metadata: "list[dict] | None" = None,
+        *,
+        batch_size: int = 100,
+    ):

Additionally (supporting change outside this range), update typing imports:

-from typing import List
+from typing import List
+# Python 3.10+: using built-in generics in signature above; no extra imports needed.

70-70: Batch upserts to avoid payload limits and improve resilience.

Large single calls can exceed request limits; batching is safer.

-        self.index.upsert(vectors=items, namespace=namespace)
+        for start in range(0, len(items), batch_size):
+            self.index.upsert(vectors=items[start:start + batch_size], namespace=namespace)

37-49: Unify upsert paths; avoid tuple-style vectors with the v3 Pinecone client.

Use the new upsert to keep one codepath and consistent payload shape.

     def upsert_vectors(self, vectors: List[List[float]], ids: List[str]):
-        # Input validation
-        if vectors is None or ids is None:
-            raise ValueError("vectors and ids must not be None.")
-        if not isinstance(vectors, list) or not isinstance(ids, list):
-            raise TypeError("vectors and ids must be lists.")
-        if len(vectors) != len(ids):
-            raise ValueError("vectors and ids must have the same length.")
-        if self.dimension is not None and any(len(v) != self.dimension for v in vectors):
-            raise ValueError("vector dimensionality mismatch with index.")
-        if not self.index:
-            raise RuntimeError("Index is not initialized. Call create_index first.")
-        self.index.upsert(vectors=[(id, vec) for id, vec in zip(ids, vectors)])
+        # Delegate to unified implementation; default namespace ("") is Pinecone's default.
+        self.upsert(namespace="", ids=ids, vectors=vectors)

51-70: Optional: return the Pinecone response for observability/testing.

Returning the last UpsertResponse (or a list when batched) helps with metrics and tests.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6a0e924 and d7cc4d3.

📒 Files selected for processing (1)

services/vector_db_service.py (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

services/vector_db_service.py (2)

tests/test_vector_db_service.py (1)

upsert (10-12)

api/indexing_router.py (1)

index (27-83)

🪛 Ruff (0.13.1)

services/vector_db_service.py

58-58: Avoid specifying long messages outside the exception class

(TRY003)

60-60: Avoid specifying long messages outside the exception class

(TRY003)

🔇 Additional comments (1)

services/vector_db_service.py (1)

51-61: Verify module path and upsert call‑site compatibility.

Import path OK — repo contains config/settings.py (/home/jailuser/git/config/settings.py) and multiple files import from config.settings (services/vector_db_service.py, services/web_search_service.py, services/embedding_service.py, api/indexing_router.py, tests/test_embedding_batch.py). No references to core.config.settings found.

Call-site check incomplete — the ripgrep scan returned "No files were searched", so I cannot confirm callers use the new upsert(namespace, ids, vectors, metadata=...) signature. Re-run a full repo search (example: rg -nP '\bupsert\s*(' -S || true) or allow an AST-based scan to enumerate upsert call sites and flag mismatches.

coderabbitai · 2025-09-21T12:51:04Z

services/vector_db_service.py

+        if self.index is None:
+            self.create_index()
+        if not (len(ids) == len(vectors)):


⚠️ Potential issue

Guard against missing embedding dimension before auto-creating the index.

Without a dimension, create_index will fail at runtime.

- if self.index is None: - self.create_index() + if self.index is None: + if self.dimension is None: + raise RuntimeError( + "embedding dimension is not configured; set settings.embedding_dimension or pass dimension." + ) + self.create_index()

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if self.index is None:

self.create_index()

if not (len(ids) == len(vectors)):

if self.index is None:

if self.dimension is None:

raise RuntimeError(

"embedding dimension is not configured; set settings.embedding_dimension or pass dimension."

)

self.create_index()

if not (len(ids) == len(vectors)):

🤖 Prompt for AI Agents

In services/vector_db_service.py around lines 55-57, the code auto-calls create_index() without ensuring an embedding dimension exists which will cause a runtime failure; before calling create_index(), validate that self.dimension is set or infer it from the provided vectors (e.g., if vectors is non-empty set self.dimension = len(vectors[0]) after validating all vectors share that length), and if neither self.dimension nor inferable from vectors, raise a clear ValueError asking the caller to provide an embedding dimension; adjust create_index to accept/use the dimension if it currently does not.

coderabbitai · 2025-09-21T12:51:05Z

services/vector_db_service.py

+        if not (len(ids) == len(vectors)):
+            raise ValueError("ids and vectors must have the same length")
+        if metadata is not None and len(metadata) != len(ids):
+            raise ValueError("metadata length must match ids/vectors length")
+        items = []
+        for i, (id_, vector) in enumerate(zip(ids, vectors)):
+            item = {
+                "id": id_,
+                "values": vector
+            }
+            if metadata is not None:
+                item["metadata"] = metadata[i]
+            items.append(item)
+        self.index.upsert(vectors=items, namespace=namespace)


⚠️ Potential issue

Add dimension checks, vector normalization, and metadata type validation.

Parity with upsert_vectors and fewer Pinecone-side errors.

- if not (len(ids) == len(vectors)): + if len(ids) != len(vectors): raise ValueError("ids and vectors must have the same length") - if metadata is not None and len(metadata) != len(ids): - raise ValueError("metadata length must match ids/vectors length") - items = [] - for i, (id_, vector) in enumerate(zip(ids, vectors)): - item = { - "id": id_, - "values": vector - } - if metadata is not None: - item["metadata"] = metadata[i] - items.append(item) - self.index.upsert(vectors=items, namespace=namespace) + if metadata is not None and len(metadata) != len(ids): + raise ValueError("metadata length must match ids/vectors length") + items: list[dict] = [] + for i, (id_, vector) in enumerate(zip(ids, vectors)): + # Dimension validation + normalization (supports numpy arrays) + if self.dimension is not None and len(vector) != self.dimension: + raise ValueError( + f"vector[{i}] dimensionality {len(vector)} != expected {self.dimension}" + ) + values = vector.tolist() if hasattr(vector, "tolist") else list(vector) + if not all(isinstance(x, (int, float)) for x in values): + raise TypeError(f"vector[{i}] must be a sequence of numbers") + item = {"id": id_, "values": values} + if metadata is not None: + md = metadata[i] + if not isinstance(md, dict): + raise TypeError("each metadata entry must be a dict") + item["metadata"] = md + items.append(item) + self.index.upsert(vectors=items, namespace=namespace)

🧰 Tools

🪛 Ruff (0.13.1)

58-58: Avoid specifying long messages outside the exception class

(TRY003)

60-60: Avoid specifying long messages outside the exception class

(TRY003)

🤖 Prompt for AI Agents

In services/vector_db_service.py around lines 57 to 70, add parity with upsert_vectors by validating each vector's dimension, normalizing vectors, and validating metadata types before calling upsert: check that each vector is a sequence of numeric values and its length equals the configured embedding dimension (e.g., self.dimension) and raise ValueError if not; if normalization is required, convert each vector to a unit vector (divide by its L2 norm, guarding against zero norm); ensure metadata (when provided) is the expected type (e.g., dict or list) and raise TypeError for invalid entries; perform these checks/normalization inside the loop that builds items so you send only validated, normalized vectors to self.index.upsert(vectors=items, namespace=namespace).

enfayz and others added 2 commits September 20, 2025 21:18

Add web search service with Tavily and dummy providers; enhance setti…

6a0e924

…ngs for search configuration

coderabbitai bot reviewed Sep 20, 2025

View reviewed changes

.env.example Show resolved Hide resolved

api/indexing_router.py Show resolved Hide resolved

services/web_search_service.py Show resolved Hide resolved

fehranbit added 2 commits September 21, 2025 06:45

Enhance upsert method in VectorDBClient to validate index initializat…

d7cc4d3

…ion and metadata consistency

Refactor indexing router and enhance Tavily search provider to improv…

16a8693

…e response handling and data extraction

fehranbit self-requested a review September 21, 2025 12:48

fehranbit approved these changes Sep 21, 2025

View reviewed changes

fehranbit merged commit 268dbff into main Sep 21, 2025

coderabbitai bot reviewed Sep 21, 2025

View reviewed changes

This was referenced Sep 21, 2025

Tavily Web Search: Auth, Retries & Fallback #28

Merged

Web Answering API: Cited Answers via Fetch→Rank→Synthesize #29

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add DummySearchProvider Fallback and Settings Improvements #27

Add DummySearchProvider Fallback and Settings Improvements #27

Uh oh!

paytoniga commented Sep 20, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Sep 20, 2025 •

edited

Loading

Review failed

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fehranbit left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Sep 21, 2025

Uh oh!

coderabbitai bot Sep 21, 2025

Uh oh!

Uh oh!

Add DummySearchProvider Fallback and Settings Improvements #27

Add DummySearchProvider Fallback and Settings Improvements #27

Uh oh!

Conversation

paytoniga commented Sep 20, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What Changed

Why It Matters

Notes / Usage

Verification

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Sep 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fehranbit left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 21, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 21, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

paytoniga commented Sep 20, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Sep 20, 2025 •

edited

Loading