Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -241,6 +241,31 @@ def test_live_search(): # Missing @pytest.mark.integration
- **Type hints**: Required for all function signatures (enforced by mypy)
- **Docstrings**: Google-style docstrings for public functions

### Architecture Standards

**Maximum Nesting Depth**: 3 levels or less
- Functions with deeper nesting must be refactored
- Extract helper methods to reduce complexity
- Each helper method should represent a single concept
- Example violation: 6-level nesting in content_scraper.py (fixed by extracting 10 helper methods)

**One Class Per File**:
- Each file should contain exactly one class definition
- Applies to both production code and test infrastructure
- Example: Split `mock_ddgs.py` (3 classes) into 3 separate files

**Methods for Concepts**:
- Extract helper methods for each logical concept or operation
- Helper methods should have clear, single responsibilities
- Prefer multiple small methods over one large method
- Example: `_fetch_content()`, `_convert_to_markdown()`, `_truncate_if_needed()`

**No Raw Mocks**:
- Never use `MagicMock()` with property assignment (e.g., `mock.status_code = 200`)
- Create typed mock classes instead (e.g., `MockHttpResponse`, `MockSerperResponse`)
- Use factory pattern for creating pre-configured test doubles
- Use builder pattern with fluent API for test data (e.g., `a_search_result().with_title("X").build()`)

### Single Source of Truth

**Tool Versions** are defined in `pyproject.toml` under `[project.optional-dependencies.dev]`:
Expand Down
163 changes: 115 additions & 48 deletions docker/api_tools.py
Original file line number Diff line number Diff line change
Expand Up @@ -251,79 +251,146 @@ async def get_server_info_tool() -> dict:
return response.model_dump()


async def _fetch_with_serper(query: str, api_key: str):
"""Fetch search results using Serper API.

Args:
query: Search query
api_key: Serper API key

Returns:
Tuple of (results, search_source)
"""
import asyncio

from mcp_server import fetch_search_results

logger.info("Using Serper API for search")
results = await asyncio.get_event_loop().run_in_executor(
None, fetch_search_results, query, api_key
)
return results, "Serper API"
Comment on lines +254 to +272
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Add return types and favour asyncio.to_thread.

New helpers must have explicit return annotations per our typing rules, and we can avoid the deprecated asyncio.get_event_loop() by switching to asyncio.to_thread(...). Consider:

-from typing import TYPE_CHECKING, Any, Dict, List, Optional
+from typing import TYPE_CHECKING, Any, Dict, List, Optional, Tuple
...
-if TYPE_CHECKING:
-    pass
+if TYPE_CHECKING:
+    from docker.models.api_search_result import APISearchResult
...
-async def _fetch_with_serper(query: str, api_key: str):
+async def _fetch_with_serper(
+    query: str, api_key: str
+) -> Tuple[List["APISearchResult"], str]:
...
-    results = await asyncio.get_event_loop().run_in_executor(
-        None, fetch_search_results, query, api_key
-    )
+    results = await asyncio.to_thread(fetch_search_results, query, api_key)

Apply the same pattern to the DuckDuckGo helper so static typing and asyncio usage stay consistent.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
async def _fetch_with_serper(query: str, api_key: str):
"""Fetch search results using Serper API.
Args:
query: Search query
api_key: Serper API key
Returns:
Tuple of (results, search_source)
"""
import asyncio
from mcp_server import fetch_search_results
logger.info("Using Serper API for search")
results = await asyncio.get_event_loop().run_in_executor(
None, fetch_search_results, query, api_key
)
return results, "Serper API"
from typing import TYPE_CHECKING, Any, Dict, List, Optional, Tuple
if TYPE_CHECKING:
from docker.models.api_search_result import APISearchResult
async def _fetch_with_serper(
query: str, api_key: str
) -> Tuple[List["APISearchResult"], str]:
"""Fetch search results using Serper API.
Args:
query: Search query
api_key: Serper API key
Returns:
Tuple of (results, search_source)
"""
import asyncio
from mcp_server import fetch_search_results
logger.info("Using Serper API for search")
results = await asyncio.to_thread(fetch_search_results, query, api_key)
return results, "Serper API"
🤖 Prompt for AI Agents
In docker/api_tools.py around lines 254 to 272, the _fetch_with_serper helper is
missing an explicit return type annotation and uses the deprecated
asyncio.get_event_loop().run_in_executor; update its signature to include the
correct async return type (e.g., -> Tuple[List[...], str] or the precise types
used by fetch_search_results) and replace the run_in_executor call with
asyncio.to_thread(fetch_search_results, query, api_key) to run the blocking
function in a thread. Also apply the same explicit return annotation and switch
to asyncio.to_thread for the DuckDuckGo helper so both helpers are properly
typed and use modern asyncio patterns.



async def _fetch_with_duckduckgo(query: str, has_api_key: bool):
"""Fetch search results using DuckDuckGo.

Args:
query: Search query
has_api_key: Whether Serper API key was configured

Returns:
Tuple of (results, search_source)
"""
import asyncio

from mcp_server import fetch_duckduckgo_search_results

if not has_api_key:
logger.info("No Serper API key configured, using DuckDuckGo fallback")
else:
logger.warning("No results from Serper API, trying DuckDuckGo fallback")

results = await asyncio.get_event_loop().run_in_executor(
None, fetch_duckduckgo_search_results, query
)
return results, "DuckDuckGo (free fallback)"

Comment on lines +275 to +298
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Ensure the DuckDuckGo helper is fully typed.

Please mirror the Serper changes here: add a precise return type (e.g., Tuple[Optional[List["APISearchResult"]], str]) and swap asyncio.get_event_loop().run_in_executor for asyncio.to_thread(...) to satisfy our typing policy and avoid deprecated APIs.


def _format_no_results_error(query: str, search_source: str) -> dict:
"""Format error response for no results.

Args:
query: Search query
search_source: Source that was used

Returns:
Error dictionary
"""
return {
"error": "No search results found from any source.",
"query": query,
"search_source": search_source,
}


def _format_search_error(error: str, query: str, search_source: str) -> dict:
"""Format error response for search failure.

Args:
error: Error message
query: Search query
search_source: Source that was attempted

Returns:
Error dictionary
"""
return {
"error": f"Search failed: {error}",
"query": query,
"search_source": search_source,
}


async def _process_and_format_results(results, query: str, search_source: str):
"""Process search results and format response.

Args:
results: Raw search results
query: Search query
search_source: Source of results

Returns:
Formatted results dictionary
"""
import asyncio

from mcp_server import process_search_results

processed_results = await asyncio.get_event_loop().run_in_executor(
None, process_search_results, results
)

return {
"query": query,
"search_source": search_source,
"results": [result.model_dump() for result in processed_results],
}
Comment on lines +300 to +358
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Type the formatting utilities.

Our guidelines require explicit return types. _format_no_results_error, _format_search_error, and _process_and_format_results should return Dict[str, Any] (or a refined mapping) and accept properly typed results (e.g., List["APISearchResult"]). Please add those annotations and, for _process_and_format_results, consider asyncio.to_thread for consistency with the fetch helpers.

🤖 Prompt for AI Agents
In docker/api_tools.py around lines 300 to 358, the three helpers lack explicit
typing: add return annotations of Dict[str, Any] for _format_no_results_error
and _format_search_error and for _process_and_format_results change the
signature to accept results: List["APISearchResult"] (or a more specific
Sequence type) and return Dict[str, Any]; inside _process_and_format_results
replace asyncio.get_event_loop().run_in_executor(...) with
asyncio.to_thread(process_search_results, results) for consistency with fetch
helpers and keep importing typing (Dict, Any, List) and forward-reference
APISearchResult in quotes or via TYPE_CHECKING import.



def create_webcat_functions() -> Dict[str, Any]:
"""Create a dictionary of WebCat functions for the tools to use."""

# Import the existing functions from the main server
from mcp_server import (
SERPER_API_KEY,
fetch_duckduckgo_search_results,
fetch_search_results,
process_search_results,
)
from mcp_server import SERPER_API_KEY

async def search_function(query: str) -> Dict[str, Any]:
"""Wrapper for the search functionality."""
import asyncio

results = []
search_source = "Unknown"

try:
# Try Serper API first if key is available
if SERPER_API_KEY:
logger.info("Using Serper API for search")
search_source = "Serper API"
# Run the synchronous function in a thread pool
results = await asyncio.get_event_loop().run_in_executor(
None, fetch_search_results, query, SERPER_API_KEY
)
results, search_source = await _fetch_with_serper(query, SERPER_API_KEY)

# Fall back to DuckDuckGo if no API key or no results from Serper
if not results:
if not SERPER_API_KEY:
logger.info(
"No Serper API key configured, using DuckDuckGo fallback"
)
else:
logger.warning(
"No results from Serper API, trying DuckDuckGo fallback"
)

search_source = "DuckDuckGo (free fallback)"
# Run the synchronous function in a thread pool
results = await asyncio.get_event_loop().run_in_executor(
None, fetch_duckduckgo_search_results, query
results, search_source = await _fetch_with_duckduckgo(
query, bool(SERPER_API_KEY)
)

# Check if we got any results
if not results:
logger.warning(f"No search results found for query: {query}")
return {
"error": "No search results found from any source.",
"query": query,
"search_source": search_source,
}

# Process the results in thread pool (since it involves web scraping)
processed_results = await asyncio.get_event_loop().run_in_executor(
None, process_search_results, results
)
return _format_no_results_error(query, search_source)

# Return formatted results
return {
"query": query,
"search_source": search_source,
"results": [result.model_dump() for result in processed_results],
}
# Process and format results
return await _process_and_format_results(results, query, search_source)

except Exception as e:
logger.error(f"Error in search function: {str(e)}")
return {
"error": f"Search failed: {str(e)}",
"query": query,
"search_source": search_source,
}
return _format_search_error(str(e), query, search_source)

async def health_check_function() -> Dict[str, Any]:
"""Wrapper for the health check functionality."""
Expand Down
30 changes: 17 additions & 13 deletions docker/clients/duckduckgo_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,22 @@
)


def _convert_ddg_result(result: dict) -> APISearchResult:
"""Convert DuckDuckGo result format to APISearchResult.

Args:
result: Raw result dictionary from DuckDuckGo

Returns:
APISearchResult object
"""
return APISearchResult(
title=result.get("title", "Untitled"),
link=result.get("href", ""),
snippet=result.get("body", ""),
)


def fetch_duckduckgo_search_results(
query: str, max_results: int = 3
) -> List[APISearchResult]:
Expand All @@ -43,20 +59,8 @@ def fetch_duckduckgo_search_results(
logger.info(f"Using DuckDuckGo fallback search for: {query}")

with DDGS() as ddgs:
# Get search results from DuckDuckGo
results = []
search_results = ddgs.text(query, max_results=max_results)

for result in search_results:
# Convert DuckDuckGo result format to APISearchResult
results.append(
APISearchResult(
title=result.get("title", "Untitled"),
link=result.get("href", ""),
snippet=result.get("body", ""),
)
)

results = [_convert_ddg_result(result) for result in search_results]
logger.info(f"DuckDuckGo returned {len(results)} results")
return results

Expand Down
29 changes: 20 additions & 9 deletions docker/clients/serper_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,25 @@
logger = logging.getLogger(__name__)


def _convert_organic_results(organic_results: list) -> List[APISearchResult]:
"""Convert organic search results to APISearchResult objects.

Args:
organic_results: List of organic result dictionaries from Serper API

Returns:
List of APISearchResult objects
"""
return [
APISearchResult(
title=result.get("title", "Untitled"),
link=result.get("link", ""),
snippet=result.get("snippet", ""),
)
for result in organic_results
]


def fetch_search_results(query: str, api_key: str) -> List[APISearchResult]:
"""
Fetches search results from the Serper API.
Expand All @@ -38,15 +57,7 @@ def fetch_search_results(query: str, api_key: str) -> List[APISearchResult]:

# Process and return the search results
if "organic" in data:
# Convert to APISearchResult objects
return [
APISearchResult(
title=result.get("title", "Untitled"),
link=result.get("link", ""),
snippet=result.get("snippet", ""),
)
for result in data["organic"]
]
return _convert_organic_results(data["organic"])
return []
except Exception as e:
logger.error(f"Error fetching search results: {str(e)}")
Expand Down
Loading
Loading