Skip to content

Feature Proposal: Enhanced Web Search (#2)#161

Merged
laynepenney merged 4 commits intomainfrom
feat/enhanced-web-search
Jan 25, 2026
Merged

Feature Proposal: Enhanced Web Search (#2)#161
laynepenney merged 4 commits intomainfrom
feat/enhanced-web-search

Conversation

@laynepenney
Copy link
Copy Markdown
Collaborator

Feature Proposal: Enhanced Web Search (#2)

This PR adds a comprehensive proposal for enhancing Codi's web search capabilities.

Overview

The proposal documents current limitations and provides a phased implementation plan for:

  • Multiple search engine support with fallbacks
  • Search templates for common use cases (docs, pricing, errors)
  • Enhanced caching and result quality scoring
  • Structured data extraction from search results

Key Features Proposed

  1. Multi-Engine Support: DuckDuckGo, Google, Bing, Brave with fallback logic
  2. Query Optimization: Domain-specific search templates
  3. Caching System: Persistent storage for reduced API calls
  4. Quality Scoring: Result relevance ranking and spam detection

Implementation Plan

  • Phase 1 (1 week): Multi-engine foundation
  • Phase 2 (1 week): Enhanced features and caching
  • Phase 3 (1 week): Advanced capabilities

Reference

See evolution/#2-enhanced-web-search.md for full details.

- Fix SerpAPI error (paid service, not free)
- Add Brave Search API as primary (reliable JSON API)
- Add LRU cache with max 1000 entries and size limits
- Add template-aware TTL (pricing 7d, errors 12h, docs 24h, general 1h)
- Add errors template to config example
- Add date_range limitations note (Google/Bing only)
- Add performance testing and memory usage targets
- Add risks: E3 HTML fragility and rate limiting
- Include brave-api.org in references
- Add revision history section
- Fix architecture diagram indentation
- Update engine priority order in config
@laynepenney
Copy link
Copy Markdown
Collaborator Author

Changes Made (v1.1)

All issues from the deep review have been addressed:

Fixed Issues

  1. ✅ Fixed SerpAPI Free Tier Error

    • Changed from "SerpAPI free tier" to actual free options
    • Brave Search API (generous free tier) now primary
    • Google Custom Search API (100 queries/day free) as fallback
    • Bing Search API (1000 queries/month free) as backup
  2. ✅ LRU Cache with Size Limits

    • Added max 1000 entries limit
    • LRU (Least Recently Used) eviction policy
    • Memory usage target: <50MB for cache layer
  3. ✅ Template-Aware TTL

    • Different TTL per template type:
      • pricing: 7 days (pricing changes infrequently)
      • errors: 12 hours (fixes found faster)
      • docs: 24 hours (rarely changes)
      • general: 1 hour (default)
  4. ✅ Added Errors Template

    • Added errors template to config example
    • Sites: stackoverflow.com, github.com
    • Modifiers: error, fix, solution
  5. ✅ Date Range Limitations

    • Added note that date_range only works on Google and Bing
    • E3 Lite and Brave use different filtering methods
  6. ✅ Performance Testing

    • Added cache hit rate target: 40%+ for repeated queries
    • Added memory usage target: <50MB for cache layer
    • Added API response time target: <2s per search
  7. ✅ Added Risk: Rate Limiting

    • Promoted rate limiting from Medium to High impact
    • Circuit breaker and Brave as primary (generous limits)
  8. ✅ Architecture Diagram Fix

    • Fixed indentation in Result Processor block
  9. ✅ Updated Status

    • Changed from DRAFT to UNDER REVIEW
  10. ✅ Added Revision History

    • Version 1.0: Initial proposal
    • Version 1.1: All fixes applied

Summary of Technical Changes

Category Before After
Primary Engine E3 Lite (fragile HTML) Brave JSON API (reliable)
Cache Basic TTL LRU with 1000 entry limit
TTL Fixed 1 hour Template-aware (1h-7d)
Templates docs, pricing docs, pricing, errors
Fallback Unspecified Google (100/day), Bing (1000/mo)

Ready for community feedback and approval! 🎯

- Fixed typo: 'freequeries' -> 'free queries' spacing
- Corrected internal entity references (E3 -> E1)
- Unified engine configuration (removed redundant engineOrder)
- Made response limits consistent (default 15 both places)
- Added 'general' template to config example
- Resolved Open Question #3 (marked as RESOLVED)
- Clarified cache storage (file + in-memory LRU)
- Clarified extract_content limits (first 5KB)
- Updated revision history to v1.2
@laynepenney
Copy link
Copy Markdown
Collaborator Author

Version 1.2 - Final Fixes Applied (✅ READY FOR APPROVAL)

All 8 remaining issues from the deep review have been fixed:

Fixes Summary

  1. ✅ Fixed "freequeries" typo

    • "1,000 freequeries/month" → "1,000 free queries/month"
  2. ✅ Corrected internal entity references

    • Changed "E3" to "E1" (DuckDuckGo) throughout document
    • Updated risk table references
    • Updated references section
  3. ✅ Unified engine configuration

    • Removed redundant engineOrder field
    • Now only engines array (order determines priority)
  4. ✅ Made response limits consistent

    • Default changed from 5 → 15
    • Consistent between config and interface
  5. ✅ Added general template to config

    • Added empty general template with sites: []
    • Config now includes all 4 templates
  6. ✅ Resolved Open Question feat: add tab-completion for slash commands #3

    • Marked as RESOLVED with reference to v1.1 changes
    • Brave is now primary engine
  7. ✅ Clarified cache storage

    • Split into "Persistent file storage" + "In-memory LRU cache"
    • More explicit about storage mechanism
  8. ✅ Clarified extract_content limits

    • Changed "limited" to specific "fetch first 5KB"
    • Clearer about what the feature does

Final Proposal Status

Metric Value
Status READY FOR APPROVAL
Version 1.2
Total Fixes Applied 16 (v1.1: 8, v1.2: 8)
Lines Changed +177/-95 = +82 net
Open Issues None

This proposal is production-ready and addresses all identified issues!

🎯

@laynepenney laynepenney merged commit a30d2f2 into main Jan 25, 2026
3 checks passed
laynepenney added a commit that referenced this pull request Jan 26, 2026
* docs: add feature proposal for #2 enhanced web search

* docs: fix web search proposal issues (#161)

- Fix SerpAPI error (paid service, not free)
- Add Brave Search API as primary (reliable JSON API)
- Add LRU cache with max 1000 entries and size limits
- Add template-aware TTL (pricing 7d, errors 12h, docs 24h, general 1h)
- Add errors template to config example
- Add date_range limitations note (Google/Bing only)
- Add performance testing and memory usage targets
- Add risks: E3 HTML fragility and rate limiting
- Include brave-api.org in references
- Add revision history section
- Fix architecture diagram indentation
- Update engine priority order in config

* docs: apply v1.2 fixes to web search proposal (#161)

- Fixed typo: 'freequeries' -> 'free queries' spacing
- Corrected internal entity references (E3 -> E1)
- Unified engine configuration (removed redundant engineOrder)
- Made response limits consistent (default 15 both places)
- Added 'general' template to config example
- Resolved Open Question #3 (marked as RESOLVED)
- Clarified cache storage (file + in-memory LRU)
- Clarified extract_content limits (first 5KB)
- Updated revision history to v1.2

* docs: fix final version inconsistency (v1.1 -> v1.2)

* feat: implement enhanced web search multi-engine support

Phase 1: Multi-engine foundation with caching and fallback

**Features implemented:**
- Multi-engine architecture (Brave, Google, Bing, DuckDuckGo)
- LRU cache with max 1000 entries and TTL support
- Enhanced configuration system with web search settings
- Plugin-based engine registry with automatic fallback
- Updated tool registration with backward compatibility

**Engine priority:** Brave (recommended) > Google > Bing > DuckDuckGo (fallback)
**Configuration:** Support for API keys via .env file
**Documentation:** Comprehensive API key guide and setup instructions

**Ready for testing:** Users can configure multiple search engines
  for improved reliability and result quality

Wingman: Codi <codi@layne.pro>

* fix: duckduckgo engine source handling and test assertion

* feat: add timeout configuration for search engines

- Add configurable timeout via AbortSignal.timeout()
- Default timeout is undefined (no limit) for flexibility
- Can be configured via WebSearchConfig.timeout (milliseconds)
- Applied to all 4 engines: Brave, Google, Bing, DuckDuckGo
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant