# Web Search Agents: Credit Card Rewards Comparison

This notebook demonstrates **web search agent development** for AI-powered financial research by building an agent that:

1. **Searches for current credit card rewards** from reputable financial sources
2. **Filters and validates** information from trusted publishers
3. **Extracts structured data** from unstructured web content using LLMs
4. **Synthesizes comparisons** with professional formatting

## Key Concepts Demonstrated

- **Real-time Web Search API Integration**: Using Tavily for fresh financial data
- **Source Credibility Filtering**: Trust-based source selection
- **LLM-Powered Data Extraction**: Converting unstructured text to structured records
- **Financial Data Synthesis**: Creating user-ready comparison summaries
- **Error Handling & Fallbacks**: Graceful degradation when APIs are unavailable

## Scenario
A financial research agent that helps users find the best credit card rewards programs by searching the web, filtering credible sources, extracting reward rates, and presenting a clear comparison.

**Note**: This demo uses **Tavily's free web search API** (1,000 searches/month free tier) with fallback to realistic mock data for educational purposes.

In [1]:
# Import required libraries
import os
import requests
import json
from datetime import datetime
from dataclasses import dataclass
from typing import List, Optional, Dict, Any
from openai import OpenAI
import re

# Initialize OpenAI client with Vocareum endpoint
client = OpenAI(
    base_url="https://openai.vocareum.com/v1",
    api_key=os.getenv("OPENAI_API_KEY")
)

print("üîß Environment Setup:")
print(f"   ‚úÖ OpenAI API Key: {'‚úì Configured' if os.getenv('OPENAI_API_KEY') else '‚ùå Missing'}")
print(f"   üîç Tavily API Key: {'‚úì Configured' if os.getenv('TAVILY_API_KEY') else '‚ö†Ô∏è  Missing (will use fallback data)'}")

üîß Environment Setup:
   ‚úÖ OpenAI API Key: ‚úì Configured
   üîç Tavily API Key: ‚úì Configured


## Define Data Models

We'll use dataclasses to structure our web search results and credit card reward information.

In [2]:
@dataclass
class SearchResult:
    """Represents a search result from web search API"""
    title: str
    url: str
    snippet: str
    published_date: Optional[str] = None
    domain: Optional[str] = None

@dataclass
class CreditCardRecord:
    """Represents a credit card rewards record"""
    card_name: str
    issuer: str
    rewards_rate: str  # e.g., "2% cash back" or "3x points on travel"
    annual_fee: str
    bonus_offer: Optional[str] = None
    best_for: Optional[str] = None
    source_url: str = ""
    source_title: str = ""
    
@dataclass
class CardComparison:
    """Final summary of top credit card rewards"""
    intro: str
    top_cards: List[CreditCardRecord]
    takeaway: str
    sources: List[Dict[str, str]]
    disclaimer: str

print("‚úÖ Data models defined!")

‚úÖ Data models defined!


## Build the Web Search Agent

The agent orchestrates multiple tools:
1. **Web Search Tool**: Fetches results from Tavily API
2. **Source Filter Tool**: Validates credibility of sources
3. **Extraction Tool**: Uses LLM to parse credit card data
4. **Synthesis Tool**: Creates user-ready comparison

In [3]:
class CreditCardSearchAgent:
    """Agent for searching and analyzing credit card rewards from web sources"""
    
    def __init__(self):
        # Define trusted financial sources
        self.reputable_domains = {
            'nerdwallet.com',
            'bankrate.com',
            'creditkarma.com',
            'thepointsguy.com',
            'forbes.com',
            'money.com',
            'cnbc.com',
            'wallstreetjournal.com',
            'consumerreports.org',
            'creditcards.com'
        }
        
    def search_web(self, query: str, num_results: int = 10) -> List[SearchResult]:
        """
        Search the web using Tavily API for real-time credit card information
        
        Args:
            query: Search query string
            num_results: Number of results to return
            
        Returns:
            List of SearchResult objects
        """
        tavily_api_key = os.getenv("TAVILY_API_KEY")
        
        if not tavily_api_key:
            print("‚ö†Ô∏è TAVILY_API_KEY not found. Using fallback mock results.")
            return self._get_fallback_results()[:num_results]
        
        try:
            # Tavily API endpoint
            url = "https://api.tavily.com/search"
            
            # Add recency constraint
            fresh_query = f"{query} 2025"
            
            payload = {
                "api_key": tavily_api_key,
                "query": fresh_query,
                "search_depth": "basic",
                "include_answer": False,
                "include_images": False,
                "include_raw_content": False,
                "max_results": num_results,
                "include_domains": list(self.reputable_domains)
            }
            
            response = requests.post(url, json=payload, timeout=10)
            response.raise_for_status()
            
            data = response.json()
            results = []
            
            for item in data.get('results', []):
                domain = self._extract_domain(item.get('url', ''))
                
                result = SearchResult(
                    title=item.get('title', ''),
                    url=item.get('url', ''),
                    snippet=item.get('content', ''),
                    published_date=item.get('published_date'),
                    domain=domain
                )
                results.append(result)
            
            print(f"‚úÖ Found {len(results)} results from Tavily API")
            return results
            
        except requests.exceptions.RequestException as e:
            print(f"üîå Network error with Tavily API: {e}")
            return self._get_fallback_results()[:num_results]
        except Exception as e:
            print(f"‚ùå Error with Tavily API: {e}")
            return self._get_fallback_results()[:num_results]
    
    def _get_fallback_results(self) -> List[SearchResult]:
        """Fallback mock results when API is unavailable"""
        return [
            SearchResult(
                title="Best Cash Back Credit Cards of 2025 - NerdWallet",
                url="https://www.nerdwallet.com/best/credit-cards/cash-back",
                snippet="Citi Double Cash Card offers 2% cash back (1% when you buy, 1% when you pay). Chase Freedom Unlimited provides 1.5% cash back on all purchases plus 5% on travel through Chase. No annual fee for both.",
                published_date="2025-10-10",
                domain="nerdwallet.com"
            ),
            SearchResult(
                title="Top Rewards Credit Cards - Bankrate",
                url="https://www.bankrate.com/finance/credit-cards/best-rewards-credit-cards/",
                snippet="Capital One Venture Rewards earns 2x miles on every purchase with $95 annual fee. American Express Gold Card offers 4x points at restaurants and supermarkets, $250 annual fee.",
                published_date="2025-10-12",
                domain="bankrate.com"
            ),
            SearchResult(
                title="Best Travel Credit Cards 2025 - The Points Guy",
                url="https://thepointsguy.com/guide/best-travel-credit-cards/",
                snippet="Chase Sapphire Preferred earns 2x points on travel and dining, $95 annual fee with 60,000 point bonus. Chase Sapphire Reserve offers 3x points on travel and dining, $550 annual fee.",
                published_date="2025-10-11",
                domain="thepointsguy.com"
            ),
            SearchResult(
                title="Best Credit Cards for Dining - Forbes Advisor",
                url="https://www.forbes.com/advisor/credit-cards/best/dining/",
                snippet="Capital One Savor Cash Rewards earns 4% cash back on dining and entertainment, 3% on groceries. $95 annual fee. Discover it Cash Back offers 5% rotating categories.",
                published_date="2025-10-09",
                domain="forbes.com"
            ),
            SearchResult(
                title="Top Credit Card Rewards Programs - CNBC",
                url="https://www.cnbc.com/select/best-rewards-credit-cards/",
                snippet="Bank of America Customized Cash Rewards offers 3% cash back in a category of your choice, $0 annual fee. Wells Fargo Active Cash provides flat 2% cash back on all purchases.",
                published_date="2025-10-08",
                domain="cnbc.com"
            )
        ]
    
    def filter_reputable_sources(self, results: List[SearchResult]) -> List[SearchResult]:
        """
        Filter search results to only include reputable financial sources
        
        Args:
            results: List of search results
            
        Returns:
            Filtered list of reputable search results
        """
        filtered = []
        
        for result in results:
            domain = result.domain or self._extract_domain(result.url)
            
            # Check if domain is in our reputable list
            if any(rep_domain in domain for rep_domain in self.reputable_domains):
                result.domain = domain
                filtered.append(result)
                
        return filtered
    
    def _extract_domain(self, url: str) -> str:
        """Extract domain from URL"""
        match = re.search(r'https?://(?:www\.)?([^/]+)', url)
        return match.group(1) if match else ""
    
    def extract_card_data(self, results: List[SearchResult]) -> List[CreditCardRecord]:
        """
        Extract credit card reward data from search result snippets using LLM
        
        Args:
            results: List of filtered search results
            
        Returns:
            List of extracted credit card records
        """
        extraction_prompt = """You are a financial data extraction expert. Extract credit card rewards information from the provided search snippets.

For each credit card mentioned, extract:
- Card name (full card name)
- Issuer (e.g., Chase, American Express, Capital One)
- Rewards rate (e.g., "2% cash back", "3x points on travel")
- Annual fee (e.g., "$95", "No annual fee", "$0")
- Bonus offer (if mentioned, otherwise null)
- Best for (e.g., "travel", "dining", "cash back", if clear from context)

Return a JSON array of objects with fields: card_name, issuer, rewards_rate, annual_fee, bonus_offer, best_for.
Only include legitimate credit cards with clear reward information.

Search snippets:
"""
        
        # Combine all snippets for extraction
        snippets_text = "\n\n".join([
            f"Source: {result.title} ({result.domain})\n{result.snippet}"
            for result in results
        ])
        
        try:
            response = client.chat.completions.create(
                model="gpt-4o-mini",
                messages=[
                    {"role": "system", "content": extraction_prompt},
                    {"role": "user", "content": snippets_text}
                ],
                temperature=0.1,
                max_tokens=1500
            )
            
            content = response.choices[0].message.content.strip()
            
            # Clean up response
            if "```json" in content:
                content = content.split("```json")[1].split("```")[0].strip()
            elif "```" in content:
                content = content.split("```")[1].strip()
            
            extracted_data = json.loads(content)
            
            # Convert to CreditCardRecord objects
            records = []
            for item in extracted_data:
                # Find matching source for attribution
                source_result = next(
                    (r for r in results if item['card_name'].lower() in r.snippet.lower()),
                    results[0]  # fallback to first result
                )
                
                record = CreditCardRecord(
                    card_name=item['card_name'],
                    issuer=item['issuer'],
                    rewards_rate=item['rewards_rate'],
                    annual_fee=item['annual_fee'],
                    bonus_offer=item.get('bonus_offer'),
                    best_for=item.get('best_for'),
                    source_url=source_result.url,
                    source_title=source_result.title
                )
                records.append(record)
                
            return records
            
        except Exception as e:
            print(f"‚ö†Ô∏è Error extracting data: {e}")
            return []
    
    def deduplicate_cards(self, records: List[CreditCardRecord]) -> List[CreditCardRecord]:
        """
        Deduplicate credit card records by card name
        
        Args:
            records: List of credit card records
            
        Returns:
            Deduplicated list
        """
        seen_cards = {}
        
        for record in records:
            card_key = record.card_name.lower().strip()
            
            if card_key not in seen_cards:
                seen_cards[card_key] = record
            else:
                # If duplicate, prefer the one with more complete info
                existing = seen_cards[card_key]
                if record.bonus_offer and not existing.bonus_offer:
                    seen_cards[card_key] = record
        
        return list(seen_cards.values())
    
    def synthesize_comparison(self, records: List[CreditCardRecord], user_query: str = "best rewards cards") -> CardComparison:
        """
        Use LLM to synthesize final credit card comparison
        
        Args:
            records: List of deduplicated credit card records
            user_query: User's original search query
            
        Returns:
            Complete card comparison
        """
        # Take top 5 records
        top_5 = records[:5]
        
        # Prepare data for JSON serialization
        data_for_prompt = [
            {
                'card': r.card_name,
                'issuer': r.issuer,
                'rewards': r.rewards_rate,
                'fee': r.annual_fee,
                'bonus': r.bonus_offer,
                'best_for': r.best_for
            }
            for r in top_5
        ]
        
        synthesis_prompt = f"""You are a financial advisor creating a concise comparison of the best credit card rewards programs available today.

User interest: {user_query}

Create a professional comparison with:
1. A brief intro stating the date and focus
2. Keep it concise and user-ready (2-3 sentences max)
3. A 1-2 sentence takeaway about the card options and key considerations

Available data:
{json.dumps(data_for_prompt, indent=2)}

Format your response as JSON with fields: intro, takeaway
"""
        
        try:
            response = client.chat.completions.create(
                model="gpt-4o-mini",
                messages=[
                    {"role": "system", "content": synthesis_prompt},
                    {"role": "user", "content": "Generate the comparison now."}
                ],
                temperature=0.3,
                max_tokens=500
            )
            
            content = response.choices[0].message.content.strip()
            
            # Clean up response
            if "```json" in content:
                content = content.split("```json")[1].split("```")[0].strip()
            elif "```" in content:
                content = content.split("```")[1].strip()
            
            summary_data = json.loads(content)
            
            # Compile sources
            sources = []
            seen_sources = set()
            
            for record in top_5:
                if record.source_url not in seen_sources:
                    sources.append({
                        'title': record.source_title,
                        'url': record.source_url,
                        'publisher': record.source_url.split('/')[2].replace('www.', ''),
                        'as_of': 'Recent'
                    })
                    seen_sources.add(record.source_url)
            
            return CardComparison(
                intro=summary_data['intro'],
                top_cards=top_5,
                takeaway=summary_data['takeaway'],
                sources=sources,
                disclaimer="Card offers and rewards rates change frequently. Always verify current terms on the issuer's website before applying."
            )
            
        except Exception as e:
            print(f"‚ö†Ô∏è Error in synthesis: {e}")
            # Fallback summary
            return CardComparison(
                intro=f"Top credit card rewards programs as of {datetime.now().strftime('%B %d, %Y')}",
                top_cards=top_5,
                takeaway="Current rewards cards offer 1.5-4% back on purchases, with some featuring category bonuses and welcome offers.",
                sources=[],
                disclaimer="Card offers and rewards rates change frequently. Always verify current terms on the issuer's website before applying."
            )
    
    def find_best_rewards_cards(self, user_query: str = "best credit card rewards programs") -> CardComparison:
        """
        Main method to find and compare top rewards credit cards
        
        Args:
            user_query: User's search query
            
        Returns:
            Complete card comparison
        """
        print("üîç Searching for current credit card rewards...")
        
        # Step 1: Search web
        search_queries = [
            f"{user_query} 2025",
            "best rewards credit cards today",
            "top cash back credit cards site:nerdwallet.com OR site:bankrate.com"
        ]
        
        all_results = []
        for query in search_queries:
            results = self.search_web(query, num_results=5)
            all_results.extend(results)
        
        print(f"üìä Found {len(all_results)} search results")
        
        # Step 2: Filter to reputable sources
        filtered_results = self.filter_reputable_sources(all_results)
        print(f"‚úÖ Filtered to {len(filtered_results)} reputable sources")
        
        # Step 3: Extract card data
        extracted_records = self.extract_card_data(filtered_results)
        print(f"üí≥ Extracted {len(extracted_records)} credit card records")
        
        # Step 4: Deduplicate
        unique_records = self.deduplicate_cards(extracted_records)
        print(f"üîÑ Deduplicated to {len(unique_records)} unique cards")
        
        # Step 5: Synthesize comparison
        comparison = self.synthesize_comparison(unique_records, user_query)
        print("üìù Generated final comparison")
        
        return comparison

print("‚úÖ CreditCardSearchAgent defined!")

‚úÖ CreditCardSearchAgent defined!


## Demo: Interactive Credit Card Rewards Search

Let's test the agent with a typical user query to see the full workflow in action.

In [4]:
# Initialize the agent
agent = CreditCardSearchAgent()

print("ü§ñ Credit Card Search Agent initialized")
print(f"üèõÔ∏è  Monitoring {len(agent.reputable_domains)} reputable financial sources")
print("üéØ Ready to find top rewards cards!")

ü§ñ Credit Card Search Agent initialized
üèõÔ∏è  Monitoring 10 reputable financial sources
üéØ Ready to find top rewards cards!


In [5]:
# Test the agent with a typical user query
user_query = "What are the best credit card rewards programs with no annual fee?"

print(f"üí¨ User Query: {user_query}")
print("=" * 60)

# Run the search and analysis
comparison = agent.find_best_rewards_cards(user_query)

print("\n" + "=" * 60)
print("‚úÖ Search complete!")

üí¨ User Query: What are the best credit card rewards programs with no annual fee?
üîç Searching for current credit card rewards...
‚úÖ Found 5 results from Tavily API
‚úÖ Found 5 results from Tavily API
‚úÖ Found 5 results from Tavily API
üìä Found 15 search results
‚úÖ Filtered to 15 reputable sources
üí≥ Extracted 6 credit card records
üîÑ Deduplicated to 6 unique cards
üìù Generated final comparison

‚úÖ Search complete!


## Format and Display Results

Present the comparison in a user-friendly format.

In [6]:
def format_card_comparison(comparison: CardComparison) -> str:
    """
    Format the card comparison for user-friendly display
    
    Args:
        comparison: Card comparison object
        
    Returns:
        Formatted string for display
    """
    output = []
    
    # Header and intro
    output.append("# üí≥ Top Credit Card Rewards Programs")
    output.append("")
    output.append(comparison.intro)
    output.append("")
    
    # Top cards
    output.append("## üèÜ Top Picks")
    output.append("")
    
    for i, card in enumerate(comparison.top_cards, 1):
        output.append(f"### {i}. {card.card_name}")
        output.append(f"**Issuer**: {card.issuer}")
        output.append(f"**Rewards**: {card.rewards_rate}")
        output.append(f"**Annual Fee**: {card.annual_fee}")
        
        if card.bonus_offer:
            output.append(f"**Bonus Offer**: {card.bonus_offer}")
        
        if card.best_for:
            output.append(f"**Best For**: {card.best_for.title()}")
        
        output.append("")
    
    # Takeaway
    output.append("## üí° Key Takeaway")
    output.append("")
    output.append(comparison.takeaway)
    output.append("")
    
    # Sources
    if comparison.sources:
        output.append("## üìö Sources")
        output.append("")
        
        for source in comparison.sources:
            publisher = source['publisher'].replace('.com', '').title()
            output.append(f"- [{publisher}]({source['url']})")
        
        output.append("")
    
    # Disclaimer
    output.append("## ‚ö†Ô∏è Important Note")
    output.append("")
    output.append(comparison.disclaimer)
    
    return "\n".join(output)

# Display the formatted comparison
formatted_output = format_card_comparison(comparison)
print(formatted_output)

# üí≥ Top Credit Card Rewards Programs

As of October 2023, here‚Äôs a comparison of some of the best credit card rewards programs available with no annual fee, focusing on their unique benefits and rewards structures.

## üèÜ Top Picks

### 1. Bilt Mastercard¬Æ
**Issuer**: Bilt
**Rewards**: up to 100,000 points in a calendar year
**Annual Fee**: No annual fee
**Best For**: Rent Payments

### 2. Capital One Savor Cash Rewards Credit Card
**Issuer**: Capital One
**Rewards**: 1% - 8% cash back
**Annual Fee**: No annual fee
**Bonus Offer**: $200 cash back + $100 Capital One Travel credit
**Best For**: Dining & Entertainment

### 3. Chase Freedom Unlimited¬Æ
**Issuer**: Chase
**Rewards**: cash back
**Annual Fee**: No annual fee
**Best For**: Cash Back

### 4. Chase Sapphire Reserve¬Æ
**Issuer**: Chase
**Rewards**: travel credits
**Annual Fee**: None
**Best For**: Travel

### 5. American Express¬Æ Gold Card
**Issuer**: American Express
**Rewards**: up to 100,000 Membership Rewards Points


## Individual Tool Testing

Let's test each tool individually to understand how they work.

In [7]:
print("üîß INDIVIDUAL TOOL TESTING")
print("=" * 40)
print()

# Test 1: Web Search Tool
print("1Ô∏è‚É£ **Testing search_web() tool:**")
test_query = "best cash back credit cards"
search_results = agent.search_web(test_query, num_results=3)

for i, result in enumerate(search_results, 1):
    print(f"   Result {i}:")
    print(f"   Title: {result.title[:60]}...")
    print(f"   Domain: {result.domain}")
    print(f"   Snippet: {result.snippet[:100]}...")
    print()

# Test 2: Source Filter Tool
print("2Ô∏è‚É£ **Testing filter_reputable_sources() tool:**")
filtered = agent.filter_reputable_sources(search_results)
print(f"   Original results: {len(search_results)}")
print(f"   After filtering: {len(filtered)}")
print(f"   Filtered domains: {[r.domain for r in filtered]}")
print()

# Test 3: Data Extraction Tool
print("3Ô∏è‚É£ **Testing extract_card_data() tool:**")
extracted = agent.extract_card_data(filtered)
print(f"   Extracted {len(extracted)} credit card records:")

for card in extracted[:3]:  # Show first 3
    print(f"   - {card.card_name}: {card.rewards_rate}, {card.annual_fee}")

üîß INDIVIDUAL TOOL TESTING

1Ô∏è‚É£ **Testing search_web() tool:**
‚úÖ Found 3 results from Tavily API
   Result 1:
   Title: Canada's Best Cards for Bad Credit for October 2025...
   Domain: nerdwallet.com
   Snippet: Credit Cards FIND THE BEST CREDIT CARDS * Best Credit Cards in Canada * Best Cash Back Credit Cards ...

   Result 2:
   Title: Best Cash-Back Credit Cards in Canada for October 2025...
   Domain: nerdwallet.com
   Snippet: * Best Cash Back Credit Cards # Best Cash-Back Credit Cards in Canada for October 2025 * How cash-ba...

   Result 3:
   Title: Best No-Fee Credit Cards in Canada for October 2025...
   Domain: nerdwallet.com
   Snippet: * Best No Fee Credit Cards # Best No-Fee Credit Cards in Canada for October 2025 Whether you‚Äôre tryi...

2Ô∏è‚É£ **Testing filter_reputable_sources() tool:**
   Original results: 3
   After filtering: 3
   Filtered domains: ['nerdwallet.com', 'nerdwallet.com', 'nerdwallet.com']

3Ô∏è‚É£ **Testing extract_card_data() tool:**
   Ext

## Error Handling and Edge Cases

Let's test how the agent handles various error conditions.

In [8]:
print("‚ö†Ô∏è ERROR HANDLING & EDGE CASES")
print("=" * 40)
print()

# Test 1: Fallback when API is unavailable
print("1Ô∏è‚É£ **Testing API fallback mechanism:**")
# Temporarily disable API key to test fallback
original_key = os.getenv("TAVILY_API_KEY")
os.environ["TAVILY_API_KEY"] = ""  # Clear key

fallback_results = agent.search_web("test query", num_results=3)
print(f"   Fallback returned {len(fallback_results)} results")
print(f"   Sample: {fallback_results[0].title[:50]}...")

# Restore key
if original_key:
    os.environ["TAVILY_API_KEY"] = original_key
print()

# Test 2: Empty results handling
print("2Ô∏è‚É£ **Testing empty results handling:**")
empty_results = agent.filter_reputable_sources([])
print(f"   Filtering empty list: {len(empty_results)} results")
print()

# Test 3: Deduplication
print("3Ô∏è‚É£ **Testing deduplication:**")
duplicate_records = [
    CreditCardRecord(
        card_name="Chase Freedom Unlimited",
        issuer="Chase",
        rewards_rate="1.5% cash back",
        annual_fee="$0",
        source_url="test.com",
        source_title="Test"
    ),
    CreditCardRecord(
        card_name="Chase Freedom Unlimited",
        issuer="Chase",
        rewards_rate="1.5% cash back",
        annual_fee="$0",
        bonus_offer="$200 bonus",
        source_url="test2.com",
        source_title="Test 2"
    )
]

deduplicated = agent.deduplicate_cards(duplicate_records)
print(f"   Original: {len(duplicate_records)} records")
print(f"   After deduplication: {len(deduplicated)} records")
print(f"   Kept record with bonus: {deduplicated[0].bonus_offer is not None}")

‚ö†Ô∏è ERROR HANDLING & EDGE CASES

1Ô∏è‚É£ **Testing API fallback mechanism:**
‚ö†Ô∏è TAVILY_API_KEY not found. Using fallback mock results.
   Fallback returned 3 results
   Sample: Best Cash Back Credit Cards of 2025 - NerdWallet...

2Ô∏è‚É£ **Testing empty results handling:**
   Filtering empty list: 0 results

3Ô∏è‚É£ **Testing deduplication:**
   Original: 2 records
   After deduplication: 1 records
   Kept record with bonus: True


## Key Learning Points

### 1. **Web Search API Integration**
- **Real-time Search**: Tavily API provides fresh, relevant results
- **Domain Filtering**: Focus search on reputable financial sources
- **Error Handling**: Graceful fallback when API is unavailable
- **Request Management**: Proper timeout and error handling

### 2. **Source Credibility Filtering**
- **Trust-based Selection**: Pre-defined list of reputable domains
- **Domain Extraction**: Parse URLs to identify publishers
- **Quality Control**: Only accept results from trusted sources

### 3. **LLM-Powered Data Extraction**
- **Unstructured to Structured**: Convert web snippets to structured data
- **Context-aware Parsing**: LLM understands financial terminology
- **JSON Output**: Consistent, machine-readable format
- **Error Recovery**: Handles parsing failures gracefully

### 4. **Data Processing Pipeline**
- **Deduplication**: Remove duplicate cards by name
- **Quality Ranking**: Prefer records with complete information
- **Source Attribution**: Track where each data point came from

### 5. **Professional Output Synthesis**
- **LLM Summarization**: Natural language summary generation
- **User-ready Format**: Clear, actionable comparison
- **Proper Attribution**: List all sources used
- **Disclaimers**: Important caveats for financial decisions

### 6. **Production Patterns**
- **Fallback Mechanisms**: Always have a plan B
- **Rate Limiting**: Respect API usage limits
- **Caching Potential**: Could cache results to reduce API calls
- **Audit Trails**: Track sources for transparency

### Applications to Other Domains
This pattern extends to:
- **Investment research** (stock analysis, market trends)
- **Product comparisons** (electronics, services)
- **Real estate search** (property listings, market data)
- **Travel planning** (flights, hotels, destinations)
- **Legal research** (case law, regulations)
- **Academic research** (papers, citations)