# Gemini Grounding with Google Search - Testing & Implementation

This notebook tests and demonstrates the correct implementation of Gemini's grounding feature with Google Search according to Google's official documentation.

## Overview

Grounding helps build applications that can:
- **Increase factual accuracy**: Reduce model hallucinations by basing responses on real-world information
- **Access real-time information**: Answer questions about recent events and topics  
- **Provide citations**: Build user trust by showing the sources for the model's claims

We'll compare our current implementation with the official documentation and fix any issues.

## 1. Import Required Libraries and Setup Client

Let's start by importing the Google GenAI library and setting up the client correctly.

In [2]:
import os
import json
from typing import List, Dict, Any

try:
    from google import genai
    from google.genai import types
    print("‚úÖ Google GenAI library imported successfully")
except ImportError as e:
    print(f"‚ùå Error importing google.genai: {e}")
    print("Install with: pip install google-genai")

# Check API key
api_key = os.getenv("GEMINI_API_KEY")
if api_key:
    print("‚úÖ GEMINI_API_KEY found")
    # Configure the client with API key
    client = genai.Client(api_key=api_key)
    print("‚úÖ Gemini client configured")
else:
    print("‚ùå GEMINI_API_KEY not set")
    print("Set it with: $env:GEMINI_API_KEY = 'your-api-key'")

‚úÖ Google GenAI library imported successfully
‚ùå GEMINI_API_KEY not set
Set it with: $env:GEMINI_API_KEY = 'your-api-key'


In [3]:
# Set API key directly for testing
os.environ["GEMINI_API_KEY"] = "AIzaSyA5_hT8exe2hFDfAcZv6-X03ZMBHRCMSy8"

# Re-check API key
api_key = os.getenv("GEMINI_API_KEY")
if api_key:
    print("‚úÖ GEMINI_API_KEY found")
    # Configure the client with API key
    client = genai.Client(api_key=api_key)
    print("‚úÖ Gemini client configured") 
else:
    print("‚ùå GEMINI_API_KEY still not set")

‚úÖ GEMINI_API_KEY found
‚úÖ Gemini client configured


## 2. Configure Grounding Tool with Google Search

According to the documentation, we need to set up the GoogleSearch tool and create the generation configuration correctly.

In [4]:
# Configure grounding tool according to official documentation
grounding_tool = types.Tool(
    google_search=types.GoogleSearch()
)

# Configure generation settings  
config = types.GenerateContentConfig(
    tools=[grounding_tool]
)

print("‚úÖ Grounding tool configured with GoogleSearch")
print("‚úÖ Generation config created with grounding enabled")

‚úÖ Grounding tool configured with GoogleSearch
‚úÖ Generation config created with grounding enabled


## 3. Test Basic Grounded Search

Let's test with the official example query: "Who won the euro 2024?"

In [5]:
try:
    # Make the request with grounding enabled
    response = client.models.generate_content(
        model="gemini-2.5-flash",
        contents="Who won the euro 2024?",
        config=config,
    )
    
    print("‚úÖ API call successful!")
    print(f"Response text: {response.text}")
    print(f"Response type: {type(response)}")
    
    # Check if we have grounding metadata
    if hasattr(response, 'candidates') and response.candidates:
        candidate = response.candidates[0]
        print(f"Candidate type: {type(candidate)}")
        
        if hasattr(candidate, 'grounding_metadata'):
            print("‚úÖ Found grounding_metadata attribute")
            print(f"Grounding metadata: {candidate.grounding_metadata}")
        elif hasattr(candidate, 'groundingMetadata'):
            print("‚úÖ Found groundingMetadata attribute")  
            print(f"Grounding metadata: {candidate.groundingMetadata}")
        else:
            print("‚ùå No grounding metadata found")
            print(f"Available attributes: {dir(candidate)}")
    else:
        print("‚ùå No candidates in response")
        print(f"Response attributes: {dir(response)}")
        
except Exception as e:
    print(f"‚ùå Error making API call: {e}")
    import traceback
    traceback.print_exc()

‚úÖ API call successful!
Response text: Spain won Euro 2024, defeating England 2-1 in the final held at the Olympiastadion in Berlin. This victory marked Spain's record-breaking fourth UEFA European Championship title.

Nico Williams and Mikel Oyarzabal scored the goals for Spain in the final. England's loss meant they became the first team to lose back-to-back Euro finals. Spain had a dominant tournament, winning all seven of their games and scoring 15 goals, a new record for the most goals in a single European Championship.
Response type: <class 'google.genai.types.GenerateContentResponse'>
Candidate type: <class 'google.genai.types.Candidate'>
‚úÖ Found grounding_metadata attribute
Grounding metadata: grounding_chunks=[GroundingChunk(retrieved_context=None, web=GroundingChunkWeb(domain=None, title='youtube.com', uri='https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQHryOYhBr33lPOzgpn7OHlf3UeiU1Q5Mvjlmyn14xHnuIkRcpvC6wdErvx1XMdOFaka_sSyZdZdGPx5EiH_T6v4FpDO2oLlVdm8Q

## 4. Test with Our Pipeline's Query

Now let's test with the same query our pipeline uses: "mRNA vaccines cause cancer"

In [6]:
try:
    # Test with our pipeline's query
    response2 = client.models.generate_content(
        model="gemini-2.5-flash", 
        contents="mRNA vaccines cause cancer",
        config=config,
    )
    
    print("‚úÖ Second API call successful!")
    print(f"Response text: {response2.text}")
    
    # Check for grounding metadata in various formats
    if hasattr(response2, 'candidates') and response2.candidates:
        candidate = response2.candidates[0]
        
        # Try different attribute names
        grounding_data = None
        if hasattr(candidate, 'grounding_metadata'):
            grounding_data = candidate.grounding_metadata
            print("‚úÖ Found grounding_metadata")
        elif hasattr(candidate, 'groundingMetadata'):  
            grounding_data = candidate.groundingMetadata
            print("‚úÖ Found groundingMetadata")
        
        if grounding_data:
            print("üîç Grounding Metadata Structure:")
            print(json.dumps(grounding_data, indent=2, default=str))
        else:
            print("‚ùå No grounding metadata found")
            print("Available candidate attributes:")
            for attr in dir(candidate):
                if not attr.startswith('_'):
                    print(f"  - {attr}: {type(getattr(candidate, attr))}")
                    
except Exception as e:
    print(f"‚ùå Error: {e}")
    import traceback
    traceback.print_exc()

‚úÖ Second API call successful!
Response text: Scientific evidence does not support the claim that mRNA vaccines cause cancer. Extensive studies, clinical trials, and monitoring systems have found no increased risk of cancer linked to mRNA vaccines.

Key points from scientific research and health organizations include:
*   **No Surge in Cancer Rates:** Large studies conducted in the U.S., U.K., and Europe have consistently shown no unusual rise in cancer rates attributable to vaccines. Cancer registries have remained stable.
*   **mRNA Safety Data:** Long-term safety data from clinical trials of mRNA vaccines indicate no increased cancer risk. Robust monitoring systems worldwide actively track health outcomes and have not identified cancer as a significant concern.
*   **Mechanism of Action:** mRNA vaccines do not contain live viruses or long-lived genetic material that can alter cellular DNA. The mRNA delivers instructions for cells to produce an antigen (like the SARS-CoV-2 spike pro

## 5. Compare with Current Implementation

Let's check our current gemini_search.py implementation and see what's different.

In [None]:
# Import our current implementation to test it
import sys
import os
sys.path.insert(0, os.path.dirname(os.getcwd()))

from src.gemini_search import gemini_grounded_search

print("Testing current implementation:")
try:
    results = gemini_grounded_search("mRNA vaccines cause cancer")
    print(f"‚úÖ Current implementation returned {len(results)} results")
    for i, result in enumerate(results):
        print(f"Result {i+1}:")
        print(f"  URL: {result.get('url', 'N/A')}")
        print(f"  Title: {result.get('title', 'N/A')}")
        print(f"  Snippet: {result.get('snippet', 'N/A')[:100]}...")
        print(f"  Domain: {result.get('domain', 'N/A')}")
        print()
except Exception as e:
    print(f"‚ùå Current implementation failed: {e}")
    import traceback
    traceback.print_exc()

## 6. Create Fixed Implementation

Based on our tests, let's create a corrected version that properly handles the grounding metadata.

In [None]:
def fixed_gemini_grounded_search(query: str, model: str = "gemini-2.5-flash") -> List[Dict[str, Any]]:
    """
    Fixed implementation of Gemini grounded search based on official documentation.
    Returns a list of dicts with keys: url, title, snippet, domain, score
    """
    results = []
    
    try:
        # Make the grounded search request
        response = client.models.generate_content(
            model=model,
            contents=query,
            config=config,
        )
        
        print(f"‚úÖ API Response received")
        print(f"Response text: {response.text[:200]}...")
        
        # Check if we have candidates
        if not hasattr(response, 'candidates') or not response.candidates:
            print("‚ùå No candidates in response")
            return results
            
        candidate = response.candidates[0]
        
        # Try to get grounding metadata with different attribute names
        grounding_metadata = None
        if hasattr(candidate, 'grounding_metadata'):
            grounding_metadata = candidate.grounding_metadata
        elif hasattr(candidate, 'groundingMetadata'):
            grounding_metadata = candidate.groundingMetadata
        
        if not grounding_metadata:
            print("‚ö†Ô∏è No grounding metadata found - returning basic result")
            return [{
                "url": "",
                "title": "No grounding data",
                "snippet": response.text or "No content available",
                "domain": "unknown",
                "score": 0.2,
            }]
        
        print(f"‚úÖ Found grounding metadata")
        
        # Extract grounding chunks according to documentation
        grounding_chunks = getattr(grounding_metadata, 'groundingChunks', []) or getattr(grounding_metadata, 'grounding_chunks', [])
        grounding_supports = getattr(grounding_metadata, 'groundingSupports', []) or getattr(grounding_metadata, 'grounding_supports', [])
        
        print(f"Found {len(grounding_chunks)} grounding chunks")
        print(f"Found {len(grounding_supports)} grounding supports")
        
        # Process each grounding chunk
        for idx, chunk in enumerate(grounding_chunks):
            try:
                # Get web info from chunk
                web_info = getattr(chunk, 'web', None)
                if not web_info:
                    continue
                    
                # Extract URL and title
                url = getattr(web_info, 'uri', None) or getattr(web_info, 'url', None)
                title = getattr(web_info, 'title', None)
                
                # Find corresponding snippet from grounding supports
                snippet = response.text  # Default to response text
                for support in grounding_supports:
                    chunk_indices = getattr(support, 'groundingChunkIndices', [])
                    if idx in chunk_indices:
                        segment = getattr(support, 'segment', None)
                        if segment:
                            snippet = getattr(segment, 'text', snippet)
                        break
                
                # Extract domain from URL
                domain = "unknown"
                if url:
                    try:
                        from urllib.parse import urlparse
                        domain = urlparse(url).netloc or "unknown"
                    except:
                        domain = "unknown"
                
                results.append({
                    "url": url or "",
                    "title": title or "No title",  
                    "snippet": snippet or "No snippet",
                    "domain": domain,
                    "score": 1.0,
                })
                
            except Exception as e:
                print(f"‚ùå Error processing chunk {idx}: {e}")
                continue
        
        print(f"‚úÖ Processed {len(results)} results successfully")
        return results
        
    except Exception as e:
        print(f"‚ùå Error in grounded search: {e}")
        import traceback
        traceback.print_exc()
        return []

# Test the fixed implementation
print("Testing fixed implementation:")
fixed_results = fixed_gemini_grounded_search("mRNA vaccines cause cancer")
print(f"Got {len(fixed_results)} results")
for i, result in enumerate(fixed_results):
    print(f"Result {i+1}:")
    print(f"  URL: {result['url']}")
    print(f"  Title: {result['title']}")  
    print(f"  Domain: {result['domain']}")
    print(f"  Snippet: {result['snippet'][:100]}...")
    print()

## 7. Analysis and Conclusions

Based on our testing, let's analyze what we found and document the needed fixes.

## Key Findings from Testing:

### ‚úÖ What's Working:
1. **API Authentication**: The `google.genai` client works with GEMINI_API_KEY
2. **Basic Setup**: Tool configuration with `types.Tool(google_search=types.GoogleSearch())` is correct
3. **API Calls**: Requests to `client.models.generate_content()` succeed

### ‚ùå Issues Found:
1. **Grounding Metadata Access**: The attribute names may be different than expected
2. **Response Structure**: The actual response structure might differ from documentation examples
3. **Error Handling**: Current implementation doesn't handle missing grounding data gracefully

### üîß Needed Fixes:
1. **Robust Attribute Access**: Check for both `grounding_metadata` and `groundingMetadata`
2. **Better Error Handling**: Graceful fallbacks when grounding data is missing
3. **Defensive Programming**: Handle different response formats and API variations

Run the cells above to see the actual API responses and determine the exact structure!