# Tutorial 2: Building a Wikipedia Search & Summarization MCP Server

## Introduction

In this tutorial, we'll build a real-world MCP server that integrates with Wikipedia's API to provide search and summarization capabilities. This demonstrates how MCP servers can expose external data sources and APIs as standardized tools for LLMs.

## Learning Objectives

- Integrate external APIs with MCP servers
- Implement error handling for network requests
- Create tools that process and transform data
- Build resources for accessing structured content
- Use MCP for real-world data retrieval

## Architecture Overview

Our Wikipedia MCP server will provide:

```mermaid
graph TD
    A[MCP Client] --> B[Wikipedia Server]
    B --> C[search_wikipedia]
    B --> D[get_article_summary]
    B --> E[get_article_content]
    C --> F[Wikipedia API]
    D --> F
    E --> F
    
    subgraph "Tools"
        C
        D
        E
    end
```

### Tools We'll Implement

1. **search_wikipedia**: Search for articles by keyword
2. **get_article_summary**: Get a brief summary of an article
3. **get_article_content**: Retrieve full article content (truncated)


## Exercise 1: Explore Wikipedia API

Before building our MCP server, let's understand how the Wikipedia API works.

### 🎯 Task: Test Wikipedia API integration

**File to examine:** `solution/servers/wikipedia_api_demo.py`

Let's first explore the Wikipedia API to understand what we're working with.

In [None]:
# Let's test the Wikipedia API directly to understand its capabilities
import wikipediaapi
import requests
from typing import List, Dict, Any

# Initialize Wikipedia API client
wiki = wikipediaapi.Wikipedia(
    language='en',
    extract_format=wikipediaapi.ExtractFormat.WIKI,
    user_agent='MCP-Workshop/1.0 (educational-purpose)'
)

# Test basic search
def test_wikipedia_search(query: str) -> List[str]:
    """Test Wikipedia search functionality."""
    try:
        # Use Wikipedia's search API
        search_url = "https://en.wikipedia.org/api/rest_v1/page/search/title"
        params = {
            'q': query,
            'limit': 5
        }
        response = requests.get(search_url, params=params)
        response.raise_for_status()
        
        data = response.json()
        return [page['title'] for page in data.get('pages', [])]
    except Exception as e:
        print(f"Error searching Wikipedia: {e}")
        return []

# Test the search
print("Testing Wikipedia search for 'artificial intelligence':")
results = test_wikipedia_search("artificial intelligence")
for i, title in enumerate(results, 1):
    print(f"{i}. {title}")

In [None]:
# Test getting article content
def test_get_article(title: str) -> Dict[str, Any]:
    """Test getting article content."""
    try:
        page = wiki.page(title)
        if not page.exists():
            return {"error": f"Article '{title}' not found"}
        
        return {
            "title": page.title,
            "summary": page.summary[:500] + "..." if len(page.summary) > 500 else page.summary,
            "url": page.fullurl,
            "content_length": len(page.text)
        }
    except Exception as e:
        return {"error": f"Error retrieving article: {e}"}

# Test article retrieval
print("\nTesting article retrieval for 'Artificial Intelligence':")
article_info = test_get_article("Artificial Intelligence")
if "error" not in article_info:
    print(f"Title: {article_info['title']}")
    print(f"URL: {article_info['url']}")
    print(f"Content Length: {article_info['content_length']} characters")
    print(f"Summary: {article_info['summary']}")
else:
    print(f"Error: {article_info['error']}")

## Exercise 2: Build the Wikipedia MCP Server

Now let's build our Wikipedia MCP server step by step.

### 🎯 Task: Implement Wikipedia search and content tools

**File to edit:** `workshop/servers/wikipedia_server.py`

Complete the TODOs to implement all Wikipedia tools.

In [None]:
# Let's examine the workshop server structure
import sys
import os

# Add workshop directory to path
workshop_path = os.path.join(os.getcwd(), '..', 'workshop', 'servers')
sys.path.append(workshop_path)

try:
    # Import the workshop server (if it exists)
    from workshop.servers.wikipedia_server import mcp
    
    print(f"Wikipedia Server '{mcp.name}' loaded")
    print(f"Available tools: {len(mcp.list_tools())}")
    
    for tool in mcp.list_tools():
        print(f"  - {tool.name}: {tool.description}")

except ImportError as e:
    print(f"Workshop server not ready yet: {e}")
    print("Please complete workshop/servers/wikipedia_server.py first")
except Exception as e:
    print(f"Error loading server: {e}")

## Exercise 3: Test Your Wikipedia Server

Let's test each tool individually to ensure they work correctly.

In [None]:
# Test search functionality
try:
    from workshop.servers.wikipedia_server import search_wikipedia
    
    print("Testing search_wikipedia tool:")
    search_results = search_wikipedia("machine learning")
    print(f"Search results for 'machine learning': {search_results}")
    
except ImportError:
    print("search_wikipedia not implemented yet - complete the TODOs first")
except Exception as e:
    print(f"Error testing search: {e}")

In [None]:
# Test article summary functionality
try:
    from workshop.servers.wikipedia_server import get_article_summary
    
    print("Testing get_article_summary tool:")
    summary = get_article_summary("Python (programming language)")
    print(f"Summary length: {len(summary)} characters")
    print(f"Summary preview: {summary[:200]}...")
    
except ImportError:
    print("get_article_summary not implemented yet - complete the TODOs first")
except Exception as e:
    print(f"Error testing summary: {e}")

In [None]:
# Test full content functionality
try:
    from workshop.servers.wikipedia_server import get_article_content
    
    print("Testing get_article_content tool:")
    content = get_article_content("Model Context Protocol", max_length=1000)
    print(f"Content length: {len(content)} characters")
    print(f"Content preview: {content[:300]}...")
    
except ImportError:
    print("get_article_content not implemented yet - complete the TODOs first")
except Exception as e:
    print(f"Error testing content: {e}")

## Exercise 4: Error Handling and Edge Cases

Let's test how our server handles various error conditions.

In [None]:
# Test error handling
print("Testing error handling scenarios:")

try:
    from workshop.servers.wikipedia_server import search_wikipedia, get_article_summary, get_article_content
    
    # Test empty search
    print("\n1. Empty search query:")
    empty_search = search_wikipedia("")
    print(f"Result: {empty_search}")
    
    # Test non-existent article
    print("\n2. Non-existent article:")
    fake_article = get_article_summary("ThisArticleDoesNotExistAnywhere123456")
    print(f"Result: {fake_article[:100]}...")
    
    # Test very long content
    print("\n3. Content length limiting:")
    long_content = get_article_content("List of countries", max_length=100)
    print(f"Limited content length: {len(long_content)} characters")
    print(f"Content: {long_content}")
    
except ImportError:
    print("Tools not implemented yet - complete all TODOs first")
except Exception as e:
    print(f"Error during testing: {e}")

## Exercise 5: Running the Server

Once you've completed all implementations, you can run the server as a standalone MCP server.

**To run the server:**
```bash
python workshop/servers/wikipedia_server.py
```

**Or test with the complete solution:**
```bash
python solution/servers/wikipedia_server.py
```

In [None]:
# Final validation - compare with solution
print("Comparing your implementation with the solution:")

# Check solution server
solution_path = os.path.join(os.getcwd(), '..', 'solution', 'servers')
sys.path.append(solution_path)

try:
    from solution.servers.wikipedia_server import mcp as solution_mcp
    print(f"\nSolution server has {len(solution_mcp.list_tools())} tools:")
    for tool in solution_mcp.list_tools():
        print(f"  - {tool.name}: {tool.description}")
        
except ImportError:
    print("Solution server not available yet")

# Compare with workshop implementation
try:
    from workshop.servers.wikipedia_server import mcp as workshop_mcp
    print(f"\nYour server has {len(workshop_mcp.list_tools())} tools:")
    for tool in workshop_mcp.list_tools():
        print(f"  - {tool.name}: {tool.description}")
        
    if len(workshop_mcp.list_tools()) == len(solution_mcp.list_tools()):
        print("\n✅ Great! You've implemented all the required tools.")
    else:
        print("\n⚠️ Some tools may still need implementation.")
        
except (ImportError, NameError):
    print("\nWorkshop server not ready - complete the TODOs first")

## Key Concepts Learned

1. **External API Integration**: How to wrap external APIs as MCP tools
2. **Error Handling**: Proper error handling for network requests and missing data
3. **Data Processing**: Transforming and truncating data for LLM consumption
4. **Tool Design**: Creating focused, single-purpose tools
5. **Resource Management**: Handling rate limits and content size limits

## MCP Server Best Practices

```mermaid
graph TD
    A[MCP Server Design] --> B[Single Responsibility]
    A --> C[Error Handling]
    A --> D[Input Validation]
    A --> E[Output Formatting]
    
    B --> F[One tool per function]
    C --> G[Graceful degradation]
    D --> H[Type checking]
    E --> I[Consistent format]
```

### Design Principles

1. **Focused Tools**: Each tool should have a single, clear purpose
2. **Robust Error Handling**: Always handle network failures and missing data
3. **Input Validation**: Validate and sanitize all inputs
4. **Consistent Outputs**: Return data in predictable formats
5. **Documentation**: Clear descriptions for LLM understanding

## Workshop Completion Checklist

- [ ] Completed Wikipedia API exploration and testing
- [ ] Implemented `search_wikipedia` tool with proper error handling
- [ ] Implemented `get_article_summary` tool with content truncation
- [ ] Implemented `get_article_content` tool with length limiting
- [ ] Tested all tools with various inputs and edge cases
- [ ] Verified server can run as standalone MCP server
- [ ] Ready to move to client development tutorial

## Next Steps

In Tutorial 3, we'll build MCP clients (both CLI and web-based) that can discover and use our Wikipedia server tools, demonstrating the full MCP ecosystem in action.