In [21]:
import ollama

In [4]:
query = "If Eliud Kipchoge could maintain his record-making marathon pace indefinitely, how many thousand hours would it take him to run the distance between the Earth and the Moon its closest approach? Please use the minimum perigee value on the Wikipedia page for the Moon when carrying out your calculation. Round your result to the nearest 1000 hours and do not use any comma separators if necessary."

In [15]:
query = """Here's a fun riddle that I think you'll enjoy.

You have been selected to play the final round of the hit new game show "Pick That Ping-Pong". In this round, you will be competing for a large cash prize. Your job will be to pick one of several different numbered ping-pong balls, and then the game will commence. The host describes how the game works.

A device consisting of a winding clear ramp and a series of pistons controls the outcome of the game. The ramp feeds balls onto a platform. The platform has room for three ping-pong balls at a time. The three balls on the platform are each aligned with one of three pistons. At each stage of the game, one of the three pistons will randomly fire, ejecting the ball it strikes. If the piston ejects the ball in the first position on the platform the balls in the second and third position on the platform each advance one space, and the next ball on the ramp advances to the third position. If the piston ejects the ball in the second position, the ball in the first position is released and rolls away, the ball in the third position advances two spaces to occupy the first position, and the next two balls on the ramp advance to occupy the second and third positions on the platform. If the piston ejects the ball in the third position, the ball in the first position is released and rolls away, the ball in the second position advances one space to occupy the first position, and the next two balls on the ramp advance to occupy the second and third positions on the platform.

The ramp begins with 100 numbered ping-pong balls, arranged in ascending order from 1 to 100. The host activates the machine and the first three balls, numbered 1, 2, and 3, advance to the platform. Before the random firing of the pistons begins, you are asked which of the 100 balls you would like to pick. If your pick is ejected by one of the pistons, you win the grand prize, $10,000.

Which ball should you choose to maximize your odds of winning the big prize? Please provide your answer as the number of the ball selected."""

In [16]:
import ollama

response = ollama.chat(
    model='qwen2.5:72b',
    messages=[
        {'role': 'user', 'content': query}
    ]
)

print(response['message']['content'])


To solve this riddle and determine which ping-pong ball to pick to maximize the chances of winning, we need to analyze how the balls move through the system.

### Key Observations:
1. **Initial Setup**: The ramp starts with 100 balls numbered from 1 to 100. The first three balls (1, 2, and 3) are placed on the platform.
2. **Piston Actions**:
   - If the piston in the first position fires, the ball in the first position is ejected, and the remaining two balls advance one space each. A new ball from the ramp takes the third position.
   - If the piston in the second position fires, the ball in the first position is released, and the ball in the third position advances to the first position. Two new balls from the ramp take the second and third positions.
   - If the piston in the third position fires, the ball in the first position is released, and the ball in the second position advances to the first position. Two new balls from the ramp take the second and third positions.

### Analys

In [17]:
import requests
import json
import re
from urllib.parse import quote

def search_wikipedia(query, max_results=5):
    """Search Wikipedia and return a list of relevant articles with their URLs"""
    # Step 1: Search for relevant pages
    search_url = "https://en.wikipedia.org/w/api.php"
    params = {
        "action": "query",
        "list": "search",
        "srsearch": query,
        "format": "json",
        "srlimit": max_results
    }
    response = requests.get(search_url, params=params).json()
    
    search_results = response.get("query", {}).get("search", [])
    if not search_results:
        return "No results found."
    
    # Step 2: Format the results as a list of articles with URLs
    articles = []
    for result in search_results:
        title = result["title"]
        # Get the page URL from the title
        page_url = f"https://en.wikipedia.org/wiki/{quote(title)}"
        articles.append({
            "title": title,
            "url": page_url
        })
    
    return articles

def extract_search_terms_from_llm_output(llm_output):
    """Extract Python list of search terms from LLM output"""
    # First try to find Python code blocks in markdown format
    code_block_pattern = r"```python\s*(.*?)\s*```"
    code_blocks = re.findall(code_block_pattern, llm_output, re.DOTALL)
    
    if code_blocks:
        for block in code_blocks:
            # Find list patterns in the code block
            if '[' in block and ']' in block:
                try:
                    # Extract just the list part if there's more code
                    list_text = block[block.find('['):block.rfind(']')+1]
                    # Evaluate the Python list string safely
                    search_terms = eval(list_text)
                    if isinstance(search_terms, list):
                        return search_terms
                except:
                    pass
    
    # If no code blocks found or they didn't contain valid lists,
    # look for standalone Python list patterns
    list_pattern = r"\[(.*?)\]"
    matches = re.findall(list_pattern, llm_output, re.DOTALL)
    
    for match in matches:
        try:
            # Add brackets back and try to parse as Python literal
            list_str = "[" + match + "]"
            # Replace single quotes with double quotes if needed
            list_str = list_str.replace("'", '"')
            search_terms = json.loads(list_str)
            if isinstance(search_terms, list):
                return search_terms
        except:
            pass
    
    # Fallback - if no valid list found, extract potential keywords
    # Look for lines that might have keywords after colons
    keyword_pattern = r"[:-]\s*\"([^\"]+)\"|[:-]\s*\'([^\']+)\'|[:-]\s*(\w[\w\s]+)"
    keywords = re.findall(keyword_pattern, llm_output)
    flattened_keywords = []
    for keyword_tuple in keywords:
        # Each match is a tuple, take the non-empty value
        for k in keyword_tuple:
            if k.strip():
                flattened_keywords.append(k.strip())
    
    if flattened_keywords:
        return flattened_keywords
    
    # Last resort: return the original query
    return None

def decompose_query_with_llm(query, llm_client=None):
    """Use an LLM to decompose a complex query into search terms"""
    if llm_client is None:
        # Fallback to rule-based decomposition if no LLM client
        return decompose_query_rule_based(query)
    
    prompt = f'''
        I'm trying to solve a complex problem that requires factual information. I need your help identifying the key Wikipedia search terms I should use to gather relevant information.
        PROBLEM QUERY: "{query}"
        For this problem, identify 2-4 specific Wikipedia search terms that would help me find the most relevant factual information to solve it.
        Consider:

        What factual knowledge is required to answer this question?
        What key concepts or historical events are central to this problem?
        What scientific, technical, or specialized topics are involved?

        Return your suggested Wikipedia search terms as a Python list of strings:
        python["search term 1", "search term 2", "search term 3"]
        IMPORTANT:

        Make each term specific and suitable for Wikipedia article searches
        Don't be too broad (e.g., "science") or too narrow (e.g., very specific technical terminology unlikely to have its own article)
        Focus on factual topics, not methodological terms
    '''
    
    try:
        # This is the integration point with your LLM
        # Example with ollama:
        response = llm_client.chat(model="deepseek-r1:7b", messages=[
            {"role": "user", "content": prompt}
        ])
        llm_output = response['message']['content']
        
        print("LLM output: ", llm_output)

        # Extract search terms from the LLM's response
        search_terms = extract_search_terms_from_llm_output(llm_output)
        
        if search_terms:
            print("Search terms: ", search_terms)
            return search_terms
        else:
            # Fallback to rule-based if LLM output parsing failed
            return decompose_query_rule_based(query)
            
    except Exception as e:
        print(f"Error using LLM: {e}")
        # Fallback to rule-based decomposition
        return decompose_query_rule_based(query)

def decompose_query_rule_based(query):
    """Rule-based fallback for query decomposition"""
    # Split by common separators
    terms = query.replace('?', ' ').replace('!', ' ').replace(',', ' ').split()
    
    # Remove very common words (simplified stopword removal)
    stopwords = ['the', 'a', 'an', 'and', 'or', 'but', 'in', 'on', 'at', 'to', 'for', 'with', 'by', 'about', 'is', 'are']
    filtered_terms = [term.lower() for term in terms if term.lower() not in stopwords]
    
    # Group terms into 2-3 meaningful phrases (simplified)
    if len(filtered_terms) <= 3:
        return [query]  # If short query, use as is
    else:
        # For longer queries, create combinations of terms
        # This is a simple approach - just taking the whole query and first few words
        return [query, " ".join(filtered_terms[:3])]

def wikipedia_agent(query, llm_client=None):
    """Wikipedia agent that returns a list of relevant articles"""
    # Step 1: Decompose the query into search terms
    if llm_client:
        print(f"Decomposing query with LLM: {query}")
        search_terms = decompose_query_with_llm(query, llm_client)
        print(f"LLM suggested search terms: {search_terms}")
    else:
        print(f"Using rule-based query decomposition: {query}")
        search_terms = decompose_query_rule_based(query)
        print(f"Rule-based search terms: {search_terms}")
    
    # Step 2: Get Wikipedia results for each search term
    all_results = []
    for term in search_terms:
        print(f"Searching Wikipedia for: {term}")
        results = search_wikipedia(term)
        if isinstance(results, list):
            all_results.extend(results)
    
    # Remove duplicates by title
    unique_results = []
    seen_titles = set()
    for article in all_results:
        if article['title'] not in seen_titles:
            unique_results.append(article)
            seen_titles.add(article['title'])
    
    return unique_results

# Example of Ollama integration
try:
    import ollama
    
    class OllamaClient:
        def chat(self, model, messages):
            return ollama.chat(model=model, messages=messages)
    
    # Create an Ollama client instance if available
    ollama_client = OllamaClient()
    print("Ollama integration is available")
except ImportError:
    ollama_client = None
    print("Ollama is not available, using rule-based fallback for query decomposition")

# Example usage
def test_agent(test_query, use_llm=True):
    """Test the Wikipedia agent with a given query
    
    Args:
        test_query: The query to test
        use_llm: Whether to use the LLM for query decomposition (if available)
    """
    client = ollama_client if use_llm and ollama_client else None
    results = wikipedia_agent(test_query, client)
    
    print("\n" + "="*60)
    print(f"Query: {test_query}")
    print(f"Relevant Wikipedia Articles:")
    for i, article in enumerate(results, 1):
        print(f"{i}. {article['title']} - {article['url']}")
    print("="*60)
    
    return results

# Uncomment to test
# test_agent("What were the major causes of World War II?")
# test_agent("Explain how quantum computing affects cryptography", use_llm=False)

Ollama integration is available


In [20]:
test_agent("If Eliud Kipchoge could maintain his record-making marathon pace indefinitely, how many thousand hours would it take him to run the distance between the Earth and the Moon its closest approach? Please use the minimum perigee value on the Wikipedia page for the Moon when carrying out your calculation. Round your result to the nearest 1000 hours and do not use any comma separators if necessary")

Decomposing query with LLM: If Eliud Kipchoge could maintain his record-making marathon pace indefinitely, how many thousand hours would it take him to run the distance between the Earth and the Moon its closest approach? Please use the minimum perigee value on the Wikipedia page for the Moon when carrying out your calculation. Round your result to the nearest 1000 hours and do not use any comma separators if necessary
LLM output:  <think>
Okay, so I'm trying to help someone who's trying to solve this problem about Eliud Kipchoge running from Earth to the Moon at his record pace. The goal is to figure out how many thousand hours it would take him. Let me break down what needs to be done.

First, I know that to calculate time, I need two main pieces of information: distance and speed. So, I'll probably need both the average distance between Earth and Moon (specifically the closest approach) and Kipchoge's record-breaking marathon pace per kilometer or mile.

Looking at the problem, it m

[{'title': 'Speed', 'url': 'https://en.wikipedia.org/wiki/Speed'},
 {'title': 'Tachymeter (watch)',
  'url': 'https://en.wikipedia.org/wiki/Tachymeter%20%28watch%29'},
 {'title': '2025 World Single Distances Speed Skating Championships',
  'url': 'https://en.wikipedia.org/wiki/2025%20World%20Single%20Distances%20Speed%20Skating%20Championships'},
 {'title': 'World Single Distances Speed Skating Championships',
  'url': 'https://en.wikipedia.org/wiki/World%20Single%20Distances%20Speed%20Skating%20Championships'},
 {'title': 'Speed of light',
  'url': 'https://en.wikipedia.org/wiki/Speed%20of%20light'},
 {'title': 'Marathon', 'url': 'https://en.wikipedia.org/wiki/Marathon'},
 {'title': 'Marathon world record progression',
  'url': 'https://en.wikipedia.org/wiki/Marathon%20world%20record%20progression'},
 {'title': 'Aleksandr Sorokin',
  'url': 'https://en.wikipedia.org/wiki/Aleksandr%20Sorokin'},
 {'title': 'New York City Marathon',
  'url': 'https://en.wikipedia.org/wiki/New%20York%20Ci

Title: Usain Bolt - Wikipedia
URL: https://en.wikipedia.org/wiki/Usain_Bolt
Snippet: Learn about Usain Bolt, the Jamaican retired sprinter who is widely considered to be the greatest sprinter of all time. He holds the world records in the 100 m, 200 m, and 4 × 100 m relay, and won eight Olympic gold medals and 11 World Championships.
--------------------
Title: Usain Bolt | Biography, Speed, Height, Medals, & Facts | Britannica
URL: https://www.britannica.com/biography/Usain-Bolt
Snippet: Learn about Usain Bolt, the Jamaican sprinter who won gold medals in the 100-meter and 200-meter races in three consecutive Olympics. Find out his speed, height, medals, and other achievements in this comprehensive article.
--------------------
Title: How fast is Usain Bolt's top speed? - enviroliteracy.org
URL: https://enviroliteracy.org/animals/how-fast-is-usain-bolts-top-speed/
Snippet: Learn how Usain Bolt achieved his record-breaking 27.78 mph in the 100-meter sprint in 2009, and how he compares 