# Ollama Deep Researcher (DuckDuckGo Version)

This notebook implements a local research agent with LangChain and Ollama, but **uses DuckDuckGo** instead of Tavily for free web search (no API key needed).

## Prerequisites
1. Python 3.10+.
2. [Ollama](https://github.com/jmorganca/ollama) installed and running locally with a model (e.g., `ollama pull deepseek-r1:8b`).
3. `langchain`, `duckduckgo-search`, and `langchain-ollama` installed.

## Installation
```bash
pip install langchain duckduckgo-search langchain-ollama
```

## Explanation
1. **Local LLM** (Ollama) for generating queries, summarizing info, and refining research.
2. **DuckDuckGoSearchAPIWrapper** for web search queries, returns either a single string or structured results.

## Research Flow
1. Prompt the LLM to generate an initial search query from the user's topic.
2. Retrieve search results via DuckDuckGo.
3. Summarize findings.
4. Reflect and refine the query.
5. Output final summary with sources.


In [44]:
pip install langchain langchain-community langchain-openai


Collecting langchain-openai
  Downloading langchain_openai-0.3.7-py3-none-any.whl.metadata (2.3 kB)
Collecting tiktoken<1,>=0.7 (from langchain-openai)
  Using cached tiktoken-0.9.0-cp311-cp311-win_amd64.whl.metadata (6.8 kB)
Downloading langchain_openai-0.3.7-py3-none-any.whl (55 kB)
   ---------------------------------------- 0.0/55.3 kB ? eta -:--:--
   ---------------------------------------- 55.3/55.3 kB 1.5 MB/s eta 0:00:00
Using cached tiktoken-0.9.0-cp311-cp311-win_amd64.whl (893 kB)
Installing collected packages: tiktoken, langchain-openai
Successfully installed langchain-openai-0.3.7 tiktoken-0.9.0
Note: you may need to restart the kernel to use updated packages.


## Initialize LLM and DuckDuckGo

In [None]:
from langchain_community.llms import Ollama  # This is the correct import for Ollama under LangChain community
from langchain.prompts import PromptTemplate
from langchain.schema import HumanMessage
from langchain.memory import ConversationBufferMemory
from langchain_community.utilities import DuckDuckGoSearchAPIWrapper
import re


# Initialize the LangChain Ollama LLM with DeepSeek
deepseek_model = Ollama(model="deepseek-r1:1.5b")  

# Define the LLM function (keeping the same interface for compatibility)
def llm(prompt: str) -> str:
    """
    Uses LangChain + DeepSeek (via Ollama) to generate completions.
    This replaces the OpenRouter-based llm().
    """
    # Pass the user prompt directly into the LLM call
    response = deepseek_model.invoke(prompt)

    # Ollama via LangChain returns a plain text response directly
    return response.strip()

# 3) Example usage in your agent with user input
if __name__ == "__main__":
    user_prompt = input("Enter your question or research topic: ")
    response = llm(user_prompt)
    print("\nLLM Response:", response)



LLM Response: <think>

</think>

Cats are large, typically domesticated animals belonging to the Canis lupus familiari (Canow) genus. They are commonly found as pets and have various breeds that vary in size, color, and behavior. Cats are intelligent, friendly, and often well-mannered.

### Key Characteristics of Cats:
1. **Size**: Cats typically measure between 2 to 6 feet (0.5 to 1.5 meters) in height.
   - Some breeders recommend smaller cats for children under 3 years old.

2. **Color**: The color of a cat can vary, but many common breeds are either black and white or have distinct patterns like fur or whisker colors.

3. **Behavior**: Cats are generally calm, affectionate, and social. They like company and often prefer to stand still when nearby. Cats may roll over on their back when they sleep.

4. **Milk Production**: Some cats produce a lot of milk from their lips (bodily milk), which is used as a treat or milk drink.

5. **Habitat**: Cats thrive in both indoor and outdoor env

### If ollama deepseek is not on local machine, I also give the option of openrouter below with a free to use API

In [None]:
'''
OPENROUTER_API_KEY = "use your own key"

from openai import OpenAI

# 1) Create the OpenRouter-based OpenAI client (global or in your agent initialization)
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key=OPENROUTER_API_KEY  # use the variable instead of hardcoding twice
)

# 2) Define a function for your agent to call to get LLM completions
def llm(prompt: str) -> str:
    """
    Calls a free model on OpenRouter via the OpenAI Python client interface.
    Returns the generated text response.
    """
    # Example model that may have free usage
    # Check your OpenRouter dashboard to confirm.
    model_name = "google/gemini-2.0-pro-exp-02-05:free"  # <- Fixed (no comma)

    # Create a chat completion request
    completion = client.chat.completions.create(
        model=model_name,
        messages=[
            {"role": "user", "content": prompt}
        ],
        # Optionally supply extra body fields; left empty here
        extra_body={},
        # Some optional generation parameters
        max_tokens=200
    )

    # Return the text content of the first choice
    return completion.choices[0].message.content

# 3) Example usage in your agent with user input
if __name__ == "__main__":
    user_prompt = input("Enter your question or research topic: ")
    response = llm(user_prompt)
    print("\nLLM Response:", response)
'''


'\nOPENROUTER_API_KEY = "hidden I had it in my env"\n\nfrom openai import OpenAI\n\n# 1) Create the OpenRouter-based OpenAI client (global or in your agent initialization)\nclient = OpenAI(\n    base_url="https://openrouter.ai/api/v1",\n    api_key=OPENROUTER_API_KEY  # use the variable instead of hardcoding twice\n)\n\n# 2) Define a function for your agent to call to get LLM completions\ndef llm(prompt: str) -> str:\n    """\n    Calls a free model on OpenRouter via the OpenAI Python client interface.\n    Returns the generated text response.\n    """\n    # Example model that may have free usage\n    # Check your OpenRouter dashboard to confirm.\n    model_name = "google/gemini-2.0-pro-exp-02-05:free"  # <- Fixed (no comma)\n\n    # Create a chat completion request\n    completion = client.chat.completions.create(\n        model=model_name,\n        messages=[\n            {"role": "user", "content": prompt}\n        ],\n        # Optionally supply extra body fields; left empty here\

## 2. Perform the Web Search with DuckDuckGo (Structured Results)

In [None]:
def extract_query(llm_output: str) -> str:
    """
    Cleans noisy LLM output (like DeepSeek adding `<think>` or explanations) into a valid search query.
    """
    # Strip tags like <think> and </think> if they appear
    llm_output = re.sub(r"<.*?>", "", llm_output).strip()

    # Remove any intro like "Here is the search query:" if it sneaks in
    lines = [line.strip() for line in llm_output.split("\n") if line.strip()]
    
    # Find first line that looks like a query (short and not a header)
    for line in lines:
        if len(line.split()) <= 12 and not line.lower().startswith(("think", "okay", "here", "search query")):
            return line  # First clean short line is the query

    # Fallback — return the whole thing if nothing cleaner found
    return llm_output.strip()


# Strict prompt to prevent excessive output
initial_query = llm(
    f"""You are a research assistant. Your ONLY task is to generate a web search query to find specific information about this topic.
    
    Topic: {user_prompt}
    
    **IMPORTANT**:
    - Output **only** the search query itself.
    - **Do NOT** explain, analyze, or include extra text.
    - Format your output as follows:
    
    ```
    [SEARCH_QUERY]
    ```
    
    Replace `[SEARCH_QUERY]` with your generated query.
    """
)

# Cleanup the query using the new filter
initial_query = extract_query(initial_query)

print(f"\nGenerated search query: {initial_query}")

# Step 2 - Perform search
search_tool = DuckDuckGoSearchAPIWrapper()
search_results = search_tool.results(initial_query, max_results=5)

print(f"\nGot {len(search_results)} results from DuckDuckGo.")

for i, r in enumerate(search_results, start=1):
    print(f"\nResult {i}: {r['title']}")
    print("URL:", r.get('link', 'No link'))
    print("Snippet:", r.get('snippet', '')[:200], "...")




Generated search query: How do cats grow?

Got 5 results from DuckDuckGo.

Result 1: When Do Cats Stop Growing? 6 Cat Life Stages Explained
URL: https://cats.com/when-do-cats-stop-growing
Snippet: How Cats Grow. Kittens grow rapidly: typically from around 50 grams (2 ounces) at birth to 1kg (2.2 lbs) by three months of age to over 2kg (4.4 lbs) by six months of age to 4kg (8.8 lbs) by one year  ...

Result 2: When Do Cats Stop Growing & Reach Their Full Size? Vet-Verified Facts ...
URL: https://www.catster.com/cat-health-care/when-do-cats-stop-growing/
Snippet: When Do Cats Stop Growing? According to the American Animal Hospital Association, 12 months old is the age when most kittens are fully grown 1.At this age, your kitty then enters the next life ... ...

Result 3: Life Cycle of a Cat - Stages of Development in Domestic Cats - AnimalWised
URL: https://www.animalwised.com/life-cycle-of-a-cat-stages-of-development-4114.html
Snippet: During this stage, kittens grow and develop contin

## 3. Summarize the Results

We'll combine the snippets from each result and feed them to the LLM.

In [35]:
# Step 3 - Combine all snippets into a single block of text for summarization
content_to_summarize = ""

for r in search_results:
    url = r.get('link', 'unknown source')
    snippet = r.get('snippet', '')
    if snippet:
        content_to_summarize += f"Source: {url}\n{snippet}\n\n"

# Ask the LLM to summarize this content
summary_prompt = (
    "You are a research assistant. Read the following information from multiple web search results, "
    "and write a **concise, factual summary** of the topic dont think about how you got the summary only output the summary. "
    "Only include information actually found in the text — do NOT make up or guess. "
    "Keep the summary factual and well-organized.\n\n"
    f"{content_to_summarize}\n"
    "Summary:"
)

# Use your llm() function to generate the summary
initial_summary = llm(summary_prompt).strip()

# Show the initial summary to the user
print("\nInitial Summary:\n")
print(initial_summary)



Initial Summary:

Kittens experience rapid growth and development, especially during their first year. They can grow from around 50 grams (2 ounces) at birth to over 4kg (8.8 lbs) by one year. Growth plates at a cat's joints cause bones to lengthen, and bone growth stops when these plates close. While most cats are fully grown by 12 months old, male cats may continue growing until around 18 months, particularly if they are spayed or neutered. Female cats usually stop growing by 10-12 months. Although a cat may look like an adult at the age of 6 months old, it still has more mental maturing to do. Cats in this phase of development are more active.


## 4. Reflect and Refine (Optional)
If you'd like to refine further, you can do the same loop from the earlier code. For brevity, here's a shorter snippet.

In [None]:
# Set number of refinement loops (can increase if desired)
max_loops = 5

# Start with the initial summary from Step 3
current_summary = initial_summary

# Collect all sources (initial + any new ones from refinements)
all_sources = [r.get('link') for r in search_results]

# Refinement loop
for loop in range(max_loops):
    reflect_prompt = (
        f"You are a research assistant working on the following topic:\n"
        f"Topic: {user_prompt}\n\n"
        "Here is the current research summary you created:\n"
        f"{current_summary}\n\n"
        "Review this summary carefully and think about any important missing information that is relevant to the topic.\n"
        "If there are gaps, generate exactly **one concise search query** that can retrieve additional useful information.\n"
        "IMPORTANT: Respond **only with the search query itself**, with no explanation, no analysis, and no extra text.\n"
        "If no further information is needed, respond with exactly: NONE (in uppercase).\n\n"
        "Search Query:"
    )

    # Ask the LLM to generate the next search query
    new_query = llm(reflect_prompt).strip()

    # Clean the query before using it
    new_query = extract_query(new_query)

    if new_query.upper() == "NONE":
        print("\nNo further questions identified. Refinement complete.")
        break

    print(f"\nRefinement query: {new_query}")

    # Perform DuckDuckGo search using the cleaned refined query
    new_search_results = search_tool.results(new_query, max_results=5)

    if not new_search_results:
        print("No useful results found for refinement query. Stopping refinement.")
        break

    # Add new sources to the list
    for r in new_search_results:
        all_sources.append(r.get('link'))

    # Gather snippets from the new search
    new_content = ""
    for r in new_search_results:
        snippet = r.get('snippet', '')
        if snippet:
            new_content += snippet + "\n\n"

    # Update the summary with new findings
    update_prompt = (
        f"Here is the **current research summary** so far:\n\n{current_summary}\n\n"
        "Here is **new information** found from a follow-up search query:\n"
        f"Query: {new_query}\n\n"
        f"{new_content}\n\n"
        "Please update and improve the summary, integrating this new information into the existing summary where appropriate, dont think about how this expads on the previous summary since the user wont see that. "
        "Keep the summary factual, concise, and only based on information provided in the snippets."
    )

    current_summary = llm(update_prompt).strip()

    print("\n[Updated Summary after refinement]\n")
    print(current_summary, "\n")

# Final output
final_summary = current_summary
print("\n=== Final Summary ===\n")
print(final_summary)



Refinement query: cat growth rate breed differences

[Updated Summary after refinement]

Kittens grow rapidly, with most completing the majority of their development around 12 months old. However, breed significantly influences growth duration. Smaller breeds like the Singapura may stop growing by 9 months, while larger breeds such as the Maine Coon, Norwegian Forest Cat, and Siberian can continue growing for up to 2-4 years. Genetics, nutrition, and neuter status also play a role in a cat's growth rate and final size. After reaching full size, different cat breeds age at similar rates. 


Refinement query: cat growth factors "sex differences"

[Updated Summary after refinement]

Kittens grow rapidly, with most completing the majority of their development around 12 months old. However, breed significantly influences growth duration. Smaller breeds like the Singapura may stop growing by 9 months, while larger breeds such as the Maine Coon, Norwegian Forest Cat, and Siberian can continu

## 5. Final Output with Sources

In [34]:
unique_sources = list({src for src in all_sources if src})
sources_markdown = "\n".join(f"- {url}" for url in unique_sources)

print("\n**Final Research Summary**\n")
print(final_summary)
print("\n**Sources:**\n" + sources_markdown)


**Final Research Summary**

Kittens grow rapidly, with most completing the majority of their development around 12-18 months old. However, breed significantly influences growth duration. Smaller breeds like the Singapura may stop growing by 9 months, while larger breeds such as the Maine Coon, Norwegian Forest Cat, and Siberian can continue growing for up to 2-4 years.

Genetics, nutrition, neuter status, sex, and birth weight also play roles in a cat's growth rate and final size. Newborn kittens weigh an average of 100 grams (typically 75-115 grams), with larger breeds being heavier at birth. Low birth weight kittens have a higher risk of not thriving but, with proper care, can grow faster than their littermates. Kittens typically double their birth weight within the first month, and tracking their growth, every 1-4 weeks, can ensure this. For example, a 113 gram kitten should gain 11-

**Sources:**
- https://www.kinship.com/cat-health/kitten-weight-chart
- https://www.nature.com/art