# ReAct Agent with Wikipedia, ArXiv & PubMed

This notebook builds a research-focused ReAct agent using **3 free LangChain tools** that require no API key:

| Tool | Best for |
|---|---|
| **Wikipedia** | General knowledge, history, concepts |
| **ArXiv** | CS, AI, math, physics academic papers |
| **PubMed** | Biology, medicine, health research |

The agent uses the **ReAct** (Reasoning + Acting) framework: it thinks step-by-step and automatically picks the right tool for each question.

## 1. Install Dependencies
Run this once. If packages are already installed you can skip it.

In [1]:
!pip install langchain langchain-community langchain-groq wikipedia arxiv pymupdf python-dotenv
!pip install xmltodict



## 2. Imports
Bring in everything we need: the LLM, agent utilities, and the 3 tools.

In [2]:
import os
from dotenv import load_dotenv

from langchain_groq import ChatGroq
from langchain.agents import initialize_agent, AgentType

# Tool classes
from langchain_community.tools import WikipediaQueryRun, ArxivQueryRun
from langchain_community.utilities import WikipediaAPIWrapper, ArxivAPIWrapper
from langchain_community.tools.pubmed.tool import PubmedQueryRun

# Load GROQ_API_KEY from your .env file
load_dotenv() 

print("Imports successful!")

Imports successful!


## 3. LLM Setup
We use **Groq + LLaMA 3.3** as our brain.

> `temperature=0` is important here, it makes the LLM follow the strict ReAct format reliably (`Thought → Action → Observation → Final Answer`). Higher values cause formatting errors.

In [4]:
llm = ChatGroq(
    model="llama-3.1-8b-instant",
    temperature=0,       # Keep at 0 for reliable ReAct formatting
    max_tokens=256,
)

print("LLM ready!")

LLM ready!


## 4. Tools Setup

We configure each tool individually before combining them:

- `top_k_results`:how many results to fetch per query
- `doc_content_chars_max`: max characters returned per result (keeps context short & focused)

In [5]:
# Tool 1: Wikipedia
# Great for general knowledge, concepts, historical facts
wikipedia_tool = WikipediaQueryRun(
    api_wrapper=WikipediaAPIWrapper(
        top_k_results=2,
        doc_content_chars_max=1000
    )
)

# Tool 2: ArXiv
# Great for AI, machine learning, physics, mathematics research papers
arxiv_tool = ArxivQueryRun(
    api_wrapper=ArxivAPIWrapper(
        top_k_results=2,
        doc_content_chars_max=1000
    )
)

# Tool 3: PubMed
# Great for medical, biological and health science research
pubmed_tool = PubmedQueryRun()

# Combine all tools into a list
tools = [wikipedia_tool, arxiv_tool, pubmed_tool]

print(f"{len(tools)} tools ready: {[t.name for t in tools]}")

3 tools ready: ['wikipedia', 'arxiv', 'pub_med']


## 5. Agent Setup

Here we wire the LLM and tools together into a ReAct agent.

Key parameters:
- `ZERO_SHOT_REACT_DESCRIPTION`: the agent reads each tool's description to decide which one to use, with no prior examples needed
- `verbose=True`: prints the agent's internal reasoning so you can see *why* it picks each tool
- `handle_parsing_errors=True`: if the LLM produces a malformed response, the agent retries instead of crashing
- `max_iterations=5`: safety limit to prevent infinite loops

In [6]:
from langchain.prompts import PromptTemplate

system_prompt = """You are a helpful research assistant. 
You have access to tools to search for information.
IMPORTANT RULES:
- Use each tool AT MOST ONCE per question
- After getting an observation, ALWAYS write your Final Answer immediately
- Never repeat the same search twice
- Summarize what you found even if the results are imperfect"""

agent = initialize_agent(
    tools=tools,
    llm=llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True,
    handle_parsing_errors=True,
    max_iterations=4,
    early_stopping_method="generate",
    agent_kwargs={"prefix": system_prompt},
)

  warn_deprecated(


## 6. Query 1: General Knowledge (Wikipedia)
A factual/historical question, the agent should reach for **Wikipedia**.

In [7]:
result1 = agent.invoke("What is the theory of relativity and who developed it?")

print("\n" + "="*60)
print("FINAL ANSWER:")
print(result1["output"])



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought: The question seems to be about a fundamental concept in physics, so I should use a reliable source for physics-related information.
Action: arxiv
Action Input: "theory of relativity"[0m
Observation: [33;1m[1;3mPublished: 1997-08-21
Title: Variational Approach to Quantum Field Theory: Gaussian Approximation and the Perturbative Expansion around It
Authors: Jae Hyung Yee
Summary: The functional Schrodinger picture formulation of quantum field theory and the variational Gaussian approximation method based on the formulation are briefly reviewed. After presenting recent attempts to improve the variational approximation, we introduce a new systematic method based on the background field method, which enables one to compute the order-by-order correction terms to the Gaussian approximation of the effective action.

Published: 2014-03-28
Title: The Confrontation between General Relativity and Experiment
Authors: Clifford 

## 7. Query 2: AI Research (ArXiv)
A question about recent academic research, the agent should reach for **ArXiv**.

In [8]:
result2 = agent.invoke("What are recent advances in large language models?")

print("\n" + "="*60)
print("FINAL ANSWER:")
print(result2["output"])



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought: To find recent advances in large language models, I should look for information from a reliable source that covers the latest developments in the field of artificial intelligence and natural language processing.

Action: arxiv
Action Input: "recent advances in large language models"[0m
Observation: [33;1m[1;3mPublished: 2026-02-01
Title: Enhancing Human-Like Responses in Large Language Models
Authors: Ethem Yağız Çalık, Talha Rüzgar Akkuş
Summary: This paper explores the advancements in making large language models (LLMs) more human-like. We focus on techniques that enhance natural language understanding, conversational coherence, and emotional intelligence in AI systems. The study evaluates various approaches, including fine-tuning with diverse datasets, incorporating psychological principles, and designing models that better mimic human reasoning patterns. Our findings demonstrate that these enhancements not onl

## 8. Query 3: Medical Research (PubMed)
A biomedical question, the agent should reach for **PubMed**.

In [11]:
  # Bypass the agent entirely — call PubMed directly
from langchain_community.tools.pubmed.tool import PubmedQueryRun
from langchain_groq import ChatGroq

# Step 1: Get the raw PubMed results
pubmed_tool = PubmedQueryRun()
raw_results = pubmed_tool.run("sleep deprivation memory")

# Step 2: Ask the LLM to summarize what it found
llm = ChatGroq(model="llama-3.3-70b-versatile", temperature=0, max_tokens=256)

summary = llm.invoke(
    f"Summarize the following research findings in 3-4 clear sentences:\n\n{raw_results}"
)

print("FINAL ANSWER:")
print(summary.content)

FINAL ANSWER:
Here is a summary of the research findings in 3-4 clear sentences:

A recent study found that brief sleep deprivation after learning alters the synaptic structures of interneurons in the lateral and medial entorhinal cortex, which are key inputs to the hippocampus. Specifically, sleep deprivation reduced the density and size of dendritic spines in these interneurons, but the effects differed between the lateral and medial entorhinal cortex. The study suggests that sleep loss disrupts the balance between excitatory and inhibitory signals in the entorhinal cortex, which may contribute to impaired hippocampus-dependent memory processing. Overall, the findings highlight the importance of sleep for maintaining normal brain function and cognitive processes, particularly after learning.


## 9. Trying My Own Query
Asking anything, the agent will pick the best tool automatically.

In [17]:
def research(question: str):
    raw = wikipedia_tool.run(question)
    summary = llm.invoke(
        f"Answer this question: '{question}'\n\nReference:\n{raw}\n\n"
        f"If the reference doesn't help, use your own knowledge."
    )
    print(summary.content)

research("What are the top 5 best passports in the world in 2026?")

The provided reference does not contain information about the top 5 best passports in the world in 2026. However, based on my knowledge, the ranking of the best passports can vary depending on the source and criteria used. 

As of my knowledge cutoff, the top 5 best passports in the world are typically determined by factors such as visa-free travel, travel freedom, and the number of countries that can be visited without a visa. 

Here are the top 5 best passports in the world in 2026, based on my general knowledge:

1. Japanese passport: Offers visa-free travel to over 193 countries.
2. Singaporean passport: Allows visa-free travel to over 192 countries.
3. South Korean passport: Provides visa-free travel to over 191 countries.
4. German passport: Offers visa-free travel to over 190 countries.
5. Italian passport: Allows visa-free travel to over 189 countries.

Please note that these rankings can change over time and may vary depending on the source and criteria used. It's always best 