<a href="https://colab.research.google.com/github/vectara/example-notebooks/blob/main/notebooks/mcp/langchain-agent-with-vectara-mcp.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Self-Correcting Research Agent with Vectara MCP

This notebook demonstrates a **self-correcting research agent** that automatically detects and fixes its own hallucinations using Vectara's MCP tools:

- **HHEM** (Hughes Hallucination Evaluation Model) via `eval_factual_consistency` - evaluates factual accuracy
- **VHC** (Vectara Hallucination Corrector) via `correct_hallucinations` - fixes hallucinated content

The agent combines multiple tools (Wikipedia, Calculator) with HHEM/VHC to:
1. Research topics using Wikipedia
2. Draft responses based on sources
3. **Automatically verify** its responses using HHEM
4. **Auto-correct** if the factual consistency score is low

## Prerequisites

1. Install packages: `pip install vectara-mcp langchain-mcp-adapters langgraph langchain-openai wikipedia python-dotenv`
2. Get a Vectara API key from [console.vectara.com](https://console.vectara.com)
3. Get an OpenAI API key

## Installation

In [1]:
#!pip install --quiet langchain-mcp-adapters langgraph langchain-openai langchain-community vectara-mcp wikipedia python-dotenv

## Environment Setup

In [2]:
import os
from dotenv import load_dotenv

load_dotenv()

# Set your API keys here or via environment variables
os.environ["VECTARA_API_KEY"] = os.getenv("VECTARA_API_KEY", "<YOUR_VECTARA_API_KEY>")
os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY", "<YOUR_OPENAI_API_KEY>")

if os.getenv("VECTARA_API_KEY", "").startswith("<"):
    raise EnvironmentError("Please set VECTARA_API_KEY")
if os.getenv("OPENAI_API_KEY", "").startswith("<"):
    raise EnvironmentError("Please set OPENAI_API_KEY")

print("Environment configured successfully")

Environment configured successfully


## Part 1: Connect to Vectara MCP Server

Connect to the Vectara MCP server to access HHEM and VHC tools for detecting and correcting hallucinations.

In [3]:
from langchain_openai import ChatOpenAI
from langchain_mcp_adapters.client import MultiServerMCPClient
from langgraph.prebuilt import create_react_agent
import sys

# Initialize the LLM
llm = ChatOpenAI(model="gpt-4o", temperature=0)

# Configure the Vectara MCP server connection
mcp_client = MultiServerMCPClient(
    {
        "vectara": {
            "command": sys.executable,
            "args": ["-m", "vectara_mcp", "--transport", "stdio"],
            "transport": "stdio",
            "env": {
                "VECTARA_API_KEY": os.environ["VECTARA_API_KEY"]
            }
        }
    }
)

In [4]:
# Load tools from the MCP server
tools = await mcp_client.get_tools()

print("Available MCP tools:")
for tool in tools:
    print(f"  - {tool.name}: {tool.description}\n")

Available MCP tools:
  - setup_vectara_api_key: 
    Configure and validate the Vectara API key for the session.

    Args:
        api_key: str, The Vectara API key to configure - required.

    Returns:
        str: Success message with masked API key or error message.
    

  - clear_vectara_api_key: 
    Clear the stored Vectara API key from server memory.

    Returns:
        str: Confirmation message.
    

  - ask_vectara: 
    Run a RAG query using Vectara, returning search results with generated response.

    Args:
        query: str, The user query to run - required.
        corpus_keys: list[str], List of Vectara corpus keys to use. Required.
        n_sentences_before: int, Sentences before answer for context. Default 2.
        n_sentences_after: int, Sentences after answer for context. Default 2.
        lexical_interpolation: float, Lexical interpolation amount. Default 0.005.
        max_used_search_results: int, Max search results to use. Default 10.
        generati

## Part 2: Add Research Tools

Beyond the Vectara MCP tools, we add Wikipedia search and a calculator to create a practical research agent.

In [5]:
from langchain_community.tools import WikipediaQueryRun
from langchain_community.utilities import WikipediaAPIWrapper
from langchain_core.tools import tool

@tool
def calculate(expression: str) -> str:
    """Evaluate a mathematical expression. Example: calculate('2 + 2 * 3')"""
    try:
        # Safe evaluation - only allow math operations
        allowed = set('0123456789+-*/.() ')
        if all(c in allowed for c in expression):
            return str(eval(expression))
        return "Error: Only numeric expressions allowed"
    except Exception as e:
        return f"Error: {e}"

# Setup Wikipedia tool
wikipedia = WikipediaQueryRun(api_wrapper=WikipediaAPIWrapper(top_k_results=2))

# Combine all tools: Vectara MCP + Wikipedia + Calculator
all_tools = tools + [wikipedia, calculate]
print(f"Agent has {len(all_tools)} tools: {[t.name for t in all_tools]}")

Agent has 8 tools: ['setup_vectara_api_key', 'clear_vectara_api_key', 'ask_vectara', 'search_vectara', 'correct_hallucinations', 'eval_factual_consistency', 'wikipedia', 'calculate']


## Part 3: Create Self-Correcting Research Agent

Create an agent with a system prompt that instructs it to:
1. Research using Wikipedia
2. Draft a response
3. **Automatically verify** using HHEM
4. **Auto-correct** with VHC if the score is low

In [6]:
SYSTEM_PROMPT = """You are a research assistant with access to these tools:
- wikipedia: Search Wikipedia for factual information
- calculate: Perform mathematical calculations  
- eval_factual_consistency: Check if text is factually consistent with sources (HHEM score 0-1)
- correct_hallucinations: Fix hallucinated content in text (VHC)

Follow this for EVERY factual response:

1. RESEARCH: Use wikipedia to gather source information on the topic
2. DRAFT: Write your response based ONLY on the sources you found
3. VERIFY: Use eval_factual_consistency to check your draft against the Wikipedia sources
4. CORRECT: If HHEM score < 0.5, use correct_hallucinations to fix your response
5. RETURN: Provide the final verified response, after correction (if correct_hallucinations) was used.

Always be transparent: report your HHEM score and mention if corrections were made.
"""

# Create the self-correcting agent with all tools
agent = create_react_agent(
    model=llm,
    tools=all_tools,
    prompt=SYSTEM_PROMPT
)

print("Self-correcting agent created with system prompt")

Self-correcting agent created with system prompt


/var/folders/y0/qd7p5ft96k7br3ztvfwfbsqc0000gn/T/ipykernel_18469/1168192973.py:19: LangGraphDeprecatedSinceV10: create_react_agent has been moved to `langchain.agents`. Please update your import to `from langchain.agents import create_agent`. Deprecated in LangGraph V1.0 to be removed in V2.0.
  agent = create_react_agent(


## Part 4: Research Examples with Auto-Verification

The agent researches topics, drafts responses, and automatically verifies/corrects them using HHEM and VHC.

In [7]:
# Example 1: Research a historical topic
query1 = """What were the main causes of the 2008 financial crisis? 
Research this using Wikipedia, then verify your response for factual accuracy."""

print("=" * 60)
print("EXAMPLE 1: 2008 Financial Crisis")
print("=" * 60)

result1 = await agent.ainvoke({"messages": [("user", query1)]})

# Print the agent's response
for msg in result1["messages"]:
    if hasattr(msg, "content") and msg.content and not msg.content.startswith("What were"):
        print(msg.content)

EXAMPLE 1: 2008 Financial Crisis
Page: 2008 financial crisis
Summary: The 2008 financial crisis, also known as the global financial crisis (GFC) or the Panic of 2008, was a major worldwide financial crisis centered in the United States. The causes included excessive speculation on property values by both homeowners and financial institutions, leading to the 2000s United States housing bubble. This was exacerbated by predatory lending for subprime mortgages and by deficiencies in regulation. Cash out refinancings had fueled an increase in consumption that could no longer be sustained when home prices declined. The first phase of the crisis was the subprime mortgage crisis, which began in early 2007, as mortgage-backed securities (MBS) tied to U.S. real estate, and a vast web of derivatives linked to those MBS, collapsed in value. A liquidity crisis spread to global institutions by mid-2007 and climaxed with the bankruptcy of Lehman Brothers in September 2008, which triggered a stock mar

In [8]:
# Example 2: Research + Calculation
query2 = """What is the population density of Tokyo? 
Look up Tokyo's population and area on Wikipedia, calculate the density, and verify your answer."""

print("\n" + "=" * 60)
print("EXAMPLE 2: Tokyo Population Density (Research + Calculation)")
print("=" * 60)

result2 = await agent.ainvoke({"messages": [("user", query2)]})

for msg in result2["messages"]:
    if hasattr(msg, "content") and msg.content and not msg.content.startswith("What is the"):
        print(msg.content)


EXAMPLE 2: Tokyo Population Density (Research + Calculation)
Page: Demographics of Tokyo
Summary: The demography of Tokyo is analysed by the Tokyo Metropolitan Government and data is produced for each of the Special wards of Tokyo, the Western Tokyo and the Tokyo Islands, and for all of Tokyo prefecture as a whole. Statistical information is produced about the size and geographical breakdown of the population, the number of people entering and leaving country and the number of people in each demographic subgroup. As of 2025, the total population of Tokyo is 14,195,730 and had the largest population (11.5 percent of the total population).



Page: Greater Tokyo Area
Summary: The Greater Tokyo Area is the most populous metropolitan area in the world, consisting of the Kantō region of Japan (including Tokyo Metropolis and the prefectures of Chiba, Gunma, Ibaraki, Kanagawa, Saitama, and Tochigi) as well as the prefecture of Yamanashi of the neighboring Chūbu region. In Japanese, it is ref

In [9]:
# Example 3: Scientific topic
query3 = """Explain how CRISPR gene editing works.
Research this on Wikipedia and verify your explanation is factually accurate."""

print("\n" + "=" * 60)
print("EXAMPLE 3: CRISPR Gene Editing")
print("=" * 60)

result3 = await agent.ainvoke({"messages": [("user", query3)]})

for msg in result3["messages"]:
    if hasattr(msg, "content") and msg.content and not msg.content.startswith("Explain"):
        print(msg.content)


EXAMPLE 3: CRISPR Gene Editing
Page: CRISPR gene editing
Summary: CRISPR gene editing (; pronounced like "crisper"; an abbreviation for "clustered regularly interspaced short palindromic repeats") is a genetic engineering technique in molecular biology by which the genomes of living organisms may be modified. It is based on a simplified version of the bacterial CRISPR-Cas9 antiviral defense system. By delivering the Cas9 nuclease complexed with a synthetic guide RNA (gRNA) into a cell, the cell's genome can be cut at a desired location, allowing existing genes to be removed or new ones added in vivo.
The technique is considered highly significant in biotechnology and medicine as it enables in vivo genome editing and is considered exceptionally precise, cost-effective, and efficient. It can be used in the creation of new medicines, agricultural products, and genetically modified organisms, or as a means of controlling pathogens and pests. It also offers potential in the treatment of in

## Cleanup

In [10]:
# Close the MCP client connection (cleanup subprocess)
try:
    await mcp_client.close()
    print("MCP client closed.")
except AttributeError:
    # Some versions may not have explicit close
    print("MCP client session completed.")

MCP client session completed.
