<a href="https://colab.research.google.com/github/vectara/example-notebooks/blob/main/notebooks/mcp/langchain-agent-with-vectara-mcp.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Self-Correcting Research Agent with Vectara MCP

This notebook demonstrates a **self-correcting research agent** that automatically detects and fixes its own hallucinations using Vectara's MCP tools:

- **HHEM** (Hughes Hallucination Evaluation Model) via `eval_factual_consistency` - evaluates factual accuracy
- **VHC** (Vectara Hallucination Corrector) via `correct_hallucinations` - fixes hallucinated content

The agent combines multiple tools (Wikipedia, Calculator) with HHEM/VHC to:
1. Research topics using Wikipedia
2. Draft responses based on sources
3. **Automatically verify** its responses using HHEM
4. **Auto-correct** if the factual consistency score is low

## Prerequisites

1. Install packages: `pip install vectara-mcp langchain-mcp-adapters langgraph langchain-openai wikipedia python-dotenv`
2. Get a Vectara API key from [console.vectara.com](https://console.vectara.com)
3. Get an OpenAI API key

## Installation

In [1]:
#!pip install --quiet langchain-mcp-adapters langgraph langchain-openai langchain-community vectara-mcp wikipedia python-dotenv

## Environment Setup

In [2]:
import os
from dotenv import load_dotenv

load_dotenv()

# Set your API keys here or via environment variables
os.environ["VECTARA_API_KEY"] = os.getenv("VECTARA_API_KEY", "<YOUR_VECTARA_API_KEY>")
os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY", "<YOUR_OPENAI_API_KEY>")

if os.getenv("VECTARA_API_KEY", "").startswith("<"):
    raise EnvironmentError("Please set VECTARA_API_KEY")
if os.getenv("OPENAI_API_KEY", "").startswith("<"):
    raise EnvironmentError("Please set OPENAI_API_KEY")

print("Environment configured successfully")

Environment configured successfully


## Part 1: Connect to Vectara MCP Server

Connect to the Vectara MCP server to access HHEM and VHC tools for detecting and correcting hallucinations.

In [3]:
from langchain_openai import ChatOpenAI
from langchain_mcp_adapters.client import MultiServerMCPClient
from langgraph.prebuilt import create_react_agent
import sys

# Initialize the LLM
llm = ChatOpenAI(model="gpt-4o", temperature=0)

# Configure the Vectara MCP server connection
mcp_client = MultiServerMCPClient(
    {
        "vectara": {
            "command": sys.executable,
            "args": ["-m", "vectara_mcp", "--transport", "stdio"],
            "transport": "stdio",
            "env": {
                "VECTARA_API_KEY": os.environ["VECTARA_API_KEY"]
            }
        }
    }
)

In [4]:
# Load tools from the MCP server
tools = await mcp_client.get_tools()

print("Available MCP tools:")
for tool in tools:
    print(f"  - {tool.name}: {tool.description}\n")

Available MCP tools:
  - setup_vectara_api_key: 
    Configure and validate the Vectara API key for the session.

    Args:
        api_key: str, The Vectara API key to configure - required.

    Returns:
        str: Success message with masked API key or error message.
    

  - clear_vectara_api_key: 
    Clear the stored Vectara API key from server memory.

    Returns:
        str: Confirmation message.
    

  - ask_vectara: 
    Run a RAG query using Vectara, returning search results with generated response.

    Args:
        query: str, The user query to run - required.
        corpus_keys: list[str], List of Vectara corpus keys to use. Required.
        n_sentences_before: int, Sentences before answer for context. Default 2.
        n_sentences_after: int, Sentences after answer for context. Default 2.
        lexical_interpolation: float, Lexical interpolation amount. Default 0.005.
        max_used_search_results: int, Max search results to use. Default 10.
        generati

## Part 2: Add Research Tools

Beyond the Vectara MCP tools, we add Wikipedia search and a calculator to create a practical research agent.

In [5]:
from langchain_community.tools import WikipediaQueryRun
from langchain_community.utilities import WikipediaAPIWrapper
from langchain_core.tools import tool

@tool
def calculate(expression: str) -> str:
    """Evaluate a mathematical expression. Example: calculate('2 + 2 * 3')"""
    try:
        # Safe evaluation - only allow math operations
        allowed = set('0123456789+-*/.() ')
        if all(c in allowed for c in expression):
            return str(eval(expression))
        return "Error: Only numeric expressions allowed"
    except Exception as e:
        return f"Error: {e}"

# Setup Wikipedia tool
wikipedia = WikipediaQueryRun(api_wrapper=WikipediaAPIWrapper(top_k_results=2))

# Combine all tools: Vectara MCP + Wikipedia + Calculator
all_tools = tools + [wikipedia, calculate]
print(f"Agent has {len(all_tools)} tools: {[t.name for t in all_tools]}")

Agent has 8 tools: ['setup_vectara_api_key', 'clear_vectara_api_key', 'ask_vectara', 'search_vectara', 'correct_hallucinations', 'eval_factual_consistency', 'wikipedia', 'calculate']


## Part 3: Create Self-Correcting Research Agent

Create an agent with a system prompt that instructs it to:
1. Research using Wikipedia
2. Draft a response
3. **Automatically verify** using HHEM
4. **Auto-correct** with VHC if the score is low

In [6]:
SYSTEM_PROMPT = """You are a research assistant
Answer user questions using the following workflow:

1. RESEARCH: Use wikipedia to gather source information on the topic
2. DRAFT: Write your response based ONLY on the sources you found
3. VERIFY: Use eval_factual_consistency to check your draft against the Wikipedia sources
4. CORRECT: If HHEM score < 0.5, use correct_hallucinations to fix your response
5. RETURN: Provide the final verified response, after correction (if correct_hallucinations) was used.
"""

# Create the self-correcting agent with all tools
agent = create_react_agent(
    model=llm,
    tools=all_tools,
    prompt=SYSTEM_PROMPT
)

print("Self-correcting agent created with system prompt")

Self-correcting agent created with system prompt


/var/folders/y0/qd7p5ft96k7br3ztvfwfbsqc0000gn/T/ipykernel_65921/1824071622.py:12: LangGraphDeprecatedSinceV10: create_react_agent has been moved to `langchain.agents`. Please update your import to `from langchain.agents import create_agent`. Deprecated in LangGraph V1.0 to be removed in V2.0.
  agent = create_react_agent(


## Part 4: Research Examples with Auto-Verification

The agent researches topics, drafts responses, and automatically verifies/corrects them using HHEM and VHC.

In [7]:
# Example 1: Research a historical topic
query1 = """What were the main causes of the 2008 financial crisis?"""
result1 = await agent.ainvoke({"messages": [("user", query1)]})
print(result1["messages"][-1].content)

The 2008 financial crisis, also known as the global financial crisis, was primarily caused by several interrelated factors:

1. **Housing Bubble and Speculation**: There was excessive speculation on property values by both homeowners and financial institutions, leading to a housing bubble in the United States during the 2000s.

2. **Subprime Mortgages and Predatory Lending**: Financial institutions engaged in predatory lending practices, offering high-risk subprime mortgages to low-income homebuyers, often without adequate regulatory oversight.

3. **Regulatory Failures**: Deficiencies in financial regulation allowed risky financial practices to proliferate. The repeal of parts of the Glassâ€“Steagall Act in 1999 enabled financial institutions to mix low-risk operations with higher-risk activities, such as investment banking.

4. **Mortgage-Backed Securities and Derivatives**: The crisis was exacerbated by the collapse in value of mortgage-backed securities (MBS) and a complex web of d

In [8]:
# Example 2: Research + Calculation
query2 = """What is the population density of Tokyo?"""
result2 = await agent.ainvoke({"messages": [("user", query2)]})
print(result2["messages"][-1].content)

The specific population density of Tokyo in 2023 is not directly available from the current Wikipedia sources. However, Tokyo is known to be one of the most densely populated cities in the world. The city proper has a population of over 14 million, and the Greater Tokyo Area, which includes Tokyo and parts of six neighboring prefectures, is the most populous metropolitan area in the world with 41 million residents as of 2024. For precise and up-to-date figures, consulting official statistics from Tokyo's metropolitan government or other authoritative demographic sources would be necessary.


In [9]:
# Example 3: Scientific topic
query3 = """Explain how CRISPR gene editing works."""

result3 = await agent.ainvoke({"messages": [("user", query3)]})
print(result3["messages"][-1].content)

CRISPR gene editing is a powerful genetic engineering technique that allows for precise modifications to the genomes of living organisms. It is based on a simplified version of the bacterial CRISPR-Cas9 antiviral defense system. The process involves delivering the Cas9 nuclease, complexed with a synthetic guide RNA (gRNA), into a cell. This complex can cut the cell's genome at a specific location, enabling the removal of existing genes or the addition of new ones.

The CRISPR-Cas9 system works like genetic scissors, opening both strands of the targeted DNA sequence. This allows for modifications through two main methods:

1. **Knock-in Mutations**: This method uses homology-directed repair (HDR) to introduce targeted DNA changes. HDR employs similar DNA sequences to repair the break by incorporating exogenous DNA as a repair template.

2. **Knock-out Mutations**: These result from the repair of the double-stranded break via non-homologous end joining (NHEJ) or polymerase theta-mediated

## Cleanup

In [10]:
# Close the MCP client connection (cleanup subprocess)
try:
    await mcp_client.close()
    print("MCP client closed.")
except AttributeError:
    # Some versions may not have explicit close
    print("MCP client session completed.")

MCP client session completed.
