In [1]:
!pip install --quiet --upgrade langchain langchain-neo4j langchain-openai langchain-mcp-adapters mcp-neo4j-cypher

The **LangChain framework for Python** is a toolkit for building applications powered by large language models. It provides composable chains and agents, a vast integration ecosystem, memory and retrieval systems, and production essentials like callbacks, tracing, and evaluation tools.

In this notebook, we'll build a company research agent that queries a Neo4j graph database.

In [2]:
import json

from langchain_mcp_adapters.client import MultiServerMCPClient
from langchain.tools import tool
from langchain.agents import create_agent
from langchain_neo4j import Neo4jGraph, Neo4jVector
from langchain_openai import ChatOpenAI, OpenAIEmbeddings

LangChain integrates with virtually every major LLM provider like OpenAI, Anthropic, Google, Cohere, Mistral, AWS Bedrock, Azure, and many more. This makes it easy to swap models or run comparisons without rewriting your application logic.

In this example, we'll use OpenAI as our LLM provider, specifically GPT-5.1

In [3]:
import os
from getpass import getpass

os.environ["OPENAI_API_KEY"] = getpass("OpenAI API key")

OpenAI API key··········


In [4]:
model =  ChatOpenAI(model="gpt-5.1")

For this example, we'll use the companies database from the Neo4j demo server, which contains organizations, people, investors, and news articles.

In [5]:
os.environ["NEO4J_URI"] = "neo4j+s://demo.neo4jlabs.com"
os.environ["NEO4J_USERNAME"] = "companies"
os.environ["NEO4J_PASSWORD"] = "companies"
os.environ["NEO4J_DATABASE"] = "companies"

## MCP Neo4j Cypher

We'll start by using the `mcp-neo4j-cypher` to extend the agent with Neo4j tools. This MCP server provides the agent with capabilities to read the graph schema and execute Cypher queries, enabling it to fetch and analyze data directly from the database.

In [6]:
cypher_mcp_config = {"neo4j-database": {
                "transport": "stdio",
                "command": "uvx",
                "args": ["mcp-neo4j-cypher"],
                "env": {
                    "NEO4J_URI": os.environ["NEO4J_URI"],
                    "NEO4J_USERNAME": os.environ["NEO4J_USERNAME"],
                    "NEO4J_PASSWORD": os.environ["NEO4J_PASSWORD"],
                    "NEO4J_DATABASE": os.environ["NEO4J_DATABASE"]
                }
            }}

**Google Colab users only:** Run the following cell to start the MCP server with HTTP transport. This workaround is necessary because Google Colab doesn't support the default stdio transport method.

In [7]:
# Google Colab Setup: MCP Server for Neo4j
#
# This cell launches the Neo4j MCP (Model Context Protocol) server as a background process.
# MCP provides a standardized way for LLMs to interact with external tools and databases.
#
# The server exposes Cypher query capabilities over HTTP, allowing our LangChain agent
# to read schema information and execute queries against the Neo4j database.

import threading
import subprocess
import time

def run_server():
    subprocess.run([
        "mcp-neo4j-cypher",
        "--server-port", "8000",
        "--db-url", os.environ["NEO4J_URI"],
        "--username", os.environ["NEO4J_USERNAME"],
        "--password", os.environ["NEO4J_PASSWORD"],
        "--database", os.environ["NEO4J_DATABASE"],
        "--transport", "http"
    ])

server_thread = threading.Thread(target=run_server, daemon=True)
server_thread.start()
time.sleep(5)

cypher_mcp_config = {"neo4j-database": {
            "url": "http://localhost:8000/mcp",
            "transport": "streamable_http"
        }}

With the MCP server running, we initialize a client to connect to it and retrieve the available tools. These tools will allow our agent to query the Neo4j database.

In [9]:
client = MultiServerMCPClient(cypher_mcp_config)
mcp_tools = await client.get_tools()

We define a system prompt that instructs the agent on its role and capabilities. The `create_agent` function constructs a **ReAct-style** agent that follows a reasoning loop: it observes the current state, decides which tool to use (if any), executes the tool, and incorporates the result into its next step. This architecture allows the agent to chain multiple tool calls together to answer complex questions.

In [10]:
system_prompt = """
You are a helpful assistant with access to a Neo4j graph database containing company data. Use the available tools to query the database and answer questions.
"""

agent = create_agent(model, mcp_tools, system_prompt=system_prompt)

Let's test it!

In [11]:
prompt = "How many people are in the database?"

async for event in agent.astream(
    {"messages": [{"role": "user", "content": prompt}]},
    stream_mode="values",
):
    event["messages"][-1].pretty_print()


How many people are in the database?
Tool Calls:
  read_neo4j_cypher (call_phJqd8gRMWViZnthEfBSS5FL)
 Call ID: call_phJqd8gRMWViZnthEfBSS5FL
  Args:
    query: MATCH (p:Person) RETURN count(p) AS peopleCount
Name: read_neo4j_cypher

[{'type': 'text', 'text': '[{"peopleCount": 8064}]', 'id': 'lc_2598b242-6244-40aa-bff7-3d59537557f4'}]

There are 8,064 people in the database.


# Custom tools

Beyond using existing MCP servers, you can also implement your own custom tools and add them directly to the agent. This allows you to create specialized functionality tailored to your specific use case. Custom tools can be implemented using the `@tool` decorator, which turns any function into a tool the agent can invoke.

Here, we you `Neo4jGraph` from the `langchain-neo4j` package, a direct integration in the LangChain ecosystem, to establish a connection to our database and build a tool that queries investment relationships, giving you more control over the query logic.


In [12]:
neo4j_graph = Neo4jGraph()

@tool
async def get_investments(company: str) -> str:
    """Returns the investments by a company by name. Returns list of investment ids, names and types."""
    try:
        results = neo4j_graph.query("""
            MATCH (o:Organization)-[:HAS_INVESTOR]->(i)
            WHERE o.name = $company
            RETURN i.id as id, i.name as name, head(labels(i)) as type
        """, {"company": company})
        return json.dumps(results, indent=2)
    except Exception as e:
        raise Exception(f"Error fetching investments: {str(e)}")

The `langchain-neo4j` package also provides `Neo4jVector`, a vector store integration that enables semantic search over your graph data. Here, we connect to an existing vector index and create a tool that uses OpenAI embeddings to search for relevant news chunks.

In [13]:
vector_store = Neo4jVector.from_existing_index(
    OpenAIEmbeddings(),
    index_name="news",
    node_label="Chunk",
    retrieval_query="""
    MATCH (node)<-[:HAS_CHUNK]-(a:Article)
    RETURN node.text AS text, score, {date: a.date} AS metadata
    """
)

@tool
def retrieve_news(query: str) -> str:
    """Search for relevant news articles. Returns up to 5 articles with their source metadata and content."""
    retrieved_docs = vector_store.similarity_search(query, k=5)
    serialized = "\n\n".join(
        (f"Source: {doc.metadata}\nContent: {doc.page_content}")
        for doc in retrieved_docs
    )
    return serialized

You combine the MCP tools with our custom tools into a single list and create a new agent with access to all of them.

In [14]:
custom_tools = mcp_tools + [get_investments, retrieve_news]
# If desired, specify custom instructions
prompt = (
    "You are a helpful assistant with access to a Neo4j graph database containing company data. Use the available tools to query the database and answer questions."
)
custom_agent = create_agent(model, custom_tools, system_prompt=prompt)

Let's test it!

In [15]:
prompt = "Which companies did Google invest in?"

async for event in custom_agent.astream(
    {"messages": [{"role": "user", "content": prompt}]},
    stream_mode="values",
):
    event["messages"][-1].pretty_print()


Which companies did Google invest in?
Tool Calls:
  get_investments (call_OjjAOjy1pivFZYLDKkOM616B)
 Call ID: call_OjjAOjy1pivFZYLDKkOM616B
  Args:
    company: Google
Name: get_investments

[
  {
    "id": "ELsv5bECSOiWG_Uhf_txI2w",
    "name": "Ionic Security",
    "type": "Organization"
  },
  {
    "id": "EUkm62r-bMOidNtPjTkdVvg",
    "name": "Avere Systems",
    "type": "Organization"
  },
  {
    "id": "EX-RLztfkOFqTLoM6xIVnlg",
    "name": "FlexiDAO",
    "type": "Organization"
  },
  {
    "id": "EtqXbQ9LaMGq8om4dhYY0Fw",
    "name": "Cloudflare",
    "type": "Organization"
  },
  {
    "id": "EWIvDLNCSMCCBYUyz0oFPVQ",
    "name": "Trifacta",
    "type": "Organization"
  }
]

According to the database, Google has invested in the following companies:

1. Ionic Security  
2. Avere Systems  
3. FlexiDAO  
4. Cloudflare  
5. Trifacta  

These are the investments currently recorded in the graph; it may not be an exhaustive list of all real-world Google investments.


## ToDo

* [ ] Memory