# Getting Started with the Agent Development Kit (ADK) with Neo4j

This notebook shows how to use the Vertex AI Agent Framework to setup a multi-agent system integrating Neo4j.

The Vertex AI ADK is a python-based SDK that empowers developers to build multi-agent applications with custom logic, tools, and integrations.

In this notebook you will learn how to do the following:

*  Install Agent Development Kit
*  Define functions as tools
*  Define a database agent and an investor relationship agent
*  Create a "Investment Research" Multi-Agent


## Learn more about Neo4j

* https://neo4j.com/genai
* https://neo4j.com/developer
* https://graphrag.com
* https://neo4j.com/labs/genai-ecosystem
* [Neo4j and MCP Toolbox](https://neo4j.com/blog/developer/ai-agents-gen-ai-toolbox/)

## Learn more about the ADK

* [Launch Blog](https://cloud.google.com/blog/products/ai-machine-learning/build-and-manage-multi-system-agents-with-vertex-ai)
* [Developer Blog](https://developers.googleblog.com/en/agent-development-kit-easy-to-build-multi-agent-applications/)
* https://pypi.org/project/google-adk/
* https://google.github.io/adk-docs/
* https://github.com/google/adk-samples/
* [LLM Agents](https://google.github.io/adk-docs/agents/llm-agents/#putting-it-together-example)
* [Deploy to Agent Engine](https://google.github.io/adk-docs/deploy/agent-engine/)

## Company News Knowledge Graph

Our example dataset will be a small subset (250k entities) of [diffbot's](https://diffbot.com/) global Knowledge Graph (50bn entities).

It contains organizations, people in their leadership, locations and industries.

Additionally articles mentioning these companies, which are chunked and indexed as vector embeddings.

![](https://camo.githubusercontent.com/2ebaff2ceb74cd8af8f7ef8a5a4791ef92ea8f631c0fc41d828eb8ae9c7bf553/68747470733a2f2f692e696d6775722e636f6d2f6c574a5a5345652e706e67)

You can access and query a read only version of that graph with the following credentials:

* URL: https://demo.neo4jlabs.com:7473
* Username: companies
* Password: companies
* Database: companies

# Authenticate
You need this to download the SDK from GCS

In [192]:
from google.colab import auth
auth.authenticate_user()

# Install the Agent Development Kit

In [193]:
!pip3 install --upgrade --quiet google-adk neo4j-rust-ext

In [194]:
import os
import random
import sys

# Developer API or Vertex AI

The ADK supports two APIs:

* Google Gemini API (AI Studio)
* Vertex AI Gemini API.

In [195]:
# Only run this block for ML Developer API. Use your own API key.
import os

GOOGLE_API_KEY = "FILL YOUR API KEY" #@param {type:"string"}

os.environ["GOOGLE_GENAI_USE_VERTEXAI"] = "0"
os.environ["GOOGLE_API_KEY"] = GOOGLE_API_KEY

In [196]:
# Only run this block for Vertex AI API Use your own project / location.
import os

GOOGLE_CLOUD_PROJECT = "vertex-ai-neo4j-extension" #@param {type:"string"}

os.environ["GOOGLE_GENAI_USE_VERTEXAI"] = "1"
os.environ["GOOGLE_CLOUD_PROJECT"] = GOOGLE_CLOUD_PROJECT
os.environ["GOOGLE_CLOUD_LOCATION"] = "us-central1"

# Neo4j Database Connectivity

We define a small class `neo4jDatabase` that manages connecting to Neo4j and running read queries.

In [197]:
import logging

logger = logging.getLogger('agent_neo4j_cypher')
logger.info("Initializing Database for tools")

In [198]:
from neo4j import GraphDatabase
from typing import Any
import re

class neo4jDatabase:
    def __init__(self,  neo4j_uri: str, neo4j_username: str, neo4j_password: str):
        """Initialize connection to the Neo4j database"""
        logger.debug(f"Initializing database connection to {neo4j_uri}")
        d = GraphDatabase.driver(neo4j_uri, auth=(neo4j_username, neo4j_password))
        d.verify_connectivity()
        self.driver = d

    def is_write_query(self, query: str) -> bool:
      return re.search(r"\b(MERGE|CREATE|SET|DELETE|REMOVE|ADD)\b", query, re.IGNORECASE) is not None

    def _execute_query(self, query: str, params: dict[str, Any] | None = None) -> list[dict[str, Any]]:
        """Execute a Cypher query and return results as a list of dictionaries"""
        logger.debug(f"Executing query: {query}")
        try:
            if self.is_write_query(query):
                logger.error(f"Write query not supported {query}")
                raise "Write Queries are not supported in this agent"
                # logger.debug(f"Write query affected {counters}")
                # result = self.driver.execute_query(query, params)
                # counters = vars(result.summary.counters)
                # return [counters]
            else:
                result = self.driver.execute_query(query, params)
                results = [dict(r) for r in result.records]
                logger.debug(f"Read query returned {len(results)} rows")
                return results
        except Exception as e:
            logger.error(f"Database error executing query: {e}\n{query}")
            raise

In [199]:
db = neo4jDatabase("neo4j+s://demo.neo4jlabs.com","companies","companies")

In [200]:
# Testing database connection
db._execute_query("RETURN 1")

[{'1': 1}]

## Defining Functions for our Database Agent

* get_schema - Retrieve Database Schema for the LLM
* execute_read_query - Execute Cypher Read Query

In [201]:
def get_schema() -> list[dict[str,Any]]:
  """Get the schema of the database, returns node-types(labels) with their types and attributes and relationships between node-labels
  Args: None
  Returns:
    list[dict[str,Any]]: A list of dictionaries representing the schema of the database
    For example
    ```
    [{'label': 'Person','attributes': {'summary': 'STRING','id': 'STRING unique indexed', 'name': 'STRING indexed'},
      'relationships': {'HAS_PARENT': 'Person', 'HAS_CHILD': 'Person'}}]
    ```
  """
  try:
      results = db._execute_query(
              """
call apoc.meta.data() yield label, property, type, other, unique, index, elementType
where elementType = 'node' and not label starts with '_'
with label,
collect(case when type <> 'RELATIONSHIP' then [property, type + case when unique then " unique" else "" end + case when index then " indexed" else "" end] end) as attributes,
collect(case when type = 'RELATIONSHIP' then [property, head(other)] end) as relationships
RETURN label, apoc.map.fromPairs(attributes) as attributes, apoc.map.fromPairs(relationships) as relationships
              """
          )
      return results
  except Exception as e:
      return [{"error":str(e)}]

In [202]:
get_schema()



[{'label': 'Person',
  'attributes': {'summary': 'STRING',
   'id': 'STRING unique indexed',
   'name': 'STRING indexed'},
  'relationships': {'HAS_PARENT': 'Person', 'HAS_CHILD': 'Person'}},
 {'label': 'Organization',
  'attributes': {'summary': 'STRING',
   'isDissolved': 'BOOLEAN',
   'id': 'STRING unique indexed',
   'diffbotId': 'STRING',
   'nbrEmployees': 'INTEGER',
   'name': 'STRING indexed',
   'motto': 'STRING',
   'isPublic': 'BOOLEAN',
   'revenue': 'FLOAT'},
  'relationships': {'HAS_COMPETITOR': 'Organization',
   'HAS_BOARD_MEMBER': 'Person',
   'HAS_CEO': 'Person',
   'HAS_SUBSIDIARY': 'Organization',
   'HAS_INVESTOR': 'Organization',
   'HAS_CATEGORY': 'IndustryCategory',
   'HAS_SUPPLIER': 'Organization',
   'IN_CITY': 'City'}},
 {'label': 'IndustryCategory',
  'attributes': {'id': 'STRING unique indexed', 'name': 'STRING'},
  'relationships': {}},
 {'label': 'City',
  'attributes': {'summary': 'STRING', 'id': 'STRING', 'name': 'STRING'},
  'relationships': {'IN_COUN

In [203]:
def execute_read_query(query: str, params: dict[str, Any]) -> list[dict[str, Any]]:
    """
    Execute a Neo4j Cypher query and return results as a list of dictionaries
    Args:
        query (str): The Cypher query to execute
        params (dict[str, Any], optional): The parameters to pass to the query or None.
    Raises:
        Exception: If there is an error executing the query
    Returns:
        list[dict[str, Any]]: A list of dictionaries representing the query results
    """
    try:
        if params is None:
            params = {}
        results = db._execute_query(query, params)
        return results
    except Exception as e:
        return [{"error":str(e)}]

In [204]:
execute_read_query("RETURN 1", None)

[{'1': 1}]

## Defining Functions for our Investment Research Agent

* get_schema - Retrieve Database Schema for the LLM
* get_investors - Returns the investor in the company with this name or id.

In [221]:
def get_investors(company: str) -> list[dict[str, Any]]:
    """
    Returns the investor in the company with this name or id.
    Args:
        company (str): name of the company to find investors in
    Returns:
        list[dict[str, Any]]: A list of investor ids, names (and their types Organization or Person)
    """
    try:
        results = db._execute_query("""
        MATCH p=(o:Organization)<-[r:HAS_INVESTOR]-(i)
        WHERE o.name=$company OR o.id=$company
        RETURN i.id as id, i.name as name, head(labels(i)) as type
        """, {"company":company})
        return results
    except Exception as e:
        return [{"error":str(e)}]

In [223]:
get_investors("Cloudflare")

[{'id': 'EIBehdsQ4ME-yr8WK8tiC1w',
  'name': 'Atlas Financial Group',
  'type': 'Organization'},
 {'id': 'E-qHiBRsMNb2wOsqdSx2CfQ',
  'name': 'Pelion Venture Partners',
  'type': 'Organization'},
 {'id': 'E7obn7yRnMP6HjTQ4xOL6Dg', 'name': 'Venrock', 'type': 'Organization'},
 {'id': 'EfCgjTBhgNdyHkynILGypYg',
  'name': 'New Enterprise Associates (NEA)',
  'type': 'Organization'},
 {'id': 'EptIVnF6gNuOmOmAdBku7Uw',
  'name': 'National Science Foundation (NSF)',
  'type': 'Organization'},
 {'id': 'EdE6JQUuvMf6rTFv2qMJZqQ',
  'name': 'Union Square Ventures',
  'type': 'Organization'},
 {'id': 'E0aVDBaUAMSWgjywIrE4IwA',
  'name': 'Greenspring Associates',
  'type': 'Organization'},
 {'id': 'EThF8yM26NOmTZYsND2DqRQ',
  'name': 'Microsoft Accelerator',
  'type': 'Organization'},
 {'id': 'EUFq-3WlpNsq0pvfUYWXOEA', 'name': 'Google', 'type': 'Organization'},
 {'id': 'EmuHMD_KVM1OUKHkQE21s5A',
  'name': 'Qualcomm Ventures',
  'type': 'Organization'},
 {'id': 'E1XokcZs2NqGwLR47G92lFQ', 'name': 'Ba

# Define our Agents

An AI agent reasons, plans, and takes actions.

The agent takes actions via access to **tools**, deciding how and when to invoke a tool. The agent also manages orchestration, creating a plan for answering a user query and adapting to responses that aren't quite correct.

Agent **tools** can be Python functions, or Vertex AI Extensions, or MCP Servers.

## Creating the Database Agent.

* display name
* instructions - detailed, making it clear exactly how the agent should behave and which tools to use when
* tools



In [207]:
from google.adk.agents import Agent

In [208]:
MODEL="gemini-2.5-pro-exp-03-25"

In [226]:
database_agent = Agent(
    model=MODEL,
    name='graph_database_agent',
    instruction="""
      You are an Neo4j graph database and Cypher query expert, that must use the database schema with a user question and repeatedly generate valid cypher statements
      to execute on the database and answer the user's questions in a friendly manner in natural language.
      If in doubt the database schema is always prioritized when it comes to nodes-types (labels) or relationship-types or property names, never take the user's input at face value.
      If the user requests also render tables, charts or other artifacts with the query results.
      Always validate the correct node-labels at the end of a relationship based on the schema.

      If a query fails or doesn't return data, use the error response 3 times to try to fix the generated query and re-run it, don't return the error to the user.
      If you cannot fix the query, explain the issue to the user and apologize.
      *You are prohibited* from using directional arrows (like -> or <-) in the graph patterns, always use undirected patterns like `(:Label)-[:TYPE]-(:Label)`.
      You get negative points for using directional arrays in patterns.

      Fetch the graph database schema first and keep it in session memory to access later for query generation.
      Keep results of previous executions in session memory and access if needed, for instance ids or other attributes of nodes to find them again
      removing the need to ask the user. This also allows for generating shorter, more focused and less error-prone queries
      to for drill downs, sequences and loops.
      If possible resolve names to primary keys or ids and use those for looking up entities.
      The schema always indicates *outgoing* relationship-types from an entity to another entity, the graph patterns read like english language.
      `company has supplier` would be the pattern `(o:Organization)-[:HAS_SUPPLIER]-(s:Organization)`

      To get the schema of a database use the `get_schema` tool without parameters. Store the response of the schema tool in session context
      to access later for query generation.

      To answer a user question generate one or more Cypher statements based on the database schema and the parts of the user question.
      If necessary resolve categorical attributes (like names, countries, industries, publications) first by retrieving them for a set of entities to translate from the user's request.
      Use the `execute_query` tool repeatedly with the Cypher statements, you MUST generate statements that use named query parameters with `$parameter` style names
      and MUST pass them as a second dictionary parameter to the tool, even if empty.
      Parameter data can come from the users requests, prior query results or additional lookup queries.
      After the data for the question has been sufficiently retrieved, pass the data and control back to the parent agent.
    """,
    tools=[
        get_schema, execute_read_query
    ]
)

In [227]:
investment_research_agent = Agent(
    model=MODEL,
    name='investment_research_agent',
    instruction="""
    You are an agent that has access to a knowledge graph of companies (organizations), people involved with them, articles about companies,
    and industry categories and technologies.
    You have a set of tools to access different aspects of the investment database.
    You will be tasked by other agents to fetch certain information from that knowledge graph.
    If you do so, try to always return not just the factual attribute data but also
    ids of companies, articles, people to allow the other tools to investigate them more.
    """,
    tools=[
        get_schema, get_investors
    ]
)

In [228]:
root_agent = Agent(
    model=MODEL,
    name='investment_agent',
    global_instruction = "",
    instruction="""
    You are an agent that has access to a knowledge graph of companies (organizations), people involved with them, articles about companies,
    and industry categories and technologies.
    You have a set of agents to retrieve information from that knowledge graph.
    If the user requests also render tables, charts or other artifacts with the research results.
    """,

    sub_agents=[database_agent, investment_research_agent]
)

# Let's try it

You can [run the ADK locally](https://google.github.io/adk-docs/get-started/local-testing/#expected-output) a web application with `adk web` or the FastAPI server with `adk api_server`.

You can also deploy the agent to [Cloud Run](https://google.github.io/adk-docs/deploy/cloud-run/) or to [Agent Engine](https://google.github.io/adk-docs/deploy/agent-engine/).

In [229]:
import warnings
warnings.filterwarnings('ignore', category=UserWarning)
warnings.filterwarnings('ignore', category=Warning)

APP_NAME = 'Neo4j Investment Researcher'
USER_ID = 'Michael Hunger'

from google.adk.runners import InMemoryRunner
from google.genai.types import Part, UserContent


runner = InMemoryRunner(app_name=APP_NAME, agent=root_agent)

session = runner.session_service.create_session( app_name=runner.app_name, user_id=USER_ID)

async def run_prompt(new_message: str):
  content = UserContent(parts=[Part(text=new_message)])
# print (content)
  result = None
  async for event in runner.run_async(user_id=session.user_id, session_id=session.id, new_message=content):
#    print(event.content.model_dump(exclude_none=True))
#    print(event.content.parts)
    for part in event.content.parts:
      print(part)
      if part.text:
#        print(part.text)
        result = part.text
  return result

In [230]:
await run_prompt('How many people are in the database?')




video_metadata=None thought=None code_execution_result=None executable_code=None file_data=None function_call=FunctionCall(id='adk-3280e34a-0a28-485b-93d8-a4acff64180c', args={'agent_name': 'graph_database_agent'}, name='transfer_to_agent') function_response=None inline_data=None text=None
video_metadata=None thought=None code_execution_result=None executable_code=None file_data=None function_call=None function_response=FunctionResponse(id='adk-3280e34a-0a28-485b-93d8-a4acff64180c', name='transfer_to_agent', response={}) inline_data=None text=None




video_metadata=None thought=None code_execution_result=None executable_code=None file_data=None function_call=None function_response=None inline_data=None text='Okay, I can help with that. First, I need to check the database structure.'
video_metadata=None thought=None code_execution_result=None executable_code=None file_data=None function_call=FunctionCall(id='adk-bea7abcd-5baf-4a62-b1f7-9af5a3b82d63', args={}, name='get_schema') function_response=None inline_data=None text=None
video_metadata=None thought=None code_execution_result=None executable_code=None file_data=None function_call=None function_response=FunctionResponse(id='adk-bea7abcd-5baf-4a62-b1f7-9af5a3b82d63', name='get_schema', response={'result': [{'label': 'Person', 'attributes': {'summary': 'STRING', 'id': 'STRING unique indexed', 'name': 'STRING indexed'}, 'relationships': {'HAS_PARENT': 'Person', 'HAS_CHILD': 'Person'}}, {'label': 'Organization', 'attributes': {'summary': 'STRING', 'isDissolved': 'BOOLEAN', 'id': 'ST



video_metadata=None thought=None code_execution_result=None executable_code=None file_data=None function_call=None function_response=None inline_data=None text="Okay, I can tell you how many people are in the database. I'll run a query to count them."
video_metadata=None thought=None code_execution_result=None executable_code=None file_data=None function_call=FunctionCall(id='adk-daf0740c-a5d9-40de-aa69-03624e1b5ab0', args={'params': {}, 'query': 'MATCH (p:Person) RETURN count(p) AS total_people'}, name='execute_read_query') function_response=None inline_data=None text=None
video_metadata=None thought=None code_execution_result=None executable_code=None file_data=None function_call=None function_response=FunctionResponse(id='adk-daf0740c-a5d9-40de-aa69-03624e1b5ab0', name='execute_read_query', response={'result': [{'total_people': 8064}]}) inline_data=None text=None
video_metadata=None thought=None code_execution_result=None executable_code=None file_data=None function_call=None functi

'There are 8,064 people in the database.'

In [231]:
await run_prompt('What are the main competitors of YouTube?')



video_metadata=None thought=None code_execution_result=None executable_code=None file_data=None function_call=None function_response=None inline_data=None text="Okay, I can help you find the competitors of YouTube. I'll query the database for organizations that have a competitor relationship with YouTube."
video_metadata=None thought=None code_execution_result=None executable_code=None file_data=None function_call=FunctionCall(id='adk-e4bacf9a-d293-4147-9afc-9e8722f1a3ea', args={'query': 'MATCH (o:Organization {name: $name})-[:HAS_COMPETITOR]-(competitor:Organization) RETURN competitor.name AS competitor_name', 'params': {'name': 'YouTube'}}, name='execute_read_query') function_response=None inline_data=None text=None
video_metadata=None thought=None code_execution_result=None executable_code=None file_data=None function_call=None function_response=FunctionResponse(id='adk-e4bacf9a-d293-4147-9afc-9e8722f1a3ea', name='execute_read_query', response={'result': [{'competitor_name': 'Mixer'

'Based on the database, the main competitors of YouTube are:\n\n*   Mixer\n*   BYTEDANCE\n*   BuzzFeed\n*   TikTok\n*   OpenAI\n*   Dailymotion\n*   Twitter\n*   Fox Broadcasting\n*   Violin Systems\n*   Oxygen'

In [232]:
await run_prompt('Who has invested in BYTEDANCE and where else have they invested?')



video_metadata=None thought=None code_execution_result=None executable_code=None file_data=None function_call=None function_response=None inline_data=None text="Okay, I can find that information for you. I'll query the database to identify the investors in BYTEDANCE and then find the other companies they have invested in."
video_metadata=None thought=None code_execution_result=None executable_code=None file_data=None function_call=FunctionCall(id='adk-5d1d53e2-4ba0-4576-b699-4ed0802df3e9', args={'query': '\nMATCH (target:Organization {name: $targetCompany})\nMATCH (investor:Person)-[:HAS_INVESTOR]-(target)\nMATCH (investor)-[:HAS_INVESTOR]-(investment:Organization)\nWHERE investment <> target\nRETURN investor.name AS investorName, collect(DISTINCT investment.name) AS otherInvestments\n', 'params': {'targetCompany': 'BYTEDANCE'}}, name='execute_read_query') function_response=None inline_data=None text=None
video_metadata=None thought=None code_execution_result=None executable_code=None 

'Based on the data, Rong Yue is listed as an investor in BYTEDANCE. Rong Yue has also invested in Inspur.'

In [233]:
runner.session_service.list_sessions(app_name=APP_NAME, user_id=USER_ID)

ListSessionsResponse(sessions=[Session(id='b8715e94-61a9-4e0d-905d-57e056ce9de9', app_name='Neo4j Investment Researcher', user_id='Michael Hunger', state={}, events=[], last_update_time=1744231468.846185)])

In [234]:
for session in runner.session_service.list_sessions(app_name=APP_NAME, user_id=USER_ID).sessions:
  print(session.model_dump())

{'id': 'b8715e94-61a9-4e0d-905d-57e056ce9de9', 'app_name': 'Neo4j Investment Researcher', 'user_id': 'Michael Hunger', 'state': {}, 'events': [], 'last_update_time': 1744231468.846185}


In [235]:
result = await run_prompt("Summarize the results of the previous research questions")
result

video_metadata=None thought=None code_execution_result=None executable_code=None file_data=None function_call=None function_response=None inline_data=None text="Okay, here's a summary of our findings so far:\n\n1.  We found there are a total of **8,064 people** recorded in the database.\n2.  We identified the main competitors of **YouTube** as: Mixer, BYTEDANCE, BuzzFeed, TikTok, OpenAI, Dailymotion, Twitter, Fox Broadcasting, Violin Systems, and Oxygen.\n3.  We looked into investors of **BYTEDANCE** and found that **Rong Yue** is listed as an investor. Besides BYTEDANCE, Rong Yue has also invested in **Inspur**."


"Okay, here's a summary of our findings so far:\n\n1.  We found there are a total of **8,064 people** recorded in the database.\n2.  We identified the main competitors of **YouTube** as: Mixer, BYTEDANCE, BuzzFeed, TikTok, OpenAI, Dailymotion, Twitter, Fox Broadcasting, Violin Systems, and Oxygen.\n3.  We looked into investors of **BYTEDANCE** and found that **Rong Yue** is listed as an investor. Besides BYTEDANCE, Rong Yue has also invested in **Inspur**."

In [236]:
from IPython.display import Markdown, display


display(Markdown(result))

Okay, here's a summary of our findings so far:

1.  We found there are a total of **8,064 people** recorded in the database.
2.  We identified the main competitors of **YouTube** as: Mixer, BYTEDANCE, BuzzFeed, TikTok, OpenAI, Dailymotion, Twitter, Fox Broadcasting, Violin Systems, and Oxygen.
3.  We looked into investors of **BYTEDANCE** and found that **Rong Yue** is listed as an investor. Besides BYTEDANCE, Rong Yue has also invested in **Inspur**.

In [237]:
for session in runner.session_service.list_sessions(app_name=APP_NAME, user_id=USER_ID).sessions:
  print(f"Deleting session {session}")
  runner.session_service.delete_session(app_name=APP_NAME, user_id=USER_ID, session_id=session.id)

Deleting session id='b8715e94-61a9-4e0d-905d-57e056ce9de9' app_name='Neo4j Investment Researcher' user_id='Michael Hunger' state={} events=[] last_update_time=1744231473.597785
