# Module 3 - GraphRAG and Agents

This module has the following objectives:
- Experiment with queries for an Agent
- Define Tooling
- Create an agents with the available tools
- Chatbot for an Agent
- Text2Cypher (if we got time)

In [1]:
!pip install graphdatascience neo4j dotenv openai langchain



Import our usual suspects (and some more...)

In [2]:
import os
import pandas as pd
from dotenv import load_dotenv
from graphdatascience import GraphDataScience
from neo4j import Query, GraphDatabase, RoutingControl, Result
from langchain.schema import HumanMessage
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_core.messages import HumanMessage
from langchain_core.prompts import ChatPromptTemplate, PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langgraph.prebuilt import create_react_agent
from openai import OpenAI
from typing import List, Optional
from pydantic import BaseModel, Field, validator
import functools
from langchain_core.tools import tool
import gradio as gr
import time

## Setup

Load env variables

In [3]:
env_file = 'ws.env'

In [4]:
if os.path.exists(env_file):
    load_dotenv(env_file, override=True)

    # Neo4j
    HOST = os.getenv('NEO4J_URI')
    USERNAME = os.getenv('NEO4J_USERNAME')
    PASSWORD = os.getenv('NEO4J_PASSWORD')
    DATABASE = os.getenv('NEO4J_DATABASE')

    # AI
    OPENAI_API_KEY = os.getenv('OPENAI_API_KEY')
    os.environ['OPENAI_API_KEY']=OPENAI_API_KEY
    LLM = os.getenv('LLM')
    EMBEDDINGS_MODEL = os.getenv('EMBEDDINGS_MODEL')
else:
    print(f"File {env_file} not found.")

Connect to neo4j db

In [5]:
driver = GraphDatabase.driver(
    HOST,
    auth=(USERNAME, PASSWORD)
)

Test the connection

In [6]:
driver.execute_query(
    """
    MATCH (n) RETURN COUNT(n) as Count
    """,
    database_=DATABASE,
    routing_=RoutingControl.READ,
    result_transformer_= lambda r: r.to_df()
)

Unnamed: 0,Count
0,158


Test whether we got our constraints

In [7]:
schema_result_df  = driver.execute_query(
    'show indexes',
    database_=DATABASE,
    routing_=RoutingControl.READ,
    result_transformer_= lambda r: r.to_df()
)

In [8]:
schema_result_df.head(100)

Unnamed: 0,id,name,state,populationPercent,type,entityType,labelsOrTypes,properties,indexProvider,owningConstraint,lastRead,readCount
0,5,constraint_63bf11a1,ONLINE,100.0,RANGE,NODE,[Skill],[name],range-1.0,constraint_63bf11a1,2025-03-12T17:24:57.534000000+00:00,582
1,3,constraint_d3bfd313,ONLINE,100.0,RANGE,NODE,[Person],[email],range-1.0,constraint_d3bfd313,2025-03-12T16:17:28.945000000+00:00,306
2,0,index_343aff4e,ONLINE,100.0,LOOKUP,NODE,,,token-lookup-1.0,,2025-03-12T17:25:02.191000000+00:00,16701
3,1,index_f7700477,ONLINE,100.0,LOOKUP,RELATIONSHIP,,,token-lookup-1.0,,2025-03-12T17:13:59.380000000+00:00,75
4,4,skill-embeddings,ONLINE,100.0,VECTOR,NODE,[Skill],[embedding],vector-2.0,,2025-03-12T17:24:51.486000000+00:00,103


## Agent Thinking

Let's say we want to build a Agents with multiple tools. Let's try to provide the following functionality: 

1. Retrieve the skills of a person.
   - Input: Person
   - Output: Skills
   - Example: *What skills does Kristof Neys have?* 
3. Retrieve similar skills to other skills.
   - Input: Skills
   - Output: Skills
   - Example: *What skills are similar to PowerBI and Data Visualization?*
4. Retrieve similar persons to a person specified.  
   - Input: Person
   - Output: Person
   - Example: *"Which persons have similar skills as Kristof Neys?"*
6. Retrieve Persons based on a set of skills.
   - Input: Skills
   - Output: Person
   - Example: *Which persons have Python and AWS experience?*

In [9]:
embeddings = OpenAIEmbeddings(model=EMBEDDINGS_MODEL)

## 1 - Retrieve Skills of Person

Find the connected skills given a person name.

In [10]:
person_name = "Lucy Turner"

In [11]:
person_skills_df = driver.execute_query(
    """
    MATCH (p:Person{name: $person_name})-[:KNOWS]->(s:Skill)
    RETURN p.name as name, COLLECT(s.name) as skills
    """,
    database_=DATABASE,
    routing_=RoutingControl.READ,
    result_transformer_= lambda r: r.to_df(),
    person_name = person_name
)

In [12]:
person_skills_df

Unnamed: 0,name,skills
0,Lucy Turner,"[Security, Express.js, Big Data, Scala, Docker]"


## 2 - Retrieve similar skills

Retrieve skills based on a list of skills

In [13]:
skills = ['Contineous Delivery', 'Cloud Native', 'Security']
skills_vectors = embeddings.embed_documents(skills)

In [14]:
search_persons_with_skills_df = driver.execute_query(
    """
        UNWIND $skills_vectors AS v
        CALL db.index.vector.queryNodes('skill-embeddings', 3, TOFLOATLIST(v)) YIELD node, score
        WHERE score > 0.89
        OPTIONAL MATCH (node)-[:SIMILAR_SEMANTIC]-(s:Skill)
        WITH COLLECT(node) AS nodes, COLLECT(DISTINCT s) AS skills
        WITH nodes + skills AS all_skills
        UNWIND all_skills AS skill
        RETURN DISTINCT skill.name as skill_name
    """,
    database_=DATABASE,
    routing_=RoutingControl.READ,
    result_transformer_= lambda r: r.to_df(),
    skills_vectors = skills_vectors
)

In [15]:
search_persons_with_skills_df

Unnamed: 0,skill_name
0,CI/CD
1,Cloud Architecture
2,Azure
3,AWS
4,Security
5,DevOps
6,Jenkins


## 3 - Person Similarity

## Strategy 3.1 - Communities

We can use the community here to find similar people

In [16]:
person_name_1 = "John Garcia"

In [17]:
person_similarity_community_df = driver.execute_query(
    """
    MATCH (p1:Person {name: $person_name_1})-[:KNOWS]->(s:Skill)
    WITH p1, COLLECT(s.name) as s1
    MATCH (p2:Person {leiden_community: p1.leiden_community})-[:KNOWS]->(s2:Skill)
    RETURN p1.name AS person_1, s1 AS skills_1, p2.name AS person_2, COLLECT(s2.name) AS skills_2
    """,
    database_=DATABASE,
    routing_=RoutingControl.READ,
    result_transformer_= lambda r: r.to_df(),
    person_name_1 = person_name_1
)

In [18]:
person_similarity_community_df

Unnamed: 0,person_1,skills_1,person_2,skills_2
0,John Garcia,"[Security, PyTorch, HTML5, Ruby, AWS]",Thomas Nelson,"[Security, Pandas, Go]"
1,John Garcia,"[Security, PyTorch, HTML5, Ruby, AWS]",Lucy Turner,"[Security, Express.js, Big Data, Scala, Docker]"
2,John Garcia,"[Security, PyTorch, HTML5, Ruby, AWS]",Sophie Jackson,"[Security, Pandas, Linux, Angular]"
3,John Garcia,"[Security, PyTorch, HTML5, Ruby, AWS]",Mia Nelson,"[Security, WordPress, Big Data, Swift, AWS]"
4,John Garcia,"[Security, PyTorch, HTML5, Ruby, AWS]",David Lopez,"[Security, WordPress, PHP]"
5,John Garcia,"[Security, PyTorch, HTML5, Ruby, AWS]",Thomas Brown,"[Security, R, Java, Docker]"
6,John Garcia,"[Security, PyTorch, HTML5, Ruby, AWS]",Isabella Allen,"[Security, Scala, Cloud Architecture]"
7,John Garcia,"[Security, PyTorch, HTML5, Ruby, AWS]",Olivia Johnson,"[Security, Angular, CI/CD]"
8,John Garcia,"[Security, PyTorch, HTML5, Ruby, AWS]",Amelia Davis,"[Security, PyTorch, Java, HTML5, Docker]"
9,John Garcia,"[Security, PyTorch, HTML5, Ruby, AWS]",Emily Phillips,"[Security, Vue.js, PHP, Kubernetes, Data Visua..."


### Strategy 3.2 - Similar Skillsets

We can use the SIMILAR_SKILLSET relationship to find similar persons

In [19]:
person_name_1 = "John Garcia"

In [20]:
person_similar_skillset_df = driver.execute_query(
    """
    MATCH (p1:Person {name: $person_name_1})-[:KNOWS]->(s:Skill)
    WITH p1, COLLECT(s.name) as s1
    MATCH (p1)-[r:SIMILAR_SKILLSET]-(p2:Person)-[:KNOWS]->(s2:Skill)
    WHERE r.overlap > 1
    RETURN p1.name AS person_1, s1 AS skills_1, p2.name AS person_2, COLLECT(DISTINCT s2.name) AS skills_2
    """,
    database_=DATABASE,
    routing_=RoutingControl.READ,
    result_transformer_= lambda r: r.to_df(),
    person_name_1 = person_name_1
)

In [21]:
person_similar_skillset_df

Unnamed: 0,person_1,skills_1,person_2,skills_2
0,John Garcia,"[Security, PyTorch, HTML5, Ruby, AWS]",Mia Nelson,"[Security, WordPress, Big Data, Swift, AWS]"
1,John Garcia,"[Security, PyTorch, HTML5, Ruby, AWS]",Matthew Miller,"[TensorFlow, Ruby, AWS, ReactJS]"
2,John Garcia,"[Security, PyTorch, HTML5, Ruby, AWS]",Matthew Mitchell,"[R, HTML5, Blockchain, Cloud Architecture, Ruby]"
3,John Garcia,"[Security, PyTorch, HTML5, Ruby, AWS]",John Taylor,"[Pandas, Scrum, CSS3, Ruby, AWS]"
4,John Garcia,"[Security, PyTorch, HTML5, Ruby, AWS]",Amelia Davis,"[Security, PyTorch, Java, HTML5, Docker]"


### Strategy 3.3 Similar Skillsets and Semantic Meaning

Use the Semantic Meaning and Skill overlap to find people with similar skills

In [22]:
person_name_1 = "John Garcia"

In [23]:
person_similarity_df = driver.execute_query(
    """
    MATCH (p1:Person {name: $person_name_1})-[:KNOWS]->(s:Skill)
    WITH p1, COLLECT(s.name) as skills_1
    CALL (p1){
      MATCH (p1)-[:KNOWS]->(s1:Skill)-[r:SIMILAR_SEMANTIC]-(s2:Skill)<-[:KNOWS]-(p2:Person)
      RETURN p1 as person_1, p2 as person_2, SUM(r.score) AS score
      UNION 
      MATCH (p1)-[r:SIMILAR_SKILLSET]->(p2:Person)
      RETURN p1 as person_1, p2 AS person_2, SUM(r.overlap) AS score
    }
    WITH person_1.name as person_1, skills_1, person_2, SUM(score) as score
    WHERE score >= 1
    MATCH (person_2)-[:KNOWS]->(s:Skill)
    RETURN person_1, skills_1,  person_2.name as person_2, COLLECT(s.name) as skills_2, score
    ORDER BY score DESC LIMIT 5
    """,
    database_=DATABASE,
    routing_=RoutingControl.READ,
    result_transformer_= lambda r: r.to_df(),
    person_name_1 = person_name_1
)

In [24]:
person_similarity_df

Unnamed: 0,person_1,skills_1,person_2,skills_2,score
0,John Garcia,"[Security, PyTorch, HTML5, Ruby, AWS]",Matthew Miller,"[TensorFlow, Ruby, AWS, ReactJS]",2.93338
1,John Garcia,"[Security, PyTorch, HTML5, Ruby, AWS]",Matthew Mitchell,"[R, HTML5, Blockchain, Cloud Architecture, Ruby]",2.93161
2,John Garcia,"[Security, PyTorch, HTML5, Ruby, AWS]",John Johnson,"[WordPress, TensorFlow, AWS, Project Managemen...",2.863373
3,John Garcia,"[Security, PyTorch, HTML5, Ruby, AWS]",Mia Nelson,"[Security, WordPress, Big Data, Swift, AWS]",2.0
4,John Garcia,"[Security, PyTorch, HTML5, Ruby, AWS]",John Taylor,"[Pandas, Scrum, CSS3, Ruby, AWS]",2.0


## 4 - Recommendation of Person given skills

## Vector Index Search

In [25]:
skills = ['AWS', 'Security']

In [26]:
skills_vectors = embeddings.embed_documents(skills)

We get the approximate top 10 nearest nodes to the search vector `v` and take the 3 first returned. Then put them together in a list (`skill_list`) and does same ranking as before (number of skills)

In [27]:
nn_df = driver.execute_query(
    """UNWIND $skills_vectors AS v
    CALL db.index.vector.queryNodes('skill-embeddings', 3, TOFLOATLIST(v)) YIELD node, score
    WHERE score > 0.85
    WITH v as embedding, COALESCE(COLLECT(node.name), []) AS top
    RETURN *
    """,
    database_=DATABASE,
    routing_=RoutingControl.READ,
    result_transformer_= lambda r: r.to_df(),
    skills_vectors = skills_vectors
)
nn_df['skills'] = skills
cols = list(nn_df.columns)[-1:] + list(nn_df.columns)[:-1]
nn_df = nn_df[cols]

In [28]:
nn_df

Unnamed: 0,skills,embedding,top
0,AWS,"[-0.004132895264774561, -0.017077714204788208,...","[AWS, Azure, Cloud Architecture]"
1,Security,"[0.01840578392148018, -0.011083670891821384, -...","[Security, Rust, Linux]"


In [29]:
find_persons_given_skills_df = driver.execute_query(
    """
    UNWIND $skills_vectors AS v
    CALL db.index.vector.queryNodes('skill-embeddings', 3, TOFLOATLIST(v)) YIELD node, score
    WHERE score > 0.85
    OPTIONAL MATCH (node)-[:SIMILAR_SEMANTIC]-(s:Skill)
    WITH COLLECT(node) AS nodes, COLLECT(DISTINCT s) AS skills
    WITH nodes + skills AS all_skills
    UNWIND all_skills AS skill
    MATCH (p:Person)-[:KNOWS]->(skill)
    RETURN p.name AS person, COUNT(DISTINCT(skill)) AS skill_count, COLLECT(DISTINCT(skill.name)) as similar_skills
    ORDER BY skill_count DESC LIMIT 10
    """,
    database_=DATABASE,
    routing_=RoutingControl.READ,
    result_transformer_= lambda r: r.to_df(),
    skills_vectors = skills_vectors
)

In [30]:
find_persons_given_skills_df

Unnamed: 0,person,skill_count,similar_skills
0,Natalie Turner,2,"[Rust, Linux]"
1,Mia Nelson,2,"[AWS, Security]"
2,Brian Jackson,2,"[Cloud Architecture, Rust]"
3,Sophie Jackson,2,"[Security, Linux]"
4,John Garcia,2,"[AWS, Security]"
5,Isabella Allen,2,"[Cloud Architecture, Security]"
6,John Taylor,1,[AWS]
7,Brian Thompson,1,[AWS]
8,Matthew Miller,1,[AWS]
9,Natalie Miller,1,[Azure]


## Agents with GraphRAG

### Lets create a Retrieval agent

In [31]:
class Skill(BaseModel):
    """
    Represents a professional skill or knowledge of a person.
    """
    name: str = Field(..., description="Sortened name of the skill")

### Tool 1

In [32]:
def retrieve_skills_of_person(person_name: str) -> pd.DataFrame:
    """Retrieve the skills of a person. Person is provided with it's name"""
    return driver.execute_query(
        """
        MATCH (p:Person{name: $person_name})-[:KNOWS]->(s:Skill)
        RETURN p.name as name, COLLECT(s.name) as skills
        """,
        database_=DATABASE,
        routing_=RoutingControl.READ,
        result_transformer_= lambda r: r.to_df(),
        person_name = person_name
    )

In [33]:
retrieve_skills_of_person('Mia Nelson') 

Unnamed: 0,name,skills
0,Mia Nelson,"[Security, WordPress, Big Data, Swift, AWS]"


### Tool 2

In [34]:
def find_similar_skills(skills: List[Skill]) -> pd.DataFrame:
    """Find similar skills to list of skills specified. Skills are specified by a list of their names"""
    skills = [s.name for s in skills]
    skills_vectors = embeddings.embed_documents(skills)
    return driver.execute_query(
    """
        UNWIND $skills_vectors AS v
        CALL db.index.vector.queryNodes('skill-embeddings', 3, TOFLOATLIST(v)) YIELD node, score
        WHERE score > 0.89
        OPTIONAL MATCH (node)-[:SIMILAR_SEMANTIC]-(s:Skill)
        WITH COLLECT(node) AS nodes, COLLECT(DISTINCT s) AS skills
        WITH nodes + skills AS all_skills
        UNWIND all_skills AS skill
        RETURN DISTINCT skill.name as skill_name
    """,
    database_=DATABASE,
    routing_=RoutingControl.READ,
    result_transformer_= lambda r: r.to_df(),
    skills_vectors = skills_vectors
)

In [35]:
find_similar_skills([Skill(name='Python')])

Unnamed: 0,skill_name
0,Python
1,Pandas
2,Django
3,Ruby
4,Java
5,PHP
6,C++
7,Flask


### Tool 3

In [36]:
def person_similarity(person_name: str) -> pd.DataFrame:
    """Find a similar person to the one specified based on their skill similarity. Persons are provided with their name"""
    
    return driver.execute_query(
        """
        MATCH (p1:Person {name: $person_name})-[:KNOWS]->(s:Skill)
        WITH p1, COLLECT(s.name) as skills_1
        CALL (p1){
          MATCH (p1)-[:KNOWS]->(s1:Skill)-[r:SIMILAR_SEMANTIC]-(s2:Skill)<-[:KNOWS]-(p2:Person)
          RETURN p1 as person_1, p2 as person_2, SUM(r.score) AS score
          UNION 
          MATCH (p1)-[r:SIMILAR_SKILLSET]->(p2:Person)
          RETURN p1 as person_1, p2 AS person_2, SUM(r.overlap) AS score
        }
        WITH person_1.name as person_1, skills_1, person_2, SUM(score) as score
        WHERE score >= 1
        MATCH (person_2)-[:KNOWS]->(s:Skill)
        RETURN person_1, skills_1,  person_2.name as person_2, COLLECT(s.name) as skills_2, score
        ORDER BY score DESC LIMIT 5
        """,
        database_=DATABASE,
        routing_=RoutingControl.READ,
        result_transformer_= lambda r: r.to_df(),
        person_name = person_name
    )

In [37]:
person_similarity("Christopher Jackson")

Unnamed: 0,person_1,skills_1,person_2,skills_2,score
0,Christopher Jackson,"[Linux, System Design, Spark, Django, Python]",Joseph Mitchell,"[System Design, Spark, Vue.js, Ruby]",2.929993
1,Christopher Jackson,"[Linux, System Design, Spark, Django, Python]",John Walker,"[API Design, Django, Python]",2.92543
2,Christopher Jackson,"[Linux, System Design, Spark, Django, Python]",Natalie Thompson,"[System Design, Angular, Spark, TypeScript, Je...",2.0
3,Christopher Jackson,"[Linux, System Design, Spark, Django, Python]",Joseph Lopez,"[Linux, System Design, ReactJS]",2.0
4,Christopher Jackson,"[Linux, System Design, Spark, Django, Python]",Peter Perez,"[System Design, Rust, Big Data, Django]",2.0


### Tool 4

In [38]:
def find_person_based_on_skills(skills: List[Skill]) -> pd.DataFrame:
    """
    Find persons based on skills they have. Skills are specified by their names. 
    Note that similar skills can be found. These are considered similar. 
    """
    skills = [s.name for s in skills]
    skills_vectors = embeddings.embed_documents(skills)
    return driver.execute_query(
        """
        UNWIND $skills_vectors AS v
        CALL db.index.vector.queryNodes('skill-embeddings', 3, TOFLOATLIST(v)) YIELD node, score
        WHERE score > 0.89
        OPTIONAL MATCH (node)-[:SIMILAR_SEMANTIC]-(s:Skill)
        WITH COLLECT(node) AS nodes, COLLECT(DISTINCT s) AS skills
        WITH nodes + skills AS all_skills
        UNWIND all_skills AS skill
        MATCH (p:Person)-[:KNOWS]->(skill)
        RETURN p.name AS person, COUNT(DISTINCT(skill)) AS score, COLLECT(DISTINCT(skill.name)) as similar_skills
        ORDER BY score DESC LIMIT 10
        """,
        database_=DATABASE,
        routing_=RoutingControl.READ,
        result_transformer_= lambda r: r.to_df(),
        skills_vectors = skills_vectors
)

In [39]:
find_person_based_on_skills([Skill(name='Security'), Skill(name='Pandas')])

Unnamed: 0,person,score,similar_skills
0,Sophie Jackson,2,"[Security, Pandas]"
1,Thomas Nelson,2,"[Security, Pandas]"
2,Thomas Brown,1,[Security]
3,David Lopez,1,[Security]
4,Olivia Johnson,1,[Security]
5,Isabella Allen,1,[Security]
6,Mia Nelson,1,[Security]
7,Amelia Davis,1,[Security]
8,Emily Phillips,1,[Security]
9,Lucy Turner,1,[Security]


## Setting up the Agent

In [40]:
llm = ChatOpenAI(model_name=LLM, temperature=0)

In [41]:
response = llm.invoke([HumanMessage(content="hi!")])
response.content

'Hello! How can I assist you today?'

In [42]:
tools = [
    retrieve_skills_of_person, 
    find_similar_skills,
    person_similarity,
    find_person_based_on_skills,
]

llm_with_tools = llm.bind_tools(tools)

In [43]:
response = llm_with_tools.invoke([HumanMessage(content="What skills does Kristof Neys have?")])

print(f"ContentString: {response.content}")
print(f"ToolCalls: {response.tool_calls}")

ContentString: 
ToolCalls: [{'name': 'retrieve_skills_of_person', 'args': {'person_name': 'Kristof Neys'}, 'id': 'call_FxSPNqvYIStPlOyCPxvm7YbB', 'type': 'tool_call'}]


In [44]:
response = llm_with_tools.invoke([HumanMessage(content="What skills are similar to PowerBI and Data Visualization?")])

print(f"ContentString: {response.content}")
print(f"ToolCalls: {response.tool_calls}")

ContentString: 
ToolCalls: [{'name': 'find_similar_skills', 'args': {'skills': [{'name': 'PowerBI'}, {'name': 'Data Visualization'}]}, 'id': 'call_kuLxl8o90ob1uV5scTor04mG', 'type': 'tool_call'}]


In [45]:
response = llm_with_tools.invoke([HumanMessage(content="Which persons have similar skills as Kristof Neys?")])

print(f"ContentString: {response.content}")
print(f"ToolCalls: {response.tool_calls}")

ContentString: 
ToolCalls: [{'name': 'person_similarity', 'args': {'person_name': 'Kristof Neys'}, 'id': 'call_Q4nHC4m3dkOTHwyd33RrPtW1', 'type': 'tool_call'}]


In [46]:
response = llm_with_tools.invoke([HumanMessage(content="Which persons have Python and AWS experience?")])

print(f"ContentString: {response.content}")
print(f"ToolCalls: {response.tool_calls}")

ContentString: 
ToolCalls: [{'name': 'find_person_based_on_skills', 'args': {'skills': [{'name': 'Python'}, {'name': 'AWS'}]}, 'id': 'call_ur3qewzWYWeYBB3AJRsaBFh0', 'type': 'tool_call'}]


We can see that there's now no text content, but there is a tool call! It wants us to call the Tavily Search tool. This isn't calling that tool yet - it's just telling us to. In order to actually call it, we'll want to create our agent.

## Running Agents with LangGraph

In [47]:
agent_executor = create_react_agent(llm, tools)

In [48]:
response = agent_executor.invoke({"messages": [HumanMessage(content="hi!")]})

In [49]:
response["messages"]

[HumanMessage(content='hi!', additional_kwargs={}, response_metadata={}, id='a95683de-bf6e-4fa0-a02c-0341291d5dab'),
 AIMessage(content='Hello! How can I assist you today?', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 11, 'prompt_tokens': 227, 'total_tokens': 238, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-2024-08-06', 'system_fingerprint': 'fp_eb9dce56a8', 'finish_reason': 'stop', 'logprobs': None}, id='run-2cce8225-f110-42a6-8407-2d9a2fd1253a-0', usage_metadata={'input_tokens': 227, 'output_tokens': 11, 'total_tokens': 238, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})]

#### Run some examples! 

In [50]:
def ask_to_agent(question):
    for step in agent_executor.stream(
        {"messages": [HumanMessage(content=question)]},
        stream_mode="values",
    ):
        step["messages"][-1].pretty_print()

In [51]:
question = "What skills does Kristof Neys have?"

In [52]:
ask_to_agent(question)


What skills does Kristof Neys have?
Tool Calls:
  retrieve_skills_of_person (call_Bp9fFW5bJNCwOYfyW8avjVlt)
 Call ID: call_Bp9fFW5bJNCwOYfyW8avjVlt
  Args:
    person_name: Kristof Neys
Name: retrieve_skills_of_person

           name                      skills
0  Kristof Neys  [Cypher, Machine Learning]

Kristof Neys has skills in Cypher and Machine Learning.


In [53]:
question = "What skills are similar to PowerBI and Data Visualization?"

In [54]:
ask_to_agent(question)


What skills are similar to PowerBI and Data Visualization?
Tool Calls:
  find_similar_skills (call_5824RCHYUb1VTvNwSTyKtL14)
 Call ID: call_5824RCHYUb1VTvNwSTyKtL14
  Args:
    skills: [{'name': 'PowerBI'}, {'name': 'Data Visualization'}]
Name: find_similar_skills

           skill_name
0            Power BI
1             Tableau
2  Data Visualization
3       Data Analysis
4            Big Data

Skills similar to PowerBI and Data Visualization include:

1. Power BI
2. Tableau
3. Data Analysis
4. Big Data


In [55]:
question = "Which persons have similar skills as Daniel Hill?"

In [56]:
ask_to_agent(question)


Which persons have similar skills as Daniel Hill?
Tool Calls:
  person_similarity (call_WgKApcLECjs7NCBgrHGUWTGC)
 Call ID: call_WgKApcLECjs7NCBgrHGUWTGC
  Args:
    person_name: Daniel Hill
Name: person_similarity

      person_1                                           skills_1  \
0  Daniel Hill  [Pandas, System Design, Git, Cypher, Spring Boot]   
1  Daniel Hill  [Pandas, System Design, Git, Cypher, Spring Boot]   
2  Daniel Hill  [Pandas, System Design, Git, Cypher, Spring Boot]   
3  Daniel Hill  [Pandas, System Design, Git, Cypher, Spring Boot]   
4  Daniel Hill  [Pandas, System Design, Git, Cypher, Spring Boot]   

            person_2                                      skills_2     score  
0  Victoria Thompson                     [SQL, API Design, Cypher]  1.925430  
1        John Walker                  [API Design, Django, Python]  1.848923  
2      Thomas Nelson                        [Security, Pandas, Go]  1.000000  
3        Lucy Taylor  [System Design, Tableau, Flask

In [57]:
question = "Which persons have Python and AWS experience?"

In [58]:
ask_to_agent(question)


Which persons have Python and AWS experience?
Tool Calls:
  find_person_based_on_skills (call_8FvUvDPibBhxlKAEerAU6Hk5)
 Call ID: call_8FvUvDPibBhxlKAEerAU6Hk5
  Args:
    skills: [{'name': 'Python'}, {'name': 'AWS'}]
Name: find_person_based_on_skills

                person  score                similar_skills
0          John Taylor      3           [Pandas, AWS, Ruby]
1       Matthew Miller      2                   [AWS, Ruby]
2         John Johnson      2                 [Python, AWS]
3          John Walker      2              [Python, Django]
4           Ryan Young      2  [Python, Cloud Architecture]
5        Sophia Walker      2                 [Django, C++]
6        Charles Jones      2                 [Pandas, AWS]
7  Christopher Jackson      2              [Python, Django]
8          Brian Allen      2                [Pandas, Ruby]
9          John Garcia      2                   [AWS, Ruby]

The following persons have experience with Python and AWS:

1. **John Taylor** - Skil

## Chatbot

Now create a chatbot with the agent providing the responses

In [59]:
def user(user_message, history):
    if history is None:
        history = []
    history.append({"role": "user", "content": user_message})
    return "", history

def get_answer(history):
    steps = []
    full_prompt = "\n".join([f"{msg['role'].capitalize()}: {msg['content']}" for msg in history])
    
    for step in agent_executor.stream(
            {"messages": [HumanMessage(content=full_prompt)]},
            stream_mode="values",
    ):
        step["messages"][-1].pretty_print()
        steps.append(step["messages"][-1].content)
    
    return steps[-1]

def bot(history):
    bot_message = get_answer(history)
    history.append({"role": "assistant", "content": ""})

    for character in bot_message:
        history[-1]["content"] += character
        time.sleep(0.01)
        yield history

with gr.Blocks() as demo:
    chatbot = gr.Chatbot(
        label="Chatbot on a Graph",
        avatar_images=[
            "https://png.pngtree.com/png-vector/20220525/ourmid/pngtree-concept-of-facial-animal-avatar-chatbot-dog-chat-machine-illustration-vector-png-image_46652864.jpg",
            "https://d-cb.jc-cdn.com/sites/crackberry.com/files/styles/larger/public/article_images/2023/08/openai-logo.jpg"
        ],
        type="messages", 
    )
    msg = gr.Textbox(label="Message")
    clear = gr.Button("Clear")

    msg.submit(user, [msg, chatbot], [msg, chatbot], queue=False).then(
        bot, [chatbot], chatbot
    )

    clear.click(lambda: [], None, chatbot, queue=False)

demo.queue()
demo.launch(share=False)


* Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.




If you want to have the light-mode for the chatbot paste the following after the URL: /?__theme=light

### Text2Cypher

If time allows we can still experiment with the Text2Cypher functionality. 

In [60]:
text2cypher_prompt =  PromptTemplate.from_template(
    """
    Task: Generate a Cypher statement for querying a Neo4j graph database from a user input. 
    - Do not include triple backticks ``` or ```cypher or any additional text except the generated Cypher statement in your response.
    - Do not use any properties or relationships not included in the schema.
    
    Schema:
    {schema}
    
    #User Input
    {question}
    
    Cypher query:
    """
)

In [61]:
annotated_schema = """
    Nodes:
      Person:
        description: "A person in our talent pool."
        properties:
          name:
            type: "string"
            description: "The full name of the person. serves as a unique identifier."
          email:
            type: "string"
            description: "The email address of the person."
          leiden_community:
            type: "integer"
            description: "The talent community for the person.  People in the same talent segment share similar skills."
      Skill:
        description: "A professional skill."
        properties:
          name:
            type: "string"
            description: "The unique name of the skill."
    Relationships:
        KNOWS:
            description: "A person knowing a skill."
            query_pattern: "(:Person)-[:KNOWS]->(:Skill)"
    """

In [62]:
text2cypher_llm = ChatOpenAI(model=LLM, temperature=0)

In [63]:
@tool
def perform_aggregation_query(question: str) -> pd.DataFrame:
    """
    perform an aggregation query on the Neo4j graph database and obtain the results.
    """
    prompt = text2cypher_prompt.invoke({'schema': annotated_schema, 'question': question})
    query = text2cypher_llm.invoke(prompt).content
    print(f"executing Cypher query:\n{query}")
    return driver.execute_query(
        query,
        database_=DATABASE,
        routing_=RoutingControl.READ,
        result_transformer_= lambda r: r.to_df()
    )    

In [64]:
perform_aggregation_query('describe communities by skills') 

  perform_aggregation_query('describe communities by skills')


executing Cypher query:
MATCH (p:Person)-[:KNOWS]->(s:Skill)
RETURN s.name AS Skill, collect(DISTINCT p.leiden_community) AS Communities


Unnamed: 0,Skill,Communities
0,API Design,[84]
1,AWS,"[88, 34, 76, 41, 79]"
2,Agile,[76]
3,Angular,"[79, 98, 76, 88, 34, 41]"
4,Azure,"[76, 42, 79, 84]"
5,Big Data,"[88, 49, 34]"
6,Blockchain,"[98, 49]"
7,C++,"[76, 79]"
8,CI/CD,"[49, 34, 42, 88, 98, 41]"
9,CSS3,"[76, 34, 79, 41]"


In [65]:
perform_aggregation_query('how many people share skills with Isabella Allen, and what are the skills')

executing Cypher query:
MATCH (p1:Person {name: "Isabella Allen"})-[:KNOWS]->(s:Skill)<-[:KNOWS]-(p2:Person)
RETURN COUNT(DISTINCT p2) AS numberOfPeople, COLLECT(DISTINCT s.name) AS skillsSharedWithIsabellaAllen


Unnamed: 0,numberOfPeople,skillsSharedWithIsabellaAllen
0,29,"[Security, Scala, Cloud Architecture]"


In [66]:
perform_aggregation_query('Can you list me a 5 random person name from the database?')

executing Cypher query:
MATCH (p:Person) 
RETURN p.name 
ORDER BY rand() 
LIMIT 5


Unnamed: 0,p.name
0,Sophia Walker
1,Andrew King
2,Ava White
3,David Lopez
4,Olivia Johnson
