**Table of contents**<a id='toc0_'></a>    
- [LangGraph YoMap Demo using Neo4j](#toc1_)    
- [GraphQACypherChain](#toc2_)    
- [Advanced implementation with LangGraph](#toc3_)    
  - [Tools](#toc3_1_)    
    - [fetch_profile_information_from_neo4j](#toc3_1_1_)    
    - [get_service_provider_from_neo4j](#toc3_1_2_)    
  - [Question Validator](#toc3_2_)    

<!-- vscode-jupyter-toc-config
	numbering=false
	anchor=true
	flat=false
	minLevel=1
	maxLevel=6
	/vscode-jupyter-toc-config -->
<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->

# <a id='toc1_'></a>[LangGraph YoMap Demo using Neo4j](#toc0_)

In [149]:
import os

from neo4j import GraphDatabase

from langchain_neo4j import Neo4jGraph
from langchain_neo4j import GraphCypherQAChain
from langchain_openai import ChatOpenAI
from langchain_openai import OpenAIEmbeddings

from langgraph.graph.message import add_messages
from langgraph.graph import END, StateGraph, START
from langchain_core.tools import tool

import pandas as pd


In [48]:
NEO4J_URI = "neo4j+s://yomap-neo-dev.fly.dev:7687"
NEO4J_USERNAME = "neo4j"
NEO4J_PASSWORD = "5GTAmrydyizxgRcY"

In [160]:
# Connect to the Neo4j database
driver = GraphDatabase.driver(NEO4J_URI, auth=(NEO4J_USERNAME, NEO4J_PASSWORD))

In [49]:
os.environ["NEO4J_URI"] = NEO4J_URI
os.environ["NEO4J_USERNAME"] = NEO4J_USERNAME
os.environ["NEO4J_PASSWORD"] = NEO4J_PASSWORD

In [52]:
graph = Neo4jGraph()

In [53]:
graph.refresh_schema()
print(graph.schema)

Node properties:
Profile {address: STRING, about: STRING, profile_embedding: LIST, userId: STRING, location: POINT, tags: LIST, service: STRING, displayName: STRING, gender: STRING, photo: STRING, profileId: STRING, age: INTEGER}
Tag {name: STRING}
Relationship properties:

The relationships:
(:Profile)-[:HAS_TAG]->(:Tag)
(:Profile)-[:CONNECTS]->(:Profile)


In [54]:
enhanced_graph = Neo4jGraph(enhanced_schema=True)
print(enhanced_graph.schema)



Node properties:
- **Profile**
  - `address`: STRING Example: "Panamá, , Panamá, Panama"
  - `about`: STRING Example: "I'm a user-focused professional with expertise in "
  - `userId`: STRING Example: "DGnrQapGUTPhGJ6Noq2GuuDCzM82"
  - `location`: POINT 
  - `tags`: LIST Min Size: 1, Max Size: 6
  - `service`: STRING Available options: ['user', 'misc', 'home', 'repair', 'health', 'food', 'pets', 'spa', 'transport', 'education']
  - `displayName`: STRING Example: "Arizza"
  - `gender`: STRING Available options: ['', 'female', 'male', 'other', 'lgbtiplus']
  - `photo`: STRING Example: ""
  - `profileId`: STRING Example: "32z0evf2wJBkKNMaZYql"
  - `age`: INTEGER Min: 15, Max: 1011977
- **Tag**
  - `name`: STRING Example: "pets"
Relationship properties:

The relationships:
(:Profile)-[:HAS_TAG]->(:Tag)
(:Profile)-[:CONNECTS]->(:Profile)


# <a id='toc2_'></a>[GraphQACypherChain](#toc0_)

In [56]:
llm = ChatOpenAI(model="gpt-4o", temperature=0)
chain = GraphCypherQAChain.from_llm(
    graph=enhanced_graph, llm=llm, verbose=True, allow_dangerous_requests=True
)

In [58]:
response = chain.invoke({"query": "Can you searc by profiles with tag spa"})
response



[1m> Entering new GraphCypherQAChain chain...[0m
Generated Cypher:
[32;1m[1;3mcypher
MATCH (p:Profile)-[:HAS_TAG]->(t:Tag {name: "spa"})
RETURN p
[0m
Full Context:
[32;1m[1;3m[{'p': {'address': '132, Calle C, San Miguelito, Provincia de Panamá, Panama', 'gender': 'female', 'service': 'spa', 'displayName': 'Emulator 66980917', 'profileId': 'xGilUU4gHHdww4v3UWC7', 'about': "I'm a skilled and experienced professional specializing in spa treatments and manicures/pedicures (manos y pies).  My expertise lies in providing relaxing and rejuvenating experiences that leave clients feeling pampered and refreshed. I'm passionate about creating a tranquil atmosphere and delivering exceptional results.\n", 'photo': '', 'location': POINT(-79.5009471103549 9.03795683773441), 'userId': 'ogey2VejTNQ4q3HnIeQ5p5veipo1', 'age': 32, 'tags': ['spa', 'manos_y_pies'], 'profile_embedding': [-0.015384565107524395, 0.013041668571531773, 0.015411650761961937, -0.025203058496117592, 0.0013475037412717938, 

{'query': 'Can you searc by profiles with tag spa',
 'result': 'Emulator 66980917, Angela, Readet Ferche, and YM Soporte stage have profiles with the tag spa.'}

# <a id='toc3_'></a>[Advanced implementation with LangGraph](#toc0_)

In [61]:
from operator import add
from typing import Annotated, List

from typing_extensions import TypedDict


class InputState(TypedDict):
    question: str


class OverallState(TypedDict):
    question: str
    next_action: str
    cypher_statement: str
    cypher_errors: List[str]
    database_records: List[dict]
    steps: Annotated[List[str], add]


class OutputState(TypedDict):
    answer: str
    steps: List[str]
    cypher_statement: str


# LangGraph Utils
class State(TypedDict):
    messages: Annotated[list, add_messages]



## <a id='toc3_1_'></a>[Tools](#toc0_)

### <a id='toc3_1_1_'></a>[fetch_profile_information_from_neo4j](#toc0_)

In [161]:
@tool
def fetch_profile_information_from_neo4j(userId):
    """Fetch the current user info based on the displayName of the user.
    Use this tools at the begining to know the user information and
    interact with him/her using the real name.
    """
    
    profile_query = """
        MATCH (p:Profile {userId: $userId})
        RETURN 
            p.displayName AS displayName, 
            p.age AS age, 
            p.gender AS gender, 
            p.about AS about, 
            p.tags AS tags, 
            p.address AS address, 
            p.profileId AS profileId, 
            p.location AS location
        LIMIT 1;
    """

    with driver.session() as session:
        result = session.run(
            profile_query,
            userId=userId
        )

        # Print results
        return result.single()
    
    driver.close()

In [166]:
result = fetch_profile_information_from_neo4j("xG1lCeVE7EgLwXVEfPciGGWaDLn1")

In [167]:
dict(result)

{'displayName': 'Noel 2',
 'age': 44,
 'gender': 'male',
 'about': "I'm a passionate food professional with expertise in [mention a specific area within food, e.g., recipe development, culinary management, food writing]. My skills encompass [mention 1-2 relevant skills, e.g., menu creation, ingredient sourcing, food styling], enabling me to deliver high-quality results across diverse food-related projects.\n",
 'tags': ['food'],
 'address': '10506, Calle 74 Este, Panamá, Provincia de Panamá, Panamá',
 'profileId': 'oH6kfEwBhELwINiidDwJ',
 'location': POINT(-79.5046233 8.9907664)}

### <a id='toc3_1_2_'></a>[get_service_provider_from_neo4j](#toc0_)

In [203]:
@tool
def get_service_provider_from_neo4j(userId):
    """Get all the service providers based on the tag
    or the name of the category. Use this tool each time the user
    request some information about a particular service.

    Example: if the user ask about spa, then you use this tool with tag = spa
    """

    embeddings = OpenAIEmbeddings()
    
    # Convert question to embedding using OpenAI
    question_embedding = embeddings.embed_query("cerrajeria automotriz")

    tag = "cerrajeria"

    query = """
        MATCH (user:Profile {userId: $userId})
        MATCH (provider:Profile)-[:HAS_TAG]->(:Tag {name: $tag})
        WHERE point.distance(provider.location, user.location) / 1000 < 30

        OPTIONAL MATCH path = shortestPath((user)-[:CONNECTS*1..5]-(provider))

        WITH
            gds.similarity.cosine(user.profile_embedding, $question_embedding) AS similarity,
            provider,
            path,
            point.distance(provider.location, user.location) / 1000 AS distance_km,
            CASE 
                WHEN path IS NOT NULL THEN length(path) 
                ELSE -1
            END AS path_length

        RETURN 
            provider.displayName AS provider_name,
            distance_km,
            similarity,
            path_length,
            path

        ORDER BY
            similarity DESC,
            distance_km ASC, 
            path_length ASC
        LIMIT 5;
    """

    with driver.session() as session:
        result = session.run(
            query,
            userId=userId,
            tag=tag,
            question_embedding=question_embedding
        )

        # Print results
        for record in result:
            print(f"Provider: {record['provider_name']} | Similarity: {record['similarity']:.3f} | Distance: {record['distance_km']:.2f} km | Path: {record['path_length']:.0f}")
    
    driver.close()


In [209]:
result =  get_service_provider_from_neo4j("MHVYdqsQwLPG6VxmNc3a3pWOqw82")

  with driver.session() as session:


Provider: HomeSecurePTY | Similarity: 0.720 | Distance: 2.33 km | Path: 1


In [197]:
result

## <a id='toc3_2_'></a>[Question Validator](#toc0_)

In [213]:
from FlagEmbedding import BGEM3FlagModel

model = BGEM3FlagModel('BAAI/bge-m3',  
                       use_fp16=True) # Setting use_fp16 to True speeds up computation with a slight performance degradation

sentences_1 = ["Quiero dar servicios de plomeria en la ciudad de panama"]
sentences_2 = ["Home/Plumber", 
               "Hogar/Plomero",
               "Reparaciones/Hidraulicas"]

embeddings_1 = model.encode(sentences_1, 
                            batch_size=12, 
                            max_length=8192, # If you don't need such a long length, you can set a smaller value to speed up the encoding process.
                            )['dense_vecs']
embeddings_2 = model.encode(sentences_2)['dense_vecs']
similarity = embeddings_1 @ embeddings_2.T
print(similarity)

Fetching 30 files:   0%|          | 0/30 [00:00<?, ?it/s]

You're using a XLMRobertaTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.


[[0.4949 0.4292 0.4514]]
