# Professional Services Retreat | GenAI Workshop

# LLM Chat Notebook

This notebook will walkthrough how to build a simple LLM chat function.

## Imports

In [1]:
import os
import sys

sys.path.append("../../")

from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain.chains import GraphCypherQAChain
from langchain.chains.graph_qa.cypher import construct_schema
from langchain.graphs import Neo4jGraph
import json

from src.ps_genai_agents.prompts import create_graphqa_chain_cypher_prompt, create_final_summary_prompt
from src.ps_genai_agents.agents.graph.text2cypher.types.response import Response as Text2CypherResponse


(
    print(".env variables loaded!")
    if load_dotenv()
    else print("Unable to load .env variables.")
)

.env variables loaded!


## Graph Connection

The LangChain `Neo4jGraph` class will be used to connect to our Aura instance. It will be used to gather the graph schema and read from the database.

In [2]:
graph = Neo4jGraph(
    url=os.environ.get("IQS_NEO4J_URI"),
    username=os.environ.get("IQS_NEO4J_USERNAME"),
    password=os.environ.get("IQS_NEO4J_PASSWORD"),
    refresh_schema=True,
)

## Prompt Creation

The `ps-genai-agents` project contains functions to create Text2Cypher prompts easily. Since we'll be using LangChain's implementation of Text2Cypher we will only need to provide a file path to our query examples yaml. 

In [3]:
cypher_prompt = create_graphqa_chain_cypher_prompt(examples_yaml_path="../../data/iqs/queries/queries.yml")

## LLM Connection

We will use OpenAI LLMs for this workshop. You can try Text2Cypher with any LLM, but more recent LLMs will likely perform much better. Feel free to test older models such as `gpt-3.5` and compare results.

In [4]:
llm = ChatOpenAI(model="gpt-4o")

## Text2Cypher

We will use LangChain's [`GraphCypherQAChain`](https://python.langchain.com/v0.1/docs/integrations/graphs/neo4j_cypher/) to handle our Text2Cypher workflow. This Chain class will automatically retrieve the current graph schema and validate the generated Cypher behind the scenes.

A [chain](https://python.langchain.com/v0.1/docs/modules/chains/) refers to a sequence of calls. In this case these calls include the graph schema retrieval, Cypher generation and querying Neo4j.

In [5]:
chain = GraphCypherQAChain.from_llm(
    llm,
    graph=graph,
    cypher_prompt=cypher_prompt,
    verbose=True,
    return_direct=True,
    return_intermediate_steps=True,
)

We can see how the schema is formatted with the `GraphCypherQAChain` property `graph_schema`.

In [6]:
print(chain.graph_schema)

Node properties are the following:
Customer {id: STRING, ageBucket: STRING, gender: STRING},Category {id: STRING},Problem {id: STRING, problem: STRING},Question {id: INTEGER, question: STRING},Vehicle {id: STRING, totalProblems: INTEGER},Verbatim {id: STRING, verbatim: STRING, verbatimText: STRING, ageBucket: STRING, severity: FLOAT, gender: STRING, make: STRING, model: STRING, minAge: INTEGER, maxAge: INTEGER, adaEmbedding: LIST, titanEmbedding: LIST}
Relationship properties are the following:

The relationships are the following:
(:Customer)-[:SUBMITTED]->(:Verbatim),(:Problem)-[:HAS_CATEGORY]->(:Category),(:Question)-[:HAS_PROBLEM]->(:Problem),(:Vehicle)-[:HAS_CATEGORY]->(:Category),(:Vehicle)-[:HAS_VERBATIM]->(:Verbatim),(:Verbatim)-[:HAS_CATEGORY]->(:Category),(:Verbatim)-[:HAS_PROBLEM]->(:Problem),(:Verbatim)-[:HAS_QUESTION]->(:Question)


## Chat Function

Here we define a simple chat function to make our lives easier.

This function will:
* Generate Cypher from the user question
* Query the Neo4j database
* Summarize the query results
* Return a Response object containing call information and results

In [7]:
def chat(question: str) -> Text2CypherResponse:
    # Retrieve the Results from Neo4j
    r = chain(question)
    print(r)

    # Summarize the Results
    summary_prompt = create_final_summary_prompt(
        tool_execution_result=json.dumps(r["result"]), question=r["query"]
    )
    summary = llm.invoke(summary_prompt)
    return Text2CypherResponse(question=question, answer=summary.content, cypher=[r["intermediate_steps"][0]["query"]], cypher_result=[r["result"]])

## Questions

### Please summarize the verbatims for 2023 RDX for question 010 Trunk/TG Touch-Free Sensor DTU and create categories for the problems. As an output, I want the summary, corresponding categories and their verbatims

In [8]:
response = chat("Please summarize the verbatims for 2023 RDX for question 010 Trunk/TG Touch-Free Sensor DTU and create categories for the problems. As an output, I want the summary, corresponding categories and their verbatims")



[1m> Entering new GraphCypherQAChain chain...[0m


  r = chain(question)


Generated Cypher:
[32;1m[1;3m
MATCH (q:Question {id: 10})<-[:HAS_QUESTION]-(v:Verbatim {model: "RDX"})
WITH v
RETURN v.verbatim AS verbatim, v.make AS make, v.model AS model, v.ageBucket AS ageBucket, v.severity AS severity, v.gender AS gender, v.verbatimText AS verbatimText
[0m

[1m> Finished chain.[0m
{'query': 'Please summarize the verbatims for 2023 RDX for question 010 Trunk/TG Touch-Free Sensor DTU and create categories for the problems. As an output, I want the summary, corresponding categories and their verbatims', 'result': [{'verbatim': 'when I move my foot under sensor sometimes the tailgate opens and sometimes it does not. It seems to be hit or miss.', 'make': 'Acura', 'model': 'RDX', 'ageBucket': '65-69', 'severity': 2.0, 'gender': 'Male', 'verbatimText': "acura rdx exterior ext10: trunk/hatch/tailgate - touch-free sensor doesn't work consistently/dtu #010 trunk/tg touch-free sensor dtu when i move my foot under sensor sometimes the tailgate opens and sometimes it doe

In [9]:
response.display()


Question:
Please summarize the verbatims for 2023 RDX for question 010 Trunk/TG Touch-Free Sensor DTU and create categories for the problems. As an output, I want the summary, corresponding categories and their verbatims

Cypher:

MATCH (q:Question {id: 10})<-[:HAS_QUESTION]-(v:Verbatim {model: "RDX"})
WITH v
RETURN v.verbatim AS verbatim, v.make AS make, v.model AS model, v.ageBucket AS ageBucket, v.severity AS severity, v.gender AS gender, v.verbatimText AS verbatimText




Cypher Result:
[[{'verbatim': 'when I move my foot under sensor sometimes the tailgate opens and sometimes it does not. It seems to be hit or miss.', 'make': 'Acura', 'model': 'RDX', 'ageBucket': '65-69', 'severity': 2.0, 'gender': 'Male', 'verbatimText': "acura rdx exterior ext10: trunk/hatch/tailgate - touch-free sensor doesn't work consistently/dtu #010 trunk/tg touch-free sensor dtu when i move my foot under sensor sometimes the tailgate opens and sometimes it does not. it seems to be hit or miss."}, {'verbat

### What are the top 5 problems about seats for each age buckets for men over the age of 53?

In [10]:
response = chat("What are the top 5 problems about seats for each age buckets for men over the age of 53?")



[1m> Entering new GraphCypherQAChain chain...[0m
Generated Cypher:
[32;1m[1;3mcypher
MATCH (p:Problem {id: "SEAT23"})<-[:HAS_PROBLEM]-(v:Verbatim)
WHERE v.minAge > 53 AND v.gender = "Male" AND v.verbatimText CONTAINS 'seat'
WITH v.ageBucket AS ageBucket, p.problem AS problem, COLLECT(v.verbatim) AS responses
WITH ageBucket, problem, SIZE(responses) AS total, responses
WITH * ORDER BY ageBucket, total DESC
WITH ageBucket, COLLECT(problem) AS problems, COLLECT(total) AS totals, COLLECT(responses) AS responsesList
RETURN ageBucket, problems[..5] AS problem, totals[..5] AS total, responsesList[..5] AS responses
LIMIT 5
[0m

[1m> Finished chain.[0m
{'query': 'What are the top 5 problems about seats for each age buckets for men over the age of 53?', 'result': [{'ageBucket': '55-59', 'problem': ['SEAT23: Seat materials scuff/soil easily'], 'total': [2], 'responses': [['The seats pick up dirt easily, and are somewhat difficult to keep clean.', 'The alcantara front seating material in 

In [11]:
response.display()


Question:
What are the top 5 problems about seats for each age buckets for men over the age of 53?

Cypher:
cypher
MATCH (p:Problem {id: "SEAT23"})<-[:HAS_PROBLEM]-(v:Verbatim)
WHERE v.minAge > 53 AND v.gender = "Male" AND v.verbatimText CONTAINS 'seat'
WITH v.ageBucket AS ageBucket, p.problem AS problem, COLLECT(v.verbatim) AS responses
WITH ageBucket, problem, SIZE(responses) AS total, responses
WITH * ORDER BY ageBucket, total DESC
WITH ageBucket, COLLECT(problem) AS problems, COLLECT(total) AS totals, COLLECT(responses) AS responsesList
RETURN ageBucket, problems[..5] AS problem, totals[..5] AS total, responsesList[..5] AS responses
LIMIT 5




Cypher Result:
[[{'ageBucket': '55-59', 'problem': ['SEAT23: Seat materials scuff/soil easily'], 'total': [2], 'responses': [['The seats pick up dirt easily, and are somewhat difficult to keep clean.', 'The alcantara front seating material in the beige is easily soiled.']]}, {'ageBucket': '60-64', 'problem': ['SEAT23: Seat materials scuff/s

### What are the total responses under seat23 for honda civic, what is the male to female proportion for these responses and what is the problem for seat23?

In [12]:
response = chat("What are the total responses under seat23 for honda civic, what is the male to female proportion for these responses and what is the problem for seat23?")



[1m> Entering new GraphCypherQAChain chain...[0m
Generated Cypher:
[32;1m[1;3mcypher
MATCH (p:Problem {id: "SEAT23"})<-[:HAS_PROBLEM]-(v:Verbatim {make: "Honda", model: "Civic"})
WITH p.problem AS problem, COUNT(v) AS totalResponses, 
     SUM(CASE WHEN v.gender = "Male" THEN 1 ELSE 0 END) AS males,
     SUM(CASE WHEN v.gender = "Female" THEN 1 ELSE 0 END) AS females
RETURN totalResponses, males, females, toFloat(males) / (CASE WHEN females = 0 THEN 1 ELSE females END) AS maleToFemaleRatio, problem
[0m

[1m> Finished chain.[0m
{'query': 'What are the total responses under seat23 for honda civic, what is the male to female proportion for these responses and what is the problem for seat23?', 'result': [{'totalResponses': 10, 'males': 7, 'females': 3, 'maleToFemaleRatio': 2.3333333333333335, 'problem': 'SEAT23: Seat materials scuff/soil easily'}], 'intermediate_steps': [{'query': 'cypher\nMATCH (p:Problem {id: "SEAT23"})<-[:HAS_PROBLEM]-(v:Verbatim {make: "Honda", model: "Civic"}

In [13]:
response.display()


Question:
What are the total responses under seat23 for honda civic, what is the male to female proportion for these responses and what is the problem for seat23?

Cypher:
cypher
MATCH (p:Problem {id: "SEAT23"})<-[:HAS_PROBLEM]-(v:Verbatim {make: "Honda", model: "Civic"})
WITH p.problem AS problem, COUNT(v) AS totalResponses, 
     SUM(CASE WHEN v.gender = "Male" THEN 1 ELSE 0 END) AS males,
     SUM(CASE WHEN v.gender = "Female" THEN 1 ELSE 0 END) AS females
RETURN totalResponses, males, females, toFloat(males) / (CASE WHEN females = 0 THEN 1 ELSE females END) AS maleToFemaleRatio, problem




Cypher Result:
[[{'totalResponses': 10, 'males': 7, 'females': 3, 'maleToFemaleRatio': 2.3333333333333335, 'problem': 'SEAT23: Seat materials scuff/soil easily'}]]
            
Final Response:
Total responses: 10  
Male to Female Ratio: 2.33  
Problem: SEAT23: Seat materials scuff/soil easily
        
