# Putting it all together with Neo4J

In this section we put everything we learned in previous sections into practice by creating an LLM agent that will answer user questions about a hospital. To do that we use two datasources: a vectorstore that contains documents on user reviews of the hospital and a Neo4J graph database containing the information about the hospitals, visits, doctors, payments, etc. Our agent will be correctly re-direct the user query to each datasource and will answer the question in the end. Let's begin!

In [15]:
%load_ext dotenv
%dotenv secrets/secrets.env

The dotenv extension is already loaded. To reload it, use:
  %reload_ext dotenv


In [7]:
import os
from langchain_community.document_loaders import CSVLoader
from langchain import hub
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import Chroma
from langchain_core.output_parsers import StrOutputParser
from langchain_openai import ChatOpenAI, OpenAIEmbeddings

os.environ["LANGCHAIN_PROJECT"] = "hospital-system"

In [8]:
loader = CSVLoader(file_path="data/reviews.csv",source_column="review")
documents = loader.load()

# Split text into chunks

text_splitter  = RecursiveCharacterTextSplitter(chunk_size=500,chunk_overlap=0)
text_chunks = text_splitter.split_documents(documents)

vectorstore = Chroma.from_documents(documents=text_chunks, 
                                    embedding=OpenAIEmbeddings(),
                                    persist_directory="data/vectorstore")

retriever = vectorstore.as_retriever()

In [9]:
retriever.invoke("Patient safety")

[Document(page_content="review_id: 678\nvisit_id: 9840\nreview: The hospital's commitment to patient safety was evident in their strict adherence to hygiene protocols. I felt secure throughout my stay, and the staff's dedication to cleanliness did not go unnoticed.\nphysician_name: Kristopher Wiley Jr.\nhospital_name: Jones, Brown and Murray\npatient_name: Jason Shepard", metadata={'row': 197, 'source': "The hospital's commitment to patient safety was evident in their strict adherence to hygiene protocols. I felt secure throughout my stay, and the staff's dedication to cleanliness did not go unnoticed."}),
 Document(page_content="review_id: 678\nvisit_id: 9840\nreview: The hospital's commitment to patient safety was evident in their strict adherence to hygiene protocols. I felt secure throughout my stay, and the staff's dedication to cleanliness did not go unnoticed.\nphysician_name: Kristopher Wiley Jr.\nhospital_name: Jones, Brown and Murray\npatient_name: Jason Shepard", metadata={'

In [12]:
question = "How is the patient safety in the hospital?"


chat_prompt = hub.pull('rlm/rag-prompt')

llm = ChatOpenAI(model="gpt-4", temperature=0)

review_chain = chat_prompt | llm | StrOutputParser()

docs = retriever.invoke(question)
review_chain.invoke({'question': question, 'context': docs})

'The patient safety in the hospital is reportedly high. The hospital shows a strong commitment to patient safety, evident in their strict adherence to hygiene protocols. Patients have expressed feeling secure throughout their stay and have confidence in the cleanliness of the facilities.'

In [14]:
from langchain.pydantic_v1 import BaseModel, Field
from langchain.prompts import ChatPromptTemplate
from typing import Literal


class QueryRouter(BaseModel):
    
    """Routes the question to either the vectorstore or the graph database"""
    
    datasource: Literal['vectorstore', 'graph', 'fallback'] = Field(...,description="The datasource to use for answering the user question. If the user question can be answered using the reviews about the hospital, the datasource should be set to 'vectorstore'. If the question should be answered using the information from a databse containing information about hospitals that a company manages, the datasource should be set to 'graph'. If the question can be answered using LLM's internal knowledge, the datasource should be set to 'fallback'")
    

query_llm = llm.with_structured_output(QueryRouter)

query_router_prompt = ChatPromptTemplate.from_template(
    """You are an expert at routing a user question to a vectorstore or to a graph database containing information from a hospital system. The vectorstore contains documents related to the user reviews of a hospital.
Use the vectorstore for questions that can be answered using peoples' opinions on the hospital. Otherwise, use graph to answer questions using the graph database containing information from a company database that manages several hospitals. If the question can be answered using LLM's internal knowledge, use fallback.\n\n
Question: {question}"""
)

query_routing_chain = (query_router_prompt | query_llm)

print(f"{question}: {query_routing_chain.invoke({'question': question})}")
system_q = "What are the specialities of all the doctors in all the hospitals?"
print(f"{system_q}: {query_routing_chain.invoke({'question': system_q})}")
fallback_q = "What is the capital of France?"
print(f"{fallback_q}: {query_routing_chain.invoke({'question': fallback_q})}")
    

How is the patient safety in the hospital?: datasource='vectorstore'
What are the specialities of all the doctors in all the hospitals?: datasource='graph'
What is the capital of France?: datasource='fallback'


In [21]:
from neo4j import GraphDatabase

NODES = ["Hospital", "Payer", "Physician", "Patient", "Visit", "Review"]

def _set_uniqueness_constraints(tx, node):
    query = f"""CREATE CONSTRAINT IF NOT EXISTS FOR (n:{node})
        REQUIRE n.id IS UNIQUE;"""
    _ = tx.run(query, {})
    

driver = GraphDatabase.driver(
    os.getenv('NEO4J_URI'),
    auth=(os.getenv('NEO4J_USERNAME'), os.getenv('NEO4J_PASSWORD'))
)
with driver.session(database="neo4j") as session:
    for node in NODES:
        session.execute_write(_set_uniqueness_constraints, node)
    


In [24]:
with driver.session(database="neo4j") as session:
        query = f"""
        LOAD CSV WITH HEADERS
        FROM 'https://drive.google.com/file/d/1Etm7rjCAg7jX-ATfC7PtRI4zebOq0Cfu/view?usp=sharing' AS hospitals
        MERGE (h:Hospital {{id: toInteger(hospitals.hospital_id),
                            name: hospitals.hospital_name,
                            state_name: hospitals.hospital_state}});
        """
        _ = session.run(query, {})

DatabaseError: {code: Neo.DatabaseError.Statement.ExecutionFailed} {message: At https://drive.google.com/file/d/1Etm7rjCAg7jX-ATfC7PtRI4zebOq0Cfu/view?usp=sharing @ position 1978 -  there's a field starting with a quote and whereas it ends that quote there seems to be characters in that field after that ending quote. That isn't supported. This is what I read: 'nQyAE":'}