# Neo4j integration

LangChain is a popular framework for building applications powered by large language models (LLMs).

Integrating Neo4j with LangChain allows developers to build advanced AI applications that can reason over graph data, generate Cypher queries, and provide context-aware answers.

Neo4j supports LangChain with dedicated modules and tools for working with graph databases, making it easier to build applications that leverage both LLMs and graph data.

## Capabilities

LangChain & Neo4j integration supports:


- **Contextual Retrieval** - Retrieve relevant subgraphs or nodes from Neo4j to provide context for LLM-powered answers.

- **Querying Graph Data with Natural Language** - Use LLMs to translate user questions into Cypher queries, enabling natural language access to graph data.

- **Automated Reasoning** - Combine the reasoning abilities of LLMs with the structured relationships in Neo4j for more accurate and insightful responses.

- **Conversational AI** - Build chatbots and assistants that can answer questions about complex, connected data stored in Neo4j.

- **Knowledge Graph Construction** - Automatically construct knowledge graphs from unstructured data using LLMs, and store them in Neo4j for further analysis.


# Simple LangChain Agent



Throughout this course, you will be adapting a simple LangChain agent to interact with Neo4j.

You will update the agent to query a Neo4j graph database, retrieve information using RAG and GraphRAG, and dynamically generate Cypher queries based on user input.

In this lesson, you will review the agents code to understand how it works.

The application is a simple [LangGraph](https://www.langchain.com/langgraph) agent that has 2 steps:

1. Retrieve information

2. Generate an answer based on the retrieved information

The code has 4 main sections:

1. Create an LLM and Prompt

2. Define the application state

3. Create the application workflow

4. Invoke the application


In [1]:
from dotenv import load_dotenv
load_dotenv()

from langchain.chat_models import init_chat_model
from langgraph.graph import START, StateGraph
from langchain_core.prompts import PromptTemplate
from typing_extensions import List, TypedDict

# Initialize the LLM
model = init_chat_model("gpt-4o", model_provider="openai")

# Create a prompt
template = """Use the following pieces of context to answer the question at the end.
If you don't know the answer, just say that you don't know, don't try to make up an answer.

{context}

Question: {question}

Answer:"""

prompt = PromptTemplate.from_template(template)

# Define state for application
class State(TypedDict):
    question: str
    context: List[dict]
    answer: str

# Define functions for each step in the application

# Retrieve context 
def retrieve(state: State):
    context = [
        {"location": "London", "weather": "Cloudy, sunny skies later"},
        {"location": "San Francisco", "weather": "Sunny skies, raining overnight."},
    ]
    return {"context": context}

# Generate the answer based on the question and context
def generate(state: State):
    messages = prompt.invoke({"question": state["question"], "context": state["context"]})
    response = model.invoke(messages)
    return {"answer": response.content}

# Define application steps
workflow = StateGraph(State).add_sequence([retrieve, generate])
workflow.add_edge(START, "retrieve")
app = workflow.compile()

# Run the application
question = "What is the weather in San Francisco?"
response = app.invoke({"question": question})
print("Answer:", response["answer"])

Answer: Sunny skies, raining overnight.


# Neo4jGraph


You can query Neo4j from a LangChain application using the [Neo4jGraph](https://python.langchain.com/api_reference/neo4j/graphs/langchain_neo4j.graphs.neo4j_graph.Neo4jGraph.html) class. The `Neo4jGraph` class acts as the connection to the database when using other LangChain components, such as retrievers and agents.

In this lesson, you will modify the simple LangChain agent to be able to answer questions about a graph database schema.

The database contains information about movies, actors, and user ratings.


## Query Neo4j

To query Neo4j you need to:

1. Create a Neo4jGraph instance and connect to a database

2. Run a Cypher statement to get data from the database

In [2]:
import os
from dotenv import load_dotenv
load_dotenv()

True

In [3]:
from langchain_neo4j import Neo4jGraph

# Create Neo4jGraph instance
graph = Neo4jGraph(
    url=os.getenv("NEO4J_URI"),
    username=os.getenv("NEO4J_USERNAME"), 
    password=os.getenv("NEO4J_PASSWORD"),
)

In [5]:
# Run a query and print the result
result = graph.query("""
MATCH (m:Movie {title: "Toy Story"})<-[a:ACTED_IN]-(p:Person)
RETURN p.name AS actor, a.role AS role
""")

print(result)

[{'actor': 'Jim Varney', 'role': 'Slinky Dog (voice)'}, {'actor': 'Tim Allen', 'role': 'Buzz Lightyear (voice)'}, {'actor': 'Tom Hanks', 'role': 'Woody (voice)'}, {'actor': 'Don Rickles', 'role': 'Mr. Potato Head (voice)'}]


## Schema

You are going to modify the agent to retrieve the database schema and add it to the context.

You can view the database schema using the Cypher query

```cypher
CALL db.schema.visualization()
```

In [6]:
import os
from dotenv import load_dotenv
load_dotenv()

from langchain.chat_models import init_chat_model
from langgraph.graph import START, StateGraph
from langchain_core.prompts import PromptTemplate
from typing_extensions import List, TypedDict
from langchain_neo4j import Neo4jGraph

# Connect to Neo4j
graph = Neo4jGraph(
    url=os.getenv("NEO4J_URI"),
    username=os.getenv("NEO4J_USERNAME"), 
    password=os.getenv("NEO4J_PASSWORD"),
)

# Initialize the LLM
model = init_chat_model("gpt-4o", model_provider="openai")

# Create a prompt
template = """Use the following pieces of context to answer the question at the end.
If you don't know the answer, just say that you don't know, don't try to make up an answer.

{context}

Question: {question}

Answer:"""

prompt = PromptTemplate.from_template(template)

# Define state for application
class State(TypedDict):
    question: str
    context: List[dict]
    answer: str

# Define functions for each step in the application

# Retrieve context 
def retrieve(state: State):
    context = graph.query("CALL db.schema.visualization()")
    return {"context": context}

# Generate the answer based on the question and context
def generate(state: State):
    messages = prompt.invoke({"question": state["question"], "context": state["context"]})
    response = model.invoke(messages)
    return {"answer": response.content}

# Define application steps
workflow = StateGraph(State).add_sequence([retrieve, generate])
workflow.add_edge(START, "retrieve")
app = workflow.compile()

# Run the application
question = "How is the graph structured?"
response = app.invoke({"question": question})
print("Answer:", response["answer"])

Answer: The graph is structured with nodes and relationships. Here is a summary of the structure:

Nodes:
1. `Movie`: Contains an index `plotEmbedding` and a uniqueness constraint on `movieId`.
2. `User`: Has a uniqueness constraint on `userId`.
3. `Actor`: No specific indexes or constraints.
4. `Director`: No specific indexes or constraints.
5. `Genre`: Has a uniqueness constraint on `name`.
6. `Person`: Has a uniqueness constraint on `tmdbId`.

Relationships:
1. `ACTED_IN`: Connects `Person` or `Actor` or `Director` nodes to `Movie` nodes.
2. `RATED`: Connects `User` nodes to `Movie` nodes.
3. `DIRECTED`: Connects `Actor` or `Director` or `Person` nodes to `Movie` nodes.
4. `IN_GENRE`: Connects `Movie` nodes to `Genre` nodes.

Each relationship describes an interaction or association between the different types of nodes, such as acting in or directing a movie, rating a movie, or categorizing a movie by genre.
