# Graph DB Neo4j chain
지식 그래프 데이터베이스로 많이 사용되는 Neo4j와 LLM을 함께 사용하는 방법에 대해 알아봅니다

## Python Package
예시에서 사용할 몇 가지 주요 패키지를 설치합니다. 아래 주석처리된 명령어를 통해 Jupyter notebook 환경에서 바로 설치하실 수 있습니다.

In [1]:
# !pip install langchain -U
# !pip install neo4j
# !pip install python-dotenv

## Neo4j 설치
Docker 환경에서 Neo4j 데이터베이스 서버를 활성화 합니다. 
아래 Docker 컨테이너 실행 명령어를 통해 Neo4j를 설치하고 서버를 활성화 합니다. 
```
docker run \
--name neo4j \
-p 7474:7474 -p 7687:7687 \
-d \
-e NEO4J_AUTH=neo4j/pleaseletmein \
-e NEO4J_PLUGINS=\[\"apoc\"\]  \
-e NEO4J_apoc_export_file_enabled=true \
-e NEO4J_apoc_import_file_enabled=true \
-e NEO4J_apoc_import_file_use__neo4j__config=true \
neo4j:latest
```
위 명령어를 실행하면 Neo4j 서버가 실행되고, http://localhost:7474/ 에서 Neo4j Browser를 통해 Neo4j 서버에 접속할 수 있습니다.
참고로 Neo4j Browser의 기본 계정은 neo4j/neo4j 이지만 옵션으로 설정한 NEO4J_AUTH=neo4j/pleaseletmein 을 통해 neo4j/pleaseletmein 계정으로 접속할 수 있습니다.

## LangChain with Neo4j
- LangChain 패키지에서 지원하는 Neo4j 연결 작업을 수행합니다
- 예시 쿼리를 통해 신규 그래프 데이터를 추가하고 Neo4j 서버에 저장합니다
- LangChain으로 연결된 Neo4j 전용 Agent가 대화 과정에서 Neo4j를 참고하는지 테스트 합니다

In [2]:
from langchain.chat_models import ChatOpenAI
from langchain.chains import GraphCypherQAChain
from langchain.graphs import Neo4jGraph
from dotenv import load_dotenv
# load the environment variables from the .env file in the root directory of the project
load_dotenv(verbose=True, override=True)
# delete the load_dotenv function from the namespace
del load_dotenv



In [3]:
graph = Neo4jGraph(
    url="bolt://localhost:7687", 
    username="neo4j", 
    password="pleaseletmein"
)

In [4]:
# graph.query(
# """
# MERGE (m:Movie {name:"Top Gun"})
# WITH m
# UNWIND ["Tom Cruise", "Val Kilmer", "Anthony Edwards", "Meg Ryan"] AS actor
# MERGE (a:Actor {name:actor})
# MERGE (a)-[:ACTED_IN]->(m)
# """
# )

In [5]:
graph.refresh_schema()

In [6]:
print(graph.get_schema)


        Node properties are the following:
        []
        Relationship properties are the following:
        []
        The relationships are the following:
        []
        


In [7]:
# Query the graph
chain = GraphCypherQAChain.from_llm(
    ChatOpenAI(temperature=0), graph=graph, verbose=True
)

In [8]:
chain.run("Who played in Top Gun?")



[1m> Entering new  chain...[0m
Generated Cypher:
[32;1m[1;3mMATCH (p:Person)-[:ACTED_IN]->(m:Movie {title: 'Top Gun'})
RETURN p.name[0m
Full Context:
[32;1m[1;3m[][0m

[1m> Finished chain.[0m


'Tom Cruise played the lead role in the movie Top Gun.'

In [26]:
chain.run("Clean the graph database.")



[1m> Entering new  chain...[0m
Generated Cypher:
[32;1m[1;3mMATCH (n)
DETACH DELETE n[0m
Full Context:
[32;1m[1;3m[][0m

[1m> Finished chain.[0m


"I'm sorry, but without any specific information about the graph database or the cleaning process, I'm unable to provide a helpful answer. Could you please provide more details or clarify your question?"

## Synthesize Graph Data
- Theme: Food Ingredients and Recipes

In [27]:
import json
import ast
from langchain.llms import OpenAI
from langchain.prompts import PromptTemplate

In [28]:
synth_prompt = "Imagin a random dish. Describe the ingredients used, the cooking process, the taste of each ingredient, and the overall flavor of the dish."
synth_llm = OpenAI(temperature=0.7, verbose=True)
synth_result = synth_llm.predict(synth_prompt)
print(synth_result)



This dish is a savory Mediterranean pasta. It starts with a base of cooked spaghetti, tossed in a light sauce of garlic, extra virgin olive oil, white wine, parsley, and a pinch of red pepper flakes. The pasta is then topped with diced tomatoes, artichoke hearts, and kalamata olives. 

The garlic and red pepper flakes give the dish a hint of spicy heat, while the extra virgin olive oil adds a rich, buttery flavor. The tomatoes, artichoke hearts, and olives provide a contrasting salty and tangy bite.

The cooking process is simple yet flavorful. First, the pasta is boiled until al dente, then drained and tossed with the sauce. The diced tomatoes, artichoke hearts, and olives are added on top and lightly cooked with the sauce. Finally, the dish is served hot.

The overall flavor of the dish is a flavorful mix of savory and salty. The garlic, red pepper flakes, and olives give the dish a slightly spicy kick, while the tomatoes and artichoke hearts provide a tangy contrast. The extra vir

In [29]:
preprocess_instruction = f"List all the relational information that can be gleaned from the given text: {synth_result}"
preprocess_llm = OpenAI(temperature=0.7, verbose=True)
preprocess_result = preprocess_llm.predict(preprocess_instruction)
print(preprocess_result)



Relational information: 
- Pasta is cooked with garlic, extra virgin olive oil, white wine, parsley, and red pepper flakes
- Pasta is topped with diced tomatoes, artichoke hearts, and kalamata olives
- Garlic and red pepper flakes give the dish a hint of spicy heat
- Extra virgin olive oil adds a rich, buttery flavor
- Tomatoes, artichoke hearts, and olives provide a contrasting salty and tangy bite
- Overall flavor is a mix of savory and salty, with a slightly spicy kick and a tangy contrast


In [30]:
knowledge_graph_parsing_instruction = f"Divide and list all the information that can be gleaned from the given data into units suitable for storage in a graph database.: {preprocess_result}"
kg_llm = OpenAI(temperature=0.7, verbose=True)
kg_result = kg_llm.predict(knowledge_graph_parsing_instruction)
print(kg_result)



Node properties: 
- Ingredient: garlic, extra virgin olive oil, white wine, parsley, red pepper flakes, diced tomatoes, artichoke hearts, kalamata olives
- Flavor: savory, salty, spicy, tangy


In [31]:
kg_meta_node_labels = chain.run(f"Summarize the node labels")
kg_meta_edge_labels = chain.run(f"Summarize the relationship labels")



[1m> Entering new  chain...[0m
Generated Cypher:
[32;1m[1;3mMATCH (n)
RETURN DISTINCT labels(n) AS NodeLabels[0m
Full Context:
[32;1m[1;3m[][0m

[1m> Finished chain.[0m


[1m> Entering new  chain...[0m
Generated Cypher:
[32;1m[1;3mMATCH ()-[r]->()
RETURN DISTINCT type(r)[0m
Full Context:
[32;1m[1;3m[][0m

[1m> Finished chain.[0m


In [32]:
print(kg_meta_node_labels)
print(kg_meta_edge_labels)

I'm sorry, but without any provided information, I don't have the necessary details to summarize the node labels.
I'm sorry, but without any provided information, I don't have the necessary details to summarize the relationship labels.


In [33]:
graph_data_update_prompt = f"""
Node labels: {kg_meta_node_labels}
Relationship labels: {kg_meta_edge_labels}
Instruction: Save the contents into the database
Contents: {kg_result}
"""
# chain.run(f"Save the contents into the database:\n{kg_result}")
chain.run(graph_data_update_prompt)



[1m> Entering new  chain...[0m
Generated Cypher:
[32;1m[1;3mCREATE (i1:Ingredient {name: 'garlic'}),
       (i2:Ingredient {name: 'extra virgin olive oil'}),
       (i3:Ingredient {name: 'white wine'}),
       (i4:Ingredient {name: 'parsley'}),
       (i5:Ingredient {name: 'red pepper flakes'}),
       (i6:Ingredient {name: 'diced tomatoes'}),
       (i7:Ingredient {name: 'artichoke hearts'}),
       (i8:Ingredient {name: 'kalamata olives'}),
       (f1:Flavor {name: 'savory'}),
       (f2:Flavor {name: 'salty'}),
       (f3:Flavor {name: 'spicy'}),
       (f4:Flavor {name: 'tangy'})[0m
Full Context:
[32;1m[1;3m[][0m

[1m> Finished chain.[0m


"I'm sorry, but without any provided information, I don't have the necessary details to summarize the contents."

In [34]:
# test
graph_data_update_prompt = f"""
Node labels: {kg_meta_node_labels}
Relationship labels: {kg_meta_edge_labels}
Instruction: Save the contents into the database
Contents: {preprocess_result}
"""
chain.run(graph_data_update_prompt)



[1m> Entering new  chain...[0m
Generated Cypher:
[32;1m[1;3mCREATE (pasta:Ingredient {name: 'Pasta'})
CREATE (garlic:Ingredient {name: 'Garlic'})
CREATE (oliveOil:Ingredient {name: 'Extra Virgin Olive Oil'})
CREATE (wine:Ingredient {name: 'White Wine'})
CREATE (parsley:Ingredient {name: 'Parsley'})
CREATE (redPepperFlakes:Ingredient {name: 'Red Pepper Flakes'})
CREATE (tomatoes:Ingredient {name: 'Tomatoes'})
CREATE (artichokeHearts:Ingredient {name: 'Artichoke Hearts'})
CREATE (olives:Ingredient {name: 'Kalamata Olives'})

CREATE (pasta)-[:COOKED_WITH]->(garlic)
CREATE (pasta)-[:COOKED_WITH]->(oliveOil)
CREATE (pasta)-[:COOKED_WITH]->(wine)
CREATE (pasta)-[:COOKED_WITH]->(parsley)
CREATE (pasta)-[:COOKED_WITH]->(redPepperFlakes)
CREATE (pasta)-[:TOPPED_WITH]->(tomatoes)
CREATE (pasta)-[:TOPPED_WITH]->(artichokeHearts)
CREATE (pasta)-[:TOPPED_WITH]->(olives)

CREATE (garlic)-[:GIVES_HINT_OF]->(redPepperFlakes)
CREATE (oliveOil)-[:ADDS_FLAVOR]->(pasta)
CREATE (tomatoes)-[:PROVIDES_

'Based on the provided information, the contents that need to be saved into the database are the details about how pasta is cooked and topped. The pasta is cooked with garlic, extra virgin olive oil, white wine, parsley, and red pepper flakes. It is then topped with diced tomatoes, artichoke hearts, and kalamata olives. The garlic and red pepper flakes give the dish a hint of spicy heat, while the extra virgin olive oil adds a rich, buttery flavor. The tomatoes, artichoke hearts, and olives provide a contrasting salty and tangy bite. Overall, the flavor of the dish is a mix of savory and salty, with a slightly spicy kick and a tangy contrast.'

## Iteration

In [35]:
num_iters = 20
for _ in range(num_iters):
    synth_prompt = "Imagin a random dish. Describe the ingredients used, the cooking process, the taste of each ingredient, and the overall flavor of the dish."
    synth_llm = OpenAI(temperature=0.7, verbose=True)
    synth_result = synth_llm.predict(synth_prompt)
    print(synth_result)

    preprocess_instruction = f"List all the relational information that can be gleaned from the given text: {synth_result}"
    preprocess_llm = OpenAI(temperature=0.7, verbose=True)
    preprocess_result = preprocess_llm.predict(preprocess_instruction)
    print(preprocess_result)

    knowledge_graph_parsing_instruction = f"Divide and list all the information that can be gleaned from the given data into units suitable for storage in a graph database.: {preprocess_result}"
    kg_llm = OpenAI(temperature=0.7, verbose=True)
    kg_result = kg_llm.predict(knowledge_graph_parsing_instruction)
    print(kg_result)

    kg_meta_node_labels = chain.run(f"Summarize the node labels")
    kg_meta_edge_labels = chain.run(f"Summarize the relationship labels")

    try:
        graph_data_update_prompt = f"""
Node labels: {kg_meta_node_labels}
Relationship labels: {kg_meta_edge_labels}
Instruction: Save the contents into the database
Contents: {kg_result}
"""
        # chain.run(f"Save the contents into the database:\n{kg_result}")
        chain.run(graph_data_update_prompt)
    except Exception as e:
        print(f"Failed to update graph: {e}")

    try:
        # test
        graph_data_update_prompt = f"""
Node labels: {kg_meta_node_labels}
Relationship labels: {kg_meta_edge_labels}
Instruction: Save the contents into the database
Contents: {preprocess_result}
"""
        chain.run(graph_data_update_prompt)
    except Exception as e:
        print(f"Failed to update graph: {e}")




This dish is a hearty seafood stew. It includes fresh mussels, clams, shrimp, and squid, all cooked in a rich tomato-based broth. The mussels and clams provide a briny flavor, while the shrimp and squid add a hint of sweetness and a meaty texture. The tomato broth is infused with herbs such as oregano, basil, and thyme, adding a savory depth of flavor. Onions, garlic, and red peppers are sautéed in olive oil and added for a touch of sweetness and a slight kick of heat. 

The cooking process begins by sautéing the onions, garlic, and red peppers in olive oil until softened. The tomato-based broth is then added to the pan, along with the herbs and spices. The seafood is added, and the stew is cooked until all of the ingredients are cooked through. 

The overall flavor of the dish is a balance between the sweet and savory. The briny flavor of the seafood is offset by the sweetness of the tomatoes and onions, and the herbs and spices provide a depth of flavor. It is a hearty and comforti