# Using a SurrealDB Knowedge Graph and Gemini to build a Graph-RAG 

This Jupyter Notebook illustrates the implementation and utilization of a graph-based Retrieval-Augmented Generation (RAG) system using a knowledge graph stored in SurrealDB. It focuses on leveraging the knowledge graph created by the graph_extractor.ipynb notebook for question answering tasks.

### Key Functionality
####  Data Loading:
        Loads the raw text data from which will be used for context in the prompt.

#### Leveraging the Existing Knowledge Graph:

        Create a SurrQL function to query the knowledge graph using vector search for the descriptions of the entities and the other entities and relationships they have in the graph. This data will also be fed into the prompt for context.

        
#### Response Generation without a knowledge graph:
    Using a template prompt in the RAG_prompts.py file a simple context window of the text data loaded earlier with the question is asked of the LLM model.

    

#### Response Generation with knowledge graph:
    Using a template prompt in the RAG_prompts.py file a context window of the text data loaded earlier, plus the graph data retrieved from SurrealDB with the question is asked of the LLM model.


##### notes:
    This notebook utilizes libraries :
        surrealdb to interact with SurrealDB
        google.generativeai to interact with Gemini
        networkx to visualize the graph

    The notebook is loosely based on a larger repository microsoft/graphrag. The microsoft/graphrag repository provides a more comprehensive, modular graph-based RAG system. This notebook focuses specifically on the knowledge graph construction aspect.

In [1]:
import sys
import os
import json
import google.generativeai as genai
import os
import time
import ipynb_path
from surrealdb import Surreal

from IPython.display import display, Markdown,JSON
from IPython.display import JSON


#get this notebook's path for access to the other files needed
dir_path = os.path.dirname(os.path.realpath(ipynb_path.get(__name__)))
sys.path.append(dir_path) #add the current directory for adding py imports
from RAG_prompts import STD_RAG_PROMT, GRAPH_RAG_PROMT

In [2]:

# this folder
nb_folder = dir_path
input_file = nb_folder + "/Operation Dulce v2 1 1.md"

    
ip = "0.0.0.0:8080"
url = "ws://{0}/rpc".format(ip)

u = "root"
p = "root"
n = "graph_rag"
d = "graph_rag"
db_folder = nb_folder + "/db"

surrealdb_start = "surreal start --log none --allow-net --user {u} --pass {p} --bind {ip} \"rocksdb://{db_folder}\"".format(
    u=u,
    p=p,
    ip=ip,
    db_folder=db_folder)

#run this command if your surreal instance isn't running yet 
#copy and paste from below into a terminal
print(surrealdb_start)

#ensure the embedding service is running as well
"""
cd embedding_api
env FLASK_APP=embedding_api.py python -m flask --app embedding_api --debug run 
"""     

surreal start --log none --allow-net --user root --pass root --bind 0.0.0.0:8080 "rocksdb:///Users/sandro/Library/CloudStorage/GoogleDrive-apireno@gmail.com/My Drive/code/graph_rag/db"


'\ncd embedding_api\nenv FLASK_APP=embedding_api.py python -m flask --app embedding_api --debug run \n'

In [15]:

# get the text from the source file for use in our prompt
with open(input_file, "r") as f:
    input_text = f.read()


entity_types = ["PERSON", "PLACE"]
relation_type = "RELATED_TO"

retrieve_function_surql = """
DEFINE FUNCTION OVERWRITE fn::get_related_entities($text: string) {{
    
    LET $emb = fn::get_embeddings($text);
    
    RETURN SELECT id,description, ->{1}.{{out,strength,description}} FROM {0} 
        WHERE  embedding <|5,COSINE|> $emb;
}};
""".format(entity_types[0],relation_type)


async with Surreal(url) as db:
    await db.signin({"user": u, "pass": p}) 
    await db.use(n, d)
    outcome = await db.query(retrieve_function_surql)




DEFINE FUNCTION OVERWRITE fn::get_related_entities($text: string) {
    
    LET $emb = fn::get_embeddings($text);
    
    RETURN SELECT id,description, ->RELATED_TO.{out,strength,description} FROM PERSON 
        WHERE  embedding <|5,COSINE|> $emb;
};



In [16]:

#Let's test out vector/graph query
#we should see a collection of people and relationships
#where the person's description is related to the question 

async with Surreal(url) as db:
    await db.signin({"user": u, "pass": p}) 
    await db.use(n, d)
    get_the_graph_context_surql = "fn::get_related_entities('{0}')"
    outcome = await db.query(get_the_graph_context_surql.format("Who is the main character?"))

JSON(outcome)
    

<IPython.core.display.JSON object>

In [17]:
#this function will call the LLM with a prompt that includes the text of the original document
def get_std_LLM_response(the_model,the_question,the_input_text):

    messages = [
            {"role": "user", "parts": [{"text": STD_RAG_PROMT.format(
                question = the_question,
                documents = the_input_text
            )}]}
        ]

    
    response = model.generate_content(messages)
    return response
    
#this function will call the LLM with a prompt that includes the text of the original document
#plus the relevant entities and relationships from the knowledge graph bases on a vector search
async def get_graph_RAG_response(the_db,the_model,the_question,the_input_text):
    
    get_the_graph_context_surql = "fn::get_related_entities('{0}')"
    
    outcome = await the_db.query(get_the_graph_context_surql.format(the_question))

    messages = [
            {"role": "user", "parts": [{"text": GRAPH_RAG_PROMT.format(
                question = the_question,
                documents = the_input_text,
                knowledge_graph = outcome
            )}]}
        ]

    
    response = model.generate_content(messages)
    return response
    
       

In [18]:

# Set your API key (I've stored mine in an env variable)
genai.configure(api_key=os.getenv('GOOGLE_GENAI_API_KEY'))

#flash is faster and good enough
#model = "gemini-1.0-pro"
model = "gemini-1.5-flash"

model = genai.GenerativeModel(model)



In [19]:
#Lets test a question...

q = "Who is the main character and what motivates them?"

response = get_std_LLM_response(model,q,input_text)

print ("\n\n\n\n ------ STD RAG -------------------------\n\n\n\n")



display(Markdown(response.text))

print ("\n\n\n\n ------ GRAPH RAG -------------------------\n\n\n\n")

async with Surreal(url) as db:
    await db.signin({"user": u, "pass": p}) 
    await db.use(n, d)
    response = await get_graph_RAG_response(db,model,q,input_text)
    display(Markdown(response.text))








 ------ STD RAG -------------------------






Based on the provided text, Agent Alex Mercer is the main character.  His motivation is complex and evolves throughout the narrative. Initially, he seems driven by a sense of duty and compliance to the mission's protocols. However, as the operation progresses and the team encounters increasingly inexplicable phenomena, Mercer's inherent curiosity and desire to understand the unknown supersede his initial obedience. He demonstrates a growing commitment to uncovering the truth, even if it means questioning authority and established protocols.  His actions show a drive to both understand the alien intelligence and ensure humanity's safety and preparedness in this unprecedented encounter.  He becomes a leader, balancing the need for caution with a desire for discovery and fostering trust and collaboration amongst the team.






 ------ GRAPH RAG -------------------------






Based on the provided text and knowledge graph, Alex Mercer is the main character.  His motivations are multifaceted:

* **Duty and Compliance:** Initially, Alex shows a strong sense of duty and compliance to his superior, Taylor Cruz.  This is presented as a conflict with his own instincts.

* **Investigative Instincts:**  He possesses a strong investigative drive, a desire to "leave no stone unturned," and a belief in the importance of intuition and trust, which frequently clashes with Taylor Cruz's authoritarian approach.

* **Uncovering Truth:** Alex is deeply motivated by a desire to understand the truth behind Operation: Dulce and the alien communications. This is shown in his interactions with Jordan Hayes, where they share a mutual understanding of the deeper mysteries at play.

* **Protecting Humanity:** As the story progresses, Alex's motivations evolve to include a concern for the safety and future of humanity, demonstrated by his increasingly assertive leadership and emphasis on a cautious, understanding approach to the alien contact.  He takes on a more assertive and less compliant role as the narrative unfolds.

In short, Alex Mercer is driven by a complex interplay of duty, innate curiosity, a thirst for truth, and a growing sense of responsibility towards humanity's future in the face of extraterrestrial contact.


In [20]:

#Let's test a question that asks about the main character's relationships
q = "What places and people does Alex Mercer hold dearest?"

response = get_std_LLM_response(model,q,input_text)

print ("\n\n\n\n ------ STD RAG -------------------------\n\n\n\n")


display(Markdown(response.text))

print ("\n\n\n\n ------ GRAPH RAG -------------------------\n\n\n\n")

async with Surreal(url) as db:
    await db.signin({"user": u, "pass": p}) 
    await db.use(n, d)
    response = await get_graph_RAG_response(db,model,q,input_text)
    display(Markdown(response.text))







 ------ STD RAG -------------------------






Based on the provided text, Alex Mercer's strongest attachments are to his team members within the Paranormal Military Squad, particularly Dr. Jordan Hayes and Sam Rivera.  The documents repeatedly highlight his trust in and respect for Jordan's analytical skills, viewing their shared understanding as crucial for survival ("_their analytical mind sees the patterns I do. And that's worth more than protocol. That's the connection we need to survive this._").  He also shows mentorship and belief in Sam's abilities, emphasizing the importance of Sam's intuition alongside technology ("_Keen eyes will keep us ahead—or at least not blindsided_").  While he interacts with Taylor Cruz, their relationship is more one of professional obligation and sometimes conflict rather than close personal connection.  No other personal relationships or places are mentioned in the provided text.






 ------ GRAPH RAG -------------------------






Based on the provided text and knowledge graph, Alex Mercer's closest relationships are with Jordan Hayes and Sam Rivera.  The knowledge graph highlights a "close working relationship and mutual respect" with Jordan Hayes, built on shared analytical skills and intuition, and a mentorship role with Sam Rivera, valuing Sam's expertise and intuition.

Regarding places, the documents and knowledge graph strongly indicate Dulce Base as a significant location for Alex Mercer, where he spends considerable time participating in Operation: Dulce. Within Dulce Base, he interacts with his team in several locations including the briefing room, command center, mainframe room, and sub-level.  New Mexico is also mentioned as the location of Dulce Base, further associating it with Mercer.


In [21]:



q = """Based on Alex Mercer strongest relationships and the places most important to the plot choose a
    scene that is the most emotionally charged and important to make the cover of the book. 
    What chapter is this scene in?
    What would that cover look like?
"""

response = get_std_LLM_response(model,q,input_text)

print ("\n\n\n\n ------ STD RAG -------------------------\n\n\n\n")

display(Markdown(response.text))

print ("\n\n\n\n ------ GRAPH RAG -------------------------\n\n\n\n")

async with Surreal(url) as db:
    await db.signin({"user": u, "pass": p}) 
    await db.use(n, d)
    response = await get_graph_RAG_response(db,model,q,input_text)
    display(Markdown(response.text))






 ------ STD RAG -------------------------






Based on the provided text, Alex Mercer's strongest relationships are with Jordan Hayes and Sam Rivera.  The location most important to the plot is the Dulce Base command center.  Therefore, the most emotionally charged and important scene for the book cover would likely be one depicting the team in the command center during their first successful communication with the alien intelligence.  This occurs toward the end of the novel, and the exact chapter is difficult to pinpoint with certainty as the descriptions of the command center and the communication breakthrough are repeated across several chapters. However, the culmination of this effort could be found in **Chapter 15**, which contains the most detailed description of this momentous event.  Note that Chapters 9, 11, and 14 also contain descriptions of the event.


**Book Cover Description:**

The cover would feature a stylized image of the Dulce Base command center at night. The focus would be on the four agents—Alex Mercer, Jordan Hayes, Taylor Cruz, and Sam Rivera—silhouetted against a large bank of monitors displaying the alien code.  The monitors would show a blend of alien symbols and perhaps a few human-readable words, hinting at the successful communication. The overall mood should be one of awe, tension, and hope.

Specific elements:

* **Dominant colors:** Dark blues, purples, and greens to represent the underground base and the vastness of space.  A touch of brighter, warmer color (like a soft orange or yellow) could subtly emanate from the monitors, representing the alien signal.

* **Lighting:**  The lighting would be dramatic, focusing primarily on the monitors and the agents’ faces, leaving the background darker and mysterious.

* **Style:** The style could be realistic, but with a touch of stylized realism or even a slightly cinematic feel.  The image should be sharp and detailed enough to draw the reader in, but not so detailed as to overwhelm the overall mood.

* **Title:** The title, "Operation: Dulce," should be prominently displayed, possibly using a font that conveys both technological advancement and a sense of mystery.

* **Subtitle (optional):**  A subtitle such as "First Contact" or "A Symphony of Stars" could enhance the cover's appeal.


The overall effect should be a visually striking image that encapsulates the tension, mystery, and hope inherent in the story. It would grab the reader's attention and entice them to discover the secrets hidden within Dulce Base.






 ------ GRAPH RAG -------------------------






Based on the provided knowledge graph and documents, the most emotionally charged and important scene for the book cover centers around Alex Mercer's strongest relationships (Jordan Hayes and Sam Rivera) and the most significant location (Dulce Base's command center).  The scene where the team makes first contact and receives the alien response in the command center is the most emotionally resonant and pivotal moment in the narrative.  This occurs towards the end of the book, across several chapters but the most impactful scene depicting the initial contact occurs at the climax of the novel, and the subsequent chapters detail the ongoing response.

**Chapter:**  While the build-up happens across several chapters (primarily Chapters 9-15), the most impactful *initial* contact scene, best suited for the cover art, is arguably found within the later chapters, specifically the climax of Chapter 12 and the beginning of Chapter 11. These chapters describe the moment when the alien signal is fully decoded, leading to an initial response from the alien intelligence and the commencement of true interspecies communication.  The emotional intensity builds across these chapters.


**Cover Description:**

The cover art should depict the Dulce Base command center, a setting characterized by the documents as sterile, high-tech, and filled with a palpable sense of tension. The focus should be on Alex Mercer, positioned centrally, and partially obscured by a bright, ethereal glow emanating from the main monitors.  This light should represent the alien signal, and its dynamic and shifting colors should suggest the mysterious nature of the communication.  

Around Alex Mercer, slightly less prominently but still visible, should be Jordan Hayes and Sam Rivera, their faces showing expressions of awe, apprehension, and perhaps even fear mixed with excitement. The expressions should subtly reflect the gravity of the moment: awe at the contact, apprehension about the unknown, and excitement about the potential for discovery. Taylor Cruz could be partially visible in the background, their expression perhaps more reserved, conveying their characteristic cautiousness and concern for the mission’s security, subtly contrasting the emotional responses of the others.

The overall color palette should be a mixture of cool blues and greens (representing the technology and the sterile environment of the command center) with splashes of bright, almost otherworldly, light, symbolizing the alien contact. The title, "Operation: Dulce," should be prominently displayed, perhaps overlaid on the ethereal light, with the author’s name below. The style could be a blend of science fiction and thriller aesthetics, conveying both the high-stakes nature of the mission and the wonder of first contact.  The overall mood should be one of awe, tension, and anticipation.


In [25]:


q = """List the places that Alex visits and the frequency he visits them plus who he visited them with in the order of importance to Alex. 
output the response in markdown language
"""

response = get_std_LLM_response(model,q,input_text)

print ("\n\n\n\n ------ STD RAG -------------------------\n\n\n\n")

display(Markdown(response.text))

print ("\n\n\n\n ------ GRAPH RAG -------------------------\n\n\n\n")

async with Surreal(url) as db:
    await db.signin({"user": u, "pass": p}) 
    await db.use(n, d)
    response = await get_graph_RAG_response(db,model,q,input_text)
    display(Markdown(response.text))





 ------ STD RAG -------------------------






Based on the provided text, here's a list of places Alex visits, the frequency, and who he's with, ordered by apparent importance to Alex:

| Location                     | Frequency       | People Visited With                                      | Importance Rationale                                                                                                |
|------------------------------|-----------------|---------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------|
| Dulce Base Command Center     | Multiple Times  | Jordan Hayes, Taylor Cruz, Sam Rivera                     | Core location of the mission's central activity; most interactions happen here with the whole team.                   |
| Dulce Base Corridors/Hubs    | Multiple Times  | Jordan Hayes, Taylor Cruz, Sam Rivera                     | Essential for navigating the base and progressing the mission; scenes often involve the whole team.                  |
| Briefing Room                 | Twice            | Jordan Hayes, Taylor Cruz, Sam Rivera, (Sam Rivera alone once) | Initial mission briefing and later debriefing, highlighting mission parameters and strategic overview.             |
| Cramped Underground Meeting Room | Once             | Jordan Hayes, Taylor Cruz, Sam Rivera                     |  Immediately before entering Dulce Base, setting the tone of anticipation and shared responsibility.                |
| Deserted Corridor | Once | Taylor Cruz | Illustrates Cruz's ambition and control.  |
| Elevator Shaft | Once | Jordan Hayes, Taylor Cruz, Sam Rivera | Descent into Dulce; symbolic of the journey into the unknown. |
| Sam Rivera's Office | Once | Sam Rivera | A private conversation with Sam, highlighting trust and intuition. |
| Dr. Jordan Hayes' Lab | Once | Jordan Hayes | A discussion about the larger implications of the mission with Jordan. |


**Rationale for Importance Ordering:**

The order reflects the narrative emphasis and the amount of time spent at each location. The Dulce Base Command Center and its corridors are central to the story's plot and the majority of the team's interactions with Alex. The briefing room, while important for establishing the mission, is visited less frequently.  The other locations serve specific narrative purposes but are not central to the overall mission's execution in the same way.






 ------ GRAPH RAG -------------------------






Based on the provided text and knowledge graph, here's a markdown table summarizing Alex Mercer's location visits, ordered by the strength of his relationship with the location (as indicated by the knowledge graph's strength scores), including the people he visited those places with:


| Place                     | Strength | Frequency | People Visited With                     | Notes                                                                          |
|--------------------------|----------|-----------|-----------------------------------------|---------------------------------------------------------------------------------|
| Dulce Base                | 9        | Multiple   | Taylor Cruz, Jordan Hayes, Sam Rivera     | Central location for Operation: Dulce; mentioned in almost every chapter        |
| Mainframe Room            | 8        | Multiple   | Taylor Cruz, Sam Rivera, Jordan Hayes     | Significant activity related to alien communications; multiple mentions           |
| Sub-Level                 | 8        | Multiple   | Taylor Cruz, Jordan Hayes, Sam Rivera     | Descent into the base's deeper levels; mentioned in multiple chapters             |
| Briefing Room             | 7        | Once       | Taylor Cruz, Jordan Hayes, Sam Rivera     | Initial briefing for Operation: Dulce; mentioned in Chapter 1 and Chapter 3     |
| Command Center            | 9        | Multiple   | Taylor Cruz, Jordan Hayes, Sam Rivera     | Frequent presence for monitoring and operation control                         |
| Server Room (implied)     | -        | Once       | Sam Rivera                               | Inferred from Chapter 2 interaction with Sam.                                  |
| New Mexico (Dulce Location)| 9        | Multiple   | Taylor Cruz, Jordan Hayes, Sam Rivera     | Overarching location of Dulce Base; mentioned repeatedly                         |
| Lab (implied)            | -        | Once       | Jordan Hayes                               | Inferred from Chapter 3 interaction with Jordan in a lab setting.              |
| Deserted Corridor (implied) | -        | Once       |  None mentioned                            | Inferred from Chapter 1, a fleeting moment before the briefing.                |
| Underground Meeting Room (implied) | -        | Once       | Taylor Cruz, Jordan Hayes, Sam Rivera     | Chapter 2, before the descent.                                                    |
| Elevator Shaft            | -        | Once       | Taylor Cruz, Jordan Hayes, Sam Rivera     | Descent into the sub-level; Chapter 2                                         |


**Note:**  The frequency and specific individuals present are inferred from the narrative, as the documents don't explicitly list every single visit. Strength scores are directly from the knowledge graph.  Places without explicit strength scores are based on context from the narrative. The order is based on the average strength scores.  If two places have the same average strength score, the order is based on the frequency of mentions.


In [7]:
imagen = genai.ImageGenerationModel("imagen-3.0-generate-001")

result = imagen.generate_images(
    prompt="Fuzzy bunnies in my kitchen",
    number_of_images=4,
    safety_filter_level="block_only_high",
    person_generation="allow_adult",
    aspect_ratio="3:4",
    negative_prompt="Outside",
)

for image in result.images:
  print(image)

# The output should look similar to this:
# <vertexai.preview.vision_models.GeneratedImage object at 0x78f3396ef370>
# <vertexai.preview.vision_models.GeneratedImage object at 0x78f3396ef700>
# <vertexai.preview.vision_models.GeneratedImage object at 0x78f33953c2b0>
# <vertexai.preview.vision_models.GeneratedImage object at 0x78f33953c280>

for image in result.images:
  # Open and display the image using your local operating system.
  image._pil_image.show()

NotFound: 404 models/imagen-3.0-generate-001 is not found for API version v1beta, or is not supported for predict. Call ListModels to see the list of available models and their supported methods.