# Using SPARQL and the Knowledge Graph for RAG
Now that the Knowledge Graph has been created we will use it to do local and global search.

![](../images/RAG_with_KnowledgeGraph.png)

In [2]:
import pandas as pd
import os
import urllib.parse
import ast
from io import StringIO
from SPARQLWrapper import SPARQLWrapper, CSV, SELECT, POST, POSTDIRECTLY
import os
import numpy as np
from typing import Dict, Any, List
from openai import OpenAI
from langchain_openai import OpenAIEmbeddings
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from elasticsearch import Elasticsearch
import json
from tqdm import tqdm

In [3]:
# Adjust pandas display settings
pd.set_option(
    "display.max_colwidth", None
)  # Set to None to display the full column width
pd.set_option("display.max_columns", None)

In [11]:
# endpoint for GraphDB
endpoint = "http://localhost:7200/repositories/msft-graphrag-300"

In [45]:
es_username = 'elastic'
es_password = ''  # put your password here

In [12]:
def sparql_query(query: str, post=False) -> pd.DataFrame:
    sparql_conn.setQuery(query)
    sparql_conn.setReturnFormat(CSV)
    results = sparql_conn.query().convert()
    return pd.read_csv(StringIO(results.decode('utf-8')), sep=",")

In [13]:
sparql_conn = SPARQLWrapper(endpoint)

## Setup the question
The question we will ask will be:
```
What is the relationship between Bob Cratchit and Belinda Cratchit?
```

In [14]:
question_text = "What is the relationship between Bob Cratchit and Belinda Cratchit?"

## Convert the question text
First step is to convert the question into an embedding vector. To do this we will use our local LM Studio instance and call the embedding OpenAI endpoint.

In [16]:
def get_embedding(text: str, client: Any, model: str="CompendiumLabs/bge-large-en-v1.5-gguf"):
    """Convert the text into an embedding vector using the model provided

    :param text: text to be converted to and embedding vector
    :param client: OpenAI client
    :param model: name of the model to use for encoding
    """
    text = text.replace("\n", " ")
    return client.embeddings.create(input = [text], model=model).data[0].embedding

In [17]:
client = OpenAI(base_url="http://localhost:1234/v1", api_key="lm-studio")
embedding_vector = get_embedding(question_text, client=client)

Set the limits for our searches

In [18]:
top_chunks = 3
top_communities = 3
top_outside_relationships = 10
top_inside_relationships = 10
top_entities = 10

## Find nearest Entities
Now we need to use our Elasticsearch index to do a k-nearest neighbour search for our **embedding_vector** to find the 10 nearest `Entity` instances.

In [19]:
es = Elasticsearch("http://localhost:9200", 
                   basic_auth=(es_username, es_password), 
                   verify_certs=False)

In [21]:
query = {
    "field" : "description_embedding" ,
    "query_vector" : embedding_vector,
    "k" : top_entities,
    "num_candidates" : 100 ,
}
index_name = "entity_graph_index" 
res = es.search(index=index_name, knn=query, source=["id"])
search_results = res["hits"]["hits"]
# convert our results into a list of Entities
# This list will be ordered by match score descending (i.e the more likely matches will be at the beginning)
entity_list = [x['_id'] for x in search_results]
entity_list

['dde131ab575d44dbb55289a6972be18f',
 '68105770b523412388424d984e711917',
 '40e4ef7dbc98473ba311bd837859a62a',
 '254770028d7a4fa9877da4ba0ad5ad21',
 'da1684437ab04f23adac28ff70bd8429',
 'd91a266f766b4737a06b0fda588ba40b',
 'bc0e3f075a4c4ebbb7c7b152b65a5625',
 '23becf8c6fca4f47a53ec4883d4bf63f',
 '496f17c2f74244c681db1b23c7a39c0c',
 '3d0dcbc8971b415ea18065edc4d8c8ef']

In [22]:
def get_entities(nodes: List[str]) -> str:
    """Get a SPARQL query that will fetch details of the Entites that are in the list
    
    :param nodes: list of Entity ids
    :returns: a SPARQL query string
    """

    query = """
    PREFIX gr: <http://ormynet.com/ns/msft-graphrag#>
    
    SELECT ?id ?description
    WHERE
    {
        ?entity_uri a gr:Entity;
        gr:id ?id;
        gr:description ?entity_desc .
        BIND(REPLACE(?entity_desc, "\\r\\n", " ", "i") AS ?description)
    """
    first = True
    for node in nodes:
        if first:
            query += " FILTER( "
        else:
            query += " || "
        query += f' ?id = "{node}" '
        first = False
    query += """ )
    }
    """
    return query

In [23]:
# Get details about the Entities that were found
entities_df =sparql_query(get_entities(entity_list))
entities_df

Unnamed: 0,id,description
0,dde131ab575d44dbb55289a6972be18f,"Here is a comprehensive summary of the data provided: ""Belinda Cratchit is one of Bob Cratchit's daughters, known for her bravery as evidenced by her wearing ribbons. She assists her mother, Mrs. Cratchit, with household tasks such as managing the cloth."""
1,68105770b523412388424d984e711917,Belinda and Martha Cratchit are daughters of Mrs. Cratchit.
2,40e4ef7dbc98473ba311bd837859a62a,"Miss Belinda is a member of Mrs. Cratchit's family, assisting with food preparation and serving as a character responsible for changing the plates during meals. She also helps with other tasks such as sweetening the apple sauce. As a servant in the Cratchit household, Miss Belinda plays an important role in supporting her family during dinner."
3,254770028d7a4fa9877da4ba0ad5ad21,"Tim Cratchit, also known as Tiny Tim, is a cripple and the youngest son of Bob Cratchit."
4,da1684437ab04f23adac28ff70bd8429,
5,d91a266f766b4737a06b0fda588ba40b,"Here is a comprehensive summary of Bob Cratchit: Bob Cratchit is a character in Charles Dickens' novel ""A Christmas Carol."" He is Peter's father, reflecting on the importance of family relationships and remembering Tiny Tim. As a clerk, he works for Ebenezer Scrooge, but his salary is raised by Scrooge, which helps him support his struggling family. Bob Cratchit has daughters and a wife, and he praises his wife's cooking skills, particularly her pudding. He is also the father of Master Peter Cratchit and Tiny Tim, whom he loves and carries on his shoulder. In addition to being a devoted father and husband, Bob Cratchit is also an employee who arrives late to work, prompting Scrooge's desire for revenge. However, after visiting Mr. Scrooge's nephew, Bob becomes emotional and reconciled to his situation. He is optimistic about his son Peter's future, mentioning a potential job opportunity. On Christmas Eve, Bob Cratchit goes down a slide on Cornhill in honor of the holiday. He also serves dinner to his loved ones and expresses gratitude for their presence. When he visits Scrooge's house, he apologizes for being late and explains that he was making merry the previous day. Overall, Bob Cratchit is a kind and loving father who values family relationships and is grateful for the blessings in his life."
6,bc0e3f075a4c4ebbb7c7b152b65a5625,"Here is a comprehensive summary of the data provided: PETER CRATCHIT is a character in a narrative, who is the subject of conversation about his future and relationships. He is also known to be the son of BOB CRATCHIT. Note: I have included both entities mentioned in the original description, as they are related to PETER CRATCHIT."
7,23becf8c6fca4f47a53ec4883d4bf63f,The Cratchits are a family who are working together to help Bob.
8,496f17c2f74244c681db1b23c7a39c0c,"Master Peter Cratchit is Bob and Mrs. Cratchit's son who wears a distinctive shirt-collar and takes pride in his formal attire. He also assists with the family meal, showcasing his responsible nature as the heir to the Cratchit family."
9,3d0dcbc8971b415ea18065edc4d8c8ef,"Here is a comprehensive summary of Mrs. Cratchit: Mrs. Cratchit, the wife of Bob Cratchit, is a mother figure who interacts with her family members, showing concern for their well-being. She is portrayed as a proud and skilled cook who has made a successful pudding for her family to enjoy on Christmas Day. As a devoted mother, she takes pride in serving her family a sufficient and delicious dinner, including making the gravy for Christmas dinner. Mrs. Cratchit is also brave in ribbons and expresses admiration for Bob's kindness. She is nervous about serving the pudding to her family but ultimately succeeds in providing them with a wonderful meal. Additionally, she is concerned about her family's well-being, particularly Tiny Tim's behavior on Christmas Day, which she discusses with her husband. Mrs. Cratchit also wishes to give Mr. Scrooge a piece of her mind for his miserly ways, showing that she values kindness and generosity. Overall, Mrs. Cratchit is a loving and hardworking mother who plays an important role in the Cratchit family's life."


In [7]:
def convert_df_to_text(df: pd.DataFrame, key_name: str, colname: str):
    """Convert a DataFrame to text suitable for LLM context
    
    :param df: input DataFrame
    :param key_name: name of the key to use
    :param colname: name of the column in the DataFrame to use
    :returns: string suitable for LLM context
    """
    output_text = "{\"" + key_name + ":\" [\n"
    first = True
    for i in range(len(df)):
        if first:
            output_text += "\""
            first = False
        else:
            output_text += ",\n\"" 
        output_text += df[colname].iloc[i] + "\""
    output_text += "]}"
    return output_text

In [33]:
entity_text = convert_df_to_text(entities_df, 'Entities', 'description')

## Get The Top 3 Chunks
Get the `Chunk` records that are connected to these `Entity` records, sort them by those that have the most Entities and then take the top 3

In [24]:
def get_text_mapping(nodes: List[str], limit_chunks: int = 3) -> str:
    """Get a SPARQL query that fetches the top Chunks that are connected to Entity records
    
    :param nodes: list of Entity ids
    :param limit_chunks: how many chunks to return
    :returns: a SPARQL query string
    """
    query = """
    PREFIX gr: <http://ormynet.com/ns/msft-graphrag#>
    
    SELECT 
    ?chunkText 
    (COUNT(?entity_uri) AS ?freq)
    WHERE {
        ?chunk_uri gr:has_entity ?entity_uri;
        gr:text ?chunk_text .
    """
    first = True
    for node in nodes:
        if not first:
            query += " UNION "
        query += f"""
        {{
            ?entity_uri a gr:Entity;
            gr:id "{node}" .
        }} 
        """
        first = False
    query += """
        BIND(REPLACE(?chunk_text, "\\r\\n", " ") as ?chunkText)
    }
    GROUP BY ?chunk_uri ?chunkText
    ORDER BY DESC(?freq)
    """
    query += f" LIMIT {limit_chunks} "
    return query

In [26]:
# Let's find the Chunks that are most likely to contain the information we're looking for
text_mapping_df = sparql_query(get_text_mapping(entity_list, limit_chunks=top_chunks))
text_mapping_df

Unnamed: 0,chunkText,freq
0,"roof quite as gracefully and like a supernatural creature as it was possible he could have done in any lofty hall. And perhaps it was the pleasure the good Spirit had in showing off this power of his, or else it was his own kind, generous, hearty nature, and his sympathy with all poor men, that led him straight to Scrooge's clerk's; for there he went, and took Scrooge with him, holding to his robe; and on the threshold of the door the Spirit smiled, and stopped to bless Bob Cratchit's dwelling with the sprinklings of his torch. Think of that! Bob had but fifteen 'Bob' a week himself; he pocketed on Saturdays but fifteen copies of his Christian name; and yet the Ghost of Christmas Present blessed his four-roomed house! Then up rose Mrs. Cratchit, Cratchit's wife, dressed out but poorly in a twice-turned gown, but brave in ribbons, which are cheap, and make a goodly show for sixpence; and she laid the cloth, assisted by Belinda Cratchit, second of her daughters, also brave in ribbons; while Master Peter Cratchit plunged a fork into the saucepan of potatoes, and getting the corners of his monstrous shirt-collar (Bob's private property, conferred upon his son and heir in honour of the day,) into his",5
1,"debtors. Mrs. Cratchit, wife of Bob Cratchit. Belinda and Martha Cratchit, daughters of the preceding. Mrs. Dilber, a laundress. Fan, the sister of Scrooge. Mrs. Fezziwig, the worthy partner of Mr. Fezziwig. CONTENTS STAVE ONE--MARLEY'S GHOST 3 STAVE TWO--THE FIRST OF THE THREE SPIRITS 37 STAVE THREE--THE SECOND OF THE THREE SPIRITS 69 STAVE FOUR--THE LAST OF THE SPIRITS 111 STAVE FIVE--THE END OF IT 137 LIST OF ILLUSTRATIONS _IN COLOUR_ ""How now?"" said Scrooge, caustic and cold as ever. ""What do you want with me?"" _Frontispiece_ Bob Cratchit went down a slide on Cornhill, at the end of a lane of boys, twenty times, in honour of its being Christmas Eve 16 Nobody under the bed; nobody in the closet; nobody in his dressing-gown, which was hanging up in a suspicious attitude against the wall 20 The air was filled with phantoms, wandering hither",3
2,"ice-turned gown, but brave in ribbons, which are cheap, and make a goodly show for sixpence; and she laid the cloth, assisted by Belinda Cratchit, second of her daughters, also brave in ribbons; while Master Peter Cratchit plunged a fork into the saucepan of potatoes, and getting the corners of his monstrous shirt-collar (Bob's private property, conferred upon his son and heir in honour of the day,) into his mouth, rejoiced to find himself so gallantly attired, and yearned to show his linen in the fashionable Parks. And now two smaller Cratchits, boy and girl, came tearing in, screaming that outside the baker's they had smelt the goose, and known it for their own; and basking in luxurious thoughts of sage and onion, these young Cratchits danced about the table, and exalted Master Peter Cratchit to the skies, while he (not proud, although his collars nearly choked him) blew the fire, until the slow potatoes, bubbling up, knocked loudly at the saucepan-lid to be let out and peeled. 'What has ever got your precious father, then?' said Mrs. Cratchit. 'And your brother, Tiny Tim? And Martha warn't as late last Christmas Day by half an hour!' 'Here's Martha, mother!' said a girl,",3


In [34]:
chunk_text = convert_df_to_text(text_mapping_df, 'Chunks', 'chunkText')

## Get the Top 3 Relationships
Get the top 3 `Community` records that are related to these `Entity` records.

In [31]:
def get_report_mapping(nodes: List[str], limit_communities: int = 3) -> str:
    """Get the Communities that are most likely to contain the Entities
    
    :param nodes: list of Entity ids
    :param limit_communities: how many communities
    :returns: a SPARQL query string
    """
    query = """
    PREFIX gr: <http://ormynet.com/ns/msft-graphrag#>

    SELECT ?community_uri ?rank ?weight ?summary
    WHERE
    {
        ?community_uri a gr:Community;
          gr:rank ?rank;
          gr:weight ?weight;
          gr:summary ?community_summary .
        BIND(REPLACE(?community_summary, "\\r\\n", " ", "i") AS ?summary)
        ?entity_uri gr:in_community ?community_uri;
    """
    first = True
    for node in nodes:
        if not first:
            query += " UNION "
        query += f"""
        {{
            ?entity_uri a gr:Entity;
            gr:id "{node}" .
        }} 
        """
        first = False
    query += """
    }
    GROUP BY ?rank ?weight ?community_uri ?summary
    ORDER BY DESC(?rank) DESC(?weight)
    """
    query += f" LIMIT {limit_communities} "
    return query

In [32]:
# Get the top communities that these Entities are part of
report_mapping_df = sparql_query(get_report_mapping(entity_list, limit_communities=top_communities))
report_mapping_df

Unnamed: 0,community_uri,rank,weight,summary
0,http://ormynet.com/ns/data#Community_6,8.0,223,"This community revolves around Ebenezer Scrooge, a complex character with relationships to various entities. The community's structure includes Scrooge's interactions with his nephew Fred, the Ghost of Christmas Past, and other supernatural entities."
1,http://ormynet.com/ns/data#Community_22,8.0,223,"This community revolves around Ebenezer Scrooge, a complex character with a multifaceted personality. He has relationships with various entities, including his nephew, clerk, and the Ghost of Christmas Past, Present, and Yet to Come. The community's dynamics are influenced by Scrooge's interactions with these entities, which shape his perspective on kindness, generosity, and redemption."
2,http://ormynet.com/ns/data#Community_2,6.0,160,"The community revolves around the Cratchit family, with relationships between Bob Cratchit, his wife Mrs. Cratchit, and their children, including Tiny Tim. The family is connected to Mr. Scrooge through employment and personal interactions."


In [35]:
reports_text = convert_df_to_text(report_mapping_df, 'Reports', 'summary')

## Get The Outside & Inside Relationships
Get the outside and inside relationsihps for the `Entity` records.

In [27]:
def get_outside_relationships(nodes: List[str], limit_outside_relationships: int = 10) -> str:
    """Get the outside relationships
    
    :param nodes: list of Entity ids
    :param limit_outside_relationships: how many relationships to return
    :returns: a SPARQL query string
    """
    query = """
    PREFIX gr: <http://ormynet.com/ns/msft-graphrag#>
    
    SELECT 
    ?description
    ?entity_from_id ?entity_to_id
    ?rank ?weight
    WHERE {
        ?related_to_uri a gr:related_to;
            gr:id ?id;
            gr:rank ?rank;
            gr:description ?desc;
            gr:weight ?weight .
        BIND(REPLACE(?desc, "\\r\\n", "") as ?description)
        ?entity_from_uri ?related_to_uri ?entity_to_uri .
        ?entity_from_uri gr:id ?entity_from_id .
        ?entity_to_uri gr:id ?entity_to_id .
    """
    first = True
    for node in nodes:
        if first:
            query += " FILTER( "
        else:
            query += " && "
        query += f"""
    ?entity_to_id != "{node}" """
        first = False
    query += """
               )
    }
    ORDER BY DESC(?rank) DESC(?weight)
    """
    query += f" LIMIT {limit_outside_relationships} "
    return query

In [28]:
# Get the top outside relationships these Entities are not part of
outside_relationships_df = sparql_query(get_outside_relationships(entity_list, limit_outside_relationships=top_outside_relationships))
outside_relationships_df

Unnamed: 0,description,entity_from_id,entity_to_id,rank,weight
0,"Here is a comprehensive summary of the data provided:Bob Cratchit and Scrooge have a complex relationship that evolves throughout their interactions. Initially, Scrooge employs Bob as his clerk, indicating a professional relationship between them. However, this relationship also reveals Scrooge's miserly nature, as he scolds Bob for coming late to work and seeks revenge on him for being tardy.Despite these negative interactions, there is evidence of a positive relationship between the two. Bob defends Scrooge as the Founder of the Feast, suggesting that Scrooge has made an effort to be involved in his community and show kindness towards others. Furthermore, Scrooge raises Bob's salary and offers to assist his struggling family, indicating a change in his behavior towards Christmas.This change in behavior suggests that Scrooge is capable of growth and development, particularly when it comes to showing compassion and generosity during the holiday season. Overall, the relationship between Bob Cratchit and Scrooge is multifaceted, reflecting both their professional obligations as employer and employee, as well as their personal interactions and evolving dynamics.Note: I have resolved the contradictions in the descriptions by highlighting the complexities of their relationship and emphasizing the positive changes that Scrooge undergoes throughout the story.",d91a266f766b4737a06b0fda588ba40b,e2bf260115514fb3b252fd879fb3e7be,125,6.0
1,"Mrs. Cratchit expresses her negative opinion about Scrooge, indicating a strained relationship between them.",3d0dcbc8971b415ea18065edc4d8c8ef,e2bf260115514fb3b252fd879fb3e7be,125,1.0
2,"Here is a comprehensive summary of the data provided:**SCROOGE**Scrooge has a strained relationship with his nephew, which he often displays through dismissiveness towards his nephew's attempts at friendship. However, despite this, Scrooge is occasionally surprised by his nephew's laughter, which brings him joy. This suggests that beneath his gruff exterior, Scrooge may have a softer side that he only reveals in moments of genuine connection with his nephew.**SCROOGE'S NEPHEW**Scrooge's Nephew has a comical and familiar relationship with Scrooge, often describing him in a lighthearted way that implies a sense of criticism or playful teasing. Despite this, the nephew is determined to bring joy to his uncle's life, as evidenced by his efforts to attend Christmas with Scrooge and maintain a good temper.Overall, the relationship between Scrooge and his Nephew appears to be complex and multifaceted, with moments of tension and conflict alongside instances of genuine affection and connection.",e2bf260115514fb3b252fd879fb3e7be,48c0c4d72da74ff5bb926fa0c856d1a7,124,5.0
3,"Scrooge interacts with The Spirit, who serves as a guide and mentor to him, guiding him through a series of visions and transformations that help him recall his past memories and emotions, while also showing him visions of his past, present, and future. The Spirit appears to Scrooge in various settings, including the streets of London, a baker's doorway, and his clerk's house, attempting to persuade him to walk with it and teach him precepts that leave blessings. Through these interactions, The Spirit helps Scrooge see the error of his ways and change his behavior by guiding him through visions of his past and future, ultimately representing his own mortality and the unknown future.",e2bf260115514fb3b252fd879fb3e7be,de61b2670999433f807a6a1dc2b81e43,122,14.0
4,"Here is a comprehensive summary of the data provided:""Scrooge, who initially appears to be a miserly character, surprisingly becomes a second father to Tiny Tim, showing kindness and generosity towards him. He takes an interest in the child's well-being and hopes that he will be spared from illness, demonstrating a softer side to his personality.""",e2bf260115514fb3b252fd879fb3e7be,4517768fc4e24bd2a790be0e08a7856e,122,2.0
5,"Ebenezer Scrooge and Jacob Marley were business partners for an unknown number of years, with Scrooge serving as Marley's executor and administrator after his death. However, their partnership ended when Marley passed away seven years prior to the events of the story. Scrooge is haunted by Marley's ghost, which has a profound impact on him, and he also sees Marley's face in the knocker, indicating a strong connection between them. The appearance of Marley's ghost is a supernatural or paranormal event that serves as a catalyst for Scrooge's transformation.",e2bf260115514fb3b252fd879fb3e7be,32ee140946e5461f9275db664dc541a5,120,5.0
6,"The comprehensive summary of the data provided is as follows:Scrooge is affected by The Woman's laughter and finds himself shuddering from head to foot. Additionally, it appears that The Woman has a lack of respect for Scrooge, as evidenced by her reaction to the idea of profiting from his death.",e2bf260115514fb3b252fd879fb3e7be,adf4ee3fbe9b4d0381044838c4f889c8,120,2.0
7,"Here is the comprehensive summary of the data provided:**Scrooge**Scrooge and The Ghost are accompanying each other to the mansion. During their interaction, Scrooge asks questions about The Ghost's presence and purpose. Through this conversation, The Ghost guides Scrooge through his past and present life, influencing his perspective on his own life and behavior. The Ghost also warns Scrooge of impending events and tells him to remember their conversation.As they interact, The Ghost leads Scrooge to reflect on his past and his relationships with others. This reflection is influenced by the visions that The Ghost shows Scrooge, teaching him a lesson about kindness and compassion. Through these visions, The Ghost highlights the consequences of Scrooge's actions and encourages him to change.However, their interaction also involves The Ghost rebuking Scrooge for his wicked behavior, leading to his penitence and grief. Additionally, The Ghost tortures Scrooge by forcing him to observe uncomfortable scenes, causing Scrooge pain and discomfort.Despite this, Scrooge is ultimately guided by The Ghost as they walk through familiar places from Scrooge's past. This experience has a profound impact on Scrooge, leading him to reevaluate his life and behavior.**The Ghost**The Ghost is a supernatural entity that accompanies Scrooge to the mansion. It interacts directly with Scrooge, asking questions about its presence and purpose. Through this conversation, The Ghost guides Scrooge through his past and present life, influencing his perspective on his own life and behavior.As they interact, The Ghost warns Scrooge of impending events and tells him to remember their conversation. The Ghost also tries to persuade Scrooge to stay longer, but ultimately allows him to leave after a short time.Overall, the interaction between Scrooge and The Ghost is a transformative experience for both entities, leading to a deeper understanding of themselves and each other.",e2bf260115514fb3b252fd879fb3e7be,bb9e01bc171d4326a29afda59ece8d17,119,12.0
8,"The entity ""SCROOGE"" has a complex relationship with the holiday ""CHRISTMAS"". Initially, Scrooge views Christmas as something to be avoided, seeing it as a time for financial struggles and debt. However, through his experiences, he undergoes a change in attitude, deciding to honour Christmas in his heart and making a commitment to do so throughout the year. This shift indicates a transformation in his perspective and behaviour, moving away from his miserly personality and focus on business towards a more festive and generous spirit.",e2bf260115514fb3b252fd879fb3e7be,958beecdb5bb4060948415ffd75d2b03,119,5.0
9,"Fezzigw and Scrooge had a significant connection in the past. Fezzigw organized a Christmas celebration with Scrooge and Dick, showcasing his kindness and generosity. This event left a lasting impact on Scrooge, who remembered it fondly and became aware of the Ghost's presence during this scene. Furthermore, Scrooge acknowledged the positive influence of Fezziwig's kindness on his employees, highlighting its importance in their lives. Overall, Scrooge had a positive relationship with Fezziwig, who was a kind employer and friend to him in the past.",e2bf260115514fb3b252fd879fb3e7be,d64ed762ea924caa95c8d06f072a9a96,119,4.0


In [29]:
def get_inside_relationships(nodes: List[str], limit_inside_relationships: int = 10) -> str:
    """Get a SPARQL query to fetch the inside relationships
    
    :param nodes: list of Entity ids
    :param limit_inside_relationships: how many relationships to return
    :returns: a SPARQL query string
    """
    query = """
    PREFIX gr: <http://ormynet.com/ns/msft-graphrag#>
    SELECT 
    ?description
    ?entity_from_id ?entity_to_id
    ?rank ?weight
    WHERE {
        ?related_to_uri a gr:related_to;
            gr:id ?id;
            gr:rank ?rank;
            gr:description ?desc;
            gr:weight ?weight .
        BIND(REPLACE(?desc, "\\r\\n", "") as ?description)
        ?entity_from_uri ?related_to_uri ?entity_to_uri .
        ?entity_from_uri gr:id ?entity_from_id .
        ?entity_to_uri gr:id ?entity_to_id .
    """
    first = True
    for node in nodes:
        if first:
            query += " FILTER( "
        else:
            query += " || "
        query += f"""
        ?entity_to_id = "{node}" """
        first = False
    query += """
               )
    }
    ORDER BY DESC(?rank) DESC(?weight)
    """
    query += f" LIMIT {limit_inside_relationships} "
    return query

In [30]:
# Get the top inside relationships these Entities are part of
inside_relationships_df = sparql_query(get_inside_relationships(entity_list, limit_inside_relationships=top_inside_relationships))
inside_relationships_df

Unnamed: 0,description,entity_from_id,entity_to_id,rank,weight
0,"Bob Cratchit and Mrs. Cratchit are a married couple who form a loving family unit, with Mrs. Cratchit being his supportive wife.",d91a266f766b4737a06b0fda588ba40b,3d0dcbc8971b415ea18065edc4d8c8ef,22,2.0
1,Mrs. Cratchit is part of the family unit that is working together to help Bob.,3d0dcbc8971b415ea18065edc4d8c8ef,23becf8c6fca4f47a53ec4883d4bf63f,13,1.0
2,Belinda Cratchit is a daughter of Bob Cratchit.,d91a266f766b4737a06b0fda588ba40b,dde131ab575d44dbb55289a6972be18f,12,1.0
3,Mrs. Cratchit and Miss Belinda collaborate on food preparation for the family gathering.,3d0dcbc8971b415ea18065edc4d8c8ef,40e4ef7dbc98473ba311bd837859a62a,12,1.0
4,"Peter Cratchit receives affection and concern from his mother, Mrs. Cratchit.",bc0e3f075a4c4ebbb7c7b152b65a5625,3d0dcbc8971b415ea18065edc4d8c8ef,12,1.0
5,The Spirit blesses Bob Cratchit's four-roomed house with the sprinklings of its torch.,de61b2670999433f807a6a1dc2b81e43,da1684437ab04f23adac28ff70bd8429,9,1.0
6,"Bob's family is helping him, showing their love and support for each other.",4bc7440b8f4b4e4cae65a5c49defa923,23becf8c6fca4f47a53ec4883d4bf63f,6,1.0


In [36]:
relationships_text = "{\"Relationships:\" [ "
first = True
for i in range(len(inside_relationships_df)):
    if first:
        relationships_text += "\""
        first = False
    else:
        relationships_text += ",\n\"" 
    relationships_text += inside_relationships_df['description'].iloc[i] + "\""
for i in range(len(outside_relationships_df)):
    relationships_text += outside_relationships_df['description'].iloc[i] + "\""
relationships_text += "]}"

## Create LangChain Response
Having got all our important data for our identified entity list, we now need to combine them to produce a response that would be suitable as a LangChain response.

In [44]:
llm = ChatOpenAI(
    model="lmstudio-community/Meta-Llama-3.1-8B-Instruct-GGUF",
    temperature=0,
    max_tokens=None,
    timeout=None,
    max_retries=2,
    api_key="lm-server",
    base_url="http://localhost:1234/v1"
)

## Prompt with no context
Prompt with no context fromt the Knowledge Graph

In [38]:
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant that answers questions about a book.",
        ),
        ("human", "{input}"),
    ]
)
chain = prompt | llm | StrOutputParser()
chain.invoke(
    {
        "input": question_text,
    }
)

'I think there may be some confusion here!\n\nIn Charles Dickens\' classic novel "A Christmas Carol", Bob Cratchit\'s wife is actually named Emily, not Belinda.\n\nBob Cratchit is a kind and hardworking clerk who works for Ebenezer Scrooge. He is the father of six children: Peter, Belle (not Belinda), Tiny Tim, and three other unnamed children.'

## Prompt with context from Knowledge Graph
Create a context using the Knowledge Graph and feed that to the LLM.

In [39]:
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant that answers questions about a book.",
        ),
        ("human", "{context} {input}"),
    ]
)
chain = prompt | llm | StrOutputParser()
chain.invoke(
    {
        "context": entity_text + "," + chunk_text + "," + relationships_text +"," + reports_text,
        "input": question_text,
    }
)

"According to the data provided, Belinda Cratchit is a daughter of Bob Cratchit. They are related as parent and child. Additionally, it is mentioned that Mrs. Cratchit (Bob's wife) assists Belinda with household tasks such as managing the cloth, indicating a close family relationship between them."

## Global Query
We'll setup a global query with 2 different prompt templates.

In [40]:
MAP_SYSTEM_PROMPT = """
---Role---

You are a helpful assistant responding to questions about data in the tables provided.


---Goal---

Generate a response consisting of a list of key points that responds to the user's question, summarizing all relevant information in the input data tables.

You should use the data provided in the data tables below as the primary context for generating the response.
If you don't know the answer or if the input data tables do not contain sufficient information to provide an answer, just say so. Do not make anything up.

Each key point in the response should have the following element:
- Description: A comprehensive description of the point.
- Importance Score: An integer score between 0-100 that indicates how important the point is in answering the user's question. An 'I don't know' type of response should have a score of 0.

The response should be JSON formatted as follows:
{{
    "points": [
        {{"description": "Description of point 1 [Data: Reports (report ids)]", "score": score_value}},
        {{"description": "Description of point 2 [Data: Reports (report ids)]", "score": score_value}}
    ]
}}

The response shall preserve the original meaning and use of modal verbs such as "shall", "may" or "will".

Points supported by data should list the relevant reports as references as follows:
"This is an example sentence supported by data references [Data: Reports (report ids)]"

**Do not list more than 5 record ids in a single reference**. Instead, list the top 5 most relevant record ids and add "+more" to indicate that there are more.

For example:
"Person X is the owner of Company Y and subject to many allegations of wrongdoing [Data: Reports (2, 7, 64, 46, 34, +more)]. He is also CEO of company X [Data: Reports (1, 3)]"

where 1, 2, 3, 7, 34, 46, and 64 represent the id (not the index) of the relevant data report in the provided tables.

Do not include information where the supporting evidence for it is not provided.


---Data tables---

{context_data}

---Goal---

Generate a response consisting of a list of key points that responds to the user's question, summarizing all relevant information in the input data tables.

You should use the data provided in the data tables below as the primary context for generating the response.
If you don't know the answer or if the input data tables do not contain sufficient information to provide an answer, just say so. Do not make anything up.

Each key point in the response should have the following element:
- Description: A comprehensive description of the point.
- Importance Score: An integer score between 0-100 that indicates how important the point is in answering the user's question. An 'I don't know' type of response should have a score of 0.

The response shall preserve the original meaning and use of modal verbs such as "shall", "may" or "will".

Points supported by data should list the relevant reports as references as follows:
"This is an example sentence supported by data references [Data: Reports (report ids)]"

**Do not list more than 5 record ids in a single reference**. Instead, list the top 5 most relevant record ids and add "+more" to indicate that there are more.

For example:
"Person X is the owner of Company Y and subject to many allegations of wrongdoing [Data: Reports (2, 7, 64, 46, 34, +more)]. He is also CEO of company X [Data: Reports (1, 3)]"

where 1, 2, 3, 7, 34, 46, and 64 represent the id (not the index) of the relevant data report in the provided tables.

Do not include information where the supporting evidence for it is not provided.

The response should be JSON formatted as follows:
{{
    "points": [
        {{"description": "Description of point 1 [Data: Reports (report ids)]", "score": score_value}},
        {{"description": "Description of point 2 [Data: Reports (report ids)]", "score": score_value}}
    ]
}}
"""
map_prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            MAP_SYSTEM_PROMPT,
        ),
        (
            "human",
            "{question}",
        ),
    ]
)
map_chain = map_prompt | llm | StrOutputParser()

In [41]:
REDUCE_SYSTEM_PROMPT = """
---Role---

You are a helpful assistant responding to questions about a dataset by synthesizing perspectives from multiple analysts.


---Goal---

Generate a response of the target length and format that responds to the user's question, summarize all the reports from multiple analysts who focused on different parts of the dataset.

Note that the analysts' reports provided below are ranked in the **descending order of importance**.

If you don't know the answer or if the provided reports do not contain sufficient information to provide an answer, just say so. Do not make anything up.

The final response should remove all irrelevant information from the analysts' reports and merge the cleaned information into a comprehensive answer that provides explanations of all the key points and implications appropriate for the response length and format.

Add sections and commentary to the response as appropriate for the length and format. Style the response in markdown.

The response shall preserve the original meaning and use of modal verbs such as "shall", "may" or "will".

The response should also preserve all the data references previously included in the analysts' reports, but do not mention the roles of multiple analysts in the analysis process.

**Do not list more than 5 record ids in a single reference**. Instead, list the top 5 most relevant record ids and add "+more" to indicate that there are more.

For example:

"Person X is the owner of Company Y and subject to many allegations of wrongdoing [Data: Reports (2, 7, 34, 46, 64, +more)]. He is also CEO of company X [Data: Reports (1, 3)]"

where 1, 2, 3, 7, 34, 46, and 64 represent the id (not the index) of the relevant data record.

Do not include information where the supporting evidence for it is not provided.


---Target response length and format---

{response_type}


---Analyst Reports---

{report_data}


---Goal---

Generate a response of the target length and format that responds to the user's question, summarize all the reports from multiple analysts who focused on different parts of the dataset.

Note that the analysts' reports provided below are ranked in the **descending order of importance**.

If you don't know the answer or if the provided reports do not contain sufficient information to provide an answer, just say so. Do not make anything up.

The final response should remove all irrelevant information from the analysts' reports and merge the cleaned information into a comprehensive answer that provides explanations of all the key points and implications appropriate for the response length and format.

The response shall preserve the original meaning and use of modal verbs such as "shall", "may" or "will".

The response should also preserve all the data references previously included in the analysts' reports, but do not mention the roles of multiple analysts in the analysis process.

**Do not list more than 5 record ids in a single reference**. Instead, list the top 5 most relevant record ids and add "+more" to indicate that there are more.

For example:

"Person X is the owner of Company Y and subject to many allegations of wrongdoing [Data: Reports (2, 7, 34, 46, 64, +more)]. He is also CEO of company X [Data: Reports (1, 3)]"

where 1, 2, 3, 7, 34, 46, and 64 represent the id (not the index) of the relevant data record.

Do not include information where the supporting evidence for it is not provided.


---Target response length and format---

{response_type}

Add sections and commentary to the response as appropriate for the length and format. Style the response in markdown.
"""

reduce_prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            REDUCE_SYSTEM_PROMPT,
        ),
        (
            "human",
            "{question}",
        ),
    ]
)
reduce_chain = reduce_prompt | llm | StrOutputParser()

In [42]:
def global_retriever(query: str, level: int, response_type: str = "multiple paragraphs") -> str:
    """Global retriever

    :param query: the question as string
    :param level: the Community level
    :param response_type: type of response
    :returns: final response as a string
    """
    community_query = f"""
    PREFIX gr: <http://ormynet.com/ns/msft-graphrag#>
    
    SELECT ?full_content
    WHERE {{
        ?community_uri a gr:Community;
        gr:level ?level;
        gr:full_content ?full_content .
        FILTER(?level = {level})
    }}
    """
    community_data = sparql_query(community_query)
    intermediate_results = []
    for i in tqdm(range(len(community_data)), desc="Processing communities"):
        intermediate_response = map_chain.invoke(
            {"question": query, "context_data": community_data["full_content"].iloc[i]}
        )
        intermediate_results.append(intermediate_response)
    final_response = reduce_chain.invoke(
        {
            "report_data": intermediate_results,
            "question": query,
            "response_type": response_type,
        }
    )
    return final_response

In [43]:
print(global_retriever("What is the story about?", 2))

Processing communities: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [07:17<00:00, 54.70s/it]


The story revolves around Ebenezer Scrooge, a complex character with a multifaceted personality, and his transformative experience after being visited by the Ghosts of Christmas Past, Present, and Yet to Come.

**Key Points:**

* The story explores Scrooge's relationships with various entities, including his nephew, clerk, and the Ghosts, which shape his perspective on kindness, generosity, and redemption.
* The story highlights Scrooge's transformation from a miserly person to someone who values Christmas and its spirit.
* The story touches on the spiritual significance of Christmas and its impact on individuals, as represented by the Spirit of Tiny Tim.

**Data References:**

* Entities (37), Relationships (103)
* Reports (74, 73)
* Entities (62), Relationships (73)
* Entities (321), Relationships (221)

The story may have a positive influence on Ebenezer Scrooge, potentially encouraging him to reevaluate his negative attitude towards Christmas.
