# GraphRAG Implementation with LlamaIndex - Experiment 1

[GraphRAG - LlamaIndex](https://docs.llamaindex.ai/en/stable/examples/cookbooks/GraphRAG_v2/)

# Setup API Key and LLM

In [1]:
from config import Config
from llama_index.llms.openai import OpenAI
import os

os.environ["OPENAI_API_KEY"] = Config.OPENAI_API_KEY
llm = OpenAI(model="gpt-4")

# Loading Data

In [8]:
from loader import load_epubs_from_dir

documents = load_epubs_from_dir("./data")

  for root_file in tree.findall('//xmlns:rootfile[@media-type]', namespaces={'xmlns': NAMESPACES['CONTAINERNS']}):


In [9]:
print(f"Loaded {len(documents)} documents")

Loaded 8 documents


In [10]:
from llama_index.core import Document

new_documents = []
for document in documents:
    new_documents.append(Document(text=document.text)) # Convert to Document object


# Create nodes/chunks from the text

In [11]:
from llama_index.core.node_parser import SentenceSplitter

splitter = SentenceSplitter(
    chunk_size=1024,
    chunk_overlap=20,
)
nodes = splitter.get_nodes_from_documents(documents)

In [12]:
print(f"Total number of nodes: {len(nodes)}")

Total number of nodes: 497


Build `ProperGraphIndex` using `GraphRAGExtractor` and `GraphRAGStore`

In [13]:
KG_TRIPLET_EXTRACT_TMPL = """
-Goal-
Given a text document, identify all entities and their entity types from the text and all relationships among the identified entities.
Given the text, extract up to {max_knowledge_triplets} entity-relation triplets.

-Steps-
1. Identify all entities. For each identified entity, extract the following information:
- entity_name: Name of the entity, capitalized
- entity_type: Type of the entity
- entity_description: Comprehensive description of the entity's attributes and activities
Format each entity as ("entity"$$$$""$$$$""$$$$"")

2. From the entities identified in step 1, identify all pairs of (source_entity, target_entity) that are *clearly related* to each other.
For each pair of related entities, extract the following information:
- source_entity: name of the source entity, as identified in step 1
- target_entity: name of the target entity, as identified in step 1
- relation: relationship between source_entity and target_entity
- relationship_description: explanation as to why you think the source entity and the target entity are related to each other

Format each relationship as ("relationship"$$$$""$$$$""$$$$""$$$$"")

3. When finished, output.

-Real Data-
######################
text: {text}
######################
output:"""

In [14]:
import re
from graph_rag_extractor import GraphRAGExtractor
from typing import Any

entity_pattern = r'\("entity"\$\$\$\$"(.+?)"\$\$\$\$"(.+?)"\$\$\$\$"(.+?)"\)'
relationship_pattern = r'\("relationship"\$\$\$\$"(.+?)"\$\$\$\$"(.+?)"\$\$\$\$"(.+?)"\$\$\$\$"(.+?)"\)'


def parse_fn(response_str: str) -> Any:
    entities = re.findall(entity_pattern, response_str)
    relationships = re.findall(relationship_pattern, response_str)
    return entities, relationships


kg_extractor = GraphRAGExtractor(
    llm=llm,
    extract_prompt=KG_TRIPLET_EXTRACT_TMPL,
    max_paths_per_chunk=2,
    parse_fn=parse_fn,
)

# Docker Setup And Neo4J setup

```bash
docker run \
    -p 7474:7474 -p 7687:7687 \
    -v $PWD/data:/data -v $PWD/plugins:/plugins \
    --name neo4j-apoc \
    -e NEO4J_apoc_export_file_enabled=true \
    -e NEO4J_apoc_import_file_enabled=true \
    -e NEO4J_apoc_import_file_use__neo4j__config=true \
    -e NEO4JLABS_PLUGINS=\[\"apoc\"\] \
    neo4j:latest
```

In [15]:
from graph_rag_store import GraphRAGStore

# Note: used to be `Neo4jPGStore`
graph_store = GraphRAGStore(
    username="neo4j", password="admin123", url="bolt://localhost:7687"
)

  from .autonotebook import tqdm as notebook_tqdm


In [16]:
from llama_index.core import PropertyGraphIndex

index = PropertyGraphIndex(
    nodes=nodes,
    kg_extractors=[kg_extractor],
    property_graph_store=graph_store,
    show_progress=True,
)

Extracting paths from text: 100%|██████████| 497/497 [40:56<00:00,  4.94s/it] 
Generating embeddings: 100%|██████████| 5/5 [00:05<00:00,  1.16s/it]
Generating embeddings: 100%|██████████| 29/29 [00:12<00:00,  2.29it/s]


In [17]:
index.property_graph_store.get_triplets()[10]

[EntityNode(label='Person', embedding=None, properties={'id': 'Ali', 'author': 'Yousuf N. Lalljee - XKP', 'title': 'Ali The Magnificent ', 'entity_description': 'Ali is a significant figure in Islam, known for his wisdom and devotion. He is considered the best human being to have ever lived after Prophet Muhammad. He is not just an Imam for Shia Muslims, but for all of mankind.', 'embeddings': 'openaiembeddings', 'description': 'This Book is a Must read to every Muslim and non-Muslim to learn more about a man who not only lived his life to better Islam, but to better humanity also. He was not an Imam just for Shia Muslims, but for all of mankind. He is undoubtedly the best human being to have ever lived after Prophet Muhammad (saw) with such an elevated status that the angels themselves know of his status.\n-\nISLAMICMOBILITY.COM', 'language': 'en', 'type': 'epub', 'triplet_source_id': '6789770d-a069-4624-a758-ed87c4d13ce7'}, name='Ali'),
 Relation(label='Devotion', source_id='Ali', ta

In [18]:
index.property_graph_store.get_triplets()[10][0].properties

{'id': 'Ali',
 'author': 'Yousuf N. Lalljee - XKP',
 'title': 'Ali The Magnificent ',
 'entity_description': 'Ali is a significant figure in Islam, known for his wisdom and devotion. He is considered the best human being to have ever lived after Prophet Muhammad. He is not just an Imam for Shia Muslims, but for all of mankind.',
 'embeddings': 'openaiembeddings',
 'description': 'This Book is a Must read to every Muslim and non-Muslim to learn more about a man who not only lived his life to better Islam, but to better humanity also. He was not an Imam just for Shia Muslims, but for all of mankind. He is undoubtedly the best human being to have ever lived after Prophet Muhammad (saw) with such an elevated status that the angels themselves know of his status.\n-\nISLAMICMOBILITY.COM',
 'language': 'en',
 'type': 'epub',
 'triplet_source_id': '6789770d-a069-4624-a758-ed87c4d13ce7'}

In [19]:
index.property_graph_store.get_triplets()[10][1].properties

{'author': 'Yousuf N. Lalljee - XKP',
 'title': 'Ali The Magnificent ',
 'embeddings': 'openaiembeddings',
 'description': 'This Book is a Must read to every Muslim and non-Muslim to learn more about a man who not only lived his life to better Islam, but to better humanity also. He was not an Imam just for Shia Muslims, but for all of mankind. He is undoubtedly the best human being to have ever lived after Prophet Muhammad (saw) with such an elevated status that the angels themselves know of his status.\n-\nISLAMICMOBILITY.COM',
 'language': 'en',
 'type': 'epub',
 'relationship_description': "Ali showed immense devotion to Prophet Muhammad, risking his life for the Prophet's safety and following his directions in returning properties and building a mosque in Medina.",
 'triplet_source_id': '19fdbefb-7482-43e8-bfc3-7b5dc435583d'}

# Build Communities

In [21]:
index.property_graph_store.build_communities()

# Create QueryEngine

In [23]:
from graph_query_engine import GraphRAGQueryEngine

query_engine = GraphRAGQueryEngine(
    graph_store=index.property_graph_store,
    llm=llm,
    index=index,
    similarity_top_k=5,
)

# Querying

In [25]:
from IPython.display import Markdown

response = query_engine.query(
    "Who is Ali?"
)
display(Markdown(f"{response.response}"))

Ali, also known as Ali Ibn Abi Talib, is a significant figure in Islamic history, known for his close bond and trust with Prophet Muhammad. He was entrusted as the Viceregent during the expedition of Tabuk and married Prophet Muhammad's daughter, Fatima. Ali was a central figure in various relationships, conflicts, alliances, and mentorships, interacting with key historical figures like Hasan, Husain, and Ayesha. He served as a Caliph, a military leader, and protector of Zimmis, and was involved in significant events such as the Battle of the Camel 'Jamal' and the aftermath of Osman's murder. Despite facing opposition and conspiracy, Ali demonstrated devotion to God through acts of charity, prayer, and his scholarly contributions, such as compiling the Quran. His actions and beliefs have earned him admiration and respect from both Muslims and non-Muslims alike.