# GraphRAG Implementation with LlamaIndex - Experiment 2

[GraphRAG - LlamaIndex](https://medium.aiplanet.com/implement-rag-with-knowledge-graph-and-llama-index-6a3370e93cdd)

# Installation

In [None]:
%pip install llama-index llama-index-graph-stores-neo4j graspologic numpy==1.24.4 scipy==1.12.0 future python-dotenv setuptools

# Setup API Key, LLM, Embed Model

In [2]:
from config import Config
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding

llm = OpenAI(model="gpt-4", api_key=Config.OPENAI_API_KEY)
embed_model = OpenAIEmbedding(model="text-embedding-ada-002", api_key=Config.OPENAI_API_KEY)

# Loading Data

In [3]:
from pathlib import Path
import uuid
from ebooklib import epub

def extract_epub_metadata(book_path: str) -> dict:
    book_path = Path(book_path)
    if not book_path.exists():
        raise FileNotFoundError(f"EPUB file not found at path: {book_path}")
    book = epub.read_epub(str(book_path))

    return {
        "title": book.get_metadata("DC", "title")[0][0].rstrip(".epub") if book.get_metadata("DC", "title") else "N/A",
        "author": book.get_metadata("DC", "creator")[0][0] if book.get_metadata("DC", "creator") else "",
        "language": book.get_metadata("DC", "language")[0][0] if book.get_metadata("DC", "language") else "",
        "description": book.get_metadata("DC", "description")[0][0] if book.get_metadata("DC", "description") else "",
        "type": "epub",
        "embeddings": "openaiembeddings"
    }

In [4]:
from llama_index.core import SimpleDirectoryReader

documents = SimpleDirectoryReader(input_dir="./data", file_metadata=extract_epub_metadata).load_data()

  for root_file in tree.findall('//xmlns:rootfile[@media-type]', namespaces={'xmlns': NAMESPACES['CONTAINERNS']}):


In [5]:
print(f"Loaded {len(documents)} documents")

Loaded 2 documents


Construct the Knowledge Graph Index

In [6]:
from llama_index.core import Settings
from llama_index.core.graph_stores import SimpleGraphStore
from llama_index.core import StorageContext
from llama_index.core import KnowledgeGraphIndex

#setup the service context (global setting of LLM)
Settings.llm = llm
Settings.chunk_size = 512

#setup the storage context
graph_store = SimpleGraphStore()
storage_context = StorageContext.from_defaults(graph_store=graph_store)

#Construct the Knowlege Graph Undex
index = KnowledgeGraphIndex.from_documents( documents=documents,
                                           max_triplets_per_chunk=3,
                                           storage_context=storage_context,
                                           embed_model=embed_model,
                                          include_embeddings=True)

# Querying

In [7]:
query = "Who is Hazrat Ali?"
query_engine = index.as_query_engine(include_text=True,
                                     response_mode ="tree_summarize",
                                     embedding_mode="hybrid",
                                     similarity_top_k=5,)
#
message_template =f"""<|system|>Please check if the following pieces of context has any mention of the  keywords provided in the Question.If not then don't know the answer, just say that you don't know.Stop there.Please donot try to make up an answer.</s>
<|user|>
Question: {query}
Helpful Answer:
</s>"""
#
response = query_engine.query(message_template)
#
print(response.response.split("<|assistant|>")[-1].strip())

Hazrat Ali is a significant figure in Islam who is known for his contributions to the betterment of Islam and humanity. He was not only an Imam for Shia Muslims, but for all of mankind. He is considered the best human being to have ever lived after Prophet Muhammad. He was a Warrior-Saint of Islam and spent his life fighting holy wars and promoting knowledge. He was also known for his role as a Caliph and Ruler, promising safety, security, and religious freedom to non-Muslims. He was recognized for his sound judgments and advice based on the Holy Quran. Despite facing challenges, he continued to assist the ruling Caliph and worked towards eradicating abuse and corruption from public service. He lived a humble life and treated the treasures of the Commonwealth as the property of the nation. He was also known for his love for his family, particularly his wife Fatima, the daughter of the Holy Prophet.


In [31]:
query = "Tell me about the bravery of Hazrat Ali. Give one event from his life that shows his bravery."
query_engine = index.as_query_engine(include_text=True,
                                     response_mode ="tree_summarize",
                                     embedding_mode="hybrid",
                                     similarity_top_k=5,)
#
message_template =f"""<|system|>Please check if the following pieces of context has any mention of the  keywords provided in the Question.If not then don't know the answer, just say that you don't know.Stop there.Please donot try to make up an answer.</s>
<|user|>
Question: {query}
Helpful Answer:
</s>"""
#
response = query_engine.query(message_template)
#
print(response.response.split("<|assistant|>")[-1].strip())

One event that showcases Hazrat Ali's bravery is when he risked his life for Prophet Muhammad during the Prophet's flight to Medina. The Prophet deputed Ali to lie in his bed, knowing that his enemies wanted to kill him. Thus, it was Ali who faced the danger in place of his Master.


# Visualization

In [32]:
from pyvis.network import Network
from IPython.display import display, HTML
g = index.get_networkx_graph()
net = Network(notebook=True,cdn_resources="in_line",directed=True)
net.from_nx(g)
html = net.generate_html()
with open("./output/example.html", mode="w", encoding="utf-8") as f:
    f.write(html)


In [29]:
storage_context.persist()