# 🧠 Knowledge Graph RAG with LlamaIndex + Gemini (No Neo4j)

A lightweight, Neo4j-free implementation of **Knowledge Graph Retrieval-Augmented Generation (KG-RAG)** using:
- 🧾 LlamaIndex's `KnowledgeGraphIndex` + `SimpleGraphStore`
- 🔍 Google Gemini 2.5 + Gemini Embeddings (`text-embedding-004`)
- 📄 PDF document loader and triplet extractor
- 🕸 PyVis network visualization for the knowledge graph [VIEW](/KG_RAG/KG_graph.html)
- 📬 Hybrid querying with graph context and custom prompting

---

## 📘 What is KG-RAG?

**KG-RAG** extends traditional RAG by retrieving structured **triplets (subject-predicate-object)** and building a **knowledge graph** over them. It enables more **context-aware, fact-checked, and explainable** answers from LLMs.

Unlike vector-only RAG, KG-RAG gives the LLM a **domain model** of the data — enabling logical reasoning, relationship understanding, and minimal hallucination.

---

## 🧰 Tech Stack

| Component        | Description                              |
|------------------|------------------------------------------|
| **LlamaIndex**   | For document parsing, graph indexing     |
| **Gemini 2.5**   | For LLM-based querying                   |
| **text-embedding-004** | For generating chunk-level embeddings |
| **SimpleGraphStore** | In-memory triple store (no Neo4j)      |
| **PyVis**        | Graph visualization                      |

---

## 🚀 How It Works

1. 📄 Load PDF document using `SimpleDirectoryReader`
2. 🧠 Split into chunks using `SentenceSplitter`
3. 🔍 Extract triplets (subject-predicate-object) via `KnowledgeGraphIndex`
4. 🗃️ Store triplets in an in-memory `SimpleGraphStore`
5. 🌐 Visualize the graph with PyVis [VIEW](/KG_RAG/KG_graph.html)
6. 💬 Run a hybrid query with Gemini and display the answer

---

In [1]:
# !pip install llama_index.llms.langchain
# !pip install langchain_community
# %pip install llama-index-embeddings-google-genai
# !pip install llama_index
%pip install -U langchain-google-genai
# %pip install -qU langchain-groq
!pip install llama-index-readers-file

In [4]:
import getpass
import os
if "GOOGLE_API_KEY" not in os.environ:
    os.environ["GOOGLE_API_KEY"] = getpass.getpass("Enter your Google AI API key: ")

Enter your Google AI API key: ··········


In [5]:
from langchain_google_genai import ChatGoogleGenerativeAI
llm = ChatGoogleGenerativeAI(
    model="gemini-2.5-flash",
    temperature=0,
    max_tokens=None,
    timeout=None,
    max_retries=2,
)

In [None]:
%pip install llama-index-graph-stores-neo4j

In [17]:
from llama_index.core import VectorStoreIndex, Settings, SimpleDirectoryReader
from llama_index.core.node_parser import SentenceSplitter
from llama_index.embeddings.google_genai import GoogleGenAIEmbedding
documents = SimpleDirectoryReader("data").load_data()
splitter = SentenceSplitter(chunk_size=1000,chunk_overlap=200)
nodes = splitter.get_nodes_from_documents(documents)
embed_model = GoogleGenAIEmbedding(
    model_name="text-embedding-004",
    embed_batch_size=100,
    api_key=""
)
Settings.embed_model = embed_model
Settings.llm = llm

In [None]:
from llama_index.core import (
    KnowledgeGraphIndex,
)
from llama_index.core import KnowledgeGraphIndex
from llama_index.core.graph_stores import SimpleGraphStore
from llama_index.core import StorageContext
# from llama_index.graph_stores.neo4j import Neo4jGraphStore

In [46]:
graph_store = SimpleGraphStore()
storage_context = StorageContext.from_defaults(graph_store=graph_store)
#Construct the Knowlege Graph Undex
index = KnowledgeGraphIndex.from_documents( documents=documents,
                                           max_triplets_per_chunk=3,
                                           storage_context=storage_context,
                                           embed_model=embed_model,
                                          include_embeddings=True)

In [None]:
from pyvis.network import Network
from IPython.display import display
g = index.get_networkx_graph()
net = Network(notebook=True,cdn_resources="in_line",directed=True)
net.from_nx(g)
net.show("graph.html")
net.save_graph("KG_graph.html")
#
import IPython
IPython.display.HTML(filename="/content/KG_graph.html")

In [79]:
# query = "What is Amazon S3? What is it used for?"
# query="what is AWS elastic beanstalk ?"
# query="what is amazon dynamoDB ? What is amazon SQS ?"
# query= "what is amazon route 53 ? what is amazon elastic transcoder ? what is amazon glacier ?"
query="what is flipkart grid hackathon ?"
query_engine = index.as_query_engine(
    include_text=True,
    response_mode="tree_summarize",
    embedding_mode="hybrid",
    similarity_top_k=5,
)

message_template = f"""
You are a factual assistant that answers questions using structured knowledge (like entity relationships).
Answer in markdown format
If the context below does NOT include any relevant information for answering the user's question, just respond with:
"I don't know based on the given information."

DO NOT attempt to guess or fabricate an answer.
DO NOT use prior knowledge beyond the given context.
</s>
<|user|>
Question: {query}

Context:
{{context}}

Helpful Answer:
</s>"""

response = query_engine.query(message_template)

In [61]:
print(response)

Amazon S3 is a storage service for the Internet, designed to facilitate web-scale computing for developers. It offers a straightforward web services interface for storing and retrieving any amount of data from anywhere on the web, at any time. This service provides developers with access to the same highly scalable, reliable, secure, fast, and inexpensive infrastructure that Amazon utilizes for its global network of websites. Objects stored in Amazon S3 are contained within Amazon S3 buckets.


In [70]:
response_2=query_engine.query(message_template)

In [71]:
print(response_2)

AWS Elastic Beanstalk is a service that simplifies the deployment and scaling of web applications and services. It supports various programming languages, including Java, .NET, PHP, Node.js, Python, and Ruby. The service automates deployment aspects such as capacity provisioning, load balancing, auto-scaling, and application health monitoring, while allowing users to maintain full control over the underlying AWS resources.


In [None]:
response_3=query_engine.query(message_template)

In [74]:
print(response_3)

Amazon DynamoDB is a fast, fully managed NoSQL database service designed to simplify and make cost-effective the storage and retrieval of any amount of data and handling of any level of request traffic. It stores all data items on Solid State Drives (SSDs) and replicates them across three Availability Zones for high availability and durability. This service allows users to offload the administrative burden of operating and scaling a highly available distributed database cluster, paying only for what they use. It automatically distributes data and traffic over a sufficient number of servers to manage specified request capacity and stored data, while maintaining consistent, fast performance.

Amazon Simple Queue Service (Amazon SQS) is a fast, reliable, scalable, and fully managed message queuing service. It facilitates the decoupling of cloud application components in a simple and cost-effective manner. SQS enables the transmission of any volume of data at any throughput level without m

In [78]:
response_4=query_engine.query(message_template)
print(response_4)

Amazon Route 53 is a highly available and scalable Domain Name System (DNS) web service. It is designed to provide a reliable and cost-effective way to route end users to Internet applications by translating human-readable names into numeric IP addresses.

Amazon Elastic Transcoder is a media transcoding service in the cloud. It is designed to be a highly scalable, easy-to-use, and cost-effective way for developers and businesses to convert media files from their source format into versions suitable for playback on various devices like smartphones, tablets, and PCs.

Amazon Glacier is an extremely low-cost storage service that offers secure and durable storage for data archiving and backup. It is optimized for data that is infrequently accessed and for which retrieval times of several hours are acceptable.


In [80]:
response_5=query_engine.query(message_template)
print(response_5)

I don't know based on the given information.
