<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/docs/examples/query_engine/knowledge_graph_rag_query_engine.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Knowledge Graph RAG Query Engine


## Graph RAG

Graph RAG is an Knowledge-enabled RAG approach to retrieve information from Knowledge Graph on given task. Typically, this is to build context based on entities' SubGraph related to the task.

## GraphStore backed RAG vs VectorStore RAG

As we compared how Graph RAG helps in some use cases in [this tutorial](https://gpt-index.readthedocs.io/en/latest/examples/index_structs/knowledge_graph/KnowledgeGraphIndex_vs_VectorStoreIndex_vs_CustomIndex_combined.html#id1), it's shown Knowledge Graph as the unique format of information could mitigate several issues caused by the nature of the "split and embedding" RAG approach.

## Why Knowledge Graph RAG Query Engine

In Llama Index, there are two scenarios we could apply Graph RAG:

- Build Knowledge Graph from documents with Llama Index, with LLM or even [local models](https://colab.research.google.com/drive/1G6pcR0pXvSkdMQlAK_P-IrYgo-_staxd?usp=sharing), to do this, we should go for `KnowledgeGraphIndex`.
- Leveraging existing Knowledge Graph, in this case, we should use `KnowledgeGraphRAGQueryEngine`.

> Note, the third query engine that's related to KG in Llama Index is `NL2GraphQuery` or `Text2Cypher`, for either exiting KG or not, it could be done with `KnowledgeGraphQueryEngine`.

Before we start the `Knowledge Graph RAG QueryEngine` demo, let's first get ready for basic preparation of Llama Index.

If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.

```python
%pip install llama-index-graph-stores-nebula
%pip install llama-index-llms-vertex
%pip install llama-index-embeddings-vertex
%pip install llama-index
```


### Vertex

In [None]:
# For Vertex

import os


import logging
import google.auth
import google.auth.transport.requests
from google.cloud import aiplatform
from vertexai.generative_models import HarmBlockThreshold, HarmCategory, SafetySetting
import sys
from dotenv import load_dotenv
import vertexai

load_dotenv()  # this loads the .env script for use below
PROJECT_ID = os.getenv("PROJECT_ID")
LOCATION = os.getenv("LOCATION")
NEBULA_SERVER_ADDRESS = "127.0.0.1"

credentials = google.auth.default(quota_project_id=PROJECT_ID)[0]
request = google.auth.transport.requests.Request()
credentials.refresh(request)

safety_config = [
    SafetySetting(
        category=HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,
        threshold=HarmBlockThreshold.BLOCK_NONE,
    ),
    SafetySetting(
        category=HarmCategory.HARM_CATEGORY_HARASSMENT,
        threshold=HarmBlockThreshold.BLOCK_NONE,
    ),
    SafetySetting(
        category=HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT,
        threshold=HarmBlockThreshold.BLOCK_NONE,
    ),
]

logging.basicConfig(
    stream=sys.stdout, level=logging.INFO
)  # logging.DEBUG for more verbose output

vertexai.init(project=PROJECT_ID, location=LOCATION)

# define LLM
from llama_index.llms.vertex import Vertex
from llama_index.embeddings.vertex import VertexTextEmbedding
from llama_index.core import Settings

Settings.llm = Vertex(
    temperature=0,
    model="gemini-1.5-flash",
    credentials=credentials,
    safety_settings=safety_config,
)
Settings.embed_model = VertexTextEmbedding(credentials=credentials)
Settings.chunk_size = 256

## Prepare for NebulaGraph

We take [NebulaGraphStore](https://gpt-index.readthedocs.io/en/stable/examples/index_structs/knowledge_graph/NebulaGraphKGIndexDemo.html) as an example in this demo, thus before next step to perform Graph RAG on existing KG, let's ensure we have a running NebulaGraph with defined data schema.

This step installs the clients of NebulaGraph, and prepare contexts that defines a [NebulaGraph Graph Space](https://docs.nebula-graph.io/3.6.0/1.introduction/2.data-model/).

##### Create a NebulaGraph (version 3.5.0 or newer) cluster with:
###### Option 0 for machines with Docker: `curl -fsSL nebula-up.siwei.io/install.sh | bash`
###### Option 1 for Desktop: NebulaGraph Docker Extension https://hub.docker.com/extensions/weygu/nebulagraph-dd-ext

###### If not, create it with the following commands from NebulaGraph's console:

```bash
CREATE SPACE llamaindex(vid_type=FIXED_STRING(256), partition_num=1, replica_factor=1);
:sleep 10;
USE llamaindex;
CREATE TAG entity(name string);
CREATE EDGE relationship(relationship string);
:sleep 10;
CREATE TAG INDEX entity_index ON entity(name(256));
```

In [2]:
# %pip install ipython-ngql nebula3-python

os.environ["NEBULA_USER"] = "root"
os.environ["NEBULA_PASSWORD"] = "nebula"  # default is "nebula"
os.environ["NEBULA_ADDRESS"] = (
    "127.0.0.1:9669"  # assumed we have NebulaGraph installed locally
)

space_name = "llamaindex"
edge_types, rel_prop_names = ["relationship"], [
    "relationship"
]  # default, could be omit if create from an empty kg
tags = ["entity"]  # default, could be omit if create from an empty kg

Then we could instiatate a `NebulaGraphStore`, in order to create a `StorageContext`'s `graph_store` as it.

In [3]:
from llama_index.core import StorageContext
from llama_index.graph_stores.nebula import NebulaGraphStore


graph_store = NebulaGraphStore(
    space_name=space_name,
    edge_types=edge_types,
    rel_prop_names=rel_prop_names,
    tags=tags,
)
storage_context = StorageContext.from_defaults(graph_store=graph_store)

Here, we assumed to have the same Knowledge Graph from [this turtorial](https://gpt-index.readthedocs.io/en/latest/examples/query_engine/knowledge_graph_query_engine.html#optional-build-the-knowledge-graph-with-llamaindex)

Let's follow on this tutorial:

# **One time Operation to Load the DB**
# With the help of Llama Index and LLM defined, we could build Knowledge Graph from given documents.

If we have a Knowledge Graph on NebulaGraphStore already, this step could be skipped

Load data from Wikipedia for "Guardians of the Galaxy Vol. 3"

In [22]:
from llama_index.core import download_loader

from llama_index.readers.wikipedia import WikipediaReader

loader = WikipediaReader()

documents = loader.load_data(
    pages=["Guardians of the Galaxy Vol. 2"], auto_suggest=False
)

# Next, Generate a KnowledgeGraphIndex with NebulaGraph as graph_store
Then, we will create a KnowledgeGraphIndex to enable Graph based RAG, apart from that, we have a Knowledge Graph up and running for other purposes, too!

In [None]:
from llama_index.core import KnowledgeGraphIndex

kg_index = KnowledgeGraphIndex.from_documents(
    documents,
    storage_context=storage_context,
    max_triplets_per_chunk=10,
    space_name=space_name,
    edge_types=edge_types,
    rel_prop_names=rel_prop_names,
    tags=tags,
    include_embeddings=True,
)

Parsing nodes:   0%|          | 0/1 [00:00<?, ?it/s]

Processing nodes:   0%|          | 0/227 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/9 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/8 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/9 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/9 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/9 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/9 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/9 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/7 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/4 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/6 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/8 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/9 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/9 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/8 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/8 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/8 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/8 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/20 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/9 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/9 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/9 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/8 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/9 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/9 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/7 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/9 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/9 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/9 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/9 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/7 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/6 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/8 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/9 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/9 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/8 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/9 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/9 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/9 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/9 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/9 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/9 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/6 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/9 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/9 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/9 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/7 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/7 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/10 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/9 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/9 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/8 [00:00<?, ?it/s]

## Perform Graph RAG Query

Finally, let's demo how to do Graph RAG towards an existing Knowledge Graph.

All we need to do is to use `RetrieverQueryEngine` and configure the retriver of it to be `KnowledgeGraphRAGRetriever`.

The `KnowledgeGraphRAGRetriever` performs the following steps:

- Search related Entities of the quesion/task
- Get SubGraph of those Entities (default 2-depth) from the KG
- Build Context based on the SubGraph

Please note, the way to Search related Entities could be either Keyword extraction based or Embedding based, which is controlled by argument `retriever_mode` of the `KnowledgeGraphRAGRetriever`, and supported options are:
- "keyword"
- "embedding"(not yet implemented)
- "keyword_embedding"(not yet implemented)

Here is the example on how to use `RetrieverQueryEngine` and `KnowledgeGraphRAGRetriever`:

In [4]:
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.core.retrievers import KnowledgeGraphRAGRetriever

graph_rag_retriever = KnowledgeGraphRAGRetriever(
    storage_context=storage_context,
    verbose=True,
)

query_engine = RetrieverQueryEngine.from_args(
    graph_rag_retriever,
)

  graph_rag_retriever = KnowledgeGraphRAGRetriever(


Then we can query it like:

In [5]:
import nest_asyncio

nest_asyncio.apply()

In [6]:
from IPython.display import display, Markdown

response = await query_engine.aquery(
    "Tell me about Groot's relationships?",
)
display(Markdown(f"<b>{response}</b>"))

Falling back to grpc since no async rest credentials were detected.


<b>Groot is a character who has many relationships. He is played by Vin Diesel, who also plays Baby Groot. Groot stars in a film and has a sequel. He helps Kraglin, who plays a role in saving Knowhere's citizens. 
</b>

In [7]:
%load_ext ngql
%ngql --address $NEBULA_SERVER_ADDRESS --port 9669 --user root --password <password>

[1;3;38;2;0;135;107m[OK] Connection Pool Created[0m
INFO:nebula3.logger:Get connection to ('127.0.0.1', 9669)


Unnamed: 0,Name
0,llamaindex


In [8]:
%ngql SHOW EDGES


INFO:nebula3.logger:Get connection to ('127.0.0.1', 9669)


Unnamed: 0,Name
0,Relation__
1,__meta__node_label__
2,__meta__rel_label__
3,relationship


## Include nl2graphquery as Context in Graph RAG

The nature of (Sub)Graph RAG and nl2graphquery are different. No one is better than the other but just when one fits more in certain type of questions. To understand more on how they differ from the other, see [this demo](https://www.siwei.io/en/demos/graph-rag/) comparing the two.

<video width="938" height="800"
       src="https://github.com/siwei-io/talks/assets/1651790/05d01e53-d819-4f43-9bf1-75549f7f2be9"  
       controls>
</video>

While in real world cases, we may not always know which approach works better, thus, one way to best leverage KG in RAG are fetching both retrieval results as context and letting LLM + Prompt generate answer with them all being involved.

So, optionally, we could choose to synthesise answer from two piece of retrieved context from KG:
- Graph RAG, the default retrieval method, which extracts subgraph that's related to the key entities in the question.
- NL2GraphQuery, generate Knowledge Graph Query based on query and the Schema of the Knowledge Graph, which is by default switched off.

We could set `with_nl2graphquery=True` to enable it like:

In [9]:
graph_rag_retriever_with_nl2graphquery = KnowledgeGraphRAGRetriever(
    storage_context=storage_context,
    verbose=True,
    with_nl2graphquery=True,
)

query_engine_with_nl2graphquery = RetrieverQueryEngine.from_args(
    graph_rag_retriever_with_nl2graphquery,
)

  graph_rag_retriever_with_nl2graphquery = KnowledgeGraphRAGRetriever(


In [10]:
response = query_engine_with_nl2graphquery.query(
    "What do you know about Groot's relationships?",
)
display(Markdown(f"<b>{response}</b>"))

template
  Input should be a valid dictionary or instance of BasePromptTemplate [type=model_type, input_value=None, input_type=NoneType]
    For further information visit https://errors.pydantic.dev/2.9/v/model_type


<b>Groot has a variety of relationships. He is played by Vin Diesel, who also returns as Groot. Groot is also known as Baby Groot. He stars in a sequel that was directed by James Gunn. The sequel was set shortly after the first film and was Gunn's next project. Groot also helps Kraglin, who is played by Sean Gunn. Kraglin helps Rocket and takes on a mission. 
</b>

And let's check the response's metadata to know more details of the retrival of Graph RAG with nl2graphquery by inspecting `response.metadata`.

- **text2Cypher**, it generates a Cypher Query towards the answer as the context.

```cypher
Graph Store Query: MATCH (e:`entity`)-[r:`relationship`]->(e2:`entity`)
WHERE e.`entity`.`name` == 'Peter Quill'
RETURN e2.`entity`.`name`
```
- **SubGraph RAG**, it get the SubGraph of 'Peter Quill' to build the context.

- Finally, it combined the two nodes of context, to synthesize the answer.

In [17]:
import pprint

pp = pprint.PrettyPrinter()
pp.pprint(response.metadata)

{'aa3d458e-6bd2-4029-a851-55224b65816b': {'kg_rel_map': {'Groot{name: Groot}': ['Groot{name: '
                                                                                'Groot} '
                                                                                '<-[relationship:{relationship: '
                                                                                'Would '
                                                                                'return '
                                                                                'as}]- '
                                                                                'Vin '
                                                                                'diesel{name: '
                                                                                'Vin '
                                                                                'diesel} '
                                                                        

In [None]:
# Query some random Relationships with Cypher
%ngql USE llamaindex;
%ngql MATCH (p)-[e]->(q) RETURN e LIMIT 20

INFO:nebula3.logger:Get connection to ('127.0.0.1', 9669)
INFO:nebula3.logger:Get connection to ('127.0.0.1', 9669)


Unnamed: 0,e
0,"(""Zuckerman"")-[:relationship@6890329262785610005{relationship: ""Called""}]->(""Teaser crowd-pleaser"")"
1,"(""Zoe saldaña"")-[:relationship@-7995950907875852384{relationship: ""Reprises role in""}]->(""Vol. 3"")"
2,"(""Zoe saldaña"")-[:relationship@-437761614642448752{relationship: ""Plays""}]->(""Gamora"")"
3,"(""Yondu udonta"")-[:relationship@-8340197319996953382{relationship: ""Died""}]->(""End of vol. 2"")"
4,"(""Yondu udonta"")-[:relationship@1211371676249638883{relationship: ""Is""}]->(""Blue-skinned buccaneer"")"
5,"(""Yondu udonta"")-[:relationship@1211371676249638883{relationship: ""Is""}]->(""Buccaneer"")"
6,"(""Yondu udonta"")-[:relationship@1211371676249638883{relationship: ""Is""}]->(""Fatherly figure"")"
7,"(""Yondu udonta"")-[:relationship@1211371676249638883{relationship: ""Is""}]->(""Fatherly figure quill"")"
8,"(""Yondu udonta"")-[:relationship@3587906030433501795{relationship: ""Exiled from""}]->(""Ravager community"")"
9,"(""Yondu udonta"")-[:relationship@8813752868920503576{relationship: ""Has""}]->(""Larger head fin"")"


In [61]:
%ngql MATCH (p:entity)-[r]->(m:entity) WHERE p.entity.name == 'Groot' RETURN p.entity.name, r, m;

INFO:nebula3.logger:Get connection to ('127.0.0.1', 9669)


Unnamed: 0,p.entity.name,r,m
0,Groot,"(""Groot"")-[:relationship@-9079449635137805301{relationship: ""Was""}]->(""Rocket's protector"")","(""Rocket's protector"" :entity{name: ""Rocket's protector""})"
1,Groot,"(""Groot"")-[:relationship@-7060404804743788649{relationship: ""Arrive""}]->(""Planet"")","(""Planet"" :entity{name: ""Planet""})"
2,Groot,"(""Groot"")-[:relationship@-7046138375535257733{relationship: ""Character""}]->(""Popular"")","(""Popular"" :entity{name: ""Popular""})"
3,Groot,"(""Groot"")-[:relationship@-6734330852527633311{relationship: ""Is member of""}]->(""Guardians"")","(""Guardians"" :entity{name: ""Guardians""})"
4,Groot,"(""Groot"")-[:relationship@-6302077391416622777{relationship: ""Rescued from""}]->(""Arête's destruction"")","(""Arête's destruction"" :entity{name: ""Arête's destruction""})"
5,Groot,"(""Groot"")-[:relationship@-5758365118766559710{relationship: ""Plants""}]->(""Bomb"")","(""Bomb"" :entity{name: ""Bomb""})"
6,Groot,"(""Groot"")-[:relationship@-5758365118766559710{relationship: ""Plants""}]->(""Ego's brain"")","(""Ego's brain"" :entity{name: ""Ego's brain""})"
7,Groot,"(""Groot"")-[:relationship@-5733813237520639549{relationship: ""Starring as""}]->(""Main characters"")","(""Main characters"" :entity{name: ""Main characters""})"
8,Groot,"(""Groot"")-[:relationship@-5399595019689382089{relationship: ""Remains""}]->(""Ship"")","(""Ship"" :entity{name: ""Ship""})"
9,Groot,"(""Groot"")-[:relationship@-5133317990461404533{relationship: ""Take on""}]->(""Mission"")","(""Mission"" :entity{name: ""Mission""})"


In [62]:
%ng_draw

<class 'pyvis.network.Network'> |N|=21 |E|=21

In [65]:
response = await query_engine.aquery(
    "Tell me about Starlord's relationships?",
)
display(Markdown(f"<b>{response}</b>"))

<b>Starlord cares about Yondu, who has a larger head fin, cares about Starlord, hesitates to turn over a quill, is Starlord's father, keeps Starlord, dies in a vacuum, reveals Starlord's origin, collects Starlord, collects offspring, arrives on a planet, is Starlord's father, has a relationship with Starlord, is hired by Ego, is accepted by Stakar Ogord, has a history with Stakar Ogord, confronts Stakar Ogord, is helped by Kraglin, is accepted by Stakar, is reluctant to kill Gunn, is accepted by Stakar Ogord, shoots Nebula, has a grudge against Stakar Ogord. Starlord is rescued by the Guardians, who are a team that includes Mantis, Gamora, Baby Groot, Rocket, and Phyla. Starlord decides to leave the Guardians, who interact with Warlock, are about a film, and are members of the Guardians. 
</b>

In [66]:
%ng_draw

<class 'pyvis.network.Network'> |N|=21 |E|=21

In [None]:
%ng_draw_schema

INFO:nebula3.logger:Get connection to ('127.0.0.1', 9669)
INFO:nebula3.logger:Get connection to ('127.0.0.1', 9669)
INFO:nebula3.logger:Get connection to ('127.0.0.1', 9669)
INFO:nebula3.logger:Get connection to ('127.0.0.1', 9669)
INFO:nebula3.logger:Get connection to ('127.0.0.1', 9669)
INFO:nebula3.logger:Get connection to ('127.0.0.1', 9669)
INFO:nebula3.logger:Get connection to ('127.0.0.1', 9669)
INFO:nebula3.logger:Get connection to ('127.0.0.1', 9669)
INFO:nebula3.logger:Get connection to ('127.0.0.1', 9669)
INFO:nebula3.logger:Get connection to ('127.0.0.1', 9669)
INFO:nebula3.logger:Get connection to ('127.0.0.1', 9669)
INFO:nebula3.logger:Get connection to ('127.0.0.1', 9669)
INFO:nebula3.logger:Get connection to ('127.0.0.1', 9669)
INFO:nebula3.logger:Get connection to ('127.0.0.1', 9669)
INFO:nebula3.logger:Get connection to ('127.0.0.1', 9669)
INFO:nebula3.logger:Get connection to ('127.0.0.1', 9669)
INFO:nebula3.logger:Get connection to ('127.0.0.1', 9669)


<class 'pyvis.network.Network'> |N|=9 |E|=4

# Cleanup 
%ngql
CLEAR SPACE $space_name; 