In [1]:
from llama_index.core import SimpleDirectoryReader, KnowledgeGraphIndex
from llama_index.core.graph_stores import SimpleGraphStore
from dotenv import load_dotenv
load_dotenv()
from llama_index.core import Settings
from IPython.display import Markdown, display
from llama_index.core import StorageContext, load_index_from_storage
from google.oauth2 import service_account
from llama_index.llms.vertex import Vertex
from llama_index.embeddings.vertex import VertexTextEmbedding
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI
import nest_asyncio
nest_asyncio.apply()

In [2]:
documents = SimpleDirectoryReader(
     input_files=["data/Graph_Retrieval-Augmented_Generation_A_Survey.pdf"]
).load_data()

In [3]:
filename = "credentials.json"
credentials: service_account.Credentials = (
    service_account.Credentials.from_service_account_file(filename)
)

In [4]:
llm = Vertex(
        model="gemini-pro", 
        project=credentials.project_id, credentials=credentials
    )
    
Settings.llm = llm
Settings.chunk_size = 512

In [5]:
embed_model = VertexTextEmbedding(
    model_name="textembedding-gecko@003",
    project=credentials.project_id, credentials=credentials
)

Settings.embed_model = embed_model

In [6]:
graph_store = SimpleGraphStore()
storage_context = StorageContext.from_defaults(graph_store=graph_store, persist_dir="./storage",)

In [7]:
index = KnowledgeGraphIndex.from_documents(
    documents,
    max_triplets_per_chunk=3,
    storage_context=storage_context,
    include_embeddings=True,
)

In [8]:
index.storage_context.persist(persist_dir="./storage")
index = load_index_from_storage(storage_context)

In [9]:
query_engine = index.as_query_engine(
    include_text=True, 
    response_mode="tree_summarize",
    embedding_mode="hybrid",
    similarity_top_k=10,
)
response = query_engine.query(
    "What is Graph Retrieval-Augmented Generation?",
)
display(Markdown(f"{response}"))

## What is Graph Retrieval-Augmented Generation (GraphRAG)?

Graph Retrieval-Augmented Generation (GraphRAG) is a novel approach that combines the strengths of large language models (LLMs) and graph data to enhance the performance of various downstream tasks. It leverages the powerful natural language processing capabilities of LLMs to generate high-quality responses while utilizing the rich knowledge and relationships embedded in graph data to improve the accuracy and relevance of the generated content.

Here are some key points about GraphRAG:

* **It combines graph foundation models with knowledge retrieval.** This allows GraphRAG to leverage the strengths of both approaches, resulting in improved performance on a variety of tasks.
* **It is a powerful tool for a variety of downstream tasks.** These include question answering, commonsense reasoning, information retrieval, and others.
* **It is a survey of existing methods and techniques for graph retrieval-augmented generation.** This means that it provides an overview of the current state of the art in this field, including different approaches to retrieval, generation enhancement, and downstream tasks.

If you would like to learn more about GraphRAG, I recommend reading the survey paper that I have provided a link to in the context information.


In [10]:
query_engine = index.as_query_engine(
    include_text=False, 
    response_mode="tree_summarize",
    embedding_mode="hybrid",
    similarity_top_k=5,
)
response = query_engine.query(
    "How to do Query-Focused Summarization (QFS)?",
)
display(Markdown(f"{response}"))

## Query-Focused Summarization (QFS)

QFS aims to generate summaries that are relevant and responsive to a specific query. Here's a breakdown of the process:

**1. Understanding the Query:**

* Analyze the query to identify key concepts, entities, and relationships.
* Use techniques like named entity recognition and relation extraction to extract relevant information.

**2. Retrieval and Ranking:**

* Retrieve relevant documents or passages from a knowledge base or corpus.
* Rank the retrieved documents based on their relevance to the query. This can involve techniques like keyword matching, semantic similarity, or neural ranking models.

**3. Summarization:**

* Generate a summary of the retrieved documents that is focused on the query.
* This can involve techniques like sentence extraction, abstractive summarization, or a combination of both.
* Ensure the summary is concise, informative, and responsive to the query.

**4. Evaluation:**

* Evaluate the generated summary for its relevance, informativeness, and responsiveness to the query.
* Use metrics like ROUGE, BLEU, or human evaluation to assess the quality of the summary.

**Additional Considerations:**

* **Domain-specific knowledge:** Incorporate domain-specific knowledge to improve the accuracy and relevance of the summary.
* **User preferences:** Consider user preferences for summary length, style, and level of detail.
* **Explainability:** Provide explanations for how the summary was generated and why certain information was included or excluded.

**Resources:**

* **Unified retrieval and reasoning for solving multi-hop question answering over knowledge graph:** This paper discusses a method for multi-hop question answering over knowledge graphs, which can be helpful for retrieving relevant information for QFS.
* **Graph retrieval-augmented generation:** This survey provides an overview of graph-based approaches for text generation, which can be useful for generating summaries that are more structured and informative.
* **Dense passage retrieval:** This paper discusses dense passage retrieval, which can be used to efficiently retrieve relevant passages for QFS.
* **LLMs for summarization:** Large language models (LLMs) can be used for abstractive summarization, which can generate more fluent and informative summaries.

By following these steps and considering the additional factors, you can effectively perform Query-Focused Summarization.

In [11]:
query_engine = index.as_query_engine(
    include_text=False,
)
response = query_engine.query(
    "How to make fried rice?",
)
display(Markdown(f"{response}"))

I am sorry, but the context provided does not contain information on how to make fried rice. The context focuses on knowledge sequences related to generative approaches and discriminative models. 


In [12]:
from pyvis.network import Network

# Get the networkx graph
g = index.get_networkx_graph()

# Create the Pyvis network
net = Network(notebook=True, cdn_resources="in_line", directed=True)

# Load the networkx graph into the Pyvis network
net.from_nx(g)

# Generate HTML content
html_content = net.generate_html()

# Write the HTML content to a file with UTF-8 encoding
with open("graph-rag.html", "w", encoding="utf-8") as f:
    f.write(html_content)

# Display the generated HTML file in the notebook
from IPython.display import IFrame
IFrame("graph-rag.html", width=900, height=600)