# Goal

The goal of this project is to demonstrate how to build an end-to-end GraphRAG pipeline — starting from raw text processing, extracting structured knowledge using LLMs, detecting semantic communities within the knowledge graph, and ultimately enabling interpretable, context-aware question answering.
By leveraging a combination of LLM-based extraction, graph algorithms, and semantic summarization, we transform unstructured text (Charles Dickens’ A Christmas Carol) into a structured, queryable knowledge base, offering a powerful and explainable alternative to traditional RAG systems.

This project highlights how GraphRAG elevates information retrieval by:
- Embedding semantic understanding at the chunk level,
- Structuring knowledge into meaningful communities,
- Summarizing communities with LLMs for human-readable insights,
- Synthesizing multi-perspective answers grounded in structured knowledge.

# 1. Setting Up LLM Access

Instantiates a `gpt-3.5-turbo` powered language model for downstream text extraction and summarization tasks.

In [1]:
from llama_index.llms.openai import OpenAI
import os

if "OPENAI_API_KEY" not in os.environ:
    os.environ["OPENAI_API_KEY"] = "your-openai-api-key"

llm = OpenAI(model="gpt-3.5-turbo")

# 2. GraphRAGExtractor

A transformer component that:
- Sends text chunks to an LLM guided by a structured prompt.
- Extracts entities, types, and relationships.
- Embeds semantic descriptions directly into metadata.
- Outputs a triple structure into LlamaIndex’s graph format.

This forms the knowledge foundation of the GraphRAG system.


## 2.1. Environment Setup

The script first ensures compatibility with asynchronous operations in environments like Jupyter notebooks by importing asyncio and applying nest_asyncio. This allows concurrent tasks, like parallel extraction from multiple text chunks, to run seamlessly.


## 2.2. Class Overview

The `GraphRAGExtractor` class inherits from `TransformComponent` and is initialized with the following:
- An LLM instance `llm`, responsible for extracting knowledge from text.
- A structured prompt `extract_prompt` that instructs the LLM to output entity-relation triplets.
- A parsing function `parse_fn` that converts raw LLM responses into structured formats.
- Parameters like `max_paths_per_chunk` (limits number of triplets) and `num_workers` (enables parallel processing).

If a user does not explicitly supply a prompt or LLM, the extractor defaults to pre-configured values from LlamaIndex settings.


## 2.3. Extraction Workflow

The main extraction happens in two methods:
- `_aextract()`:
    - Takes a text node `BaseNode` as input.
    - Sends the text to the LLM along with the extraction prompt.
    - Parses the LLM output to obtain entities, entity descriptions, relationships, and relationship descriptions.
    - Converts this information into EntityNode and Relation objects, which are stored inside the node’s metadata.
- `acall()`:
    - Manages asynchronous execution.
    - Launches multiple` _aextract()` jobs in parallel across different nodes.
    - Uses `run_jobs()` for efficient, scalable batch processing, especially for large datasets.

Finally, the `__call__()` method simply wraps `acall()` into a synchronous call using `asyncio.run()`, making it easy for users to trigger extraction in a single step.


The `GraphRAGExtractor` transforms unstructured documents into rich, structured graphs, embedding semantic understanding at the node level.
These enriched nodes — carrying entity-relation metadata — serve as the foundation for building property graphs, detecting semantic communities, and enabling interpretable, context-rich retrieval in modern GenAI systems like GraphRAG.

In [2]:
import asyncio
import nest_asyncio

nest_asyncio.apply()

from typing import Any, List, Callable, Optional, Union, Dict
from IPython.display import Markdown, display

from llama_index.core.async_utils import run_jobs
from llama_index.core.indices.property_graph.utils import (
    default_parse_triplets_fn,
)
from llama_index.core.graph_stores.types import (
    EntityNode,
    KG_NODES_KEY,
    KG_RELATIONS_KEY,
    Relation,
)
from llama_index.core.llms.llm import LLM
from llama_index.core.prompts import PromptTemplate
from llama_index.core.prompts.default_prompts import (
    DEFAULT_KG_TRIPLET_EXTRACT_PROMPT,
)
from llama_index.core.schema import TransformComponent, BaseNode
from llama_index.core.bridge.pydantic import BaseModel, Field


class GraphRAGExtractor(TransformComponent):

    llm: LLM
    extract_prompt: PromptTemplate
    parse_fn: Callable
    num_workers: int
    max_paths_per_chunk: int

    def __init__(
        self,
        llm: Optional[LLM] = None,
        extract_prompt: Optional[Union[str, PromptTemplate]] = None,
        parse_fn: Callable = default_parse_triplets_fn,
        max_paths_per_chunk: int = 10,
        num_workers: int = 4,
    ) -> None:
        """Init params."""
        from llama_index.core import Settings

        if isinstance(extract_prompt, str):
            extract_prompt = PromptTemplate(extract_prompt)

        super().__init__(
            llm=llm or Settings.llm,
            extract_prompt=extract_prompt or DEFAULT_KG_TRIPLET_EXTRACT_PROMPT,
            parse_fn=parse_fn,
            num_workers=num_workers,
            max_paths_per_chunk=max_paths_per_chunk,
        )

    @classmethod
    def class_name(cls) -> str:
        return "GraphExtractor"

    def __call__(
        self, nodes: List[BaseNode], show_progress: bool = False, **kwargs: Any
    ) -> List[BaseNode]:
        """Extract triples from nodes."""
        return asyncio.run(
            self.acall(nodes, show_progress=show_progress, **kwargs)
        )

    async def _aextract(self, node: BaseNode) -> BaseNode:
        """Extract triples from a node."""
        assert hasattr(node, "text")

        text = node.get_content(metadata_mode="llm")
        try:
            llm_response = await self.llm.apredict(
                self.extract_prompt,
                text=text,
                max_knowledge_triplets=self.max_paths_per_chunk,
            )
            entities, entities_relationship = self.parse_fn(llm_response)
        except ValueError:
            entities = []
            entities_relationship = []

        existing_nodes = node.metadata.pop(KG_NODES_KEY, [])
        existing_relations = node.metadata.pop(KG_RELATIONS_KEY, [])
        metadata = node.metadata.copy()
        for entity, entity_type, description in entities:
            metadata[
                "entity_description"
            ] = description  # Not used in the current implementation. But will be useful in future work.
            entity_node = EntityNode(
                name=entity, label=entity_type, properties=metadata
            )
            existing_nodes.append(entity_node)

        metadata = node.metadata.copy()
        for triple in entities_relationship:
            subj, obj, rel, description = triple
            subj_node = EntityNode(name=subj, properties=metadata)
            obj_node = EntityNode(name=obj, properties=metadata)
            metadata["relationship_description"] = description
            rel_node = Relation(
                label=rel,
                source_id=subj_node.id,
                target_id=obj_node.id,
                properties=metadata,
            )

            existing_nodes.extend([subj_node, obj_node])
            existing_relations.append(rel_node)

        node.metadata[KG_NODES_KEY] = existing_nodes
        node.metadata[KG_RELATIONS_KEY] = existing_relations
        return node

    async def acall(
        self, nodes: List[BaseNode], show_progress: bool = False, **kwargs: Any
    ) -> List[BaseNode]:
        """Extract triples from nodes async."""
        jobs = []
        for node in nodes:
            jobs.append(self._aextract(node))

        return await run_jobs(
            jobs,
            workers=self.num_workers,
            show_progress=show_progress,
            desc="Extracting paths from text",
        )

# 3. GraphRAGStore

An intelligent graph store that:
 - Converts graphs to NetworkX format.
 - Applies graspologic’s Hierarchical Leiden for community detection.
 - Generates LLM-based summaries of intra-community relationships.
 - Stores and indexes these summaries for fast, structured query access.

Essentially the provided code defines `GraphRAGStore`, a powerful extension of `SimplePropertyGraphStore`, designed to build communities within a knowledge graph and summarize their semantics using a large language model (LLM). This class serves as the critical middle layer in the GraphRAG architecture, where raw extracted knowledge is organized into meaningful clusters for efficient and interpretable retrieval.


## 3.1. Class Purpose

The `GraphRAGStore` is responsible for:
- Structuring the property graph into semantic communities.
- Summarizing each community’s internal relationships.
- Providing a retrieval-ready format where each cluster represents a topic or tightly connected concept set.

It introduces two configurable properties:
- `community_summary`: A dictionary that holds generated summaries per community.
- `max_cluster_size`: A parameter controlling the granularity of clusters during community detection.


## 3.2. Generating Community Summaries

The `generate_community_summary()` method sends a curated set of relationships to an LLM (via the ChatMessage interface) with explicit instructions:
- It asks the LLM to read relationship patterns like entity1 → entity2 → relationship → description.
- It instructs the LLM to synthesize a coherent, concise narrative capturing the essence of these relationships.
- The output is cleaned to remove any AI system-specific prefixes (assistant:) before storage.

This ensures that every semantic community has a human-readable, LLM-generated summary explaining its key ideas.


## 3.3. Building Communities from the Graph

The `build_communities()` method orchestrates the core clustering logic:
- Converts the internal graph into a `NetworkX` graph format via `_create_nx_graph()`. Transforms the internal property graph into a NetworkX graph by creating nodes and edges, where edges carry relationship labels and descriptions.
- Applies the Hierarchical Leiden algorithm `hierarchical_leiden` to detect densely connected groups of nodes (communities).
- Extracts detailed information for each node based on their community using `_collect_community_info()`. For each node, it collects neighbor relationships only within the same community, ensuring that the community summary remains topically consistent.
- Generates community summaries by calling `_summarize_communities()`. It converts the detailed intra-community relationships into a string and feeds it into the LLM to produce a single semantic summary per cluster.

Thus, from a flat knowledge graph, the system creates high-level topical structures automatically.


## 3.5. Lazy Summary Retrieval

Finally, the `get_community_summaries()` method ensures that summaries are generated only when needed:
- If summaries do not exist yet, it triggers a full build_communities() process.
- Otherwise, it quickly retrieves the stored results.

This lazy execution approach optimizes resource usage and avoids redundant computations during repeated querying.


In the GraphRAG pipeline, `GraphRAGStore` plays a crucial role in structuring chaotic knowledge into thematic clusters, and distilling meaning from relationships — making the final retrieval system interpretable, scalable, and more aligned with how humans think (via structured summaries instead of isolated facts).

In [3]:
import re
from llama_index.core.graph_stores import SimplePropertyGraphStore
import networkx as nx
from graspologic.partition import hierarchical_leiden

from llama_index.core.llms import ChatMessage


class GraphRAGStore(SimplePropertyGraphStore):
    community_summary = {}
    max_cluster_size = 5

    def generate_community_summary(self, text):
        """Generate summary for a given text using an LLM."""
        messages = [
            ChatMessage(
                role="system",
                content=(
                    "You are provided with a set of relationships from a knowledge graph, each represented as "
                    "entity1->entity2->relation->relationship_description. Your task is to create a summary of these "
                    "relationships. The summary should include the names of the entities involved and a concise synthesis "
                    "of the relationship descriptions. The goal is to capture the most critical and relevant details that "
                    "highlight the nature and significance of each relationship. Ensure that the summary is coherent and "
                    "integrates the information in a way that emphasizes the key aspects of the relationships."
                ),
            ),
            ChatMessage(role="user", content=text),
        ]
        response = OpenAI().chat(messages)
        clean_response = re.sub(r"^assistant:\s*", "", str(response)).strip()
        return clean_response

    def build_communities(self):
        """Builds communities from the graph and summarizes them."""
        nx_graph = self._create_nx_graph()
        community_hierarchical_clusters = hierarchical_leiden(
            nx_graph, max_cluster_size=self.max_cluster_size
        )
        community_info = self._collect_community_info(
            nx_graph, community_hierarchical_clusters
        )
        self._summarize_communities(community_info)

    def _create_nx_graph(self):
        """Converts internal graph representation to NetworkX graph."""
        nx_graph = nx.Graph()
        for node in self.graph.nodes.values():
            nx_graph.add_node(str(node))
        for relation in self.graph.relations.values():
            nx_graph.add_edge(
                relation.source_id,
                relation.target_id,
                relationship=relation.label,
                description=relation.properties["relationship_description"],
            )
        return nx_graph

    def _collect_community_info(self, nx_graph, clusters):
        """Collect detailed information for each node based on their community."""
        community_mapping = {item.node: item.cluster for item in clusters}
        community_info = {}
        for item in clusters:
            cluster_id = item.cluster
            node = item.node
            if cluster_id not in community_info:
                community_info[cluster_id] = []

            for neighbor in nx_graph.neighbors(node):
                if community_mapping[neighbor] == cluster_id:
                    edge_data = nx_graph.get_edge_data(node, neighbor)
                    if edge_data:
                        detail = f"{node} -> {neighbor} -> {edge_data['relationship']} -> {edge_data['description']}"
                        community_info[cluster_id].append(detail)
        return community_info

    def _summarize_communities(self, community_info):
        """Generate and store summaries for each community."""
        for community_id, details in community_info.items():
            details_text = (
                "\n".join(details) + "."
            )  # Ensure it ends with a period
            self.community_summary[
                community_id
            ] = self.generate_community_summary(details_text)

    def get_community_summaries(self):
        """Returns the community summaries, building them if not already done."""
        if not self.community_summary:
            self.build_communities()
        return self.community_summary

# 4. GraphRAGQueryEngine

The provided code defines the `GraphRAGQueryEngine`, a custom query engine that sits at the final stage of the GraphRAG pipeline, enabling intelligent, structured retrieval based on community summaries rather than raw document chunks. This class plays a pivotal role in transforming fragmented information into coherent, high-quality answers tailored to user queries.


## 4.1. Class Purpose

The `GraphRAGQueryEngine` is designed to:
- Retrieve community-level summaries from the structured graph `GraphRAGStore`.
- Use a Large Language Model (LLM) to interpret each community in the context of the user’s query.
- Aggregate multiple community-level answers into a single, unified response.

In other words, it orchestrates the retrieval, interpretation, and synthesis of knowledge extracted and organized by earlier stages of the pipeline.


## 4.2. Query Processing Workflow

The main method, `custom_query(query_str)`, governs the full query execution flow:
- Step 1: Calls `get_community_summaries()` from the graph store, fetching all community summaries.
- Step 2: For each community, it generates a specific answer to the input query using `generate_answer_from_summary()`.
- Step 3: Consolidates all community-level answers into a final cohesive response via `aggregate_answers()`.

This multi-stage flow ensures that the query is answered holistically, taking into account multiple knowledge clusters rather than depending on a single passage or isolated match.


## 4.3. Answer Generation at the Community Level

The `generate_answer_from_summary()` method:
- Constructs a contextual system prompt embedding the specific community summary alongside the user’s query.
- Sends this structured prompt to the LLM.
- Cleans and extracts the LLM’s reply to remove any system artifacts (like assistant: prefixes).

By grounding the LLM’s reasoning within a narrow topical scope (one community at a time), this method produces focused, contextually relevant partial answers.


## 4.4. Final Answer Aggregation

The `aggregate_answers()` method:
- Takes all intermediate answers generated from different communities.
- Forms a combined prompt asking the LLM to synthesize a final, concise answer from these fragments.
- Again, sanitizes the output for a clean and polished result.

This final aggregation step weaves together different knowledge perspectives into one cohesive narrative — ideal for complex, multi-topic queries.


In traditional RAG systems, retrieval often stops at fetching similar passages.
`GraphRAGQueryEngine` goes a step further:
- It interprets structured summaries.
- It reasons across communities.
- It synthesizes multi-source answers with minimal hallucination risk.

This leads to higher accuracy, better coherence, and superior explainability — all critical factors for production-grade GenAI applications.

In [4]:
from llama_index.core.query_engine import CustomQueryEngine
from llama_index.core.llms import LLM


class GraphRAGQueryEngine(CustomQueryEngine):
    graph_store: GraphRAGStore
    llm: LLM

    def custom_query(self, query_str: str) -> str:
        """Process all community summaries to generate answers to a specific query."""
        community_summaries = self.graph_store.get_community_summaries()
        community_answers = [
            self.generate_answer_from_summary(community_summary, query_str)
            for _, community_summary in community_summaries.items()
        ]

        final_answer = self.aggregate_answers(community_answers)
        return final_answer

    def generate_answer_from_summary(self, community_summary, query):
        """Generate an answer from a community summary based on a given query using LLM."""
        prompt = (
            f"Given the community summary: {community_summary}, "
            f"how would you answer the following query? Query: {query}"
        )
        messages = [
            ChatMessage(role="system", content=prompt),
            ChatMessage(
                role="user",
                content="I need an answer based on the above information.",
            ),
        ]
        response = self.llm.chat(messages)
        cleaned_response = re.sub(r"^assistant:\s*", "", str(response)).strip()
        return cleaned_response

    def aggregate_answers(self, community_answers):
        """Aggregate individual community answers into a final, coherent response."""
        # intermediate_text = " ".join(community_answers)
        prompt = "Combine the following intermediate answers into a final, concise response."
        messages = [
            ChatMessage(role="system", content=prompt),
            ChatMessage(
                role="user",
                content=f"Intermediate answers: {community_answers}",
            ),
        ]
        final_response = self.llm.chat(messages)
        cleaned_final_response = re.sub(
            r"^assistant:\s*", "", str(final_response)
        ).strip()
        return cleaned_final_response

# 5. Process the Input data

Before building graphs or querying information, the first step is to prepare the raw text into a format that downstream AI models and pipelines can handle efficiently. The provided code walks through a simple but critical preprocessing pipeline that transforms a large text file into manageable, chunked units ready for further processing.

## 5.1. Loading the Source Document
- "A Christmas Carol" is a novell by Charles Dickens, published in 1843. We have downloaded it as `book.txt`. It is opened and its content is read into memory.
- This raw text is wrapped into a `Document` object (from `LlamaIndex`’s core APIs), which standardizes it for consistent handling across different pipeline stages.

## 5.2. Initializing the Sentence-Level Chunking Strategy
- A `SentenceSplitter` is initialized to divide the large text into smaller, sentence-level chunks.
- Each chunk is up to 1024 tokens long, with an overlap of 20 tokens between adjacent chunks.

## 5.3. Splitting the Document into Nodes
- The `SentenceSplitter` processes the document and produces a list of nodes.
- Each node represents a chunk of text, optimized for feeding into downstream knowledge extraction models or graph builders.

Chunking is a foundational step in any Retrieval-Augmented or Graph-based AI pipeline:
- It balances granularity and context.
- It ensures efficient LLM querying by controlling token size.
- It preserves semantic flow by introducing overlaps.

In [5]:
from llama_index.core import Document
from llama_index.core.node_parser import SentenceSplitter

with open("./book.txt") as f:
    doc = f.read() 
text = doc


documents = [Document(text=text)]


# Step 3: Initialize a chunking (node parser) strategy – sentence-level splitting

splitter = SentenceSplitter(
    chunk_size=1024,
    chunk_overlap=20,
)
nodes = splitter.get_nodes_from_documents(documents)

# # Step 5: Load nodes (chunks) into a list
# chunked_docs = [node.text for node in nodes]
nodes[0]
len(nodes)

45

# 6. Define a Structured Extraction Prompt

Provides the LLM with explicit instructions on extracting entities, relationships, and descriptions, ensuring structured, parsable JSON output.

In [6]:
KG_TRIPLET_EXTRACT_TMPL = """
-Goal-
Given a text document, identify all entities and their entity types from the text and all relationships among the identified entities.
Given the text, extract up to {max_knowledge_triplets} entity-relation triplets.

-Steps-
1. Identify all entities. For each identified entity, extract the following information:
- entity_name: Name of the entity, capitalized
- entity_type: Type of the entity
- entity_description: Comprehensive description of the entity's attributes and activities

2. From the entities identified in step 1, identify all pairs of (source_entity, target_entity) that are *clearly related* to each other.
For each pair of related entities, extract the following information:
- source_entity: name of the source entity, as identified in step 1
- target_entity: name of the target entity, as identified in step 1
- relation: relationship between source_entity and target_entity
- relationship_description: explanation as to why you think the source entity and the target entity are related to each other

3. Output Formatting:
- Return the result in valid JSON format with two keys: 'entities' (list of entity objects) and 'relationships' (list of relationship objects).
- Exclude any text outside the JSON structure (e.g., no explanations or comments).
- If no entities or relationships are identified, return empty lists: { "entities": [], "relationships": [] }.

-An Output Example-
{
  "entities": [
    {
      "entity_name": "Albert Einstein",
      "entity_type": "Person",
      "entity_description": "Albert Einstein was a theoretical physicist who developed the theory of relativity and made significant contributions to physics."
    },
    {
      "entity_name": "Theory of Relativity",
      "entity_type": "Scientific Theory",
      "entity_description": "A scientific theory developed by Albert Einstein, describing the laws of physics in relation to observers in different frames of reference."
    },
    {
      "entity_name": "Nobel Prize in Physics",
      "entity_type": "Award",
      "entity_description": "A prestigious international award in the field of physics, awarded annually by the Royal Swedish Academy of Sciences."
    }
  ],
  "relationships": [
    {
      "source_entity": "Albert Einstein",
      "target_entity": "Theory of Relativity",
      "relation": "developed",
      "relationship_description": "Albert Einstein is the developer of the theory of relativity."
    },
    {
      "source_entity": "Albert Einstein",
      "target_entity": "Nobel Prize in Physics",
      "relation": "won",
      "relationship_description": "Albert Einstein won the Nobel Prize in Physics in 1921."
    }
  ]
}

-Real Data-
######################
text: {text}
######################
output:"""

# 7. Entity and Relationship Extraction

Initializes a custom extractor that sends each text chunk to the LLM, extracts a maximum of 2 triples, and parses the output into nodes and edges.

In [7]:
import json


def parse_fn(response_str: str) -> Any:
    json_pattern = r"\{.*\}"
    match = re.search(json_pattern, response_str, re.DOTALL)
    entities = []
    relationships = []
    if not match:
        return entities, relationships
    json_str = match.group(0)
    try:
        data = json.loads(json_str)
        entities = [
            (
                entity["entity_name"],
                entity["entity_type"],
                entity["entity_description"],
            )
            for entity in data.get("entities", [])
        ]
        relationships = [
            (
                relation["source_entity"],
                relation["target_entity"],
                relation["relation"],
                relation["relationship_description"],
            )
            for relation in data.get("relationships", [])
        ]
        return entities, relationships
    except json.JSONDecodeError as e:
        print("Error parsing JSON:", e)
        return entities, relationships


kg_extractor = GraphRAGExtractor(
    llm=llm,
    extract_prompt=KG_TRIPLET_EXTRACT_TMPL,
    max_paths_per_chunk=2,
    parse_fn=parse_fn,
)

# 8. Build the Knowledge Graph

- Uses `GraphRAGExtractor` to build a property graph.
- Stores graph data inside a customized `GraphRAGStore`, which will later detect communities and generate summaries.

Also lets take a look at few Entity, Relation and Relation descriptions.

In [8]:
from llama_index.core import PropertyGraphIndex

index = PropertyGraphIndex(
    nodes=nodes,
    property_graph_store=GraphRAGStore(),
    kg_extractors=[kg_extractor],
    show_progress=True,
)

Extracting paths from text: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 45/45 [00:46<00:00,  1.02s/it]
Generating embeddings: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:02<00:00,  2.99s/it]
Generating embeddings: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:02<00:00,  1.88it/s]


In [9]:
list(index.property_graph_store.graph.nodes.values())[-1]

EntityNode(label='entity', embedding=None, properties={'triplet_source_id': 'a96a0fc3-9253-4e07-bccb-4f9433fc0e37'}, name='turkey')

In [10]:
list(index.property_graph_store.graph.relations.values())[0]

Relation(label='former partner', source_id='Ebenezer Scrooge', target_id='Marley', properties={'relationship_description': 'Ebenezer Scrooge was a former partner of Marley in business.', 'triplet_source_id': 'e2249b72-2769-4475-a037-625fdec11e3b'})

In [11]:
list(index.property_graph_store.graph.relations.values())[0].properties[
    "relationship_description"
]

'Ebenezer Scrooge was a former partner of Marley in business.'

# 9. Community Detection and Summarization

- Converts the graph into a NetworkX structure.
- Applies Hierarchical Leiden clustering to group related entities.
- Automatically generates LLM-based summaries for each community.

In [12]:
index.property_graph_store.build_communities()



# 10. Query Processing with GraphRAG

Initializes a specialized query engine that uses community summaries to answer questions efficiently.

In [13]:
query_engine = GraphRAGQueryEngine(
    graph_store=index.property_graph_store, llm=llm
)

# 11. Lets ask few Questions

In [15]:
queries = [
    "What is the significance of Christmas Eve in A Christmas Carol?",
    "How does the setting of Victorian London contribute to the story's themes?",
    "Describe the chain of events that leads to Scrooge's transformation.",
    "How does Dickens use the different spirits (Past, Present, and Future) to guide Scrooge?",
    "Why does Dickens choose to divide the story into \"staves\" rather than chapters?"
]

for query in queries:
    response = query_engine.query(query)
    print("##########################################\n")
    print(f"QUESTION : {query}\n")
    print(f"GRAPHRAG ANSWER : {response}\n")
print("##########################################")

##########################################

QUESTION : What is the significance of Christmas Eve in A Christmas Carol?

GRAPHRAG ANSWER : Christmas Eve holds significant importance in "A Christmas Carol" as it serves as the catalyst for the transformation of Ebenezer Scrooge. On this night, Scrooge is visited by the spirits of Marley, Christmas Past, Present, and Yet to Come, leading to his redemption and change of heart. The interactions on Christmas Eve highlight themes of redemption, reflection, and the power of human connection, ultimately shaping the narrative dynamics and character development in the story.

##########################################

QUESTION : How does the setting of Victorian London contribute to the story's themes?

GRAPHRAG ANSWER : The setting of Victorian London in "A Christmas Carol" significantly contributes to the story's themes by emphasizing societal disparities, compassion, generosity, and the transformative power of redemption. Through the contrast 