# Introduction to GraphRAG with Python, Ollama, and NetworkX

---

### 1. **Introduction to Large Language Models (LLMs)**
   - **What are LLMs?**  
     LLMs, like GPT, are models trained on vast amounts of text to generate human-like text. They can understand and generate language based on prompts provided by users.
   - **Context Window in LLMs**  
     The **context window** refers to the amount of information (in tokens or words) that an LLM can consider at a time while generating responses. Think of it like a short-term memory limit.
   - **Context Window Limit**  
     The window is limited by design, meaning the model can only "remember" or take into account a certain amount of input at once. This limitation impacts how well it can respond to queries, especially when the input is long or complex.

---

### 2. **Why Retrieval Augmented Generation (RAG) is Required**
   - **The Problem**  
     When users ask LLMs questions, the information may not fit within the limited context window. As a result, the LLM might give incomplete or incorrect answers.
   - **What is RAG?**  
     **Retrieval-Augmented Generation (RAG)** solves this by combining LLMs with external data sources. Instead of relying solely on the model’s internal knowledge, RAG retrieves relevant information from databases or documents before generating a response.
   - **How RAG Works**  
     - **Retrieval**: When a query is made, RAG retrieves relevant chunks of text from external sources.
     - **Augmentation**: These retrieved documents are then fed into the LLM’s context window.
     - **Generation**: The LLM uses both the input and the retrieved documents to create a response.

---

### 3. **Shortcomings of RAG**
   - **Challenges with Relevant Information**  
     RAG doesn’t always retrieve the most relevant data, leading to incoherent or irrelevant answers.
   - **Efficiency**  
     Retrieving and processing large documents can be computationally expensive.
   - **Context Switching**  
     When the retrieval process pulls in too many chunks of data, the model might struggle to maintain context, resulting in disjointed or inaccurate responses.

---

### 4. **Solutions: Semantic Chunking, Ranking, and Re-ranking**
   - **Semantic Chunking**  
     Breaks down large documents into meaningful "chunks" based on content. This helps in retrieving smaller, more relevant parts of a document.
   - **Ranking**  
     After retrieval, the system ranks the chunks based on their relevance to the query.
   - **Re-ranking**  
     Uses machine learning algorithms to re-rank the retrieved documents to ensure that the most useful information is prioritized.

---

### 5. **Issues that Still Persist**
   - **Complex Queries**  
     RAG still struggles with highly complex, multi-part questions that require a deep understanding of multiple documents.
   - **Scaling**  
     As the size of external knowledge sources grows, retrieval efficiency and relevance can degrade.

---

### 6. **Introduction to Graph Theory and Graph Databases**
   - **Graph Theory Basics**  
     In graph theory, data is represented as nodes (entities) and edges (relationships between entities). This allows complex relationships to be modeled in a highly structured way.
   - **Graph Databases**  
     Unlike traditional databases, graph databases store data in the form of nodes and edges, making it easier to traverse relationships and retrieve connected information.

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Graph Theory and Graph Databases</title>
    <script type="module">
        // Load the Mermaid library
        import mermaid from 'https://cdn.jsdelivr.net/npm/mermaid@10/dist/mermaid.esm.min.mjs';
        mermaid.initialize({ startOnLoad: true });
    </script>
    <style>
        /* Optional: Style for better visualization */
        .mermaid {
            background-color: #f9f9f9;
            border: 1px solid #ccc;
            border-radius: 5px;
            padding: 10px;
        }
    </style>
</head>
<body>

<h2>Knowledge Graph: Graph Theory and Graph Databases</h2>
<div class="mermaid">
    graph TD;
        A[Graph Theory] --> B[Graph Databases];
        A --> C[Nodes and Edges];
        B --> D[Complex Relationships];
        C --> E[Entities];
        C --> F[Relationships];
        D --> G[Fast Queries];
        D --> H[Contextual Data];
        E --> I[Depth-First Search];
        E --> J[Breadth-First Search];
</div>

</body>
</html>

---

### 7. **How Graph Databases Work**
   - **Nodes and Edges**  
     Nodes represent entities, while edges represent relationships between these entities. Graph queries allow for fast and intuitive exploration of connections, which can be helpful in retrieving contextual data.
   - **Graph Algorithms**  
     Graph databases often employ algorithms like depth-first search or breadth-first search to efficiently find related data based on a query.

---

### 8. **What is GraphRAG?**
   - **Initial Concept**  
     GraphRAG combines **graph theory** with **RAG** to improve how information is retrieved and related across datasets. It enhances the retrieval process by mapping the relationships between pieces of data.
   - **How GraphRAG Works**  
     - **Graph-Based Retrieval**: Instead of relying solely on document-level retrieval, GraphRAG uses graph databases to retrieve data based on the relationships between entities. This provides more contextually relevant data.
     - **Traversing the Graph**: Queries traverse the graph to identify not just relevant data but also data that is related through nodes and edges.
     - **Improved Augmentation**: This graph-based approach helps the LLM to understand not just the isolated pieces of information but also how they are related, improving the quality of generated responses.

---

### 9. **Conclusion**
   GraphRAG is an advancement over traditional RAG because it leverages the structured power of graph theory to retrieve and relate data more effectively. While it solves many issues of information retrieval, there are still challenges, such as scaling the graph and efficiently querying vast amounts of interconnected data.

---


## What is in this tutorial package

- code
- data
- images

In [None]:
import os
from typing import Tuple
import pickle
import ollama
import networkx as nx
import numpy as np
import matplotlib.pyplot as plt
import concurrent.futures
import re
import PyPDF2
from nltk.tokenize import sent_tokenize, word_tokenize
from nltk.corpus import stopwords
from rake_nltk import Rake

# Ensure you have the necessary NLTK resources
import nltk
nltk.download('punkt')
nltk.download('stopwords')


## this fucntion below reads files


def read_file(file_path):
    """Read the content of a Markdown or PDF file."""
    if file_path.endswith('.pdf'):
        with open(file_path, 'rb') as file:
            reader = PyPDF2.PdfReader(file)
            text = ''
            for page in reader.pages:
                text += page.extract_text() + '\n'
            return text
    elif file_path.endswith('.md') or file_path.endswith('.markdown'):
        with open(file_path, 'r', encoding='utf-8') as file:
            return file.read()
    else:
        raise ValueError("Unsupported file type. Please provide a Markdown or PDF file.")

# this function was intended for chapter finding but could not use it due to complexity


def detect_table_of_contents(text):
    """Detect the Table of Contents in the document."""
    toc_pattern = re.compile(r'^(Chapter Contents \d+|[0-9]+\. [A-Za-z0-9 .\-]+)(?:\s*-\s*[0-9]+)?$', re.MULTILINE)
    toc_matches = toc_pattern.findall(text)
    return toc_matches


# Here comes the most  important function for this project, 
# this function forms the network of the graph by chunking pages to paragraphs to senteces to words 
# and generating embeddings for each or them
# then it will find the keywords using RAKE Keyword extrantion algorithm
# giving us a knowledge graph
# this is a crude implementation hence the graph will be dense and process will take time
# If you manually give it the chapter names It will be blazing fast

def split_text_into_sections(text):
    """Split text into chapters, pages, paragraphs, sentences, and words."""

    def split_text(text, delimiters):
        """Split text using multiple delimiters."""
        # Create a regex pattern that matches any of the delimiters
        pattern = '|'.join(map(re.escape, delimiters))
        return re.split(pattern, text)

    chapternames = ["Bioprocess Development: An Interdisciplinary Challenge",
                    "Introduction to Engineering Calculations",
                    "Presentation and Analysis of Data",
                    "Material Balances",
                    "Energy Balances",
                    "Unsteady-State Material and Energy Balances",
                    "Fluid Flow and Mixing",
                    "Heat Transfer",
                    "Mass Transfer",
                    "Unit Operations",
                    "Homogeneous Reactions",
                    "Heterogeneous Reactions",
                    "Reactor Engineering",
                    "Appendices",
                    "Appendix A Conversion Factors",
                    "Appendix B Physical and Chemical Property Data",
                    "Appendix C Steam Tables",
                    "Appendix D Mathematical Rules",
                    "Appendix E List of Symbols",
                    "Index",
                    'A Special Tree', 'The School Among the Pines', 
                  'The Wind on Haunted Hill', 'Romi and the Wildfire', 'Tiger My Friend', 
                  'Monkey Trouble', 'Snake Trouble', 'Those Three Bears', 'The Coral Tree', 
                  "The Thief's Story", 'When the Trees Walked', 'Goodbye, Miss Mackenzie', 
                  'Pret in the House', 'The Overcoat', 'The Tunnel', 'Wild Fruit', 
                  'The Night the Roof Blew Off', "A Traveller's Tale", 'And Now We are Twelve']  # List of chapters already given for making it fast
    

    chapters = split_text(text,chapternames) # deactivate if not using the Biochem.md or rb.md
    #chapters=text.split('Chapter')  # activate if not using the Biochem.md
    graph = nx.Graph()
    stop_words = set(stopwords.words('english'))  # Load English stopwords

    def process_chapter(chapter):
        """Process a single chapter into pages, paragraphs, sentences, and words."""
        pages = chapter.split('\n\n')  # Assuming pages are separated by double newlines
        for page in pages:
            paragraphs = re.split(r'\n+', page)  # Split into paragraphs
            for paragraph in paragraphs:
                sentences = sent_tokenize(paragraph)  # Split into sentences using NLTK
                for sentence in sentences:
                    words = word_tokenize(sentence)  # Split into words using NLTK
                    filtered_words = [word for word in words if word.lower() not in stop_words]  # Remove stopwords
                    
                    # Create nodes in the graph
                    graph.add_node(sentence)
                    sentence_embedding = get_embedding(sentence)
                    graph.nodes[sentence]['embedding'] = sentence_embedding  # Store embedding in the graph
                    
                    for word in filtered_words:
                        graph.add_node(word)
                        graph.add_edge(sentence, word)  # Connect sentence to its words

                    # Extract keywords using RAKE
                    r = Rake()
                    r.extract_keywords_from_text(sentence)
                    keywords = r.get_ranked_phrases()
                    graph.nodes[sentence]['keywords'] = keywords  # Store keywords in the graph
                    for keyword in keywords:
                        graph.add_node(keyword)
                        keyword_embedding = get_embedding(keyword)
                        graph.nodes[keyword]['embedding'] = keyword_embedding  # Store embedding in the graph
                        graph.add_edge(sentence, keyword)  # Connect sentence to its keywords
                        
                graph.add_edge(page, paragraph)  # Connect page to its paragraphs
            graph.add_edge(chapter, page)  # Connect chapter to its pages

    # Use multithreading to process chapters
    with concurrent.futures.ThreadPoolExecutor() as executor:
        futures = [executor.submit(process_chapter, chapter) for chapter in chapters]
        for future in concurrent.futures.as_completed(futures):
            try:
                future.result()  # Wait for the chapter processing to complete
            except Exception as e:
                print(f"Error processing chapter: {e}")

    return graph

# GraphRAG takes a lot of time to calculate on big books so we will save the graphs as yaml

def save_graph(graph, filepath):
    """Save the graph to a specified file path using pickle."""
    # Check if the filepath is a directory or a file
    if os.path.isdir(filepath):
        raise ValueError("Please provide a file name along with the directory path.")
    
    # Check if the file path ends with .gpickle
    if not filepath.endswith('.gpickle'):
        raise ValueError("File must have a .gpickle extension.")
    
    # Ensure the directory exists
    os.makedirs(os.path.dirname(filepath), exist_ok=True)
    
    # Save the graph using pickle
    with open(filepath, 'wb') as f:
        pickle.dump(graph, f, pickle.HIGHEST_PROTOCOL)
    print(f"Graph saved to {filepath}")

def load_graph(filepath):
    """Load the graph from a specified file path using pickle."""
    # Check if the file exists
    if not os.path.isfile(filepath):
        raise FileNotFoundError(f"No such file: '{filepath}'")
    
    # Check if the file path ends with .gpickle
    if not filepath.endswith('.gpickle'):
        raise ValueError("File must have a .gpickle extension.")
    
    # Load the graph using pickle
    with open(filepath, 'rb') as f:
        graph = pickle.load(f)
    print(f"Graph loaded from {filepath}")
    return graph



# The embedding Function

def get_embedding(text, model="mxbai-embed-large"):
    """Get embedding for a given text using Ollama API."""
    response = ollama.embeddings(model=model, prompt=text)
    return response["embedding"]

# This function below gets the similarity of keywords in question with the huge text

def calculate_cosine_similarity(chunk, query_embedding, embedding):
    """Calculate cosine similarity between a chunk and the query."""
    if np.linalg.norm(query_embedding) == 0 or np.linalg.norm(embedding) == 0:
        return (chunk, 0)  # Handle zero vectors
    cosine_sim = np.dot(query_embedding, embedding) / (np.linalg.norm(query_embedding) * np.linalg.norm(embedding))
    return (chunk, cosine_sim)



# The Retrival portion of the graphrag

def find_most_relevant_chunks(query, graph):
    """Find the most relevant chunks based on the graph and cosine similarity to the query."""
    # Step 1: Extract keywords from the query using RAKE
    r = Rake()
    r.extract_keywords_from_text(query)
    keywords = r.get_ranked_phrases()

    # Step 2: Find relevant sentences in the graph based on keywords
    relevant_sentences = set()
    for keyword in keywords:
        for node in graph.nodes():
            if keyword.lower() in node.lower():  # Check if keyword is in the node
                relevant_sentences.add(node)  # Add the whole sentence

    # Step 3: Calculate embeddings for relevant sentences
    similarities = {}
    query_embedding = get_embedding(query)

    for sentence in relevant_sentences:
        if sentence in graph.nodes:
            embedding = graph.nodes[sentence].get('embedding')
            if embedding is not None:
                cosine_sim = calculate_cosine_similarity(sentence, query_embedding, embedding)
                similarities[sentence] = cosine_sim[1]  # Store only the similarity score

    # Sort sentences by similarity
    sorted_sentences = sorted(similarities.items(), key=lambda item: item[1], reverse=True)
    return sorted_sentences[:20]  # Return top 20 relevant sentences





def answer_query(query, graph):
    """Answer a query using the graph and embeddings."""
    relevant_chunks = find_most_relevant_chunks(query, graph)
    context = " ".join(chunk for chunk, _ in relevant_chunks)  # Combine top chunks for context
    response = ollama.generate(model='mistral-nemo:latest', prompt=f"Context: {context} Question: {query}") ## Change the LLM to anyone of your Ollama LLM that has tool use and logical reasoning
    
    if 'response' in response:
        return response['response']
    else:
        return "No answer generated."



### Example usage

In [None]:



file_path = './data/st5.md'  # Specify the path to your Markdown or PDF file 
text = read_file(file_path)  # Read the entire book

# Detect Table of Contents
toc_entries = detect_table_of_contents(text)
print("Detected Table of Contents:")
for entry in toc_entries:
    print(entry)

# Create a hierarchical graph from the book
graph = split_text_into_sections(text)


#save the graph
savefile=  "./graphs/st5.gpickle"           #input("enter path for saving the knowledge base:")
save_graph(graph, savefile)

In [None]:
# load file from path
loadfile=  "./graphs/st5.gpickle"
graph=load_graph(loadfile)

### Visualize the graph (only use in small docs as it is expansive)

In [None]:
# visualizer is now multi threaded for speed


def visualize_graph(graph):
    """Visualize the graph using Matplotlib with improved layout to reduce overlap."""
    def draw_canvas(figsize: Tuple[int, int]):
        print("fig draw starting")
        plt.figure(figsize=(90, 70))  # Adjust figure size for better visibility
        print("fig draw done \n\n")


    def draw_nodes(graph, pos):
        """Draw nodes in the graph."""
        print("node draw starts")
        nx.draw_networkx_nodes(graph, pos, node_size=1200, node_color='lightblue', alpha=0.7)
        print("node draw ends \n\n")

    def draw_edges(graph, pos):
        """Draw edges in the graph."""
        print("edge draw starts")
        nx.draw_networkx_edges(graph, pos, width=1.0, alpha=0.3)
        print("edge draw done \n\n")

    def draw_labels(graph, pos):
        """Draw labels in the graph."""
        print("drawing lables ")
        labels = {}
        for node in graph.nodes():
            keywords = graph.nodes[node].get('keywords', [])
            label = ', '.join(keywords[:3])  # Limit to the first 3 keywords for clarity
            labels[node] = label if label else node[:10] + '...'  # Fallback to node name if no keywords
        nx.draw_networkx_labels(graph, pos, labels, font_size=16)  # Draw labels with smaller font size
        print("lables drawn \n\n")


    draw_canvas(figsize=(90,90))

    # Use ThreadPoolExecutor to handle layout and rescaling concurrently
    with concurrent.futures.ThreadPoolExecutor() as executor:
        # Submit layout calculation
        future_pos = executor.submit(nx.kamada_kawai_layout, graph)
        pos = future_pos.result()  # Get the result of the layout calculation

        # Submit rescaling of the layout
        future_rescale = executor.submit(nx.rescale_layout_dict, pos, scale=2)
        pos = future_rescale.result()  # Get the result of the rescaling

    # Use ThreadPoolExecutor to draw nodes, edges, and labels concurrently
    with concurrent.futures.ThreadPoolExecutor() as executor:
        executor.submit(draw_nodes, graph, pos)
        executor.submit(draw_edges, graph, pos)
        executor.submit(draw_labels, graph, pos)
    plt.title("Graph Visualization of Text Chunks")
    plt.axis('off')  # Turn off the axis
    plt.tight_layout()  # Adjust spacing for better layout
    plt.show()

# Visualize the graph
visualize_graph(graph)

### A list of test  quesions

In [5]:
questions = [
    "Why were there not many fruit trees in the region where Rakesh lived?",
    "What is the significance of the cherry seed in the story?",
    "Why did Rakesh's parents send him to stay with his grandfather in Mussoorie?",
    "What do the oak and deodar trees represent in the setting of the story?",
    "How does the cherry tree serve as a symbol of Rakesh's growth throughout the story?",
    "Why did Grandfather forget about the cherry seed after it was planted?",
    "How does the harsh environment of the Himalayan foothills affect the growth of the cherry tree?",
    "What role does Grandfather play in Rakesh's life beyond being a guardian?",
    "How does the story reflect the theme of nurturing life, both in the tree and in Rakesh?",
    "What is the deeper meaning behind Rakesh wondering, 'Is this what it feels to be God?'",
    "Why was Grandfather's eyesight relevant to the dynamic between him and Rakesh?",
    "What do the monsoon rains and mist symbolize in the development of the cherry tree?",
    "Why does the cherry tree survive despite being damaged by the goat and the woman cutting grass?",
    "What is the importance of the animals, such as the praying mantis and the caterpillar, in the story?",
    "How does the setting of the story, particularly the Himalayan environment, influence the plot?",
    "In what ways does the relationship between Rakesh and the cherry tree evolve over time?",
    "Why do the small cherries at the end of the story taste sour, and what could this symbolize?",
    "What do Grandfather's stories about animals and supernatural beings suggest about his character?",
    "How does the tree's blossoming reflect the overall message of the story?",
    "What does the final scene, where Rakesh and Grandfather sit under the tree, symbolize about life and creation?"
]

questions2 = [
    "Why was the hill called Haunted Hill?",
    "What measures did villagers take to prevent their roofs from being blown away by the wind?",
    "Why was Usha not afraid of returning alone through Haunted Hill?",
    "What is the significance of the story about the ghosts that Usha's grandfather told?",
    "What does Usha notice about the weather before setting off from the bazaar?",
    "How did the atmosphere change as Usha hurried home after her visit to the bazaar?",
    "Why did Usha decide to seek shelter in the ruins?",
    "How did the sudden change in the wind’s direction affect Usha’s journey back home?",
    "What role does the storm play in heightening Usha’s fear in the ruins?",
    "How did Usha react when she encountered the furry object in the ruins?",
    "What evidence does Usha see that makes her believe she has encountered ghosts?",
    "Why did Usha believe she saw two ghostly children in the ruins?",
    "How did Usha feel when she realized she wasn’t alone in the ruins?",
    "Why did Usha, Suresh, and Binya decide to wait until morning to leave the ruins?",
    "What did the sound of the whistling thrush signify to the children?",
    "Why does Usha look back at the ruins before leaving, and what does she see?",
    "What is the significance of Usha’s repeated hearing of the words 'goodbye'?",
    "Why do Usha and her friends doubt whether the voices they heard were real?",
    "How does the setting of the story (Haunted Hill and the ruins) contribute to its eerie mood?",
    "What is the meaning of the final line, 'Or was it just the wind?' in the context of the story?"
]
questions3 = [
    "1. How do you calculate the yield coefficient for a microbial growth process from experimental data?",
    "2. Explain the significance of Monod kinetics in microbial growth and how it compares to Michaelis-Menten kinetics.",
    "3. What factors determine the oxygen transfer rate in a bioreactor, and how can you optimize it?",
    "4. Describe the mechanism of mass transfer in bioreactors and the challenges associated with scale-up.",
    "5. What is the role of shear forces in bioprocessing, and how do they affect microbial cell cultures?",
    "6. How do you design a downstream process for the recovery of intracellular products from a fermentation broth?",
    "7. Discuss the importance of residence time distribution in bioreactors and how it affects bioprocess performance.",
    "8. What are the key parameters that influence heat transfer in large-scale bioreactors?",
    "9. How does the gas-liquid mass transfer coefficient (kLa) vary with operating conditions in an aerated stirred tank reactor?",
    "10. Explain the concept of metabolic flux analysis and its applications in optimizing bioprocesses.",
    "11. How do you determine the rate-limiting step in a multistep biocatalytic reaction?",
    "12. Discuss the role of enzyme kinetics in bioreactor design and how they influence the choice of reactor configuration.",
    "13. What are the methods to increase cell density in a continuous bioprocess, and what challenges are associated with it?",
    "14. How do you model and predict the growth of a mixed microbial population in a bioreactor?",
    "15. What are the strategies to minimize product inhibition in enzyme-catalyzed reactions?",
    "16. Describe the process of chromatographic separation used in protein purification and its challenges in industrial applications.",
    "17. How does fed-batch operation improve productivity in fermentation processes, and what are the limitations?",
    "18. Explain the thermodynamic principles behind the design of bioreactor cooling systems.",
    "19. What are the challenges in scaling up a bioreactor from lab scale to industrial scale?",
    "20. Discuss the impact of medium formulation on microbial metabolism and product yield in a bioprocess."
]
questions4 = [
    "1. What motivated Lily to explore the hidden garden, despite the townsfolk's stories?",
    "2. Why do you think the gate to the hidden garden was left slightly ajar for Lily?",
    "3. What is the significance of the singing flowers in the garden, and when do they sing?",
    "4. How does Lily’s determination to find the owl reflect her belief in magic?",
    "5. In what way does the setting of the hidden garden contribute to the magical atmosphere of the story?",
    "6. What can be inferred about Jack's character based on his interactions with Lily?",
    "7. How does the conversation between Jack and Lily reveal their personal struggles?",
    "8. Why do you think the owl only appears when someone truly believes in magic?",
    "9. What is the symbolic meaning of the owl’s presence in the garden?",
    "10. How does the garden serve as a place of healing for both Lily and Jack?",
    "11. In what ways do the themes of friendship and kindness manifest throughout the story?",
    "12. Why did Lily choose to wish for Jack's sister to get better instead of something for herself?",
    "13. What message does the owl convey about true friendship and selflessness?",
    "14. How does the magical realism in the story enhance its emotional impact?",
    "15. What role does the ancient oak tree play in the garden, both physically and symbolically?",
    "16. Why do you think Jack visits the garden to think, and how does it provide solace for him?",
    "17. How does the story suggest that belief in magic is tied to hope and kindness?",
    "18. What is the significance of the garden being hidden and unknown to most of the townsfolk?",
    "19. How does Lily’s and Jack’s exploration of the garden bring them closer together?",
    "20. What do you think the author is trying to convey about the power of kindness and belief?"
]
questions5= [
    "1. Why did Rakesh stop looking at the cherry tree frequently after planting it, and what does this indicate about human behavior and expectations?",
    "2. How does the gradual growth of the cherry tree symbolize Rakesh's journey of patience and growth?",
    "3. What role does Grandfather play in shaping Rakesh’s understanding of nature and life?",
    "4. The cherry tree faces multiple obstacles, such as being eaten by a goat and cut down by a woman. How do these events reflect the unpredictability of life?",
    "5. Why does Grandfather tell Rakesh that 'nothing is lucky if you put it away,' and how does this philosophy relate to real-life experiences?",
    "6. What does the cherry tree symbolize for Rakesh and Grandfather beyond its physical growth?",
    "7. How does the seasonal cycle of the cherry tree mirror Rakesh's personal growth over time?",
    "8. What is the significance of Rakesh forgetting about the cherry seed after planting it, only to rediscover it later?",
    "9. How does Grandfather’s advice to plant the cherry seed instead of simply keeping it demonstrate a deeper life lesson?",
    "10. Why does Rakesh describe the pebbles around the tree as providing ‘privacy’? What might this reveal about his understanding of growth?",
    "11. The cherry tree attracts various creatures, like the praying mantis and the caterpillar. What do these encounters suggest about the interconnectedness of life?",
    "12. When Rakesh touches the cherry tree at the end and wonders if this is what it feels like to be God, what might he be implying about creation and nurturing?",
    "13. Why does Grandfather’s storytelling about transformation and unusual events contrast with the news Rakesh reads from the newspaper?",
    "14. What is the significance of Grandfather’s increasing grumpiness when the newspaper doesn’t arrive due to the snow?",
    "15. How do the stories told by Grandfather (about people turning into animals, ghosts in trees, etc.) influence Rakesh’s perception of the world?",
    "16. Why does Rakesh keep checking the cherry tree out of the corner of his eye, and how does this subtle action reflect human nature?",
    "17. What is the role of nature’s unpredictability, such as the snow and monsoon rains, in the life cycle of the cherry tree?",
    "18. How does the cherry tree serve as a metaphor for perseverance in the face of adversity?",
    "19. Why does Rakesh compare the birds pecking at the blossoms to a form of resilience in the tree’s continual blooming?",
    "20. What is the deeper significance of Grandfather and Rakesh’s shared silence beneath the tree as they watch the stars?"
]


### Check out the answers

In [None]:
import random


# Example query
query = questions5[random.randint(0,len(questions5))]  # Replace with your question
answer = answer_query(query, graph)

print("Question:", query, "\n\n")
print("Answer:", answer)
