# ChishikAI Researcher
*Chishiki* means *knowledge* in Japanese.

### Why make this app?
Lately, the hype has been around AI agents as knowledge assistants with the ability to do **Deep Research**. They promise a revolution in knowledge curation, synthesis, and discovery. However, one thing I notice about these offerings from either Google or OpenAI is that their Deep Research modes all return walls of text. Impressive and extensive research reports, but still walls of text.

To me, graph is one of the most (if not the most) powerful and intuitive way to represent knowledge. That is why I wanted to build my own version of Deep Research on the foundation of knowledge graph technologies. With ChishikAI, instead of only looking at a body of knowledge, you can also reach inside that body and feel its internal organs and vessels viscerally (sorry for the flowery word choices but it seems right :).

##### Please walk with me through this notebook and let me show you what I have! Please read through it once first before making any changes, because I want to show some illustrative examples (or just make a copy and do whatever).

### First of all, all the imports

In [172]:
import networkx as nx
import nx_arangodb as nxadb

from arango import ArangoClient
import re

# LangChain for orchestration of AI agents
from langgraph.prebuilt import create_react_agent
from langchain_openai import ChatOpenAI
from langchain_community.graphs import ArangoGraph
from langchain_community.chains.graph_qa.arangodb import ArangoGraphQAChain
from langchain_core.tools import tool

# LightRAG for knowledge graph construction and basic querying
from lightrag import LightRAG, QueryParam
from lightrag.llm.openai import gpt_4o_mini_complete, gpt_4o_complete, openai_embed
from pprint import pprint

from IPython.display import display, Markdown

# Asyncio to enable nested event loops
import asyncio
import nest_asyncio

nest_asyncio.apply()

from pydantic import BaseModel, Field
from typing import List

from web_research_helpers import search_duckduckgo, fetch_and_save_content, build_lightrag
from graph_visualization_helpers import visualize_graph

%load_ext autoreload
%autoreload 2


The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


### Connecting to ArangoDB for Persistence and AQL Querying

In [2]:
db = ArangoClient(hosts="https://421d5d59c944.arangodb.cloud:8529").db(username="root", password="heu9aOl6vf5dCU2k6I66", verify=True)
print(db)

<StandardDatabase _system>


### Several Small Helper Functions

In [3]:
def get_graph_schema(graph, db):
    """
    Generates a schema dictionary for the given ArangoDB graph.
    
    Args:
        graph: The graph object obtained from db.graph('your_graph_name')
        db: The database object used to access collections
    
    Returns:
        A dictionary containing the graph and collection schemas.
    """
    schema = {
        'Graph Schema': [],  # You can add graph-level metadata if needed
        'Collection Schema': []
    }
    
    # Process vertex (document) collections
    for vertex_coll_name in graph.vertex_collections():
        coll = db.collection(vertex_coll_name)
        # Get a sample document (if one exists)
        sample_cursor = coll.find({}, limit=1)
        example_doc = next(sample_cursor, None)
    
        if example_doc:
            properties = [{'name': key, 'type': 'str'} for key in example_doc.keys()]
        else:
            properties = [
                {'name': '_key', 'type': 'str'},
                {'name': '_id', 'type': 'str'},
                {'name': '_rev', 'type': 'str'}
            ]
            example_doc = {'_key': None, '_id': None, '_rev': None}
    
        schema['Collection Schema'].append({
            'collection_name': vertex_coll_name,
            'collection_type': 'document',
            'document_properties': properties,
            'example_document': example_doc
        })
    
    # Process edge collections (using the 'edge_collection' key)
    for edge_def in graph.edge_definitions():
        edge_coll_name = edge_def['edge_collection']
        coll = db.collection(edge_coll_name)
        sample_cursor = coll.find({}, limit=1)
        example_edge = next(sample_cursor, None)
    
        if example_edge:
            properties = [{'name': key, 'type': 'str'} for key in example_edge.keys()]
        else:
            properties = [
                {'name': '_key', 'type': 'str'},
                {'name': '_id', 'type': 'str'},
                {'name': '_from', 'type': 'str'},
                {'name': '_to', 'type': 'str'},
                {'name': '_rev', 'type': 'str'}
            ]
            example_edge = {'_key': None, '_id': None, '_from': None, '_to': None, '_rev': None}
    
        schema['Collection Schema'].append({
            'collection_name': edge_coll_name,
            'collection_type': 'edge',
            'edge_properties': properties,
            'example_edge': example_edge
        })
    
    return schema


In [4]:
def nx_graph_info(G):
    """
    Generates a summary of the given NetworkX graph.

    Args:
        G: A NetworkX graph object.

    Returns:
        A string summarizing the number of nodes, edges, graph attributes, 
        and the first few nodes and edges of the graph.
    """
    info_str = (
        f"Graph with {G.number_of_nodes()} nodes and {G.number_of_edges()} edges\n"
        f"Graph attributes: {G.graph}\n"
        f"First few nodes: {list(G.nodes(data=True))[:1]}\n"
        f"First few edges: {list(G.edges(data=True))[:1]}"
    )
    return info_str

In [5]:
def pretty_markdown(text):
    """
    Displays the given text as formatted Markdown.

    Args:
        text: A string containing Markdown-formatted text to be displayed.

    Returns:
        None: This function outputs the Markdown to the display.
    """
    return display(Markdown(text))

# RESEARCH AND BUILD AND LOAD RESEARCH GRAPH
So instead of picking a public dataset, I wanted to build an app that let users construct their own knowledge graph on a topic that they want to do research on. I based this feature on top of this nice library called [LightRAG](https://github.com/HKUDS/LightRAG). LightRAG allows construction of knowledge graphs from a library of documents, as well as querying them using a combination of vector and graph searches. However, this querying feature cannot handle queries that require complex graph analytics, which is why I want to upgrade it with NetworkX and AQL agents. Later on in this notebook, I will show comparison examples.
For now, let's choose a research topic. I put in **majorana 1 quantum chip** as an example. This new chip from Microsoft claimed to be a leap in quantum computing technology (that I know nearly nothing about)

##### Web research and add to LightRAG graph 
You can put your own research topic inside the research_topic variable

In [6]:
research_topic = "majorana 1 quantum chip" # for putting in natural language queries
research_topic_underscore = research_topic.replace(" ", "_") # for better directory naming

##### Now we build the knowledge graph. I made several helper functions for this and import them here. What happens in the background:
- a research topic is given to a DuckDuckGo search engine
- a number of web pages are returned, parsed into markdown, and then processed into the knowledge graph by LightRAG.  
- takes some time depending how many web pages you are indexing

In [None]:
print("Building RAG system...")
rag, _ = build_lightrag(research_topic, max_results=20)
if rag:
    print("\nRAG system built successfully, querying...")
 
else:
    print("Failed to build RAG system")

Building RAG system...
Starting build_lightrag for query: majorana 1 quantum chip
Starting DuckDuckGo search for: majorana 1 quantum chip


INFO:primp:response: https://lite.duckduckgo.com/lite/ 200 18850
INFO:primp:response: https://lite.duckduckgo.com/lite/ 200 24073


Found 20 results
Starting content fetch for 20 results
Processing URL 1: https://azure.microsoft.com/en-us/blog/quantum/2025/02/19/microsoft-unveils-majorana-1-the-worlds-first-quantum-processor-powered-by-topological-qubits/
Processing URL 2: https://news.microsoft.com/source/features/innovation/microsofts-majorana-1-chip-carves-new-path-for-quantum-computing/


INFO:docling.document_converter:Going to convert document batch...
INFO:docling.pipeline.base_pipeline:Processing document microsofts-majorana-1-chip-carves-new-path-for-quantum-computing
--- Logging error ---
Traceback (most recent call last):
  File "/home/asa/arangodb_hackathon/arangodb/lib/python3.11/site-packages/docling/backend/html_backend.py", line 97, in walk
    self.analyse_element(element, idx, doc)
  File "/home/asa/arangodb_hackathon/arangodb/lib/python3.11/site-packages/docling/backend/html_backend.py", line 127, in analyse_element
    self.handle_list(element, idx, doc)
  File "/home/asa/arangodb_hackathon/arangodb/lib/python3.11/site-packages/docling/backend/html_backend.py", line 237, in handle_list
    parent=self.parents[self.level], name="list", label=GroupLabel.LIST
           ~~~~~~~~~~~~^^^^^^^^^^^^
KeyError: -2

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/asa/.local/share/uv/python/cpytho

Successfully processed https://news.microsoft.com/source/features/innovation/microsofts-majorana-1-chip-carves-new-path-for-quantum-computing/
Processing URL 3: https://www.technowize.com/microsoft-majorana-1-chip-quantum-computing-breakthrough/
Error processing https://www.technowize.com/microsoft-majorana-1-chip-quantum-computing-breakthrough/: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
Processing URL 4: https://www.pcmag.com/news/microsoft-majorana-1-chip-quantum-computing-is-years-not-decades-away


INFO:docling.document_converter:Going to convert document batch...
INFO:docling.pipeline.base_pipeline:Processing document microsoft-majorana-1-chip-quantum-computing-is-years-not-decades-away
INFO:docling.document_converter:Finished converting document microsoft-majorana-1-chip-quantum-computing-is-years-not-decades-away in 0.12 sec.


Successfully processed https://www.pcmag.com/news/microsoft-majorana-1-chip-quantum-computing-is-years-not-decades-away
Processing URL 5: https://www.sciencealert.com/microsoft-claims-a-major-quantum-breakthrough-but-what-does-it-do


INFO:docling.document_converter:Going to convert document batch...
INFO:docling.pipeline.base_pipeline:Processing document microsoft-claims-a-major-quantum-breakthrough-but-what-does-it-do
INFO:docling.document_converter:Finished converting document microsoft-claims-a-major-quantum-breakthrough-but-what-does-it-do in 0.10 sec.


Successfully processed https://www.sciencealert.com/microsoft-claims-a-major-quantum-breakthrough-but-what-does-it-do
Processing URL 6: https://www.geeky-gadgets.com/microsoft-majorana-1-quantum-chip-breakthrough/


INFO:docling.document_converter:Going to convert document batch...
INFO:docling.pipeline.base_pipeline:Processing document microsoft-majorana-1-quantum-chip-breakthrough
INFO:docling.document_converter:Finished converting document microsoft-majorana-1-quantum-chip-breakthrough in 0.09 sec.


Successfully processed https://www.geeky-gadgets.com/microsoft-majorana-1-quantum-chip-breakthrough/
Processing URL 7: https://www.livescience.com/technology/computing/quantum-processor-that-uses-entirely-new-state-of-matter-could-set-us-on-the-path-to-quantum-supremacy


INFO:docling.document_converter:Going to convert document batch...
INFO:docling.pipeline.base_pipeline:Processing document quantum-processor-that-uses-entirely-new-state-of-matter-could-set-us-on-the-path-to-quantum-supremacy
INFO:docling.document_converter:Finished converting document quantum-processor-that-uses-entirely-new-state-of-matter-could-set-us-on-the-path-to-quantum-supremacy in 0.11 sec.


Successfully processed https://www.livescience.com/technology/computing/quantum-processor-that-uses-entirely-new-state-of-matter-could-set-us-on-the-path-to-quantum-supremacy
Processing URL 8: https://computercity.com/hardware/processors/microsoft-majorana-1-chip


INFO:docling.document_converter:Going to convert document batch...
INFO:docling.pipeline.base_pipeline:Processing document microsoft-majorana-1-chip
INFO:docling.document_converter:Finished converting document microsoft-majorana-1-chip in 0.31 sec.


Successfully processed https://computercity.com/hardware/processors/microsoft-majorana-1-chip
Processing URL 9: https://www.infoq.com/news/2025/02/microsoft-majorana-quantum-chip/


INFO:docling.document_converter:Going to convert document batch...
INFO:docling.pipeline.base_pipeline:Processing document microsoft-majorana-quantum-chip
INFO:docling.document_converter:Finished converting document microsoft-majorana-quantum-chip in 0.63 sec.


Successfully processed https://www.infoq.com/news/2025/02/microsoft-majorana-quantum-chip/
Processing URL 10: https://www.businessinsider.com/satya-nadella-microsoft-new-majorana-chip-quantum-breakthrough-state-matter-2025-2?op=1


INFO:docling.document_converter:Going to convert document batch...
INFO:docling.pipeline.base_pipeline:Processing document satya-nadella-microsoft-new-majorana-chip-quantum-breakthrough-state-matter-2025-2
INFO:docling.document_converter:Finished converting document satya-nadella-microsoft-new-majorana-chip-quantum-breakthrough-state-matter-2025-2 in 0.13 sec.


Successfully processed https://www.businessinsider.com/satya-nadella-microsoft-new-majorana-chip-quantum-breakthrough-state-matter-2025-2?op=1
Processing URL 11: https://www.nature.com/articles/d41586-025-00527-z


INFO:docling.document_converter:Going to convert document batch...
INFO:docling.pipeline.base_pipeline:Processing document d41586-025-00527-z
INFO:docling.document_converter:Finished converting document d41586-025-00527-z in 0.89 sec.
INFO:docling.document_converter:Going to convert document batch...
INFO:docling.pipeline.base_pipeline:Processing document massive-microsoft-quantum-computer-breakthrough-uses-new-state-of-matter
INFO:docling.document_converter:Finished converting document massive-microsoft-quantum-computer-breakthrough-uses-new-state-of-matter in 0.10 sec.


Successfully processed https://www.nature.com/articles/d41586-025-00527-z
Processing URL 12: https://www.forbes.com/sites/johnkoetsier/2025/02/19/massive-microsoft-quantum-computer-breakthrough-uses-new-state-of-matter/
Successfully processed https://www.forbes.com/sites/johnkoetsier/2025/02/19/massive-microsoft-quantum-computer-breakthrough-uses-new-state-of-matter/
Processing URL 13: https://www.cnbc.com/2025/02/19/microsoft-reveals-its-first-quantum-computing-chip-the-majorana-1.html


INFO:docling.document_converter:Going to convert document batch...
INFO:docling.pipeline.base_pipeline:Processing document microsoft-reveals-its-first-quantum-computing-chip-the-majorana-1.html
INFO:docling.document_converter:Finished converting document microsoft-reveals-its-first-quantum-computing-chip-the-majorana-1.html in 0.14 sec.


Successfully processed https://www.cnbc.com/2025/02/19/microsoft-reveals-its-first-quantum-computing-chip-the-majorana-1.html
Processing URL 14: https://cybernews.com/ai-news/new-microsoft-majorana-1-quantum-chip-is-a-breakthrough-in-quantum-computing/
Error processing https://cybernews.com/ai-news/new-microsoft-majorana-1-quantum-chip-is-a-breakthrough-in-quantum-computing/: 403 Client Error: Forbidden for url: https://cybernews.com/ai-news/new-microsoft-majorana-1-quantum-chip-is-a-breakthrough-in-quantum-computing/
Processing URL 15: https://www.popsci.com/technology/majorana-1-microsoft/


INFO:docling.document_converter:Going to convert document batch...
INFO:docling.pipeline.base_pipeline:Processing document majorana-1-microsoft
INFO:docling.document_converter:Finished converting document majorana-1-microsoft in 0.11 sec.


Successfully processed https://www.popsci.com/technology/majorana-1-microsoft/
Processing URL 16: https://thequantuminsider.com/2025/02/19/microsofts-majorana-1-chip-carves-new-path-for-quantum-computing/


INFO:docling.document_converter:Going to convert document batch...
INFO:docling.pipeline.base_pipeline:Processing document microsofts-majorana-1-chip-carves-new-path-for-quantum-computing
INFO:docling.document_converter:Finished converting document microsofts-majorana-1-chip-carves-new-path-for-quantum-computing in 0.61 sec.


Successfully processed https://thequantuminsider.com/2025/02/19/microsofts-majorana-1-chip-carves-new-path-for-quantum-computing/
Processing URL 17: https://www.nextbigfuture.com/2025/02/microsoft-majorana-1-chip-has-8-qubits-right-now-with-a-roadmap-to-1-million-raw-qubits.html


INFO:docling.document_converter:Going to convert document batch...
INFO:docling.pipeline.base_pipeline:Processing document microsoft-majorana-1-chip-has-8-qubits-right-now-with-a-roadmap-to-1-million-raw-qubits.html
INFO:docling.document_converter:Finished converting document microsoft-majorana-1-chip-has-8-qubits-right-now-with-a-roadmap-to-1-million-raw-qubits.html in 0.20 sec.


Successfully processed https://www.nextbigfuture.com/2025/02/microsoft-majorana-1-chip-has-8-qubits-right-now-with-a-roadmap-to-1-million-raw-qubits.html
Processing URL 18: https://quantum.microsoft.com/en-us/solutions/microsoft-quantum-hardware


INFO:docling.document_converter:Going to convert document batch...
INFO:docling.pipeline.base_pipeline:Processing document microsoft-quantum-hardware
--- Logging error ---
Traceback (most recent call last):
  File "/home/asa/arangodb_hackathon/arangodb/lib/python3.11/site-packages/docling/backend/html_backend.py", line 97, in walk
    self.analyse_element(element, idx, doc)
  File "/home/asa/arangodb_hackathon/arangodb/lib/python3.11/site-packages/docling/backend/html_backend.py", line 127, in analyse_element
    self.handle_list(element, idx, doc)
  File "/home/asa/arangodb_hackathon/arangodb/lib/python3.11/site-packages/docling/backend/html_backend.py", line 237, in handle_list
    parent=self.parents[self.level], name="list", label=GroupLabel.LIST
           ~~~~~~~~~~~~^^^^^^^^^^^^
KeyError: -2

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/asa/.local/share/uv/python/cpython-3.11.11-linux-x86_64-gnu/lib/python3

Successfully processed https://quantum.microsoft.com/en-us/solutions/microsoft-quantum-hardware
Processing URL 19: https://www.theverge.com/news/614205/microsoft-quantum-computing-majorana-1-processor
Successfully processed https://www.theverge.com/news/614205/microsoft-quantum-computing-majorana-1-processor
Processing URL 20: https://www.electronicspecifier.com/products/quantum/majorana-1-microsoft-s-new-quantum-chip


INFO:docling.document_converter:Going to convert document batch...
INFO:docling.pipeline.base_pipeline:Processing document majorana-1-microsoft-s-new-quantum-chip
INFO:docling.document_converter:Finished converting document majorana-1-microsoft-s-new-quantum-chip in 1.40 sec.


Successfully processed https://www.electronicspecifier.com/products/quantum/majorana-1-microsoft-s-new-quantum-chip


INFO:docling.document_converter:Going to convert document batch...
INFO:docling.pipeline.base_pipeline:Processing document microsoft-unveils-majorana-1-the-worlds-first-quantum-processor-powered-by-topological-qubits
--- Logging error ---
Traceback (most recent call last):
  File "/home/asa/arangodb_hackathon/arangodb/lib/python3.11/site-packages/docling/backend/html_backend.py", line 97, in walk
    self.analyse_element(element, idx, doc)
  File "/home/asa/arangodb_hackathon/arangodb/lib/python3.11/site-packages/docling/backend/html_backend.py", line 127, in analyse_element
    self.handle_list(element, idx, doc)
  File "/home/asa/arangodb_hackathon/arangodb/lib/python3.11/site-packages/docling/backend/html_backend.py", line 237, in handle_list
    parent=self.parents[self.level], name="list", label=GroupLabel.LIST
           ~~~~~~~~~~~~^^^^^^^^^^^^
KeyError: -2

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/asa/

Successfully processed https://azure.microsoft.com/en-us/blog/quantum/2025/02/19/microsoft-unveils-majorana-1-the-worlds-first-quantum-processor-powered-by-topological-qubits/
Content fetch completed in 19.19 seconds. Successfully processed 18 files.
Initializing LightRAG...
LightRAG initialized successfully
Processing 18 documents...
Processing file 1/18: ddgo_pages_md/majorana_1_quantum_chip/1_https%3A%2F%2Fazure.microsoft.com%2Fen-us%2Fblog%2Fquantum%2F2025%2F02%2F19%2Fmicrosoft-unveils-majo.md
File 1: Read content, length: 18875


Processing batch 1:   0%|          | 0/1 [00:00<?, ?it/s]INFO:lightrag:Inserting 4 vectors to chunks
Generating embeddings: 100%|██████████| 1/1 [00:00<00:00,  1.09batch/s]

[A
[A

[A[A

[A[AINFO:lightrag:Inserting 90 vectors to entities
Generating embeddings: 100%|██████████| 3/3 [00:01<00:00,  2.30batch/s]
INFO:lightrag:Inserting 30 vectors to relationships
Generating embeddings: 100%|██████████| 1/1 [00:02<00:00,  2.04s/batch]
INFO:lightrag:Writing graph with 91 nodes, 30 edges
Processing batch 1: 100%|██████████| 1/1 [01:08<00:00, 68.06s/it]
INFO:lightrag:Processing 1 new unique documents


File 1: Inserted successfully
Processing file 2/18: ddgo_pages_md/majorana_1_quantum_chip/2_https%3A%2F%2Fnews.microsoft.com%2Fsource%2Ffeatures%2Finnovation%2Fmicrosofts-majorana-1-chip-carve.md
File 2: Read content, length: 17197


Processing batch 1:   0%|          | 0/1 [00:00<?, ?it/s]INFO:lightrag:Inserting 4 vectors to chunks
Generating embeddings: 100%|██████████| 1/1 [00:00<00:00,  1.66batch/s]

[A
[A

[A[A

[A[AINFO:lightrag:Inserting 62 vectors to entities
Generating embeddings: 100%|██████████| 2/2 [00:01<00:00,  1.49batch/s]
INFO:lightrag:Inserting 19 vectors to relationships
Generating embeddings: 100%|██████████| 1/1 [00:00<00:00,  1.92batch/s]
INFO:lightrag:Writing graph with 129 nodes, 44 edges
Processing batch 1: 100%|██████████| 1/1 [00:40<00:00, 40.59s/it]
INFO:lightrag:Processing 1 new unique documents


File 2: Inserted successfully
Processing file 3/18: ddgo_pages_md/majorana_1_quantum_chip/4_https%3A%2F%2Fwww.pcmag.com%2Fnews%2Fmicrosoft-majorana-1-chip-quantum-computing-is-years-not-decade.md
File 3: Read content, length: 12855


Processing batch 1:   0%|          | 0/1 [00:00<?, ?it/s]INFO:lightrag:Inserting 3 vectors to chunks
Generating embeddings: 100%|██████████| 1/1 [00:01<00:00,  1.57s/batch]

[A
[A

[A[A

[A[AINFO:lightrag:Inserting 92 vectors to entities
Generating embeddings: 100%|██████████| 3/3 [00:02<00:00,  1.01batch/s]
INFO:lightrag:Inserting 12 vectors to relationships
Generating embeddings: 100%|██████████| 1/1 [00:02<00:00,  2.75s/batch]
INFO:lightrag:Writing graph with 214 nodes, 53 edges
Processing batch 1: 100%|██████████| 1/1 [01:39<00:00, 99.93s/it]
INFO:lightrag:Processing 1 new unique documents


File 3: Inserted successfully
Processing file 4/18: ddgo_pages_md/majorana_1_quantum_chip/5_https%3A%2F%2Fwww.sciencealert.com%2Fmicrosoft-claims-a-major-quantum-breakthrough-but-what-does-it-.md
File 4: Read content, length: 5732


Processing batch 1:   0%|          | 0/1 [00:00<?, ?it/s]INFO:lightrag:Inserting 2 vectors to chunks
Generating embeddings: 100%|██████████| 1/1 [00:00<00:00,  2.21batch/s]

[A
[A

[A[A

[A[AINFO:lightrag:Inserting 18 vectors to entities
Generating embeddings: 100%|██████████| 1/1 [00:00<00:00,  1.19batch/s]
INFO:lightrag:Inserting 12 vectors to relationships
Generating embeddings: 100%|██████████| 1/1 [00:00<00:00,  1.05batch/s]
INFO:lightrag:Writing graph with 225 nodes, 61 edges
Processing batch 1: 100%|██████████| 1/1 [00:45<00:00, 45.14s/it]
INFO:lightrag:Processing 1 new unique documents


File 4: Inserted successfully
Processing file 5/18: ddgo_pages_md/majorana_1_quantum_chip/6_https%3A%2F%2Fwww.geeky-gadgets.com%2Fmicrosoft-majorana-1-quantum-chip-breakthrough%2F.md
File 5: Read content, length: 12324


Processing batch 1:   0%|          | 0/1 [00:00<?, ?it/s]INFO:lightrag:Inserting 3 vectors to chunks
Generating embeddings: 100%|██████████| 1/1 [00:00<00:00,  3.02batch/s]

[A
[A

[A[A

[A[AINFO:lightrag:Inserting 40 vectors to entities
Generating embeddings: 100%|██████████| 2/2 [00:00<00:00,  2.22batch/s]
INFO:lightrag:Inserting 17 vectors to relationships
Generating embeddings: 100%|██████████| 1/1 [00:01<00:00,  1.14s/batch]
INFO:lightrag:Writing graph with 260 nodes, 78 edges
Processing batch 1: 100%|██████████| 1/1 [00:39<00:00, 39.29s/it]
INFO:lightrag:Processing 1 new unique documents


File 5: Inserted successfully
Processing file 6/18: ddgo_pages_md/majorana_1_quantum_chip/7_https%3A%2F%2Fwww.livescience.com%2Ftechnology%2Fcomputing%2Fquantum-processor-that-uses-entirely-ne.md
File 6: Read content, length: 12441


Processing batch 1:   0%|          | 0/1 [00:00<?, ?it/s]INFO:lightrag:Inserting 3 vectors to chunks
Generating embeddings: 100%|██████████| 1/1 [00:01<00:00,  1.21s/batch]

[A
[A

[A[A

[A[AINFO:lightrag:Inserting 44 vectors to entities
Generating embeddings: 100%|██████████| 2/2 [00:01<00:00,  1.72batch/s]
INFO:lightrag:Inserting 15 vectors to relationships
Generating embeddings: 100%|██████████| 1/1 [00:00<00:00,  1.54batch/s]
INFO:lightrag:Writing graph with 292 nodes, 90 edges
Processing batch 1: 100%|██████████| 1/1 [00:36<00:00, 36.36s/it]
INFO:lightrag:Processing 1 new unique documents


File 6: Inserted successfully
Processing file 7/18: ddgo_pages_md/majorana_1_quantum_chip/8_https%3A%2F%2Fcomputercity.com%2Fhardware%2Fprocessors%2Fmicrosoft-majorana-1-chip.md
File 7: Read content, length: 8983


Processing batch 1:   0%|          | 0/1 [00:00<?, ?it/s]INFO:lightrag:Inserting 2 vectors to chunks
Generating embeddings: 100%|██████████| 1/1 [00:00<00:00,  2.28batch/s]

[A
[A

[A[A

[A[AINFO:lightrag:Inserting 30 vectors to entities
Generating embeddings: 100%|██████████| 1/1 [00:01<00:00,  1.39s/batch]
INFO:lightrag:Inserting 13 vectors to relationships
Generating embeddings: 100%|██████████| 1/1 [00:00<00:00,  2.18batch/s]
INFO:lightrag:Writing graph with 311 nodes, 96 edges
Processing batch 1: 100%|██████████| 1/1 [01:05<00:00, 65.55s/it]
INFO:lightrag:Processing 1 new unique documents


File 7: Inserted successfully
Processing file 8/18: ddgo_pages_md/majorana_1_quantum_chip/9_https%3A%2F%2Fwww.infoq.com%2Fnews%2F2025%2F02%2Fmicrosoft-majorana-quantum-chip%2F.md
File 8: Read content, length: 10785


Processing batch 1:   0%|          | 0/1 [00:00<?, ?it/s]INFO:lightrag:Inserting 3 vectors to chunks
Generating embeddings: 100%|██████████| 1/1 [00:00<00:00,  1.83batch/s]

[A
[A
[A

[A[A

[A[AINFO:lightrag:Inserting 35 vectors to entities
Generating embeddings: 100%|██████████| 2/2 [00:01<00:00,  1.86batch/s]
INFO:lightrag:Inserting 20 vectors to relationships
Generating embeddings: 100%|██████████| 1/1 [00:00<00:00,  1.66batch/s]
INFO:lightrag:Writing graph with 335 nodes, 112 edges
Processing batch 1: 100%|██████████| 1/1 [00:48<00:00, 48.62s/it]
INFO:lightrag:Processing 1 new unique documents


File 8: Inserted successfully
Processing file 9/18: ddgo_pages_md/majorana_1_quantum_chip/10_https%3A%2F%2Fwww.businessinsider.com%2Fsatya-nadella-microsoft-new-majorana-chip-quantum-breakthrou.md
File 9: Read content, length: 7376


Processing batch 1:   0%|          | 0/1 [00:00<?, ?it/s]INFO:lightrag:Inserting 2 vectors to chunks
Generating embeddings: 100%|██████████| 1/1 [00:00<00:00,  3.82batch/s]

[A
[A

[A[A

[A[AINFO:lightrag:Inserting 20 vectors to entities
Generating embeddings: 100%|██████████| 1/1 [00:00<00:00,  1.27batch/s]
INFO:lightrag:Inserting 13 vectors to relationships
Generating embeddings: 100%|██████████| 1/1 [00:01<00:00,  1.09s/batch]
INFO:lightrag:Writing graph with 344 nodes, 122 edges
Processing batch 1: 100%|██████████| 1/1 [00:28<00:00, 28.08s/it]
INFO:lightrag:Processing 1 new unique documents


File 9: Inserted successfully
Processing file 10/18: ddgo_pages_md/majorana_1_quantum_chip/11_https%3A%2F%2Fwww.nature.com%2Farticles%2Fd41586-025-00527-z.md
File 10: Read content, length: 9645


Processing batch 1:   0%|          | 0/1 [00:00<?, ?it/s]INFO:lightrag:Inserting 2 vectors to chunks
Generating embeddings: 100%|██████████| 1/1 [00:00<00:00,  2.23batch/s]

[A
[A

[A[A

[A[AINFO:lightrag:Inserting 39 vectors to entities
Generating embeddings: 100%|██████████| 2/2 [00:00<00:00,  2.08batch/s]
INFO:lightrag:Inserting 8 vectors to relationships
Generating embeddings: 100%|██████████| 1/1 [00:00<00:00,  2.29batch/s]
INFO:lightrag:Writing graph with 374 nodes, 127 edges
Processing batch 1: 100%|██████████| 1/1 [00:29<00:00, 29.30s/it]
INFO:lightrag:Processing 1 new unique documents


File 10: Inserted successfully
Processing file 11/18: ddgo_pages_md/majorana_1_quantum_chip/12_https%3A%2F%2Fwww.forbes.com%2Fsites%2Fjohnkoetsier%2F2025%2F02%2F19%2Fmassive-microsoft-quantum-com.md
File 11: Read content, length: 7645


Processing batch 1:   0%|          | 0/1 [00:00<?, ?it/s]INFO:lightrag:Inserting 2 vectors to chunks
Generating embeddings: 100%|██████████| 1/1 [00:00<00:00,  2.76batch/s]

[A
[A

[A[A

[A[AINFO:lightrag:Inserting 23 vectors to entities
Generating embeddings: 100%|██████████| 1/1 [00:00<00:00,  1.34batch/s]
INFO:lightrag:Inserting 15 vectors to relationships
Generating embeddings: 100%|██████████| 1/1 [00:00<00:00,  1.08batch/s]
INFO:lightrag:Writing graph with 386 nodes, 139 edges
Processing batch 1: 100%|██████████| 1/1 [00:35<00:00, 35.45s/it]
INFO:lightrag:Processing 1 new unique documents


File 11: Inserted successfully
Processing file 12/18: ddgo_pages_md/majorana_1_quantum_chip/13_https%3A%2F%2Fwww.cnbc.com%2F2025%2F02%2F19%2Fmicrosoft-reveals-its-first-quantum-computing-chip-the.md
File 12: Read content, length: 10122


Processing batch 1:   0%|          | 0/1 [00:00<?, ?it/s]INFO:lightrag:Inserting 3 vectors to chunks
Generating embeddings: 100%|██████████| 1/1 [00:00<00:00,  1.79batch/s]

[A
[A

[A[A

[A[AINFO:lightrag:Inserting 35 vectors to entities
Generating embeddings: 100%|██████████| 2/2 [00:01<00:00,  1.83batch/s]
INFO:lightrag:Inserting 8 vectors to relationships
Generating embeddings: 100%|██████████| 1/1 [00:00<00:00,  2.07batch/s]
INFO:lightrag:Writing graph with 411 nodes, 144 edges
Processing batch 1: 100%|██████████| 1/1 [00:27<00:00, 27.42s/it]
INFO:lightrag:Processing 1 new unique documents


File 12: Inserted successfully
Processing file 13/18: ddgo_pages_md/majorana_1_quantum_chip/15_https%3A%2F%2Fwww.popsci.com%2Ftechnology%2Fmajorana-1-microsoft%2F.md
File 13: Read content, length: 9963


Processing batch 1:   0%|          | 0/1 [00:00<?, ?it/s]INFO:lightrag:Inserting 2 vectors to chunks
Generating embeddings: 100%|██████████| 1/1 [00:00<00:00,  2.07batch/s]

[A
[A
[A

[A[A

[A[AINFO:lightrag:Inserting 30 vectors to entities
Generating embeddings: 100%|██████████| 1/1 [00:00<00:00,  1.01batch/s]
INFO:lightrag:Inserting 6 vectors to relationships
Generating embeddings: 100%|██████████| 1/1 [00:00<00:00,  1.31batch/s]
INFO:lightrag:Writing graph with 426 nodes, 147 edges
Processing batch 1: 100%|██████████| 1/1 [00:27<00:00, 27.04s/it]
INFO:lightrag:Processing 1 new unique documents


File 13: Inserted successfully
Processing file 14/18: ddgo_pages_md/majorana_1_quantum_chip/16_https%3A%2F%2Fthequantuminsider.com%2F2025%2F02%2F19%2Fmicrosofts-majorana-1-chip-carves-new-path-fo.md
File 14: Read content, length: 14053


Processing batch 1:   0%|          | 0/1 [00:00<?, ?it/s]INFO:lightrag:Inserting 3 vectors to chunks
Generating embeddings: 100%|██████████| 1/1 [00:00<00:00,  1.53batch/s]

[A
[A
[A

[A[A

[A[AINFO:lightrag:Inserting 43 vectors to entities
Generating embeddings: 100%|██████████| 2/2 [00:00<00:00,  2.37batch/s]
INFO:lightrag:Inserting 14 vectors to relationships
Generating embeddings: 100%|██████████| 1/1 [00:00<00:00,  1.75batch/s]
INFO:lightrag:Writing graph with 450 nodes, 153 edges
Processing batch 1: 100%|██████████| 1/1 [00:37<00:00, 37.70s/it]
INFO:lightrag:Processing 1 new unique documents


File 14: Inserted successfully
Processing file 15/18: ddgo_pages_md/majorana_1_quantum_chip/17_https%3A%2F%2Fwww.nextbigfuture.com%2F2025%2F02%2Fmicrosoft-majorana-1-chip-has-8-qubits-right-now-w.md
File 15: Read content, length: 21606


Processing batch 1:   0%|          | 0/1 [00:00<?, ?it/s]INFO:lightrag:Inserting 5 vectors to chunks
Generating embeddings: 100%|██████████| 1/1 [00:01<00:00,  1.27s/batch]

[A
[A

[A[A

[A[AINFO:lightrag:Inserting 66 vectors to entities
Generating embeddings: 100%|██████████| 3/3 [00:01<00:00,  2.82batch/s]
INFO:lightrag:Inserting 32 vectors to relationships
Generating embeddings: 100%|██████████| 1/1 [00:00<00:00,  1.70batch/s]
INFO:lightrag:Writing graph with 494 nodes, 179 edges
Processing batch 1: 100%|██████████| 1/1 [00:48<00:00, 48.98s/it]
INFO:lightrag:Processing 1 new unique documents


File 15: Inserted successfully
Processing file 16/18: ddgo_pages_md/majorana_1_quantum_chip/18_https%3A%2F%2Fquantum.microsoft.com%2Fen-us%2Fsolutions%2Fmicrosoft-quantum-hardware.md
File 16: Read content, length: 4056


Processing batch 1:   0%|          | 0/1 [00:00<?, ?it/s]INFO:lightrag:Inserting 1 vectors to chunks
Generating embeddings: 100%|██████████| 1/1 [00:00<00:00,  2.68batch/s]

[A
[A

[A[A

INFO:lightrag:Inserting 11 vectors to entities
Generating embeddings: 100%|██████████| 1/1 [00:00<00:00,  1.73batch/s]
INFO:lightrag:Inserting 0 vectors to relationships
INFO:lightrag:Writing graph with 499 nodes, 179 edges
Processing batch 1: 100%|██████████| 1/1 [00:16<00:00, 16.21s/it]
INFO:lightrag:Processing 1 new unique documents


File 16: Inserted successfully
Processing file 17/18: ddgo_pages_md/majorana_1_quantum_chip/19_https%3A%2F%2Fwww.theverge.com%2Fnews%2F614205%2Fmicrosoft-quantum-computing-majorana-1-processor.md
File 17: Read content, length: 6550


Processing batch 1:   0%|          | 0/1 [00:00<?, ?it/s]INFO:lightrag:Inserting 2 vectors to chunks
Generating embeddings: 100%|██████████| 1/1 [00:00<00:00,  1.81batch/s]

[A
[A

[A[A

[A[AINFO:lightrag:Inserting 32 vectors to entities
Generating embeddings: 100%|██████████| 1/1 [00:00<00:00,  1.19batch/s]
INFO:lightrag:Inserting 13 vectors to relationships
Generating embeddings: 100%|██████████| 1/1 [00:01<00:00,  1.04s/batch]
INFO:lightrag:Writing graph with 516 nodes, 188 edges
Processing batch 1: 100%|██████████| 1/1 [00:25<00:00, 25.77s/it]
INFO:lightrag:Processing 1 new unique documents


File 17: Inserted successfully
Processing file 18/18: ddgo_pages_md/majorana_1_quantum_chip/20_https%3A%2F%2Fwww.electronicspecifier.com%2Fproducts%2Fquantum%2Fmajorana-1-microsoft-s-new-quantum-.md
File 18: Read content, length: 20218


Processing batch 1:   0%|          | 0/1 [00:00<?, ?it/s]INFO:lightrag:Inserting 3 vectors to chunks
Generating embeddings: 100%|██████████| 1/1 [00:00<00:00,  1.58batch/s]

[A
[A
[A

[A[A

[A[AINFO:lightrag:Inserting 71 vectors to entities
Generating embeddings: 100%|██████████| 3/3 [00:01<00:00,  2.56batch/s]
INFO:lightrag:Inserting 21 vectors to relationships
Generating embeddings: 100%|██████████| 1/1 [00:01<00:00,  1.07s/batch]
INFO:lightrag:Writing graph with 567 nodes, 200 edges
Processing batch 1: 100%|██████████| 1/1 [01:15<00:00, 75.97s/it]

File 18: Inserted successfully
build_lightrag completed in 817.43 seconds

RAG system built successfully, querying...





In [170]:
# reinitialize the lightrag object from working_dir to show that it has been persisted
# working_dir is unique to each research topic 
lightrag = LightRAG(working_dir=f'lightrag/{research_topic_underscore}', 
                   embedding_func=openai_embed,
                   llm_model_func=gpt_4o_mini_complete)

INFO:lightrag:Logger initialized for working directory: lightrag/majorana_1_quantum_chip
INFO:lightrag:Load KV json_doc_status_storage with 0 data
INFO:lightrag:Load KV llm_response_cache with 2 data
INFO:lightrag:Load KV full_docs with 18 data
INFO:lightrag:Load KV text_chunks with 49 data
INFO:lightrag:Loaded graph from lightrag/majorana_1_quantum_chip/graph_chunk_entity_relation.graphml with 567 nodes, 200 edges
INFO:nano-vectordb:Load (564, 1536) data
INFO:nano-vectordb:Init {'embedding_dim': 1536, 'metric': 'cosine', 'storage_file': 'lightrag/majorana_1_quantum_chip/vdb_entities.json'} 564 data
INFO:nano-vectordb:Load (200, 1536) data
INFO:nano-vectordb:Init {'embedding_dim': 1536, 'metric': 'cosine', 'storage_file': 'lightrag/majorana_1_quantum_chip/vdb_relationships.json'} 200 data
INFO:nano-vectordb:Load (49, 1536) data
INFO:nano-vectordb:Init {'embedding_dim': 1536, 'metric': 'cosine', 'storage_file': 'lightrag/majorana_1_quantum_chip/vdb_chunks.json'} 49 data
INFO:lightrag:Lo

asking LightRAG an overview question for a quick demo

In [None]:
lightrag_answer_one = lightrag.query('Give me a concise summary of the main ideas in this research topic?', param=QueryParam(only_need_context=False))

In [171]:
pretty_markdown(lightrag_answer_one)

## Overview of Microsoft's Quantum Computing Research

Microsoft's quantum computing research revolves around the development of the **Majorana 1 chip**, which represents a significant advancement in the field of quantum technology. This chip utilizes **topological qubits**, derived from **Majorana particles**, to enhance the performance and reliability of quantum computations.

### Key Components

1. **Majorana 1 Chip**: 
   - Microsoft's first Quantum Processing Unit (QPU) designed to support up to one million qubits.
   - Aims for scalability and stability, making it pivotal for practical quantum computing.

2. **Topological Qubits**:
   - Leveraging the unique properties of Majorana particles to create qubits that are less susceptible to errors.
   - Allows for improved error correction and noise reduction, enhancing the stability of quantum operations.

3. **Topological Core Architecture**:
   - The underlying architecture enabling the construction of the Majorana 1 chip, designed to facilitate high qubit counts efficiently.

### Research and Applications

- Microsoft emphasizes the potential of the Majorana 1 chip in diverse fields such as drug discovery, artificial intelligence optimization, climate science initiatives, and financial modeling.
- Collaborative efforts with the **Defense Advanced Research Projects Agency (DARPA)** through the US2QC program are geared toward validating and enhancing quantum computing technologies.

### Challenges and Critiques

Despite the promising advancements, experts remain cautious. There are concerns over the independent validation of Microsoft's claims regarding Majorana particles and the actual performance of the Majorana 1 chip. The scientific community is closely monitoring further experimental validations to confirm the chip’s efficacy and capabilities.

### Conclusion

Microsoft's pursuit in quantum computing, particularly through the Majorana 1 chip and topological qubits, signifies a bold initiative towards realizing scalable, practical quantum technologies. The road ahead includes overcoming substantial technical challenges while maintaining rigorous scientific inquiry and validation.

##### and load onto ArangoDB

In [280]:
# before loading, need to convert the graph object to networkx format
G = nx.read_graphml(f"lightrag/{research_topic_underscore}/graph_chunk_entity_relation.graphml")

In [281]:
# this graph is undirected by default
if G.is_directed():
    print("The graph is directed.")
else:
    print("The graph is undirected.")


The graph is undirected.


In [None]:
# add a new attribute to each node called original_id that store the LightRAG graph ids on the new ArangoDB graph
for node in G.nodes():
    G.nodes[node]['original_id'] = node

# then create G_adb and upload it to ArangoDB
G_adb = nxadb.Graph(
    name=f"{research_topic_underscore}",
    db=db,
    incoming_graph_data=G,
    overwrite_graph=True
)
print(G_adb)

[23:18:20 -0600] [INFO]: Graph 'majorana_1_quantum_chip' created.
INFO:nx_arangodb:Graph 'majorana_1_quantum_chip' created.
[2025/03/07 23:18:21 -0600] [1814688] [INFO] - adbnx_adapter: Instantiated ADBNX_Adapter with database '_system'
INFO:adbnx_adapter:Instantiated ADBNX_Adapter with database '_system'


Output()

Output()

[2025/03/07 23:18:21 -0600] [1814688] [INFO] - adbnx_adapter: Created ArangoDB 'majorana_1_quantum_chip' Graph
INFO:adbnx_adapter:Created ArangoDB 'majorana_1_quantum_chip' Graph


Graph named 'majorana_1_quantum_chip' with 567 nodes and 200 edges


In [7]:
# get the schema of the graph on ArangoDB for reference later
graph = db.graph(f"{research_topic_underscore}")
schema = get_graph_schema(graph, db)
print(schema)

{'Graph Schema': [], 'Collection Schema': [{'collection_name': 'majorana_1_quantum_chip_node', 'collection_type': 'document', 'document_properties': [{'name': '_key', 'type': 'str'}, {'name': '_id', 'type': 'str'}, {'name': '_rev', 'type': 'str'}, {'name': 'entity_type', 'type': 'str'}, {'name': 'description', 'type': 'str'}, {'name': 'source_id', 'type': 'str'}, {'name': 'original_id', 'type': 'str'}], 'example_document': {'_key': '0', '_id': 'majorana_1_quantum_chip_node/0', '_rev': '_jVOuoqi---', 'entity_type': '"CATEGORY"', 'description': '"QEC involves error correction techniques that protect quantum information by correcting errors arising from decoherence and quantum noise."', 'source_id': 'chunk-9fa29975ce7f5d34f81b33f8bc44fac3', 'original_id': '"QUANTUM ERROR CORRECTION (QEC)"'}}, {'collection_name': 'majorana_1_quantum_chip_node_to_majorana_1_quantum_chip_node', 'collection_type': 'edge', 'edge_properties': [{'name': '_key', 'type': 'str'}, {'name': '_id', 'type': 'str'}, {'n

In [152]:
graph.name

'majorana_1_quantum_chip'

In [203]:
# reload the graph from ArangoDB to show that it has been persisted
G_adb = nxadb.Graph(name=f"{research_topic_underscore}", db=db)

print(G_adb)

[11:38:06 -0500] [INFO]: Graph 'majorana_1_quantum_chip' exists.
INFO:nx_arangodb:Graph 'majorana_1_quantum_chip' exists.
[11:38:06 -0500] [INFO]: Default node type set to 'majorana_1_quantum_chip_node'
INFO:nx_arangodb:Default node type set to 'majorana_1_quantum_chip_node'


Graph named 'majorana_1_quantum_chip' with 567 nodes and 200 edges


The decision to convert to undirected graph before further querying is tailored for this use case, since LightRAG graphs are unconnected by default but converted to directed on Arango. Keeping it as a directed graph affects some kinds of queries. For example counting how many isolated nodes are in the graph. A directed graph import from Arango would show 0 isolated nodes when there are many.

In [212]:
G_adb_undirected = G_adb.to_undirected()

##### Now for the fun part, we can visualize the graph. I made a helper function to do this based on pyvis.

In [282]:
net = visualize_graph(
    G_adb_undirected,
    output_path=f"visualization_output/{research_topic_underscore}.html"
)

# Multi-agent Graph Query System
The core of ChishikAI is a multi-agent that can plan and dynamically call tools to create and execute both NetworkX and AQL queries. The starter template provided was very instructive. I have made several improvents to both **text_to_nx_to_to_text** and **text_to_aql_to_text** functions. In a sense they are more 'agentic'. The workflows for both functions now follow similar patterns:
- the natural language query is given to a code agent
- the code agent generate draft code and sends to a review agent
- the review agent reviews and decides: 
    - APPROVE sends the code to be executed (auto approval after MAX_REVIEWS reviews), 
    - REJECT sends the code back to code agent with comments for updates
- code passing review is executed: 
    - success sends the results to a synthesizer agent
    - error then the code and error is analyzed by an error agent then the error analysis is send back to code agent (auto exit function and return a message about what happened after MAX_RETRIES retries)

The review agent and the error analyzer agent is added as safety and performance checks for the workflows to:
- make sure that the code does not fail loudly (run without error)
- make sure that the code does not fail silently (run but give the wrong results or very inefficiently). this is why the review agent exists.
The code for both agentic functions, text_to_nx_algorithm_to_to_text_agentic and text_to_aql_to_text_agentic are shown below along with a fairly complex example query for each.

In [286]:
from typing import TypedDict, Optional, Union, Dict, Any, List, Tuple
import textwrap

# Update the state to include review information
class TextToNxState(TypedDict, total=False):
    query: str  # The original query about graph algorithms
    code: Optional[str]  # Generated NetworkX code
    review_feedback: Optional[str]  # Feedback from review step
    execution_result: Optional[Union[str, Dict[str, Any], List, Any]]  # Result of code execution
    error: Optional[str]  # Any error message
    final_answer: Optional[str]  # Final response to user
    retry_count: int  # Number of code generation retries
    review_count: int  # Number of review attempts

@tool
def text_to_nx_algorithm_to_text_agentic(query: str) -> str:
    """Translates natural language graph algorithm queries into NetworkX code, executes it, and returns human-readable answers.
    
    This function implements a robust workflow to:
    1. Generate optimized NetworkX code from natural language questions about graph algorithms
    2. Review the code for correctness, efficiency, and proper use of NetworkX
    3. Execute the validated code against a NetworkX graph (G_adb_undirected)
    4. Format the results into a natural language response
    
    The function includes a comprehensive error handling system with up to 2 retries for failed executions
    and up to 2 review cycles to ensure code quality before execution.
    
    Args:
        query (str): A natural language question about graph algorithms to apply to the NetworkX graph
        
    Returns:
        str: A natural language response answering the original query based on the NetworkX analysis,
             or an error message if the process failed after maximum retry attempts
    """
    code_llm = ChatOpenAI(model_name="o1-mini")
    write_llm = ChatOpenAI(model_name="gpt-4o-mini")
    MAX_RETRIES = 2
    MAX_REVIEWS = 1
    
    def generate_code(state: TextToNxState) -> TextToNxState:
        """Generate NetworkX code to answer the natural language query"""
        print("1) Generating NetworkX code")
        
        # Include review feedback and error context in the prompt
        review_context = f"\nPrevious code was rejected because: {state['review_feedback']}" if state.get("review_feedback") else ""
        error_context = f"\nPrevious attempt failed with error: {state['error']}" if state.get("error") else ""
        
        code_generation = code_llm.invoke(f"""
        You are a NetworkX expert who excels at crafting correct and efficient NetworkX code.
        I have a NetworkX Graph called `G` that will be passed as a parameter to your function.
        The graph has the following schema: {nx_graph_info(G_adb_undirected)}
        
        Write a Python function called `analyze_graph` that takes a NetworkX graph as input and answers this query: {state['query']}.
        {review_context}
        {error_context}
        
        Guidelines:
        - Function signature must be exactly: `def analyze_graph(G):`
        - Return a value that directly answers the query (not just raw graph data)
        - Include all necessary imports inside the function
        - Ensure all variables are properly scoped within the function
        - Follow NetworkX best practices for efficiency
        - Make sure you only use the parameter G and not any external variables
        - Follow the provided schema exactly; ensure that any collections or fields mentioned match the schema. If you do not see another exact match, use the closest match.
        - Only provide the code, no explanations or markdown.
        - When you pass back a node, always pass back its id along with any other requested information.
        """).content
        
        # Clean up the code
        code_match = re.search(r'```python\n(.*?)\n```', code_generation, re.DOTALL)
        if code_match:
            nx_code = code_match.group(1).strip()
        else:
            nx_code = code_generation.strip()
        
        print('-'*10)
        print(nx_code)
        print('-'*10)
        
        return {**state, "code": nx_code, "review_feedback": None}
    
    def review_code(state: TextToNxState) -> Tuple[bool, TextToNxState]:
        """Review the generated NetworkX code for correctness and efficiency"""
        print("\n2) Reviewing NetworkX code")
        
        review_result = code_llm.invoke(f"""
        You are an expert NetworkX code reviewer. 
        I have a NetworkX Graph with the following schema: {nx_graph_info(G_adb_undirected)}
        I have this question about the graph: {state['query']}
        
        Review the following code and determine if it:
        1. Do not worry about importing inside a function
        2. Uses NetworkX efficiently and correctly
        3. Will properly answer the original question
        4. Has a correct function signature `def analyze_graph(G):`
        5. Has all variables properly scoped within the function
        6. Has no obvious bugs or performance issues
        7. Only uses the parameter G and not any external variables
        
        Code to review:
        ```python
        {state['code']}
        ```

        ALWAYS STRUCTURE your response with this EXACT FORMAT:
        - "APPROVE" if the code is good to execute
        - "REJECT: <specific reason>" if the code needs to be regenerated
        
        Be strict but fair in your review. Focus on correctness and efficiency.
        """).content
        
        print(f"Review result: {review_result}")
        print('-'*10)
        
        if review_result.startswith("APPROVE"):
            return True, {**state, "review_feedback": None}
        else:
            feedback = review_result.replace("REJECT:", "").strip()
            return False, {**state, "review_feedback": feedback}
    
    def create_function(code_string: str, globals_dict: Dict[str, Any]) -> Any:
        """Create a function dynamically from a code string using a function factory pattern"""
        # Create a new dictionary for local variables
        locals_dict = {}
        
        try:
            # Execute the code in the given context
            exec(code_string, globals_dict, locals_dict)
            
            # Check if the analyze_graph function was created
            if "analyze_graph" not in locals_dict:
                raise ValueError("No analyze_graph function was defined in the code")
                
            # Return the function
            return locals_dict["analyze_graph"]
        except Exception as e:
            raise ValueError(f"Error creating function: {str(e)}")
    
    def execute_code(state: TextToNxState) -> TextToNxState:
        """Execute the generated NetworkX code using a function factory"""
        print("\n3) Executing NetworkX code")
        
        # Prepare the global namespace with required imports and objects
        globals_dict = {"nx": nx}
        
        try:
            # Create the function
            analyze_function = create_function(state["code"], globals_dict)
            
            # Execute the function with the graph
            result = analyze_function(G_adb_undirected)
            
            print('-'*10)
            print(f"Result: {result}")
            print('-'*10)
            
            return {**state, "execution_result": result, "error": None}
                
        except Exception as e:
            error_msg = f"EXECUTION ERROR: {str(e)}"
            print(error_msg)
            return {**state, "error": error_msg}
    
    def format_answer(state: TextToNxState) -> TextToNxState:
        """Format the final answer in natural language"""
        print("4) Formulating final answer")
        
        if state.get("error"):
            return state
            
        nx_to_text = write_llm.invoke(f"""
            I have a NetworkX Graph.
            
            Original query: {state['query']}
            
            I ran the following NetworkX analysis function:
            ```python
            {state['code']}
            ```
            
            Result: {state['execution_result']}
            
            Based on the query and results, generate a short and concise response that directly answers the query.
            
            Your response:
        """).content

        return {**state, "final_answer": nx_to_text}
    
    def should_retry(state: TextToNxState) -> str:
        """Determine if we should retry code generation"""
        if state.get("error") and state["retry_count"] < MAX_RETRIES:
            return "retry"
        return "complete"
    
    # Initialize state
    state = {
        "query": query,
        "code": None,
        "review_feedback": None,
        "execution_result": None,
        "error": None,
        "final_answer": None,
        "retry_count": 0,
        "review_count": 0
    }
    
    # Execute workflow
    while True:
        # Generate NetworkX code
        state = generate_code(state)
        
        # Review loop
        while state["review_count"] < MAX_REVIEWS:
            approved, new_state = review_code(state)
            state = new_state
            
            if approved:
                break
                
            state["review_count"] += 1
            if state["review_count"] >= MAX_REVIEWS:
                print(f"\nMax reviews ({MAX_REVIEWS}) reached. Proceeding with last code.")
                break
                
            # Regenerate code with review feedback
            state = generate_code(state)
        
        # Execute code
        state = execute_code(state)
        
        # Format answer if no error, otherwise check for retry
        if not state.get("error"):
            state = format_answer(state)
            break
        
        # Check if we should retry
        action = should_retry(state)
        if action == "complete":
            break
            
        # Reset review count and increment retry counter
        state["review_count"] = 0
        state["retry_count"] += 1
        print(f"\nRetrying... (Attempt {state['retry_count'] + 1}/{MAX_RETRIES + 1})")
    
    # Return final answer or error message
    if state.get("final_answer"):
        return state["final_answer"]
    else:
        return f"Failed to analyze query after {state['retry_count'] + 1} attempts. Last error: {state['error']}"

In [178]:
# example usage
test_nx_tool_question = '''What is the most popular node in the graph and why? Give me the statistics that support your answer.
            Also are there isolated nodes in the graph? If so how many are there?'''
test_nx_tool_answer = text_to_nx_algorithm_to_text_agentic(test_nx_tool_question)
print("\nFinal Answer:", test_nx_tool_answer)

1) Generating NetworkX code
----------
def analyze_graph(G):
    import networkx as nx

    # Calculate degrees
    degrees = G.degree()
    # Find the node with the highest degree
    most_popular_node, max_degree = max(degrees, key=lambda x: x[1])
    
    # Find isolated nodes
    isolated_nodes = list(nx.isolates(G))
    num_isolated = len(isolated_nodes)
    
    # Prepare the result string
    result = (
        f"The most popular node is '{most_popular_node}' with a degree of {max_degree}. "
        f"This indicates it has the highest number of connections in the graph. "
        f"There {'are' if num_isolated else 'are no'} isolated nodes in the graph."
    )
    if num_isolated:
        result += f" The number of isolated nodes is {num_isolated}."
    
    return result
----------

2) Reviewing NetworkX code
Review result: APPROVE
----------

3) Executing NetworkX code
----------
Result: The most popular node is 'majorana_1_quantum_chip_node/55' with a degree of 51. This indic

In [194]:
from typing import TypedDict, Optional, Union, Dict, Any, List, Tuple

# Update the state to include review information
class TextToAQLState(TypedDict, total=False):
    query: str  # The original query about graph
    aql_query: Optional[str]  # Generated AQL query
    review_feedback: Optional[str]  # Feedback from review step
    execution_result: Optional[Union[str, Dict[str, Any], List, Any]]  # Result of query execution
    error: Optional[str]  # Any error message
    final_answer: Optional[str]  # Final response to user
    retry_count: int  # Number of query generation retries
    review_count: int  # Number of review attempts

@tool
def text_to_aql_to_text_agentic(query: str) -> str:
    """Converts natural language questions about graph data into AQL queries, executes them, and returns natural language answers.
    
    This function implements a robust workflow to:
    1. Generate an optimized AQL query from the natural language question
    2. Review the query for correctness and efficiency
    3. Execute the validated query against an ArangoDB graph database
    4. Format the results into a natural language response
    
    The workflow includes error handling, retry logic (up to 2 retries), and query review (up to 2 reviews)
    to ensure reliable query generation and execution.
    
    Args:
        query (str): A natural language question about data stored in the ArangoDB graph
        
    Returns:
        str: A natural language response answering the original question based on the database results,
             or an error message if the process failed"""
    code_llm = ChatOpenAI(model_name="o1-mini")
    write_llm = ChatOpenAI(model_name="gpt-4o-mini")
    MAX_RETRIES = 2
    MAX_REVIEWS = 2
    
    def generate_aql(state: TextToAQLState) -> TextToAQLState:
        """Generate AQL query to answer the natural language query"""
        print("1) Generating AQL query")
        
        # Include review feedback in the prompt if available
        review_context = f"\nPrevious query was rejected because: {state['review_feedback']}" if state.get("review_feedback") else ""
        error_context = f"\nPrevious attempt failed with error: {state['error']}" if state.get("error") else ""
        
        aql_generation = code_llm.invoke(f"""
        You are a seasoned ArangoDB developer specializing in crafting correct and efficient AQL queries.
        I have an ArangoDB Graph named {graph.name} with the following schema:  {get_graph_schema(graph, db)}
        Generate a precise AQL query to answer this question: {state['query']}.
        {review_context}
        {error_context}
        
        Please adhere to these guidelines:
        - Produce a query that is syntactically correct and optimized for performance
        - Follow the provided schema exactly; ensure that any collections or fields mentioned match the schema. If you do not see another exact match, use the closest match.
        - Use best practices for clarity and efficiency
        - Do not include any markdown formatting or explanations
        
        """).content
        
        # Clean up the query (existing cleanup logic)
        query_match = re.search(r'```aql\n(.*?)\n```', aql_generation, re.DOTALL)
        if query_match:
            aql_query = query_match.group(1).strip()
        else:
            query_lines = []
            in_query = False
            for line in aql_generation.split('\n'):
                line = line.strip()
                if re.match(r'^(FOR|RETURN|LET|COLLECT|SORT|LIMIT)\b', line):
                    in_query = True
                if in_query:
                    query_lines.append(line)
            aql_query = '\n'.join(query_lines)
        
        print('-'*10)
        print(aql_query)
        print('-'*10)
        
        return {**state, "aql_query": aql_query, "review_feedback": None}
    
    def review_query(state: TextToAQLState) -> Tuple[bool, TextToAQLState]:
        """Review the generated AQL query for correctness and efficiency"""
        print("\n2) Reviewing AQL query")
        
        review_result = code_llm.invoke(f"""
        You are an expert AQL query reviewer. Your task is to review the following AQL query and determine if it:
        1. Uses the correct collections and attributes from the schema
        2. Is syntactically correct
        3. Follows ArangoDB best practices
        4. Will correctly answer the original question
        I have an ArangoDB Graph {graph} with the following schema:  {get_graph_schema(graph, db)}
        Original question: {state['query']}
        
        
        AQL Query to review:
        ```aql
        {state['aql_query']}
        ```

        Respond with:
        - "APPROVE" if the query is good to execute
        - "REJECT: <specific reason>" if the query needs to be regenerated
        
        Be strict but fair in your review. Focus on correctness and efficiency.
        """).content
        
        print(f"Review result: {review_result}")
        print('-'*10)
        
        if review_result.startswith("APPROVE"):
            return True, {**state, "review_feedback": None}
        else:
            feedback = review_result.replace("REJECT:", "").strip()
            return False, {**state, "review_feedback": feedback}
    
    def execute_aql(state: TextToAQLState) -> TextToAQLState:
        """Execute the generated AQL query"""
        print("\n3) Executing AQL query")
        
        try:
            cursor = G_adb.query(state["aql_query"])
            results = list(cursor)
            
            print('-'*10)
            print(f"Results: {results}")
            print('-'*10)
            
            return {**state, "execution_result": results, "error": None}
            
        except Exception as e:
            error_msg = f"AQL ERROR: {str(e)}"
            print(error_msg)
            return {**state, "error": error_msg}
    
    def format_answer(state: TextToAQLState) -> TextToAQLState:
        """Format the final answer in natural language"""
        print("4) Formulating final answer")
        
        if state.get("error"):
            return state
            
        aql_to_text = write_llm.invoke(f"""
            I have an ArangoDB Graph {graph} with the following schema: {get_graph_schema(graph, db)}

            I have the following graph query: {state['query']}.

            I have executed the following AQL query to help me answer my query:
            ---
            {state['aql_query']}
            ---

            The query returned the following results: {state['execution_result']}.

            Based on my original Query and the results, generate a fully detailed response to
            answer my query. DO NOT MAKE UP ANY NUMBERS OR INFORMATION IF NOT PROVIDED IN THE RESULTS.
            
            Your response:
        """).content

        return {**state, "final_answer": aql_to_text}
    
    def should_retry(state: TextToAQLState) -> str:
        """Determine if we should retry query generation"""
        if state.get("error") and state["retry_count"] < MAX_RETRIES:
            return "retry"
        return "complete"
    
    # Initialize state
    state = {
        "query": query,
        "aql_query": None,
        "review_feedback": None,
        "execution_result": None,
        "error": None,
        "final_answer": None,
        "retry_count": 0,
        "review_count": 0
    }
    
    # Execute workflow
    while True:
        # Generate AQL
        state = generate_aql(state)
        
        # Review loop
        while state["review_count"] < MAX_REVIEWS:
            approved, new_state = review_query(state)
            state = new_state
            
            if approved:
                break
                
            state["review_count"] += 1
            if state["review_count"] >= MAX_REVIEWS:
                print(f"\nMax reviews ({MAX_REVIEWS}) reached. Proceeding with last query.")
                break
                
            # Regenerate query with review feedback
            state = generate_aql(state)
        
        # Execute AQL
        state = execute_aql(state)
        
        # Format answer if no error, otherwise check for retry
        if not state.get("error"):
            state = format_answer(state)
            break
        
        # Check if we should retry
        action = should_retry(state)
        if action == "complete":
            break
            
        # Reset review count and increment retry counter
        state["review_count"] = 0
        state["retry_count"] += 1
        print(f"\nRetrying... (Attempt {state['retry_count'] + 1}/{MAX_RETRIES + 1})")
    
    # Return final answer or error message
    if state.get("final_answer"):
        return state["final_answer"]
    else:
        return f"Failed to analyze query after {state['retry_count']} attempts. Last error: {state['error']}"

In [18]:
# example usage
test_aql_tool_question = '''What is the most popular node in the graph and why? Give me the statistics that support your answer.
            Also are there isolated nodes in the graph? If so how many are there?'''
test_aql_tool_answer = text_to_aql_to_text_agentic(test_aql_tool_question)
print("\nFinal Answer:", test_aql_tool_answer)

1) Generating AQL query
----------
LET degrees = (
  FOR e IN majorana_1_quantum_chip_node_to_majorana_1_quantum_chip_node
    FOR nodeId IN [e._from, e._to]
      COLLECT id = nodeId WITH COUNT INTO degree
      RETURN { id, degree }
)

LET most_popular = (
  FOR d IN degrees
    SORT d.degree DESC
    LIMIT 1
    LET node = DOCUMENT(d.id)
    RETURN { key: node._key, id: node._id, degree: d.degree }
)[0]

LET totalNodes = LENGTH(FOR n IN majorana_1_quantum_chip_node RETURN 1)

LET connectedNodes = LENGTH(degrees)

LET isolated_count = totalNodes - connectedNodes

RETURN { most_popular, isolated_nodes: isolated_count }
----------

2) Reviewing AQL query
Review result: APPROVE

**Review Summary:**

1. **Syntactically Correct:**  
   The AQL query is free from syntax errors and follows the correct structure for ArangoDB queries.

2. **Follows ArangoDB Best Practices:**  
   - **Use of Collections and Attributes:**  
     The query correctly references the `majorana_1_quantum_chip_node` 

### After we have the two tools, we can bind them to ReAct agents and create a system that let that handle graph analytics tasks for ChishikAI. This system will be in charge of any queries that require graph analytics. The system has 3 components:
1. a planner agent that breaks down the ask inside the query into a step by step plan, along with tool (nx or aql) suggested to perform each step.
2. a ReAct agent takes in each step and execute it and relay the results from previous steps to the next.
3. a synthesizer agent that synthesize the answer to the query using results from the ReAct agent

One thing I want to note is that I have tested using 1 ReAct agent to do all of this. However, the cognitive load placed on that 1 agent is too heavy I think, which leads to poor performance especially for complex and hybrid queries. Thus I put in the planner and synthesizer agent to reduce this cognitive load and enhance the system's capacity. A form of test time scaling I guess. 

Another nice thing about this system is I can peak into its plan and see what steps it takes and why. I have put 3 example queries below with the plans and the answers. Please note the dynamic use of both nx and aql tools by the system.

In [325]:
tools = [text_to_aql_to_text_agentic, text_to_nx_algorithm_to_text_agentic]

# using Pydantic models for structuring the output of the planner agent
class QueryPlanStep(BaseModel):
    """Model for a single step in the query plan"""
    step_number: int = Field(description="The sequence number of this step")
    description: str = Field(description="Description of what this step should do")
    tool_suggestion: str = Field(description="Either 'aql' or 'networkx'")
    reasoning: str = Field(description="Explanation for why this tool was chosen")

class QueryPlan(BaseModel):
    """Model for the complete query plan"""
    steps: List[QueryPlanStep] = Field(description="Sequential steps to execute the query")

# the actual query graph function
def query_graph(query):
    """
    Execute a natural language query against a knowledge graph using a multi-step planning and execution approach.

    Args:
        query (str): Natural language query about the knowledge graph

    Returns:
        tuple: (QueryPlan, str) - Contains the execution plan and final summarized results
    """
    llm = ChatOpenAI(model_name="gpt-4o-mini")
    
    # Configure LLM for structured output
    planner_llm = llm.with_structured_output(QueryPlan)
    
    # Get the structured plan
    plan = planner_llm.invoke(f'''You are a graph query planner. Given a query about a knowledge graph, break it down into the a set of sequential steps.
                                If there are steps that can be grouped together into one, do so. It is more efficient to do so.
                                When there are multiple solutions or options for a step, choose the most appropriate one YOURSELF.
                                 Consider that:
                                 - NetworkX is good first choice for most cases of graph analytics, like popularity and centrality measures, and path finding
                                 - AQL is good for large-scale traversals and complex pattern matching
                                 
                                 Query to plan: {query}
                              ''')
    # Create the execution agent with tools
    app = create_react_agent(llm, tools)
    
    # Execute each step in the plan
    results = []
    for step in plan.steps:
        step_prompt = f'''You are an intelligent graph query assistant executing a specific step in a larger plan.
                         Current step: {step.description}
                         Suggested tool: {step.tool_suggestion}
                         
                         Execute this step using the appropriate tool. If the suggested tool fails, try the alternative.
                         Previous steps results (if any): {results}
                         '''
        
        step_result = app.invoke({"messages": [{"role": "user", "content": step_prompt}]})
        results.append(step_result["messages"][-1].content)
    
    # Combine and summarize results
    summary_prompt = f"Summarize the following step results into a coherent response: {results}. ONLY USE THE INFORMATION PROVIDED IN THE RESULTS TO ANSWER THE QUERY. DO NOT MAKE UP ANY INFORMATION."
    final_summary = llm.invoke(summary_prompt).content
    
    return plan, final_summary

In [241]:
test_query_graph_question_one = '''What is the most popular node in the graph? Give me all its attributes.'''
test_query_graph_plan_one, test_query_graph_answer_one = query_graph('What is the most popular node in the graph? Give me all its attributes.')

1) Generating NetworkX code
----------
def analyze_graph(G):
    return "degree centrality"
----------

2) Reviewing NetworkX code
Review result: REJECT: The function does not analyze the graph to determine the centrality measure and merely returns a static string.
----------

Max reviews (1) reached. Proceeding with last code.

3) Executing NetworkX code
----------
Result: degree centrality
----------
4) Formulating final answer
1) Generating NetworkX code
----------
def analyze_graph(G):
    import networkx as nx
    centrality = nx.degree_centrality(G)
    most_popular_node = max(centrality, key=centrality.get)
    return most_popular_node
----------

2) Reviewing NetworkX code
Review result: APPROVE
----------

3) Executing NetworkX code
----------
Result: majorana_1_quantum_chip_node/55
----------
4) Formulating final answer
1) Generating AQL query
----------
RETURN DOCUMENT('majorana_1_quantum_chip_node/55')
----------

2) Reviewing AQL query
Review result: APPROVE
----------

3)

here you can read the plan for the first question

In [243]:
for step in test_query_graph_plan_one.steps:
    print(step.description)
    print(step.tool_suggestion)
    print(step.reasoning)
    print('-'*10)

Identify the centrality measure that defines 'popularity' in the graph.
networkx
Using NetworkX to calculate centrality measures helps identify the most popular node based on established metrics like degree centrality or betweenness centrality.
----------
Calculate the centrality scores for all nodes to find the most popular node.
networkx
NetworkX provides a straightforward way to compute centrality scores efficiently across the graph.
----------
Retrieve the attributes of the most popular node identified in the previous step.
aql
Using AQL for this step is appropriate as it allows for direct querying of node attributes in a structured manner once the node is known.
----------


here you can see the answer for the first question

In [244]:
pretty_markdown(test_query_graph_answer_one)

The graph's degree centrality identifies 'popularity,' with the most popular node being `majorana_1_quantum_chip_node/55`, which represents **Microsoft**, classified as an **"ORGANIZATION."** 

Here are the detailed attributes associated with this node:

- **Key**: `55`
- **ID**: `majorana_1_quantum_chip_node/55`
- **Revision**: `_jVOuoqi--1`
- **Entity Type**: "ORGANIZATION"

**Description**: Microsoft is a leading multinational technology corporation known for its significant contributions to the tech sector, particularly in software, hardware, and cloud services, with products such as Windows and Microsoft Office. Recently, the company has focused heavily on innovation in quantum computing, notably through the development of the Majorana 1 chip, which features 8 qubits. Microsoft is pioneering a new architecture for quantum computers using topological qubits, with the goal of creating a fault-tolerant quantum computer capable of transforming various industries.

- **Original ID**: "MICROSOFT"

In summary, this node encapsulates Microsoft's technological achievements and its ongoing initiatives in quantum computing, especially concerning the Majorana 1 chip's development.

second question is more complex

In [247]:
test_query_graph_question_two = '''What is the third most popular node in the graph? What is the shortest path on the graph connecting the most popular node and the third most popular node?'''
test_query_graph_plan_two, test_query_graph_answer_two = query_graph(test_query_graph_question_two)

1) Generating NetworkX code
----------
def analyze_graph(G):
    import networkx as nx
    degrees = G.degree()
    sorted_nodes = sorted(degrees, key=lambda x: x[1], reverse=True)
    if len(sorted_nodes) < 3:
        return None
    third_node = sorted_nodes[2]
    return {'node': third_node[0], 'degree': third_node[1]}
----------

2) Reviewing NetworkX code
Review result: APPROVE
----------

3) Executing NetworkX code
----------
Result: {'node': 'majorana_1_quantum_chip_node/53', 'degree': 16}
----------
4) Formulating final answer
1) Generating NetworkX code
----------
def analyze_graph(G):
    import networkx as nx
    max_node = max(G.degree, key=lambda x: x[1])[0]
    return max_node
----------

2) Reviewing NetworkX code
Review result: APPROVE
----------

3) Executing NetworkX code
----------
Result: majorana_1_quantum_chip_node/55
----------
4) Formulating final answer
1) Generating NetworkX code
----------
def analyze_graph(G):
    import networkx as nx
    source = 'majorana

the plan for the second question and the answer

In [248]:
for step in test_query_graph_plan_two.steps:
    print(step.description)
    print(step.tool_suggestion)
    print(step.reasoning)
    print('-'*10)
pretty_markdown(test_query_graph_answer_two)

Identify all nodes and their popularity metrics to find the third most popular node in the graph.
networkx
NetworkX provides effective analytics capabilities, including centrality measures and node ranking, which is suitable for determining the popularity of nodes.
----------
Once the third most popular node is identified, locate the most popular node in the graph.
networkx
This continues to leverage NetworkX's capabilities for efficient node handling and analytics.
----------
Compute the shortest path between the most popular node and the third most popular node.
networkx
NetworkX is ideal for pathfinding tasks, allowing for the quick and efficient calculation of the shortest path between nodes.
----------


In the graph, the most popular node is `'majorana_1_quantum_chip_node/55'`, which has the highest degree centrality. The third most popular node is `'majorana_1_quantum_chip_node/53'`, with a degree of 16. The shortest path connecting these two nodes is direct: `['majorana_1_quantum_chip_node/55', 'majorana_1_quantum_chip_node/53']`.

the third question

In [292]:
test_query_graph_question_three = '''Give me the node ids of ALL nodes that are 2 hops away from the second most popular node. Among those nodes, how many has their ORIGINAL ID that starts with letter "L"?'''
test_query_graph_plan_three, test_query_graph_answer_three = query_graph(test_query_graph_question_three)

1) Generating NetworkX code
----------
def analyze_graph(G):
    import networkx as nx
    centrality = nx.degree_centrality(G)
    sorted_nodes = sorted(centrality.items(), key=lambda item: item[1], reverse=True)
    return sorted_nodes[1][0] if len(sorted_nodes) >= 2 else None
----------

2) Reviewing NetworkX code
Review result: APPROVE
----------

3) Executing NetworkX code
----------
Result: majorana_1_quantum_chip_node/54
----------
4) Formulating final answer
1) Generating NetworkX code
----------
def analyze_graph(G):
    import networkx as nx
    source = 'majorana_1_quantum_chip_node/54'
    paths = nx.single_source_shortest_path_length(G, source, cutoff=2)
    two_hop_nodes = [node for node, length in paths.items() if length == 2]
    return two_hop_nodes
----------

2) Reviewing NetworkX code
Review result: APPROVE
----------

3) Executing NetworkX code
----------
Result: ['majorana_1_quantum_chip_node/225', 'majorana_1_quantum_chip_node/229', 'majorana_1_quantum_chip_node/

the system came up with a hybrid plan for this question too

In [294]:
for step in test_query_graph_plan_three.steps:
    print(step.description)
    print(step.tool_suggestion)
    print(step.reasoning)
    print('-'*10)

Identify the second most popular node in the graph based on a defined popularity metric (e.g., degree).
networkx
NetworkX is suitable for analyzing node attributes such as popularity, which can be measured through degree centrality.
----------
Perform a breadth-first search (BFS) from the second most popular node to find all nodes that are 2 hops away.
networkx
NetworkX's BFS is optimal for finding nodes at a specific distance in a graph.
----------
Filter the nodes found in step 2 to extract those whose original ID starts with the letter 'L'.
aql
AQL is better suited for filtering nodes based on string conditions after the graph structure has been traversed.
----------
Count the number of nodes that meet the criteria from step 3.
aql
Counting can be efficiently handled in AQL after filtering the relevant node IDs.
----------


In [295]:
pretty_markdown(test_query_graph_answer_three)

The node `majorana_1_quantum_chip_node/54` is the second most popular in the graph, based on degree centrality. It has 59 nodes that are 2 hops away from it. The specific nodes at this distance include (but are not limited to) `majorana_1_quantum_chip_node/225`, `majorana_1_quantum_chip_node/229`, and `majorana_1_quantum_chip_node/127`, among others.

Additionally, a query was executed to filter nodes whose original ID starts with the letter 'L', but it returned an empty array, indicating there are no nodes in the `majorana_1_quantum_chip_node` collection that match this criterion. This suggests there are either no such IDs in the existing nodes or that they do not meet the filtering condition.

# ChishikAI System
For now, I have designed ChishikAI to be able to handle 3 kinds of queries, each kind will be routed to one of the 3 routes below:
- **Route 1**: General semantic questions that LightRAG can answer directly e.g. 'Give me an overview of the main ideas'
- **Route 2**: Complex questions that requires graph analytics e.g. 'Tell me about the most important topic in this graph'
- **Route 3**: Expansion e.g. 'Expand the graph with more info about IBM'

### Route 1: Direct LightRAG Queries

In [299]:
lightrag_good_answer = lightrag.query('''Tell me one interesting fact about IBM''', param=QueryParam(only_need_context=False))

INFO:lightrag:Using global mode for query processing
INFO:lightrag:Query: IBM, Technology, History, top_k: 60, cosine: 0.2
INFO:lightrag:Global query uses 44 entites, 60 relations, 3 text units


This is the answer. Notice that it answers from its knowledge base and the fact is very relevant to the topic of quantum computing.

In [300]:
pretty_markdown(lightrag_good_answer)

One interesting fact about IBM is that it has developed the first-ever 1,000-qubit quantum chip, called the Condor processor, marking a significant milestone in the company's efforts to achieve commercial viability in large-scale quantum computing. This advancement positions IBM as a major player in the race to build powerful quantum computing systems, alongside competitors like Microsoft and Google.

This is an example of a question LightRAG cannot handle well. It incorrectly identified Google as the second most popular node.

In [301]:
lightrag_bad_answer = lightrag.query('''Tell me about the most popular node in this knowledge graph?''', param=QueryParam(only_need_context=False))
pretty_markdown(lightrag_bad_answer)

Based on the information from the Knowledge Base, the second most popular node appears to be **Google**. Below is an overview of this entity and its significance:

### Google Overview
**Type:** Organization  
**Description:** Google is a leading technology company, renowned for its innovations in various fields, including quantum computing. It is recognized for its quantum chip, Willow, which demonstrates capabilities that exceed the processing power of current supercomputers. Google has achieved significant milestones in quantum technology, including claiming quantum supremacy in 2019.

### Quantum Computing Efforts
Google is actively developing quantum computing technologies, competing with other organizations like Microsoft. The company introduced the Willow quantum chip, aiming to enhance computational accuracy and speed in tasks traditionally dominated by classical computers.

### Major Achievements
- **Quantum Supremacy:** In 2019, Google claimed to have achieved quantum supremacy, marking a pivotal moment in the field of quantum computing.
- **Willow Quantum Chip:** This innovative chip is part of Google's strategy to lead the quantum computing frontier, aimed at performing complex computations rapidly.

Google's efforts in the quantum computing space position it as a critical player in the race for advanced computing capabilities, making it a direct competitor to Microsoft and other companies in the sector.

### Route 2: Queries With Graph Analytics
This route was designed with several processing steps:
1. Graph Analytics handles by the **query_graph** function above
2. Enhancing semantic context about the nodes relevant to the query returned by **query_graph** with LightRAG.
3. Synthesize the answer

This is a set of functions that help with enhancing semantic context of nodes retrieved by **query_graph** 

In [276]:
from pydantic import BaseModel, Field
class NodeIDs(BaseModel):
        """Model for storing extracted node IDs."""
        node_ids: List[str] = Field(description="List of Arango graph node IDs found in the text. Has the format <graph_name>/<node_id>")

def get_node_attributes(graph, node_id):
    '''Return the original_id of the node given the node id'''
    return graph.nodes[node_id].get('original_id', None)

def get_nodes_info(query):
    """
    Retrieve detailed information about nodes identified in a query.

    Args:
        query (str): Query to identify relevant nodes in the graph

    Returns:
        dict: Dictionary mapping node IDs to their original IDs and associated information
    """
    
    llm = ChatOpenAI(model_name="gpt-4o-mini")
    llm = llm.with_structured_output(NodeIDs)
    node_ids = llm.invoke(query).node_ids
    
    
    nodes_info = {} 
    for node_id in node_ids:
        node_original_id = get_node_attributes(G_adb_undirected, node_id)
        node_info = lightrag.query(f'''Give me concise information about this node {node_original_id}''')
        nodes_info[f'{node_id}'] = {f'{node_original_id}': node_info}
    
    
    return nodes_info

This is the main funtion for Route 2

In [344]:
def query_graph_need_graph_analytics(query):
    """
    Execute a graph query with enhanced analytics and node information synthesis.

    Args:
        query (str): User query requiring graph analytics and node information

    Returns:
        str: Synthesized answer combining graph analysis and detailed node information
    """
    llm = ChatOpenAI(model_name="gpt-4o-mini")
    
    # run graph analytics to find relevant nodes and graph statistics to the ChishikAI query
    _, query_graph_answer = query_graph(query)
    
    # enhance the nodes 
    graph_query_relevant_nodes_context = get_nodes_info(query_graph_answer)
    
    prompt = f'''You are an expert about the topic {research_topic_underscore}. 
                        You are given a query: {query} to answer. 
                        This is some helpful info from the a knowledge graph: {query_graph_answer}.
                        More info on the nodes mentioned above is in the following list: {graph_query_relevant_nodes_context}.
                        Synthesize a detailed and comprehensive answer to the query based on the info provided.
                        If you do not see you have enough information to answer the query, just say so.
                        '''
    answer = llm.invoke(prompt)
    
    
    return answer.content

##### Now let's test this function on the query that LightRAG alone could not answer

In [None]:
query_graph_better_answer_than_lightrag = query_graph_need_graph_analytics('''Tell me about the second most popular node in this knowledge graph?''')


1) Generating NetworkX code
----------
def analyze_graph(G):
    import networkx as nx
    popularity = [{'id': node, 'connections': degree} for node, degree in G.degree()]
    return popularity
----------

2) Reviewing NetworkX code
Review result: APPROVE
----------

3) Executing NetworkX code
----------
Result: [{'id': 'majorana_1_quantum_chip_node/141', 'connections': 1}, {'id': 'majorana_1_quantum_chip_node/562', 'connections': 0}, {'id': 'majorana_1_quantum_chip_node/371', 'connections': 0}, {'id': 'majorana_1_quantum_chip_node/394', 'connections': 0}, {'id': 'majorana_1_quantum_chip_node/314', 'connections': 1}, {'id': 'majorana_1_quantum_chip_node/384', 'connections': 2}, {'id': 'majorana_1_quantum_chip_node/530', 'connections': 0}, {'id': 'majorana_1_quantum_chip_node/391', 'connections': 0}, {'id': 'majorana_1_quantum_chip_node/424', 'connections': 0}, {'id': 'majorana_1_quantum_chip_node/452', 'connections': 1}, {'id': 'majorana_1_quantum_chip_node/533', 'connections': 1}, {'

The second most popular node in the knowledge graph is `majorana_1_quantum_chip_node/54`, which has 20 connections. This node represents Majorana 1, Microsoft's first Quantum Processing Unit (QPU) powered by a topological core, aimed at advancing fault-tolerant quantum computing. 

Key features of Majorana 1 include:
- **Topological Core Architecture**: It utilizes a unique architecture designed for scalable and stable qubits, addressing significant error correction challenges.
- **Prototype Functionality**: Currently functioning as an eight-qubit prototype, it incorporates advanced features to enhance stability and reduce error rates.
- **New State of Matter**: The chip introduces a "topological state" crucial for its operation.

Majorana 1 is also geared towards solving complex problems in various fields such as drug discovery, AI optimization, climate science, and financial modeling. It represents a significant milestone in Microsoft's ongoing commitment to quantum computing.

In [335]:
pretty_markdown(query_graph_better_answer_than_lightrag)

The second most popular node in the knowledge graph is `majorana_1_quantum_chip_node/54`, which has 20 connections. This node represents Majorana 1, Microsoft's first Quantum Processing Unit (QPU) powered by a topological core, aimed at advancing fault-tolerant quantum computing. 

Key features of Majorana 1 include:
- **Topological Core Architecture**: It utilizes a unique architecture designed for scalable and stable qubits, addressing significant error correction challenges.
- **Prototype Functionality**: Currently functioning as an eight-qubit prototype, it incorporates advanced features to enhance stability and reduce error rates.
- **New State of Matter**: The chip introduces a "topological state" crucial for its operation.

Majorana 1 is also geared towards solving complex problems in various fields such as drug discovery, AI optimization, climate science, and financial modeling. It represents a significant milestone in Microsoft's ongoing commitment to quantum computing.

Route 2 gave us the right node along with its info.

### Route 3: Expansion
What makes ChishikAI 'deep' is its ability to expand and grow the knowledge graph. This route do that when the user asks it to get more info about a topic or a set of topics.




In [328]:
# expansion functions

def expand_graph_on_topic(expand_topic, max_results):
    """
    Search for a topic, process web pages, and add them to the existing LightRAG instance.
    
    Args:
        search_topic (str): The topic to search for
        max_results (int): Maximum number of search results to process
    """
    print(f"Starting search and processing for: {expand_topic}")
    
    # Step 1: Search DuckDuckGo
    results = search_duckduckgo(expand_topic, max_results=max_results)
    if not results: 
        print("No search results found")
        return
    
    # Step 2: Fetch and save content
    query_folder, processed_files = fetch_and_save_content(
        results, 
        research_topic_underscore,
        max_workers=2
    )
    
    if not processed_files:
        print("No files were successfully processed")
        return
    
    # Step 3: Process and insert documents into existing lightrag instance
    print(f"Processing {len(processed_files)} documents...")
    for i, md_file in enumerate(processed_files, 1):
        print(f"Processing file {i}/{len(processed_files)}: {md_file}")
        with open(md_file, 'r', encoding='utf-8') as f:
            content = f.read()
            # Insert content into existing lightrag instance
            lightrag.insert(content)
            print(f"File {i}: Inserted successfully")
    
    print(f"Search and processing completed for: {expand_topic}")

class ExpandTopics(BaseModel):
    topics: List[str] = Field(description='''List of topics that the user shows interest in in the query and wants to do get more info from web research and expand the knowledge graph on.
                              Extract the topic keywords exactly as they are in the query. Do not add any additional topics on your own.
                              ''')

def extract_topic_and_expand_graph(query, max_results=1):
    '''Use this tool to extract the topics the user wants to expand the graph on.'''
    llm = ChatOpenAI(model_name="gpt-4o-mini")
    llm = llm.with_structured_output(ExpandTopics)
    expand_topics = llm.invoke(query)
    print(expand_topics.topics)
    for topic in expand_topics.topics:
        expand_graph_on_topic(topic, max_results=max_results)
    return 'Successfully expanded the graph on the topics: ' + ', '.join(expand_topics.topics) + '!'


In [329]:
test_expand_graph = extract_topic_and_expand_graph('''I am interested in learning more about Google's quantum chip.''', max_results=2)

["Google's quantum chip"]
Starting search and processing for: Google's quantum chip
Starting DuckDuckGo search for: Google's quantum chip


INFO:primp:response: https://lite.duckduckgo.com/lite/ 200 17984


Found 2 results
Starting content fetch for 2 results
Processing URL 1: https://blog.google/technology/research/google-willow-quantum-chip/
Processing URL 2: https://en.wikipedia.org/wiki/Willow_processor


INFO:docling.document_converter:Going to convert document batch...
INFO:docling.pipeline.base_pipeline:Processing document Willow_processor
INFO:docling.document_converter:Finished converting document Willow_processor in 0.17 sec.
INFO:docling.document_converter:Going to convert document batch...
INFO:docling.pipeline.base_pipeline:Processing document google-willow-quantum-chip
INFO:docling.document_converter:Finished converting document google-willow-quantum-chip in 0.23 sec.
INFO:lightrag:Processing 1 new unique documents


Successfully processed https://en.wikipedia.org/wiki/Willow_processor
Successfully processed https://blog.google/technology/research/google-willow-quantum-chip/
Content fetch completed in 0.41 seconds. Successfully processed 2 files.
Processing 2 documents...
Processing file 1/2: ddgo_pages_md/majorana_1_quantum_chip/1_https%3A%2F%2Fblog.google%2Ftechnology%2Fresearch%2Fgoogle-willow-quantum-chip%2F.md


Processing batch 1:   0%|          | 0/1 [00:00<?, ?it/s]INFO:lightrag:Inserting 4 vectors to chunks
Generating embeddings: 100%|██████████| 1/1 [00:00<00:00,  1.21batch/s]

[A
[A

[A[A

[A[AINFO:lightrag:Inserting 98 vectors to entities
Generating embeddings: 100%|██████████| 4/4 [00:01<00:00,  2.95batch/s]
INFO:lightrag:Inserting 34 vectors to relationships
Generating embeddings: 100%|██████████| 2/2 [00:03<00:00,  1.60s/batch]
INFO:lightrag:Writing graph with 649 nodes, 233 edges
Processing batch 1: 100%|██████████| 1/1 [00:35<00:00, 35.08s/it]
INFO:lightrag:Processing 1 new unique documents


File 1: Inserted successfully
Processing file 2/2: ddgo_pages_md/majorana_1_quantum_chip/2_https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FWillow_processor.md


Processing batch 1:   0%|          | 0/1 [00:00<?, ?it/s]INFO:lightrag:Inserting 2 vectors to chunks
Generating embeddings: 100%|██████████| 1/1 [00:03<00:00,  3.06s/batch]

[A
[A

[A[A

[A[AINFO:lightrag:Inserting 31 vectors to entities
Generating embeddings: 100%|██████████| 1/1 [00:01<00:00,  1.14s/batch]
INFO:lightrag:Inserting 27 vectors to relationships
Generating embeddings: 100%|██████████| 1/1 [00:00<00:00,  1.23batch/s]
INFO:lightrag:Writing graph with 669 nodes, 259 edges
Processing batch 1: 100%|██████████| 1/1 [00:31<00:00, 31.60s/it]

File 2: Inserted successfully
Search and processing completed for: Google's quantum chip





##### The graph size has increased after expansion

In [338]:
print(G_adb_undirected.number_of_nodes())
print(G_adb_undirected.number_of_edges())

567
200


In [336]:
G = nx.read_graphml('lightrag/majorana_1_quantum_chip/graph_chunk_entity_relation.graphml')
print(G.number_of_nodes())
print(G.number_of_edges())


669
259


### THE ROUTER
I use an LLM to decide the route based on user's query with structured output

In [330]:
from pydantic import BaseModel, Field
from typing import List, Literal

class LLMRouterOutput(BaseModel):
    """Model for storing the output of the LLM router."""
    route: Literal['query_graph_with_graph_analytics', 'lightrag', 'expansion', 'verify with user'] = Field(
        description="The chosen route based on the query"
    )

def llm_router(query: str) -> LLMRouterOutput:
    """
    Routes the query to one of the processing paths based on its content.
    
    Parameters:
    query (str): The input query to be routed.
    
    Returns:
    LLMRouterOutput: The output containing the chosen route.
    """
    llm = ChatOpenAI(model_name="gpt-4o-mini")
    llm = llm.with_structured_output(LLMRouterOutput)
    route = llm.invoke(f'''You are given a query: "{query}"
                       
                       You must choose exactly ONE of the following routing options:
                       - 'to query_graph_with_graph_analytics': For queries that need graph analytics to answer (about nodes, edges, relationships)
                       - 'to lightrag': For information queries that don't need graph analytics
                       - 'expansion': For queries about expanding the knowledge graph
                       - 'verify with user': If the intent is unclear
                       
                       Return only one of these exact options.''')
    
    return route

We can run some test on router to see if it works as expected

In [331]:
llm_router('what is the most popular node in the graph?')

LLMRouterOutput(route='query_graph_with_graph_analytics')

In [332]:
llm_router('tell me about majorana particles?')

LLMRouterOutput(route='lightrag')

In [333]:
llm_router('expand the graph on the topic of quantum computing')

LLMRouterOutput(route='expansion')

### Finally, the function that does it all, **chishikai_answer**

In [345]:
def chishikai_answer(query):
    router_output = llm_router(query)
    if router_output.route == 'query_graph_with_graph_analytics':
        return query_graph_need_graph_analytics(query)
    elif router_output.route == 'lightrag':
        return lightrag.query(query)
    elif router_output.route == 'expansion':
        return extract_topic_and_expand_graph(query)
    elif router_output.route == 'verify with user':
        return 'I do not know how to answer that. Please state your intent more clearly.'
test_chishikai_answer = chishikai_answer('Tell me about the most popular node in the graph?')

1) Generating NetworkX code
----------
def analyze_graph(G):
    import networkx as nx
    return "Degree centrality is the most suitable popularity measure for a social network graph because it directly quantifies the number of connections each node has, effectively indicating popularity."
----------

2) Reviewing NetworkX code
Review result: REJECT: The function does not analyze the graph using NetworkX and merely returns a static string without utilizing the provided graph parameter G.
----------

Max reviews (1) reached. Proceeding with last code.

3) Executing NetworkX code
----------
Result: Degree centrality is the most suitable popularity measure for a social network graph because it directly quantifies the number of connections each node has, effectively indicating popularity.
----------
4) Formulating final answer
1) Generating NetworkX code
----------
def analyze_graph(G):
    import networkx as nx
    centrality = nx.degree_centrality(G)
    return [{'id': node, 'degree_cen

In [346]:
pretty_markdown(test_chishikai_answer)

The most popular node in the graph related to the Majorana 1 quantum chip is identified as `'majorana_1_quantum_chip_node/55'`, which represents Microsoft. This node is notable for its high degree centrality score of 0.0901, indicating a significant number of connections compared to other nodes in the graph, many of which have a degree centrality of 0.0, reflecting no connections and lower popularity.

### Overview of Microsoft (Node ID: `'majorana_1_quantum_chip_node/55'`)
- **Entity Type**: Organization
- **Description**: Microsoft is a leading multinational technology corporation, particularly known for its contributions to software, hardware, and cloud services. The company has made substantial advancements in quantum computing, especially highlighted by its work on the Majorana 1 chip, which utilizes topological qubits aimed at creating fault-tolerant quantum computers.

### Outgoing Connections
Microsoft's node is connected to various partners and initiatives:
1. **Collaboration with Atom Computing (Weight: 40)**: Focuses on advancements in quantum computing technologies.
2. **Partnership with Quantinuum (Weight: 40)**: Joint efforts on hybrid chemistry simulations and enhancing quantum capabilities.
3. **Development of Majorana particles (Weight: 9)**: Critical for their approach to stable quantum computing.
4. **Comparison of quantum computing timelines with Nvidia (Weight: 6)**: This highlights their forward-looking approach and optimism in the field.
5. **Development of Majorana 1 Chip (Weight: 28)**: Marks significant progress and technological advancements in quantum computing.

### Incoming Connections
Microsoft is also linked to several significant programs and collaborations:
1. **Participation in DARPA's US2QC Program (Weight: 8)**: This initiative helps in advancing Microsoft's quantum computing roadmap.
2. **Involvement in QBI (Weight: 7)**: Collaborative efforts to evaluate quantum computing technologies.
3. **Advancement to the final phase of DARPA's quantum computing project (Weight: 14)**: Demonstrates a strong support from governmental entities for Microsoft's innovative endeavors.

### Strategic Role in Quantum Computing
This detailed overview emphasizes Microsoft's influential role in the quantum computing landscape, showcasing its strategic partnerships and commitment to innovation. Its engagement with various organizations and platforms, like Azure Quantum, positions it as a leader in advancing quantum technologies and integrating these solutions with traditional computing paradigms.

Overall, Microsoft's extensive collaborations and focused research underscore its status as the most popular node in the graph related to quantum computing, particularly through its development of the Majorana 1 chip.

# GRADIO UI

In [347]:
import gradio as gr

# Create a wrapper function that handles the chat history parameter
def chat_wrapper(message, history):
    # We only pass the user's message to the original function
    return chishikai_answer(message)

# Create a full chat interface with title and subtitle
with gr.Blocks(theme="default") as demo:
    gr.Markdown("# ChishikAI")
    gr.Markdown("### Your Deep Graph Researcher")
    
    chat_interface = gr.ChatInterface(
        fn=chat_wrapper,
    )

demo.launch(share=False)

  self.chatbot = Chatbot(


* Running on local URL:  http://127.0.0.1:7866

To create a public link, set `share=True` in `launch()`.




1) Generating NetworkX code
----------
def analyze_graph(G):
    import networkx as nx
    connected_components = list(nx.connected_components(G))
    number_of_connected_components = len(connected_components)
    connected_components_sizes = [len(component) for component in connected_components]
    return {
        'number_of_connected_components': number_of_connected_components,
        'connected_components_sizes': connected_components_sizes
    }
----------

2) Reviewing NetworkX code
Review result: REJECT: The code uses `nx.connected_components`, which only works for undirected graphs. If `G` is a directed graph, you should use `nx.strongly_connected_components` or `nx.weakly_connected_components` instead.
----------

Max reviews (1) reached. Proceeding with last code.

3) Executing NetworkX code
----------
Result: {'number_of_connected_components': 395, 'connected_components_sizes': [3, 1, 1, 1, 2, 120, 1, 1, 1, 3, 6, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2



----------
Result: {'majorana_1_quantum_chip_node/141': {'degree_centrality': 0.0017667844522968198, 'closeness_centrality': 0.002355712603062426, 'betweenness_centrality': 0.0}, 'majorana_1_quantum_chip_node/562': {'degree_centrality': 0.0, 'closeness_centrality': 0.0, 'betweenness_centrality': 0.0}, 'majorana_1_quantum_chip_node/371': {'degree_centrality': 0.0, 'closeness_centrality': 0.0, 'betweenness_centrality': 0.0}, 'majorana_1_quantum_chip_node/394': {'degree_centrality': 0.0, 'closeness_centrality': 0.0, 'betweenness_centrality': 0.0}, 'majorana_1_quantum_chip_node/314': {'degree_centrality': 0.0017667844522968198, 'closeness_centrality': 0.0017667844522968198, 'betweenness_centrality': 0.0}, 'majorana_1_quantum_chip_node/384': {'degree_centrality': 0.0035335689045936395, 'closeness_centrality': 0.0799342959392181, 'betweenness_centrality': 0.0007379843021983177}, 'majorana_1_quantum_chip_node/530': {'degree_centrality': 0.0, 'closeness_centrality': 0.0, 'betweenness_centralit

INFO:lightrag:Using global mode for query processing
INFO:lightrag:Query: Cloud computing, Azure, Information technology, top_k: 60, cosine: 0.2
INFO:lightrag:Global query uses 38 entites, 60 relations, 3 text units
