<h2>Dynamic Query Engine Selection</h2>

In [1]:
import os

from llama_index.core import VectorStoreIndex,ServiceContext
from llama_index.core import KnowledgeGraphIndex, SimpleDirectoryReader
from llama_index.core import StorageContext
from llama_index.graph_stores.nebula import NebulaGraphStore
from llama_index.llms.openai import OpenAI
from llama_index.core import Settings
from llama_index.core.graph_stores import SimpleGraphStore
from llama_index.core.query_engine.router_query_engine import RouterQueryEngine
from llama_index.core.selectors.llm_selectors import LLMSingleSelector
from llama_index.core.indices.composability import ComposableGraph
from llama_index.core.indices.keyword_table import GPTSimpleKeywordTableIndex


from dotenv import load_dotenv
from pathlib import Path
import requests

<h3>Setting up the Environment</h3>


In [2]:
# Load variables from the .env file into environment variables
dotenv_path = '.env'
load_dotenv(dotenv_path)


# Setting up LLM for Llama_index
llm = OpenAI(temperature=0, model="gpt-3.5-turbo")
Settings.llm = llm
Settings.chunk_size = 512

<h3>Setup Nebula Graph Store</h3>

In [3]:

space_name = "paul_graham_essay"
edge_types, rel_prop_names = ["relationship"], [
    "relationship"
] 
tags = ["entity"] 

graph_store = NebulaGraphStore(
    space_name=space_name,
    edge_types=edge_types,
    rel_prop_names=rel_prop_names,
    tags=tags,
)

<h3>Getting data from Wikipidea into Text files</h3>

In [4]:
wiki_titles = ["Hyderabad","Chennai","Mumbai","Delhi"]

for title in wiki_titles:
    response = requests.get(
        "https://en.wikipedia.org/w/api.php",
        params={
            "action": "query",
            "format": "json",
            "titles": title,
            "prop": "extracts",
            # 'exintro': True,
            "explaintext": True,
        },
    ).json()
    page = next(iter(response["query"]["pages"].values()))
    wiki_text = page["extract"]

    data_path = Path("data1")
    if not data_path.exists():
        Path.mkdir(data_path)

    with open(data_path / f"{title}.txt", "w",encoding="utf-8") as fp:
        fp.write(wiki_text)

In [5]:
# Load all wiki documents
city_docs = {}
for wiki_title in wiki_titles:
    city_docs[wiki_title] = SimpleDirectoryReader(
        input_files=[f"data1/{wiki_title}.txt"]
    ).load_data()

<h3>Loading documents from local </h3>

In [6]:
documents = SimpleDirectoryReader("data1").load_data()

<h3>Creating Graph Index </h3>

In [7]:
storage_context = StorageContext.from_defaults(graph_store=graph_store)

# NOTE: can take a while!
index = KnowledgeGraphIndex.from_documents(
    documents,
    storage_context=storage_context,
    max_triplets_per_chunk=2,
    space_name=space_name,
    edge_types=edge_types,
    rel_prop_names=rel_prop_names,
    tags=tags,
)

<h3>Querying with Graph Query Engine</h3>

In [8]:
graph_query_engine = index.as_query_engine(
    include_text=False, response_mode="tree_summarize"
)
response = graph_query_engine.query(
    "What is History of Chennai?",
)
print(response)

Chennai has historical connections with Mumbai, including various aspects such as economic ties, cultural exchanges, and administrative relationships. The city has been a hub for diverse populations, literacy rates, art deco buildings, and notable landmarks. Additionally, Chennai has had significant interactions with Mumbai in terms of governance, transportation, and economic activities.


In [9]:


llm_gpt3 = OpenAI(temperature=0, model="gpt-3.5-turbo")
service_context = ServiceContext.from_defaults(llm=llm_gpt3, chunk_size=1024)

# Build city document index
vector_indices = {}
for wiki_title in wiki_titles:
    storage_context = StorageContext.from_defaults()
    # build vector index
    vector_indices[wiki_title] = VectorStoreIndex.from_documents(
        city_docs[wiki_title],
        service_context=service_context,
        storage_context=storage_context,
    )
    # set id for vector index
    vector_indices[wiki_title].index_struct.index_id = wiki_title
    # persist to disk
    storage_context.persist(persist_dir=f"./storage/{wiki_title}")


response = (
    vector_indices["Hyderabad"]
    .as_query_engine()
    .query("What are the Attractions in Hyderabad?")
)
print(str(response))

  service_context = ServiceContext.from_defaults(llm=llm_gpt3, chunk_size=1024)


Attractions in Hyderabad include the Charminar, Golconda Fort, Qutb Shahi tombs, Chowmahalla Palace, Falaknuma Palace, Purani Haveli, King Kothi Palace, Bella Vista Palace, Paigah Palace, Asman Garh Palace, Basheer Bagh Palace, Errum Manzil, Spanish Mosque, Salar Jung Museum, Telangana State Archaeology Museum, Nizam Museum, City Museum, and Birla Science Museum.


<h3>Defining index Summaries</h3>

In [10]:
index_summaries = {}
for wiki_title in wiki_titles:
    # set summary text for city
    index_summaries[wiki_title] = (
        f"This content contains Wikipedia articles about {wiki_title}. "
        f"Use this index if you need to lookup specific facts about {wiki_title}.\n"
        "Do not use this index if you want to analyze multiple Cities."
    )

In [11]:
index_summaries

{'Hyderabad': 'This content contains Wikipedia articles about Hyderabad. Use this index if you need to lookup specific facts about Hyderabad.\nDo not use this index if you want to analyze multiple Cities.',
 'Chennai': 'This content contains Wikipedia articles about Chennai. Use this index if you need to lookup specific facts about Chennai.\nDo not use this index if you want to analyze multiple Cities.',
 'Mumbai': 'This content contains Wikipedia articles about Mumbai. Use this index if you need to lookup specific facts about Mumbai.\nDo not use this index if you want to analyze multiple Cities.',
 'Delhi': 'This content contains Wikipedia articles about Delhi. Use this index if you need to lookup specific facts about Delhi.\nDo not use this index if you want to analyze multiple Cities.'}

<h3>Generating Root Index from Composable Graph</h3>

In [12]:


graph = ComposableGraph.from_indices(
    GPTSimpleKeywordTableIndex,
    [index for _, index in vector_indices.items()],
    [summary for _, summary in index_summaries.items()],
    max_keywords_per_chunk=50,
)

# get root index
root_index = graph.get_index(graph.index_struct.index_id)
root_index.set_index_id("compare_contrast")

root_summary = (
    "This index contains Wikipedia articles about multiple cities. "
    "Use this index if you want to compare multiple cities. "
)

<h3>Creating Router Query Engine</h3>

In [13]:
from llama_index.core.tools.query_engine import QueryEngineTool

query_engine_tools = []

# add vector index tools
for wiki_title in wiki_titles:
    index = vector_indices[wiki_title]
    summary = index_summaries[wiki_title]

    query_engine = index.as_query_engine(service_context=service_context)
    vector_tool = QueryEngineTool.from_defaults(
        query_engine, description=summary
    )
    query_engine_tools.append(vector_tool)


# add graph tool
graph_description = (
    "This tool contains Wikipedia articles about multiple cities. "
    "Use this tool if you want to compare multiple cities. "
)
graph_tool = QueryEngineTool.from_defaults(
    graph_query_engine, description=graph_description
)
query_engine_tools.append(graph_tool)

In [14]:


router_query_engine = RouterQueryEngine(
    selector=LLMSingleSelector.from_defaults(service_context=service_context),
    query_engine_tools=query_engine_tools,
)

<h3>Querying with Router Query Engine</h3>

In [15]:
# ask a compare/contrast question
response = router_query_engine.query(
    "Compare and contrast the cities Chennai and Mumbai.",
)
print(str(response))

Chennai and Mumbai are both major cities in India with significant characteristics. Chennai is the capital of Tamil Nadu and is known for its diverse population, while Mumbai is the capital city of Maharashtra and serves as a major information technology hub. Chennai is governed by the Greater Chennai Corporation, whereas Mumbai is home to multiple chess grandmasters. Chennai has a tropical wet and dry climate, while Mumbai experiences a high internet usage rate. Both cities have extensive road networks and are connected by various relationships, showcasing their importance in different aspects such as governance, culture, and infrastructure.


In [17]:
response = router_query_engine.query("List some of Places in Chennai.")
print(str(response))

Tholkappia Poonga, Semmoli Poonga, Madras Crocodile Bank, Arignar Anna Zoological Park, Guindy National Park, Marina Beach, Elliot's Beach, M.A. Chidambaram Stadium, Chemplast Cricket Ground, Jawaharlal Nehru Stadium, Mayor Radhakrishnan Stadium, Velachery Aquatic Complex, SDAT Tennis Stadium, Madras Boat Club, Royal Madras Yacht Club, Guindy Race Course, Cosmopolitan Club, Gymkhana Club, Madras Motor Race Track.
