# **Query**
---

For the query, I will be using the **Drift Search** method. This method is developed by Microsoft Research team. DRIFT search (Dynamic Reasoning and Inference with Flexible Traversal) builds upon Microsoft’s GraphRAG technique, combining characteristics of both global and local search to generate detailed responses in a method that balances computational costs with quality outcomes using drift search method.

In [2]:
import os 
from pathlib import Path

import pandas as pd 

from graphrag.config.enums import ModelType
from graphrag.config.models.drift_search_config import DRIFTSearchConfig
from graphrag.config.models.language_model_config import LanguageModelConfig
from graphrag.config.models.vector_store_schema_config import VectorStoreSchemaConfig
from graphrag.language_model.manager import ModelManager
from graphrag.query.indexer_adapters import (
    read_indexer_entities,
    read_indexer_relationships,
    read_indexer_report_embeddings,
    read_indexer_reports,
    read_indexer_text_units
)
from graphrag.query.structured_search.drift_search.drift_context import DRIFTSearchContextBuilder
from graphrag.query.structured_search.drift_search.search import DRIFTSearch
from graphrag.tokenizer.get_tokenizer import get_tokenizer
from graphrag.vector_stores.lancedb import LanceDBVectorStore

In [4]:
INPUT_DIR = r"rag-system\output"
LANCEDB_PATH = r"rag-system\output\lancedb"
COMMUNITY_REPORT_TABLE = "community_reports"
COMMUNITY_TABLE = "communities"
ENTITY_TABLE = "entities"
RELATIONSHIP_TABLE = "relationships"
COVARIATE_TABLE = "covariates"
TEXT_UNIT_TABLE = "text_units"
COMMUNITY_LEVEL = 2

In [5]:
# read nodes table to get community and degree data
entity_df = pd.read_parquet(f"{INPUT_DIR}/{ENTITY_TABLE}.parquet")
community_df = pd.read_parquet(f"{INPUT_DIR}/{COMMUNITY_TABLE}.parquet")

print (f"Entity df columns: {entity_df.columns}")

Entity df columns: Index(['id', 'human_readable_id', 'title', 'type', 'description',
       'text_unit_ids', 'frequency', 'degree', 'x', 'y'],
      dtype='object')


In [6]:
entities = read_indexer_entities(entity_df, community_df, COMMUNITY_LEVEL)

# load descriptions embeddings to an in-memory vector store
description_embedding_store = LanceDBVectorStore(
    vector_store_schema_config=VectorStoreSchemaConfig(
        index_name = "default-entity-description"
    ),
)

description_embedding_store.connect(db_uri = LANCEDB_PATH)

In [7]:
full_content_embedding_store = LanceDBVectorStore(
    vector_store_schema_config=VectorStoreSchemaConfig(
        index_name = "default-community-full_content"
    ),
)

full_content_embedding_store.connect(db_uri = LANCEDB_PATH)

In [8]:
print (f"Number of Entities loaded: {len(entity_df)}")
entity_df.head()

Number of Entities loaded: 51


Unnamed: 0,id,human_readable_id,title,type,description,text_unit_ids,frequency,degree,x,y
0,4cd3d037-9d90-4f99-92f2-43a7c22a0265,0,JAMES NAISMITH,PERSON,"James Naismith is the inventor of basketball, ...",[0532a59ebcdea44655709f423ec25a4053ae4a0bed2ca...,1,4,0.0,0.0
1,a1ece026-d01b-4eb7-b541-034bfa6697c3,1,BASKETBALL HALL OF FAME,ORGANIZATION,"The Naismith Memorial Basketball Hall of Fame,...",[0532a59ebcdea44655709f423ec25a4053ae4a0bed2ca...,2,4,0.0,0.0
2,dfea875d-67c2-49b5-9695-4fa77998ea6b,2,SPRINGFIELD,GEO,Springfield is a city located in Massachusetts...,[0532a59ebcdea44655709f423ec25a4053ae4a0bed2ca...,2,1,0.0,0.0
3,1fcd6087-9231-4f40-b3aa-f7bb0c762e80,3,NEW JERSEY,GEO,New Jersey is a state that participated in the...,[0532a59ebcdea44655709f423ec25a4053ae4a0bed2ca...,1,1,0.0,0.0
4,5523cbe5-3adb-49bf-8710-797b073ec029,4,NEW YORK,GEO,New York is a state that participated in the f...,[0532a59ebcdea44655709f423ec25a4053ae4a0bed2ca...,1,1,0.0,0.0


In [9]:
relationship_df = pd.read_parquet(f"{INPUT_DIR}/{RELATIONSHIP_TABLE}.parquet")
relationships = read_indexer_relationships(relationship_df)

print (f"Number of Relationships loaded: {len(relationship_df)}")
relationship_df.head()

Number of Relationships loaded: 52


Unnamed: 0,id,human_readable_id,source,target,description,weight,combined_degree,text_unit_ids
0,da61fbb9-edbb-44f8-9b27-0b8fb3a93e8e,0,JAMES NAISMITH,BASKETBALL HALL OF FAME,James Naismith's contributions to basketball a...,9.0,8,[0532a59ebcdea44655709f423ec25a4053ae4a0bed2ca...
1,601506ec-3574-4b21-b939-5a71935a2c29,1,JAMES NAISMITH,SPRINGFIELD,James Naismith invented basketball in Springfi...,8.0,5,[0532a59ebcdea44655709f423ec25a4053ae4a0bed2ca...
2,2106225d-fd3a-4c80-b292-71ba13b39741,2,JAMES NAISMITH,BASKETBALL,"Basketball was invented by James Naismith, mak...",1.0,13,[0532a59ebcdea44655709f423ec25a4053ae4a0bed2ca...
3,91d826fd-c015-4dc3-a895-4b6e1c7f315d,3,JAMES NAISMITH,FIRST BASKETBALL GAME,James Naismith was the inventor of the game th...,9.0,5,[0532a59ebcdea44655709f423ec25a4053ae4a0bed2ca...
4,4c3f8764-8d88-49ab-a0c0-41acfca05d43,4,NEW JERSEY,NEW YORK,New Jersey and New York played against each ot...,7.0,2,[0532a59ebcdea44655709f423ec25a4053ae4a0bed2ca...


In [10]:
text_unit_df = pd.read_parquet(f"{INPUT_DIR}/{TEXT_UNIT_TABLE}.parquet")
text_units = read_indexer_text_units(text_unit_df)

print (f"Number of Text Units loaded: {len(text_unit_df)}")
text_unit_df.head()

Number of Text Units loaded: 3


Unnamed: 0,id,human_readable_id,text,n_tokens,document_ids,entity_ids,relationship_ids,covariate_ids
0,0532a59ebcdea44655709f423ec25a4053ae4a0bed2cab...,0,## By Farah Farooqi T H E S T O R Y O F ...,1200,[601dbeb0fec14afcee160ab4fcb385de8bb1a217a9652...,"[4cd3d037-9d90-4f99-92f2-43a7c22a0265, a1ece02...","[da61fbb9-edbb-44f8-9b27-0b8fb3a93e8e, 601506e...","[ff681478-7656-451c-88fb-ee1c2890de69, 6d1bd11..."
1,e4c0006537c4f6ef6c0efaad2e13e2918d2d335f528ce6...,1,to many other countries. More people wanted t...,1200,[601dbeb0fec14afcee160ab4fcb385de8bb1a217a9652...,"[a1ece026-d01b-4eb7-b541-034bfa6697c3, dfea875...","[05634eed-bcfa-425c-8079-8cd2b8e6c566, d8792ca...","[3697137b-a96c-43d9-a933-34534e2d52ad, 1d5ce67..."
2,ac484bacf78ef3dc8937a7ed3c8f8cbca5d117ccf5c6c6...,2,", such as basketball , baseball , and tennis ...",400,[601dbeb0fec14afcee160ab4fcb385de8bb1a217a9652...,"[7f579d20-be3e-4dfb-95cf-c83e1d8af6fe, 0881991...","[31202f8c-fa81-43f3-93e7-127b8ccaeadf, 458194a...","[abf9fa22-d2a6-409e-8fa7-3face3003629, 66cd625..."


In [14]:
from dotenv import load_dotenv

load_dotenv()

OPENAI_API_KEY = os.getenv("GRAPHRAG_API_KEY")

In [15]:
chat_config = LanguageModelConfig(
    type = ModelType.Chat,
    api_key= OPENAI_API_KEY,
    model_provider = "openai",
    model = "gpt-4o-mini",
    max_retries=20
)

chat_model = ModelManager().get_or_create_chat_model(
    name = "local_search",
    model_type=ModelType.Chat,
    config=chat_config
)

tokenizer = get_tokenizer(chat_config)

embedding_config = LanguageModelConfig(
    type = ModelType.Embedding,
    api_key= OPENAI_API_KEY,
    model_provider = "openai",
    model = "text-embedding-3-small",
    max_retries=20
)

text_embedder = ModelManager().get_or_create_embedding_model(
    name = "local_search_embedding",
    model_type=ModelType.Embedding,
    config=embedding_config
)

In [16]:
def read_community_reports (
        input_dir: str,
        community_report_table: str = COMMUNITY_REPORT_TABLE
):
    input_path = f"{input_dir}/{community_report_table}.parquet"
    return pd.read_parquet(input_path)

report_df = read_community_reports(INPUT_DIR)
reports = read_indexer_reports(
    report_df,
    community_df,
    COMMUNITY_LEVEL,
    content_embedding_col="full_content_embedding"
)
read_indexer_report_embeddings(reports, full_content_embedding_store)

In [17]:
drift_parameters = DRIFTSearchConfig(
    reduce_temperature= 0,
    data_max_tokens = 12000,
    primer_folds=1,
    drift_k_followups=3,
    n_depth=3,
)

context_builder = DRIFTSearchContextBuilder(
    model=chat_model,
    text_embedder=text_embedder,
    entities=entities,
    relationships=relationships,
    reports=reports,
    entity_text_embeddings=description_embedding_store,
    text_units=text_units,
    tokenizer=tokenizer,
    config=drift_parameters
)

search = DRIFTSearch(
    model=chat_model,
    context_builder=context_builder,
    tokenizer=tokenizer,
)

In [18]:
response = await search.search (
    query= "Who is James A. Naismith?"
)

  0%|          | 0/3 [00:00<?, ?it/s]        Reached token limit - reverting to previous context state
Reached token limit - reverting to previous context state
Reached token limit - reverting to previous context state
  0%|          | 0/3 [00:00<?, ?it/s]        Reached token limit - reverting to previous context state
Reached token limit - reverting to previous context state
Reached token limit - reverting to previous context state
No answer found for query: What were the specific 13 original rules of basketball?
No follow-up actions found for response: {}
No follow-up actions for action: What were the specific 13 original rules of basketball?
  0%|          | 0/3 [00:00<?, ?it/s]Reached token limit - reverting to previous context state
Reached token limit - reverting to previous context state
Reached token limit - reverting to previous context state
                                             

In [19]:
response.response

"## Overview of James A. Naismith\n\nJames A. Naismith is best known as the inventor of basketball, a sport he created in December 1891 while working as a physical education instructor at the International YMCA Training School in Springfield, Massachusetts. His invention aimed to provide a new indoor game to keep students active during the winter months. The original game utilized a soccer ball and two peach baskets, and it quickly evolved into the structured sport we recognize today.\n\n### Contributions to Basketball\n\nNaismith developed the original 13 rules of basketball, which laid the groundwork for the game's structure and gameplay. Key aspects of these rules included:\n\n1. **Objective**: Score points by throwing the ball into the opposing team's basket.\n2. **Teams**: Each game was played with two teams, typically consisting of nine players each.\n3. **Game Duration**: The game was played in two halves, totaling 30 minutes.\n4. **Ball Movement**: Players were required to pass

In [20]:
response = await search.search (
    query= "Explain about Naishmith Memorial Basketball Hall of Fame!"
)

response.response

  0%|          | 0/3 [00:00<?, ?it/s]Reached token limit - reverting to previous context state
Reached token limit - reverting to previous context state
Reached token limit - reverting to previous context state
  0%|          | 0/3 [00:00<?, ?it/s]        Reached token limit - reverting to previous context state
Reached token limit - reverting to previous context state
Reached token limit - reverting to previous context state
  0%|          | 0/3 [00:00<?, ?it/s]        Reached token limit - reverting to previous context state
Reached token limit - reverting to previous context state
Reached token limit - reverting to previous context state
                                             

"# Naismith Memorial Basketball Hall of Fame\n\nThe Naismith Memorial Basketball Hall of Fame, located in Springfield, Massachusetts, is a prestigious institution dedicated to honoring the contributions of players, coaches, referees, and other significant figures in the sport of basketball. Named after James A. Naismith, the inventor of basketball, the Hall of Fame serves multiple purposes, including celebrating basketball's rich history, educating the public, and engaging with the community.\n\n## Historical Significance\n\nEstablished in 1959, the Hall of Fame was created to formally recognize the achievements of basketball's key figures and preserve their legacies for future generations. The first class of inductees included legends such as George Mikan and Adolph Rupp, setting a standard for future recognitions [Data: Sources (1)].\n\n## Notable Features and Exhibits\n\nThe Hall of Fame features a variety of exhibits that celebrate the history and evolution of basketball, including