## Setup
The `settings.yaml` file configures the library including model types being used - in this case I've configured it to use gpt-4o.

Please refer to the [CLI docs](https://microsoft.github.io/graphrag/cli/#init) for more detailed information on how to generate the `settings.yaml` file.

#### Load `settings.yaml` configuration

In [3]:
import yaml
from graphrag.config.create_graphrag_config import create_graphrag_config

settings = yaml.safe_load(open("settings.yaml")) 
#the config file is generated from the settings we loaded

config = create_graphrag_config(
    values=settings, root_dir="."
)

## Workflow and Document Processing
We're pre-processing the documents and building or populating the graph that will back the LLM.

In [4]:

from graphrag.index.run.run_workflows import run_workflows
from graphrag.index.typing import PipelineRunResult
import graphrag.api as api



workflows = [
        "create_base_text_units",
        "create_final_documents",
        "extract_graph",
        "compute_communities",
        "create_final_entities",
        "create_final_relationships",
        "create_final_nodes",
        "create_final_communities",
        "create_final_text_units",
        "create_final_community_reports",
        "generate_text_embeddings",
    ]

In [None]:
#this step takes several minutes as the workflow steps are executed

outputs: list[PipelineRunResult] = []

async for output in run_workflows(
    workflows,
    config,
    cache=None,
    callbacks=[],
    logger=None,
    is_update_run=None,
):
    outputs.append(output)


  _edge_swap_numba = nb.jit(_edge_swap, nopython=False)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  input.loc[:, NODE_DETAILS] = input.loc[


In [7]:

# index_result is a list of workflows that make up the indexing pipeline that was run
for workflow_result in outputs:
    print(workflow_result)
    status = f"error\n{workflow_result.errors}" if workflow_result.errors else "success"
    print(f"Workflow Name: {workflow_result.workflow}\tStatus: {status}")

PipelineRunResult(workflow='create_base_text_units', result=                                                   id  \
0   336671e337e5f4539069473e8f8691b3ed696331aabe67...   
1   2160a0c64179a7920c578f3400ad64f77c22927e6ab8c7...   
2   d798befe565a9ed5b6b536fd8a95a1d396867b232ec308...   
3   cc6a8a52ea673776c03f32442c2a05f75b59d30a0bf4c0...   
4   1c129c3dd67b1761adbdb4186b2de1036b2e4ff3683e4d...   
5   fdd19e6236e61193504953904d1221bc393a60fe728ffa...   
6   a998e6a1b2d1e74ba419f937061024540104d2716c1917...   
7   3292473b26f94c7aff219219ee64dcba1585532bac0857...   
8   25ae520bd79457caa7d277e1ba3731e3f498fc62f02935...   
9   d38581de899a32c16a744f6a867412cb91e528f9383372...   
10  b644ae78a58c60ff6b7a6b959c84a3f5f7d8b97123992d...   
11  f995cd9f704ad3fe03a64029c2dfa6beb97262269f6a4c...   
12  d1537788200767168593eb8e9d4f4c4b7006aba28fb1cc...   
13  5c5adb5118a758e4a0a70d2d702eb73cddadb7d95e2efa...   
14  3c6bd4bf5311e797262e6b100e39817183f99836bd9760...   
15  88479779a69573e42cf7992d

## Query an index

To query an index, several index files must first be read into memory and passed to the query API. 

In [9]:
import pandas as pd

final_nodes = pd.read_parquet("output/create_final_nodes.parquet")
final_entities = pd.read_parquet(
    "output/create_final_entities.parquet"
)
final_communities = pd.read_parquet(
    "output/create_final_communities.parquet"
)
final_community_reports = pd.read_parquet(
    "output/create_final_community_reports.parquet"
)

response, context = await api.global_search(
    config=config,
    nodes=final_nodes,
    entities=final_entities,
    communities=final_communities,
    community_reports=final_community_reports,
    community_level=2,
    dynamic_community_selection=False,
    response_type="Multiple Paragraphs",
    query="Who is Scrooge and what are his main relationships?",
)

creating llm client with {'api_key': 'REDACTED,len=164', 'type': "openai_chat", 'encoding_model': 'cl100k_base', 'model': 'gpt-4o', 'max_tokens': 4000, 'temperature': 0.0, 'top_p': 1.0, 'n': 1, 'frequency_penalty': 0.0, 'presence_penalty': 0.0, 'request_timeout': 180.0, 'api_base': None, 'api_version': None, 'organization': None, 'proxy': None, 'audience': None, 'deployment_name': None, 'model_supports_json': True, 'tokens_per_minute': 50000, 'requests_per_minute': 1000, 'max_retries': 10, 'max_retry_wait': 10.0, 'sleep_on_rate_limit_recommendation': True, 'concurrent_requests': 25, 'responses': None}


The response object is the official reponse from graphrag while the context object holds various metadata regarding the querying process used to obtain the final response.

In [10]:
print(response)

### Ebenezer Scrooge: Character Overview

Ebenezer Scrooge is initially portrayed as a miserly and cold-hearted individual, particularly during the Christmas season. His wealth and uncharitable demeanor lead to his isolation from others, as he is known for his lack of generosity and warmth [Data: Reports (9)].

### Transformation and Supernatural Encounters


### Key Relationships

- **Fred, Scrooge's Nephew**: Fred embodies the spirit of Christmas with his positive outlook and actions. He consistently invites Scrooge to join the Christmas celebrations, demonstrating kindness and a willingness to connect with his uncle despite Scrooge's initial coldness [Data: Reports (7)].

- **Bob Cratchit, Scrooge's Clerk**: Bob Cratchit works for Scrooge, who is initially depicted as a miserly employer. However, the story highlights a transformation in Scrooge's character, leading to a positive change in Bob's circumstances, including a raise and support for his family [Data: Reports (11)].

These 

Digging into the context a bit more provides users with extremely granular information such as what sources of data (down to the level of text chunks) were ultimately retrieved and used as part of the context sent to the LLM model).

In [11]:
from pprint import pprint

pprint(context)  # noqa: T203

{'claims': [],
 'entities': [],
 'relationships': [],
 'reports': [{'content': '# Ebenezer Scrooge and His Transformative Journey\n'
                         '\n'
                         'The community centers around Ebenezer Scrooge, a '
                         'once miserly and solitary figure, whose life is '
                         'profoundly transformed through supernatural '
                         'encounters. Key entities include his deceased '
                         'partner Marley, his clerk Bob Cratchit, and various '
                         'spirits that guide him through reflections on his '
                         'past, present, and potential future. The narrative '
                         'unfolds in London, with significant events occurring '
                         "on Christmas Eve, leading to Scrooge's redemption "
                         'and newfound generosity.\n'
                         '\n'
                         "## Scrooge's Initial Character a