# 2 - Querying (20 mins)

## What You Will Learn

In this exercise you will learn how to query data. The exercise uses a different dataset from the first exercise.

In the course of this exercise you will:
  
  - Inspect the press releases and pre-extracted chunks that will be used to build the new dataset
  - Build the new dataset
  - Learn about the GraphRAG Toolkit's multi-tenancy feature
  - Query the data using traditional vector similarity search
  - Query the data using the GraphRAG Toolkit
  - Visualise the results
  - Inspect the underlying results generated by the GraphRAG Toolkit
  - Learn about the role of entity network contexts in querying the data
  - Visualise the entity networks used when querying the data

## Query Process

There are two parts to querying: _retrieve_ and _generate_:

#### Retrieve

  1. Create an embedding for the user question
  2. This embedding is used to conduct a _top k similarity search_ for chunk ids
  3. The chunk ids form the _entry points_ into the graph for a set of graph traversals
  
#### Generate

  4. The graph traversal results are submitted together with the user question to an LLM, which generates a response
  
Some workloads will only want the structured results of the Retrieve operation (steps 1-3), while others will want the natural language response returned by the Generate operation (steps 1-3 plus step 4). The GraphRAG Toolkit supports both options.   


![Querying](./images/query.png)

***

## Build a New Dataset

First, you're going to load another dataset. This dataset has already been pre-extracted from a set of press releases ‚Äì¬†all you need do is build the graph and vector stores from the pre-extracted chunks.

### üîç 2.1 Inspect the press releases
  
Take a moment to inspect one of the press releases. Pick one from the list below:

In [None]:
%pycat source-data/ecorp-md/Revolutionizing Personal Computing.md

In [None]:
%pycat source-data/ecorp-md/Countdown to Christmas.md

In [None]:
%pycat source-data/ecorp-md/AnyCompany Logistics Slashes Shipping Times to UK with Turquoise Canal Shortcut.md

In [None]:
%pycat source-data/ecorp-md/Turquoise Canal Blocked by Landslides.md

The press releases tell a story. In summary:

  - Example Corp sells Widgets
  - There is a huge Christmas demand for Widgets in the UK
  - Example Corp has partnered with AnyCompany Logistics
  - AnyCompany Logistics is cutting shipping times by using the Turquoise Canal
  - The Turquoise Canal is blocked, causing delays
  
![Corpus](./images/corpus.png)

### üîç 2.2 Inspect the pre-extracted chunks
  
Also take a look at one of the pre-extracted chunks:

In [None]:
%pycat source-data/ecorp/aws::69d4badc:968c/aws::69d4badc:968c:7281c2ee.json

This pre-extracted chunk includes a vector embedding. Normally, embeddings would be calculated _during_ the Build stage, but for the purposes of this workshop, to avoid frequent calls to Bedrock, we've precalculated the embeddings.

### üéØ 2.3 Build the press release lexical graph

Build the dataset using the cell below:

In [None]:
%reload_ext dotenv
%dotenv

import os

from graphrag_toolkit.lexical_graph import LexicalGraphIndex
from graphrag_toolkit.lexical_graph.storage import GraphStoreFactory
from graphrag_toolkit.lexical_graph.storage import VectorStoreFactory
from graphrag_toolkit.lexical_graph.indexing.load import FileBasedDocs

docs = FileBasedDocs(
    docs_directory='source-data',
    collection_id='ecorp'
)

with (
    GraphStoreFactory.for_graph_store(os.environ['GRAPH_STORE']) as graph_store,
    VectorStoreFactory.for_vector_store(os.environ['VECTOR_STORE'], index_names=['chunk']) as vector_store
):
    graph_index = LexicalGraphIndex(
        graph_store, 
        vector_store,
        tenant_id='ecorp' # tenant id - loads the data into a tenant-specific lexical graph
    )

    graph_index.build(
        docs, 
        show_progress=True
    )

print('Build complete')

***

## Multi-Tenancy

Notice that in the code above, the `LexicalGraphIndex` was initialised with `tenant_id='ecorp'`. This loads the press release data into a _separate_ lexical graph (named `ecorp`).

Multi-tenancy is a feature in the GraphRAG Toolkit that allows hosting multiple separate lexical graphs within the same underlying graph and vector stores. This capability allows you to manage and query different sets of data within a shared infrastructure.

Multi-tenancy can be particularly useful in the following scenarios:

  - Creating separate lexical graphs for different collections of documents
  - Managing individual user data
  - Handling different domains within the same infrastructure
  - Running multiple dev and test workloads

If you later run the optional notebook, **03 - Agentic Use Cases**, you will see how this ability to host multiple graphs in the same database is used to create different domain-specific tools for use by an AI agent.

***

## Query the Data


### üéØ 2.4 Query the data using vector RAG

Before you explore the graph-enabled search capabilties of the GraphRAG Toolkit, you'll query the press release data using pure vector search alone.

Imagine you are an analyst tasked with predicting the fortunes of Example Corp. What kinds of questions might you ask? How about: 

  - ***What are the sales prospects for Example Corp in the UK?***
  
The pure vector-based approach in the cell below uses similarity search to find relevant chunks that can help answer the question. Given your reading of the press releases (or the summaries above), what kind of answer might you expect vector search to produce? Run the cell below to find out.

<div class="alert alert-success">
This is not a full-blown vector RAG solution. Production RAG solutions typically incorporate multiple techniques, including vector search, semantic search, and reranking. The goal here is simply to show what vector search alone can achieve, and then, later, what we can add using the graph. As you'll see, vector search remains a powerful and important tool for building RAG solutions.
</div>

<div class="alert alert-danger">
‚è≥ <b style="color:black;">Wait</b>
    
The refresh interval for indexes in OpenSearch Serverless vector search collections is approximately <a href="https://docs.aws.amazon.com/opensearch-service/latest/developerguide/serverless-overview.html#serverless-limitations" target="_blank">60 seconds</a>. This means that newly inserted vectors won't be visible for up to a minute, which can negatively impact search results.

If you run a query below and get a response that indicates that the search results are empty, wait a few seconds and then re-run the query.

This is not a limitation of the GraphRAG Toolkit; rather, it is a feature of the OpenSearch Serverless vector store. Other vector stores (e.g. Amazon Aurora Postgres, Amazon Neptune Analytics) have different characteristics.
</div>

In [None]:
%reload_ext dotenv
%dotenv
%run './misc/vector_query.py'

vector_response = vector_query(
    "What are the sales prospects for Example Corp in the UK?", 
    tenant_id='ecorp', # we need to query the 'ecorp' tenant index
    streaming=True
)

vector_response.print_response_stream()

### Commentary

The response is detailed, but it is not complete. It is overly _optimistic_.

![Vector Search](./images/vector-search.png)

To answer your analyst question fully and effectively, the system must retrieve not only information that is semantically similar to the company and location named in the question (Example Corp, and the UK), but also structurally relevant, potentially _semantically dissimilar_ information regarding Example Corp's supply chain dependencies and any recent events impacting this supply chain. 

Some of this additional relevant information is missing ‚Äì¬†information about the blockage in the Turquoise Canal ‚Äì¬†and as a result, the response lacks the nuance we need to make an informed decision about Example Corp's fortunes.

### üéØ 2.5 Inspect the chunks returned by vector search

To see the content that was used to generate the response, run the cell below:

In [None]:
for i, n in enumerate(vector_response.source_nodes):
    print(f'Chunk {i+1}:\n\n{n.text}\n\n------------------\n')

### üéØ 2.6 Query the data using graph-enhanced search

Run the cell below to use the GraphRAG Toolkit's graph-enhanced search:

In [None]:
%reload_ext dotenv
%dotenv

import os

from graphrag_toolkit.lexical_graph import LexicalGraphQueryEngine
from graphrag_toolkit.lexical_graph.storage import GraphStoreFactory
from graphrag_toolkit.lexical_graph.storage import VectorStoreFactory

with (
    GraphStoreFactory.for_graph_store(os.environ['GRAPH_STORE']) as graph_store,
    VectorStoreFactory.for_vector_store(os.environ['VECTOR_STORE']) as vector_store
):

    query_engine = LexicalGraphQueryEngine.for_traversal_based_search(
        graph_store, 
        vector_store,
        streaming=True,
        tenant_id='ecorp', # we need to query the 'ecorp' tenant index
        no_cache=True
    )

    response = query_engine.query("What are the sales prospects for Example Corp in the UK?")

response.print_response_stream()

### Commentary

The answer here is more nuanced. Importantly, it identifies that the blockage in the Turquoise Canal poses a potential problem.

![Graph Search](./images/graph-search.png)

### üéØ 2.7 Visualize the results

You can view the results by the running the visualisation below:

In [None]:
NB_CLASSIC = True

from graphrag_toolkit.lexical_graph.visualisation import GraphNotebookVisualisation

v = GraphNotebookVisualisation(nb_classic=NB_CLASSIC)
v.display_results(response)

### üéØ 2.8 Inspect the search results

Besides visualising the results, you can also programatically access the structured results generated by the query engine.

When you call the `query()` method, the engine retrieves a set of search results from the graph and vector stores, and then passes these search results to an LLM together with a prompt to generate a natural language response to your question.

#### Show the context passed to the LLM

To see the results passed in the context window to the LLM, run the following cell:

In [None]:
for n in response.source_nodes:
    print(n.text)

Notice how these results are structured: sets of statements, grouped by topic and source.

#### Show the underlying results:

The results passed to the LLM contain only the information necessary to generate a natural language response. However, during the retrieval process, the GraphRAG Toolkit creates a far more detailed set of search results, with individually scored statements annotated with the names of the retriever strategies that found them.

To see this more detailed breakdown of the results, run the following cell:

In [None]:
import json
for n in response.source_nodes:
    print(json.dumps(n.metadata, indent=2))

***

## Entity Network Contexts

Why is the graph-enhanced search more effective at answering our question? As mentioned above, to answer the question effectively, the system must retrieve two types of content:

  - Content that is semantically similar to the company and location named in the question (Example Corp, and the UK)
  - Structurally relevant, potentially _dissimilar_ content (regarding, for example, Example Corp's supply chain dependencies and any recent events impacting this supply chain)
  
To access this additional, structurally relevant information, the GraphRAG Tookit uses _entity networks_. Entity networks are one- or two-hop networks that surround important entities and keywords extracted from the question. These entity networks act as 'fingerprints' for content that is structurally relevant, but potentially _dissimilar_ to the question:

![Entity Network](./images/entity-network.png)

### How entity networks are used in querying

Entity network contexts are used in several places in the querying process:

#### Dissimilarity searches

Entity networks are used to seed similarity searches for potentially relevant content that is _semantically dissimilar_ to the question. The results of these (dis)similarity searches then form the starting points for graph traversals.

![Dissimilarity Search](./images/dissimilarity-search.png)

#### Enrich the prompt with additional context

Entity networks are also used to guide the LLM to focus on relevant search results. The textual representations of the entity networks are added as additional context to the prompt used to generate a response. 

![Guide LLM](./images/guide-llm.png)
  

### üéØ 2.9 Visualise the entity networks used in the query

You can view the entity networks used in the query above by the running the visualisation from the cell below:

In [None]:
v.display_entity_contexts(response)

## Next Exercise (Optional)

If you have time, go to <a href="../../../nbclassic/notebooks/graphrag-toolkit/3-Agentic-Use-Cases.ipynb"><b>Exercise 3 - Agentic Use Cases</b></a> to continue the workshop exercises.