# Traversal-Based Querying

## Setup

If you haven't already, install the toolkit and dependencies using the [Setup](./00-Setup.ipynb) notebook.

### TraversalBasedRetriever

See [TraversalBasedRetriever](https://github.com/awslabs/graphrag-toolkit/blob/main/docs/lexical-graph/querying.md#traversalbasedretriever).

In [None]:
%reload_ext dotenv
%dotenv

import os

from graphrag_toolkit.lexical_graph import set_logging_config
from graphrag_toolkit.lexical_graph import LexicalGraphQueryEngine
from graphrag_toolkit.lexical_graph.storage import GraphStoreFactory
from graphrag_toolkit.lexical_graph.storage import VectorStoreFactory

set_logging_config('INFO')

with (
    GraphStoreFactory.for_graph_store(os.environ['GRAPH_STORE']) as graph_store,
    VectorStoreFactory.for_vector_store(os.environ['VECTOR_STORE']) as vector_store
):

    query_engine = LexicalGraphQueryEngine.for_traversal_based_search(
        graph_store, 
        vector_store,
        streaming=True
    )

    response = query_engine.query("What are the differences between Neptune Database and Neptune Analytics?")

print(f"""{response.print_response_stream()}

retrieve_ms: {int(response.metadata['retrieve_ms'])}
answer_ms  : {int(response.metadata['answer_ms'])}
total_ms   : {int(response.metadata['total_ms'])}
""")

#### Show the context passed to the LLM:

In [None]:
for n in response.source_nodes:
    print(n.text)

#### Show the underlying results:

In [None]:
import json
for n in response.source_nodes:
    print(json.dumps(n.metadata, indent=2))

#### Visualisation:

In [None]:
from graphrag_toolkit.lexical_graph.visualisation import GraphNotebookVisualisation

v = GraphNotebookVisualisation(nb_classic=True)

`display_results()` shows the topics, statements, facts and entities used when generating the response. If `include_sources=True`, the visualisation will also show the sources and chunks. Note that this is not a visualisation of any of the queries used by the retrievers; rather, it is a visualisation of the _results_ produced by the retrievers.

In [None]:
v.display_results(response, include_sources=True)

`display_entity_contexts()` shows the network of entities used to generate starting points for retrievals and aid reranking of results.

In [None]:
v.display_entity_contexts(response)

`display_schema()` shows the underlying inferred schema for the entity relations in the lexical graph. By default, `display_schema()` shows the schema for the default tenant; you can show the schema for a different tenant by supplying a `tenant_id` string parameter.

In [None]:
v.display_schema()

#### Metadata filtering

In [None]:
%reload_ext dotenv
%dotenv

import os

from graphrag_toolkit.lexical_graph import set_logging_config
from graphrag_toolkit.lexical_graph import LexicalGraphQueryEngine
from graphrag_toolkit.lexical_graph.storage import GraphStoreFactory
from graphrag_toolkit.lexical_graph.storage import VectorStoreFactory
from graphrag_toolkit.lexical_graph.metadata import FilterConfig

from llama_index.core.vector_stores.types import FilterOperator, MetadataFilter

set_logging_config('INFO')

with (
    GraphStoreFactory.for_graph_store(os.environ['GRAPH_STORE']) as graph_store,
    VectorStoreFactory.for_vector_store(os.environ['VECTOR_STORE']) as vector_store
):

    query_engine = LexicalGraphQueryEngine.for_traversal_based_search(
        graph_store, 
        vector_store,
        filter_config = FilterConfig(
            MetadataFilter(
                key='url',
                value='https://docs.aws.amazon.com/neptune/latest/userguide/intro.html',
                operator=FilterOperator.EQ
            )
        )
    )

    response = query_engine.query("What are the differences between Neptune Database and Neptune Analytics?")

print(f"""{response.response}

retrieve_ms: {int(response.metadata['retrieve_ms'])}
answer_ms  : {int(response.metadata['answer_ms'])}
total_ms   : {int(response.metadata['total_ms'])}
""")

In [None]:
for n in response.source_nodes:
    print(n.text)

#### Set subretriever

In the example below, the `TraversalBasedRetriever` is configured with a `ChunkBasedSearch` subretriever. (You can also try with `EntityBasedSearch` and `EntityContextSearch`).

In [None]:
%reload_ext dotenv
%dotenv

import os

from graphrag_toolkit.lexical_graph import LexicalGraphQueryEngine
from graphrag_toolkit.lexical_graph.storage import GraphStoreFactory
from graphrag_toolkit.lexical_graph.storage import VectorStoreFactory
from graphrag_toolkit.lexical_graph.retrieval.retrievers import ChunkBasedSearch
from graphrag_toolkit.lexical_graph.retrieval.retrievers import EntityBasedSearch
from graphrag_toolkit.lexical_graph.retrieval.retrievers import EntityContextSearch

with (
    GraphStoreFactory.for_graph_store(os.environ['GRAPH_STORE']) as graph_store,
    VectorStoreFactory.for_vector_store(os.environ['VECTOR_STORE']) as vector_store
):

    query_engine = LexicalGraphQueryEngine.for_traversal_based_search(
        graph_store, 
        vector_store,
        retrievers=[ChunkBasedSearch]
    )

    response = query_engine.query("What are the differences between Neptune Database and Neptune Analytics?")

print(f"""{response.response}

retrieve_ms: {int(response.metadata['retrieve_ms'])}
answer_ms  : {int(response.metadata['answer_ms'])}
total_ms   : {int(response.metadata['total_ms'])}
""")