# Visualizing the knowledge graph with `yfiles-jupyter-graphs`

This notebook is a partial copy of [local_search.ipynb](../../local_search.ipynb) that shows how to use `yfiles-jupyter-graphs` to add interactive graph visualizations of the parquet files  and how to visualize the result context of `graphrag` queries (see at the end of this notebook).

In [None]:
# Copyright (c) 2024 Microsoft Corporation.
# Licensed under the MIT License.

In [21]:
import os

import pandas as pd
import tiktoken

from graphrag.query.context_builder.entity_extraction import EntityVectorStoreKey
from graphrag.query.indexer_adapters import (
    read_indexer_entities,
    read_indexer_relationships,
    read_indexer_reports,
    read_indexer_text_units,
)
from graphrag.query.input.loaders.dfs import (
    store_entity_semantic_embeddings,
)
from graphrag.query.llm.oai.chat_openai import ChatOpenAI
from graphrag.query.llm.oai.embedding import OpenAIEmbedding
from graphrag.query.llm.oai.typing import OpenaiApiType
from graphrag.query.structured_search.local_search.mixed_context import (
    LocalSearchMixedContext,
)
from graphrag.query.structured_search.local_search.search import LocalSearch
from graphrag.vector_stores.lancedb import LanceDBVectorStore

from dotenv import load_dotenv
load_dotenv()

True

## Local Search Example

Local search method generates answers by combining relevant data from the AI-extracted knowledge-graph with text chunks of the raw documents. This method is suitable for questions that require an understanding of specific entities mentioned in the documents (e.g. What are the healing properties of chamomile?).

### Load text units and graph data tables as context for local search

- In this test we first load indexing outputs from parquet files to dataframes, then convert these dataframes into collections of data objects aligning with the knowledge model.

### Load tables to dataframes

In [5]:
INPUT_DIR = "output/20240824-194226/artifacts"
LANCEDB_URI = f"{INPUT_DIR}/lancedb"

COMMUNITY_REPORT_TABLE = "create_final_community_reports"
ENTITY_TABLE = "create_final_nodes"
ENTITY_EMBEDDING_TABLE = "create_final_entities"
RELATIONSHIP_TABLE = "create_final_relationships"
COVARIATE_TABLE = "create_final_covariates"
TEXT_UNIT_TABLE = "create_final_text_units"
COMMUNITY_LEVEL = 2

#### Read entities

In [6]:
# read nodes table to get community and degree data
entity_df = pd.read_parquet(f"{INPUT_DIR}/{ENTITY_TABLE}.parquet")
entity_embedding_df = pd.read_parquet(f"{INPUT_DIR}/{ENTITY_EMBEDDING_TABLE}.parquet")

#### Read relationships

In [7]:
relationship_df = pd.read_parquet(f"{INPUT_DIR}/{RELATIONSHIP_TABLE}.parquet")
relationships = read_indexer_relationships(relationship_df)

The requirements for the input data is an `id` attribute for the nodes and `start`/`end` properties for the relationships that correspond to the node ids. Additional attributes can be added in the `properties` of each node/relationship dict:

In [8]:
# %pip install yfiles_jupyter_graphs --quiet
from yfiles_jupyter_graphs import GraphWidget


# converts the entities dataframe to a list of dicts for yfiles-jupyter-graphs
def convert_entities_to_dicts(df):
    """Convert the entities dataframe to a list of dicts for yfiles-jupyter-graphs."""
    nodes_dict = {}
    for _, row in df.iterrows():
        # Create a dictionary for each row and collect unique nodes
        node_id = row["title"]
        if node_id not in nodes_dict:
            nodes_dict[node_id] = {
                "id": node_id,
                "properties": row.to_dict(),
            }
    return list(nodes_dict.values())


# converts the relationships dataframe to a list of dicts for yfiles-jupyter-graphs
def convert_relationships_to_dicts(df):
    """Convert the relationships dataframe to a list of dicts for yfiles-jupyter-graphs."""
    relationships = []
    for _, row in df.iterrows():
        # Create a dictionary for each row
        relationships.append({
            "start": row["source"],
            "end": row["target"],
            "properties": row.to_dict(),
        })
    return relationships


w = GraphWidget()
w.directed = True
w.nodes = convert_entities_to_dicts(entity_df)
w.edges = convert_relationships_to_dicts(relationship_df)

## Configure data-driven visualization

The additional properties can be used to configure the visualization for different use cases.

In [9]:
# show title on the node
w.node_label_mapping = "title"


# map community to a color
def community_to_color(community):
    """Map a community to a color."""
    colors = [
        "crimson",
        "darkorange",
        "indigo",
        "cornflowerblue",
        "cyan",
        "teal",
        "green",
    ]
    return (
        colors[int(community) % len(colors)] if community is not None else "lightgray"
    )


def edge_to_source_community(edge):
    """Get the community of the source node of an edge."""
    source_node = next(
        (entry for entry in w.nodes if entry["properties"]["title"] == edge["start"]),
        None,
    )
    source_node_community = source_node["properties"]["community"]
    return source_node_community if source_node_community is not None else None


w.node_color_mapping = lambda node: community_to_color(node["properties"]["community"])
w.edge_color_mapping = lambda edge: community_to_color(edge_to_source_community(edge))
# map size data to a reasonable factor
w.node_scale_factor_mapping = lambda node: 0.5 + node["properties"]["size"] * 1.5 / 20
# use weight for edge thickness
w.edge_thickness_factor_mapping = "weight"

## Automatic layouts

The widget provides different automatic layouts that serve different purposes: `Circular`, `Hierarchic`, `Organic (interactiv or static)`, `Orthogonal`, `Radial`, `Tree`, `Geo-spatial`.

For the knowledge graph, this sample uses the `Circular` layout, though `Hierarchic` or `Organic` are also suitable choices.

In [10]:
# Use the circular layout for this visualization. For larger graphs, the default organic layout is often preferrable.
w.circular_layout()

## Display the graph

In [11]:
display(w)

GraphWidget(layout=Layout(height='800px', width='100%'))

# Visualizing the result context of `graphrag` queries

The result context of `graphrag` queries allow to inspect the context graph of the request. This data can similarly be visualized as graph with `yfiles-jupyter-graphs`.

## Making the request

The following cell recreates the sample queries from [local_search.ipynb](../../local_search.ipynb).

In [23]:
# setup (see also ../../local_search.ipynb)
entities = read_indexer_entities(entity_df, entity_embedding_df, COMMUNITY_LEVEL)

description_embedding_store = LanceDBVectorStore(
    collection_name="entity_description_embeddings",
)
description_embedding_store.connect(db_uri=LANCEDB_URI)
entity_description_embeddings = store_entity_semantic_embeddings(
    entities=entities, vectorstore=description_embedding_store
)

report_df = pd.read_parquet(f"{INPUT_DIR}/{COMMUNITY_REPORT_TABLE}.parquet")
reports = read_indexer_reports(report_df, entity_df, COMMUNITY_LEVEL)
text_unit_df = pd.read_parquet(f"{INPUT_DIR}/{TEXT_UNIT_TABLE}.parquet")
text_units = read_indexer_text_units(text_unit_df)

api_key = os.getenv("GRAPHRAG_API_KEY")
llm_model = os.getenv("GRAPHRAG_LLM_MODEL")
embedding_model = os.getenv("GRAPHRAG_EMBEDDING_MODEL")

llm = ChatOpenAI(
    api_key=api_key,
    model=llm_model,
    api_type=OpenaiApiType.OpenAI,  # OpenaiApiType.OpenAI or OpenaiApiType.AzureOpenAI
    max_retries=20,
)

token_encoder = tiktoken.get_encoding("cl100k_base")

text_embedder = OpenAIEmbedding(
    api_key=api_key,
    api_base=None,
    api_type=OpenaiApiType.OpenAI,
    model=embedding_model,
    deployment_name=embedding_model,
    max_retries=20,
)

context_builder = LocalSearchMixedContext(
    community_reports=reports,
    text_units=text_units,
    entities=entities,
    relationships=relationships,
    entity_text_embeddings=description_embedding_store,
    embedding_vectorstore_key=EntityVectorStoreKey.ID,  # if the vectorstore uses entity title as ids, set this to EntityVectorStoreKey.TITLE
    text_embedder=text_embedder,
    token_encoder=token_encoder,
)

local_context_params = {
    "text_unit_prop": 0.5,
    "community_prop": 0.1,
    "conversation_history_max_turns": 5,
    "conversation_history_user_turns_only": True,
    "top_k_mapped_entities": 10,
    "top_k_relationships": 10,
    "include_entity_rank": True,
    "include_relationship_weight": True,
    "include_community_rank": False,
    "return_candidate_context": False,
    "embedding_vectorstore_key": EntityVectorStoreKey.ID,  # set this to EntityVectorStoreKey.TITLE if the vectorstore uses entity title as ids
    "max_tokens": 12_000,  # change this based on the token limit you have on your model (if you are using a model with 8k limit, a good setting could be 5000)
}

llm_params = {
    "max_tokens": 2_000,  # change this based on the token limit you have on your model (if you are using a model with 8k limit, a good setting could be 1000=1500)
    "temperature": 0.0,
}

search_engine = LocalSearch(
    llm=llm,
    context_builder=context_builder,
    token_encoder=token_encoder,
    llm_params=llm_params,
    context_builder_params=local_context_params,
    response_type="multiple paragraphs",  # free form text describing the response type and format, can be anything, e.g. prioritized list, single paragraph, multiple paragraphs, multiple-page report
)

## Run local search on sample queries

In [24]:
query = """"
Domain:
(define (domain sokoban)
	(:requirements :strips)
	(:predicates (sokoban ?x)   								;sokoban is at location x
				 (crate ?x)     								;crate is at location x
				 (leftOf ?x ?y) 								;location x is to the left of locaiton y
				 (below ?x ?y)  								;location x is below location y
				 (at ?x ?y)     								;object x is at location y
				 (clear ?x))									;x is a location

	(:action moveLeft
		:parameters (?sokoban ?x ?y)
		:precondition (and (sokoban ?sokoban)
						   (at ?sokoban ?x)
						   (leftOf ?y ?x)   					;location y is to the left of location x
						   (clear ?y))      					;and y is empty/clear, so move left to y
		:effect (and (at ?sokoban ?y) (clear ?x)
				(not (at ?sokoban ?x)) (not (clear ?y))))

	(:action moveRight
		:parameters (?sokoban ?x ?y)
		:precondition (and (sokoban ?sokoban)
							(at ?sokoban ?x)
							(leftOf ?x ?y)    					;location x is to the left of y
							(clear ?y))       					;and y is clear, so move right to y
		:effect (and (at ?sokoban ?y) (clear ?x)
				(not (at ?sokoban ?x)) (not (clear ?y))))

	(:action moveUp
		:parameters (?sokoban ?x ?y)
		:precondition (and (sokoban ?sokoban)
						  (at ?sokoban ?x)
						  (below ?x ?y)      					;location x is below location y
						  (clear ?y))        					;and y is clear, so move up to y
		:effect (and (at ?sokoban ?y) (clear ?x)
				(not (at ?sokoban ?x)) (not (clear ?y))))

	(:action moveDown
		:parameters (?sokoban ?x ?y)
		:precondition (and (sokoban ?sokoban)
						  (at ?sokoban ?x)
						  (below ?y ?x)      					;location y is below location x
						  (clear ?y))        					;and y is clear, so move down to y
		:effect (and (at ?sokoban ?y) (clear ?x)
				(not (at ?sokoban ?x)) (not (clear ?y))))

	(:action pushLeft
		:parameters (?sokoban ?x ?y ?z ?crate)
		:precondition (and (sokoban ?sokoban)
							(crate ?crate)
							(leftOf ?y ?x)  					;location y is left of x
							(leftOf ?z ?y)    					;z (destination for block) is left of where the block currently is
							(at ?sokoban ?x)   					;sokoban player is at x
							(at ?crate ?y)     					;crate is at y							    					
							(clear ?z))        					;and location z is clear, so push crate left to z
		:effect (and (at ?sokoban ?y) (at ?crate ?z) 
				(clear ?x) 
				(not (at ?sokoban ?x)) 
				(not (at ?crate ?y)) 
				(not (clear ?z)) 
				(not (clear ?y))))
			   
	(:action pushRight
		:parameters (?sokoban ?x ?y ?z ?crate)
		:precondition (and (sokoban ?sokoban)
							(crate ?crate)
							(leftOf ?x ?y)						;x is left of y
							(leftOf ?y ?z)						;y is left of z
							(at ?sokoban ?x)					;sokoban is at x
							(at ?crate ?y)						;crate is at y
							(clear ?z))							;z is clear, so push crate right to z
		:effect (and (at ?sokoban ?y) (at ?crate ?z) 
				(clear ?x)
				(not (at ?sokoban ?x))
				(not (at ?crate ?y))
				(not (clear ?z))
				(not (clear ?y))))

	(:action pushUp
		:parameters (?sokoban ?x ?y ?z ?crate)
		:precondition (and (sokoban ?sokoban)
							(crate ?crate)
							(below ?x ?y)						;x is below y
							(below ?y ?z)						;y is below z
							(at ?sokoban ?x)					;sokoban is at x
							(at ?crate ?y)						;crate is at y
							(clear ?z))							;z is clear, so push crate up to z
		:effect (and (at ?sokoban ?y) (at ?crate ?z)
				(clear ?x)
				(not (at ?sokoban ?x))
				(not (at ?crate ?y))
				(not (clear ?y))
				(not (clear ?z))))

	(:action pushDown
		:parameters (?sokoban ?x ?y ?z ?crate)
		:precondition (and (sokoban ?sokoban)
							(crate ?crate)
							(below ?y ?x)						;y is below x
							(below ?z ?y)						;z is below y
							(at ?sokoban ?x)					;sokoban is at x
							(at ?crate ?y)						;crate is at y
							(clear ?z))							;z is clear, so push crate down to z
		:effect (and (at ?sokoban ?y) (at ?crate ?z)
				(clear ?x)
				(not (at ?sokoban ?x))
				(not (at ?crate ?y))
				(not (clear ?y))
				(not (clear ?z))))
)

Example problems:
(define (problem s1)
	(:domain sokoban)
	(:objects sokoban, crate2, l1, l2, l5, l6, l9, l10, l11, l12, l13, l14, l15, l16, l17, l18)
	(:init (sokoban sokoban) 
		   (crate crate2)

		   ;;horizontal relationships
		   (leftOf l1 l2) 
		   (leftOf l5 l6) 
		   (leftOf l9 l10) (leftOf l10 l11) (leftOf l11 l12) 
 		   (leftOf l13 l14) (leftOf l14 l15) (leftOf l15 l16)
 		   (leftOf l17 l18)

 		   ;;vertical relationships
 		   (below l5 l1) (below l6 l2)
 		   (below l9 l5) (below l10 l6)
 		   (below l13 l9) (below l14 l10) (below l15 l11) (below l16 l12)
 		   (below l17 l13) (below l18 l14)

 		   ;;initialize sokoban and crate
		   (at sokoban l10)
 		   (at crate2 l15) 

 		   ;;clear spaces
		   (clear l1) 
		   (clear l2) 
		   (clear l5) 
		   (clear l6) 
		   (clear l9)
		   (clear l11)
		   (clear l12) 
		   (clear l13) 
		   (clear l14)
		   (clear l16) 
		   (clear l17)   				
		   (clear l18))

	(:goal (and (at crate2 l2)))
)
```

```
(define (problem s2)
	(:domain sokoban)
	(:objects sokoban1, sokoban2, crate1, crate2, l1, l2, l5, l6, l9, l10, l11, l12, l13, l14, l15, l16, l17, l18)
	(:init (sokoban sokoban1) 
		   (sokoban sokoban2)
		   (crate crate1)	
		   (crate crate2)
		   
		   ;;horizontal relationships
		   (leftOf l1 l2) 
		   (leftOf l5 l6) 
		   (leftOf l9 l10) (leftOf l10 l11) (leftOf l11 l12) 
 		   (leftOf l13 l14) (leftOf l14 l15) (leftOf l15 l16)
 		   (leftOf l17 l18)

 		   ;;vertical relationships
 		   (below l5 l1) (below l6 l2)
 		   (below l9 l5) (below l10 l6)
 		   (below l13 l9) (below l14 l10) (below l15 l11) (below l16 l12)
 		   (below l17 l13) (below l18 l14)

 		   ;;initialize sokoban and crate
		   (at sokoban1 l10)
		   (at sokoban2 l16)
		   (at crate1 l9)
 		   (at crate2 l15) 

 		   ;;clear spaces
		   (clear l1) 
		   (clear l2) 
		   (clear l5) 
		   (clear l6) 
		   (clear l11)
		   (clear l12) 
		   (clear l13) 
		   (clear l14)
		   (clear l17)   				
		   (clear l18))

	(:goal (and (at crate1 l9) (at crate2 l2)))
)

There is a simple strategy for solving all problems in this domain without using search. Implement the strategy as a Python function.

The code should should be of the form

def get_plan(objects, init, goal):
# Your code here
return plan

where
- `objects` is a set of objects (string names)
- `init` is a set of ground atoms represented as tuples of predicate
names and arguments (e.g., ('predicate-foo', 'object-bar', ...))
- `goal` is also a set of ground atoms represented in the same way
- `plan` is a list of actions, where each action is a ground operator
represented as a string (e.g., '(operator-baz object-qux ...)')

"""

In [26]:
result = await search_engine.asearch(query)
print(result.response)

To implement a simple strategy for solving Sokoban problems without using search, we can create a function that generates a plan based on the initial state and the goal state. The function will analyze the positions of the sokoban and crates, and determine the necessary actions to achieve the goal.

Here’s a Python function that follows this strategy:

```python
def get_plan(objects, init, goal):
    plan = []
    
    # Extract positions of sokoban and crates from the init state
    sokoban = next(obj for obj in objects if obj.startswith('sokoban'))
    crate_positions = {obj: None for obj in objects if obj.startswith('crate')}
    
    for atom in init:
        if atom[0] == 'at':
            if atom[1] == sokoban:
                sokoban_position = atom[2]
            elif atom[1] in crate_positions:
                crate_positions[atom[1]] = atom[2]
    
    # Define a function to check if a location is clear
    def is_clear(location):
        return ('clear', location) in init
  

In [31]:
result

SearchResult(response="To implement a simple strategy for solving Sokoban problems without using search, we can create a function that generates a plan based on the initial state and the goal state. The function will analyze the positions of the sokoban and crates, and determine the necessary actions to achieve the goal.\n\nHere’s a Python function that follows this strategy:\n\n```python\ndef get_plan(objects, init, goal):\n    plan = []\n    \n    # Extract positions of sokoban and crates from the init state\n    sokoban = next(obj for obj in objects if obj.startswith('sokoban'))\n    crate_positions = {obj: None for obj in objects if obj.startswith('crate')}\n    \n    for atom in init:\n        if atom[0] == 'at':\n            if atom[1] == sokoban:\n                sokoban_position = atom[2]\n            elif atom[1] in crate_positions:\n                crate_positions[atom[1]] = atom[2]\n    \n    # Define a function to check if a location is clear\n    def is_clear(location):\n 

### Result save in exp/exp1.txt

## Inspecting the context data used to generate the response

In [27]:
result.context_data["entities"].head()

Unnamed: 0,id,entity,description,number of relationships,in_context
0,0,SOKOBAN,"The ""SOKOBAN"" is an object that serves as the ...",9,True
1,1,CRATE,"The ""CRATE"" is an object within the sokoban ga...",5,True
2,38,GOAL,The goal is a specific condition that must be ...,2,True
3,35,CRATE1,The crate1 is an object that can be pushed by ...,3,True
4,34,SOKOBAN2,The sokoban2 is an object representing the pla...,11,True


In [28]:
result.context_data["relationships"].head()

Unnamed: 0,id,source,target,description,weight,rank,links,in_context
0,65,SOKOBAN2,CRATE2,The sokoban2 is the character that interacts w...,1.0,15,2,True
1,0,SOKOBAN,CRATE,The sokoban is the character that interacts wi...,1.0,14,1,True
2,5,SOKOBAN,PUSHLEFT,The action 'pushLeft' involves the sokoban as ...,1.0,14,4,True
3,6,SOKOBAN,PUSHRIGHT,The action 'pushRight' involves the sokoban as...,1.0,14,4,True
4,64,SOKOBAN2,CRATE1,The sokoban2 is the character that interacts w...,1.0,14,1,True


## Visualizing the result context as graph

In [29]:
"""
Helper function to visualize the result context with `yfiles-jupyter-graphs`.

The dataframes are converted into supported nodes and relationships lists and then passed to yfiles-jupyter-graphs.
Additionally, some values are mapped to visualization properties.
"""


def show_graph(result):
    """Visualize the result context with yfiles-jupyter-graphs."""
    from yfiles_jupyter_graphs import GraphWidget

    if (
        "entities" not in result.context_data
        or "relationships" not in result.context_data
    ):
        msg = "The passed results do not contain 'entities' or 'relationships'"
        raise ValueError(msg)

    # converts the entities dataframe to a list of dicts for yfiles-jupyter-graphs
    def convert_entities_to_dicts(df):
        """Convert the entities dataframe to a list of dicts for yfiles-jupyter-graphs."""
        nodes_dict = {}
        for _, row in df.iterrows():
            # Create a dictionary for each row and collect unique nodes
            node_id = row["entity"]
            if node_id not in nodes_dict:
                nodes_dict[node_id] = {
                    "id": node_id,
                    "properties": row.to_dict(),
                }
        return list(nodes_dict.values())

    # converts the relationships dataframe to a list of dicts for yfiles-jupyter-graphs
    def convert_relationships_to_dicts(df):
        """Convert the relationships dataframe to a list of dicts for yfiles-jupyter-graphs."""
        relationships = []
        for _, row in df.iterrows():
            # Create a dictionary for each row
            relationships.append({
                "start": row["source"],
                "end": row["target"],
                "properties": row.to_dict(),
            })
        return relationships

    w = GraphWidget()
    # use the converted data to visualize the graph
    w.nodes = convert_entities_to_dicts(result.context_data["entities"])
    w.edges = convert_relationships_to_dicts(result.context_data["relationships"])
    w.directed = True
    # show title on the node
    w.node_label_mapping = "entity"
    # use weight for edge thickness
    w.edge_thickness_factor_mapping = "weight"
    display(w)


show_graph(result)

GraphWidget(layout=Layout(height='700px', width='100%'))