<a href="https://colab.research.google.com/github/wenqiglantz/hands-on-llamaindex/blob/main/05_llama_packs_neo4j.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Llama Pack - Neo4j Query Engine

This Llama Pack creates a Neo4j knowledge graph query engine, and executes its `query` function. This pack offers the option of creating multiple types of query engines for Neo4j knowledge graphs, namely:

* Knowledge graph vector-based entity retrieval (default if no query engine type option is provided)
* Knowledge graph keyword-based entity retrieval
* Knowledge graph hybrid entity retrieval
* Raw vector index retrieval
* Custom combo query engine (vector similarity + KG entity retrieval)
* KnowledgeGraphQueryEngine
* KnowledgeGraphRAGRetriever

For this notebook, we will load a Wikipedia page on paleo diet into Neo4j KG and perform queries.

In [1]:
%pip install -q llama-index-readers-wikipedia
%pip install -q llama-index-llms-openai
%pip install -q llama-index-packs-neo4j-query-engine

In [2]:
!pip install -q neo4j llama-index wikipedia

  Preparing metadata (setup.py) ... [?25l[?25hdone
  Building wheel for wikipedia (setup.py) ... [?25l[?25hdone


In [3]:
import os, openai, logging, sys
from google.colab import userdata

# set OpenAI API key in environment variable
os.environ["OPENAI_API_KEY"] = userdata.get("OPENAI_API_KEY")

logging.basicConfig(stream=sys.stdout, level=logging.INFO)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

## Setup Data

Load a Wikipedia page on paleo diet.

In [4]:
from llama_index.core import download_loader
from llama_index.readers.wikipedia import WikipediaReader

WikipediaReader = download_loader("WikipediaReader")
documents = WikipediaReader().load_data(pages=['Paleolithic diet'], auto_suggest=False)
print(f'Loaded {len(documents)} documents')

  WikipediaReader = download_loader("WikipediaReader")


Loaded 1 documents


## Download and Initialize Pack

In [5]:
from llama_index.core.llama_pack import download_llama_pack

# download and install dependencies
Neo4jQueryEnginePack = download_llama_pack(
  "Neo4jQueryEnginePack", "./neo4j_pack"
)

Assume you have the credentials for Neo4j stored in `credentials.json` at the project root, you load the json and extract the credential details. `credentials.json` is in the following format (replace the "#" symbol with your actual password or url):
```
{
    "username": "neo4j",
    "password": "##############",
    "url": "neo4j+s://#######.databases.neo4j.io",
    "database": "neo4j"
}
```

If you don't have a Neo4j db provisioned, you can provision a new free instance under Neo4j Aura through Neo4j website.

In [7]:
import json

# get Neo4j credentials (assume it's stored in credentials.json)
with open('credentials.json') as f:
  neo4j_connection_params = json.load(f)
  username = neo4j_connection_params['username']
  password = neo4j_connection_params['password']
  url = neo4j_connection_params['url']
  database = neo4j_connection_params['database']

See below how `Neo4jQueryEnginePack` is constructed.  You can pass in the `query_engine_type` from `Neo4jQueryEngineType` to construct `Neo4jQueryEnginePack`. The code snippet below shows a KG keyword query engine.  If `query_engine_type` is not defined, it defaults to KG vector based entity retrieval.

`Neo4jQueryEngineType` is an enum, which holds various query engine types, see below. You can pass in any of these query engine types to construct `Neo4jQueryEnginePack`.
```
class Neo4jQueryEngineType(str, Enum):
    """Neo4j query engine type"""

    KG_KEYWORD = "keyword"
    KG_HYBRID = "hybrid"
    RAW_VECTOR = "vector"
    RAW_VECTOR_KG_COMBO = "vector_kg"
    KG_QE = "KnowledgeGraphQueryEngine"
    KG_RAG_RETRIEVER = "KnowledgeGraphRAGRetriever"
```

In [8]:
from llama_index.packs.neo4j_query_engine.base import Neo4jQueryEngineType

# create the pack
neo4j_pack = Neo4jQueryEnginePack(
  username = username,
  password = password,
  url = url,
  database = database,
  docs = documents,
  query_engine_type = Neo4jQueryEngineType.KG_KEYWORD
)

loaded nodes with 8 nodes


## Run Pack

In [9]:
from IPython.display import Markdown

response = neo4j_pack.run("Tell me about the benefits of paleo diet.")
display(Markdown(f"<b>{response}</b>"))

<b>The paleo diet is a popular eating plan that focuses on consuming foods that were available to our ancestors during the Paleolithic era. Advocates of the paleo diet claim that it offers several benefits. One of the main benefits is weight loss, as the diet encourages the consumption of whole, unprocessed foods that are low in calories and high in nutrients. Additionally, the paleo diet promotes the intake of lean proteins, which can help increase satiety and reduce cravings. It also emphasizes the consumption of fruits and vegetables, which are rich in vitamins, minerals, and antioxidants. Some people also report improved digestion, increased energy levels, and better blood sugar control when following the paleo diet. However, it is important to note that individual results may vary, and it is always recommended to consult with a healthcare professional before making any significant changes to your diet.</b>

Let's try out the KG hybrid query engine. See code below.  You can try any other query engines in a similar way by replacing the `query_engine_type` with another query engine type from `Neo4jQueryEngineType` enum.

In [10]:
neo4j_pack = Neo4jQueryEnginePack(
  username = username,
  password = password,
  url = url,
  database = database,
  docs = documents,
  query_engine_type = Neo4jQueryEngineType.KG_HYBRID
)

response = neo4j_pack.run("Tell me about the benefits of paleo diet.")
display(Markdown(f"<b>{response}</b>"))

loaded nodes with 8 nodes


<b>The paleo diet is believed to have some potential benefits for improving health. Advocates of the diet claim that following it may lead to improvements in body composition and metabolism compared to the typical Western diet or other recommended diets. Some evidence suggests that the paleo diet may help with weight loss, possibly due to increased satiety from the foods typically consumed. However, it is important to note that any weight loss achieved from the paleo diet is likely due to overall decreased caloric intake rather than any special feature of the diet itself. Additionally, the paleo diet encourages the consumption of whole, unprocessed foods, which aligns with mainstream advice about diet and can result in reduced intake of processed foods, sugar, and salt. It also shares similarities with traditional ethnic diets, such as the Mediterranean diet, which have been found to be more healthful than the Western diet. However, it is worth mentioning that following the paleo diet can lead to nutritional deficiencies, such as inadequate intake of calcium and vitamin D, and may increase the risk of ingesting toxins from high fish consumption. It is always recommended to consult with a healthcare professional or registered dietitian before making any significant changes to your diet.</b>

## Comparison of the Knowledge Graph Query Strategies

The table below lists the details of the 7 query engines, and their pros and cons based on experiments with NebulaGraph and LlamaIndex, as outlined in the blog post [7 Query Strategies for Navigating Knowledge Graphs with LlamaIndex](https://betterprogramming.pub/7-query-strategies-for-navigating-knowledge-graphs-with-llamaindex-ed551863d416?sk=55c94ad72e75aa52ac6cc21d8145b37d).

![Knowledge Graph query strategies comparison](https://miro.medium.com/v2/resize:fit:1400/format:webp/1*0UsLpj7v2GO67U-99YJBfg.png)