<img width="8%" alt="LlamaIndex.png" src="https://raw.githubusercontent.com/jupyter-naas/awesome-notebooks/master/.github/assets/logos/LlamaIndex.png" style="border-radius: 15%">

# LlamaIndex - Integrate with Neo4j
<a href="https://bit.ly/3JyWIk6">Give Feedback</a> | <a href="https://github.com/jupyter-naas/awesome-notebooks/issues/new?assignees=&labels=bug&template=bug_report.md&title=LlamaIndex+-+Integrate+with+Neo4j:+Error+short+description">Bug report</a>

**Tags:** #llamaindex #neo4j #integration #graphstore #request #response

**Author:** [Mahimai Raja J](http://github.com/mahimairaja)

**Last update:** 2023-10-22 (Created: 2023-10-08)

**Description:** This notebook demonstrates how to use Llamaindex and Neo4j as graph store to perform request and response model for the custom data. It is usefull for organizations to quickly integrate Neo4j with Llamaindex.

**References:**
- [Llamaindex Documentation](https://docs.llamaindex.ai/en/stable/examples/index_structs/knowledge_graph/Neo4jKGIndexDemo.html)
- [Neo4j Documentation](https://neo4j.com/docs/)
- [data connectors](https://docs.llamaindex.ai/en/stable/core_modules/data_modules/connector/modules.html)
- [Neo4j Sandbox](https://sandbox.neo4j.com/)
- [AuroDS instance](https://console.neo4j.io/?product=aura-ds) 
- [Paged Attention - VLLM](https://arxiv.org/abs/2309.06180)


## Step to create Neo4j instance

1. Create a Neo4j Sandbox instance from [here](https://sandbox.neo4j.com/) or Create a Neo4j Aura DS instance from [here](https://console.neo4j.io/?product=aura-ds).

2. Login to the Neo4j instance and create a new graph.

3. Copy the `neo4j_url`, `neo4j_username`, `neo4j_password` and `neo4j_graph` from the Neo4j instance and replace in the below code.

## Input

### Import libraries

In [None]:
try:
    import llama_index
except ModuleNotFoundError:
    !pip install -q llama-index
    import llama_index
try:
    import neo4j
except ModuleNotFoundError:
    !pip install -q neo4j-driver
    import neo4j
try:
    import pypdf
except ModuleNotFoundError:
    !pip install -q pypdf
    import pypdf


import os
import logging
import sys
from llama_index.llms import OpenAI
from llama_index import ServiceContext
from llama_index import (
    KnowledgeGraphIndex,
    LLMPredictor,
    ServiceContext,
    SimpleDirectoryReader,
)
from llama_index.storage.storage_context import StorageContext
from llama_index.graph_stores import Neo4jGraphStore


from llama_index.llms import OpenAI
from IPython.display import Markdown, display

logging.basicConfig(stream=sys.stdout, level=logging.INFO)

### Setup variables
- `neo4j_url`: URL of the Neo4j instance
- `neo4j_username`: Username of the Neo4j instance
- `neo4j_password`: Password of the Neo4j instance
- `neo4j_graph`: Graph name of the Neo4j instance

In [None]:
neo4j_url = "bolt://3.220.232.45:7687"
neo4j_username = "neo4j"
neo4j_password = "papers-food-agents"
neo4j_graph = "neo4j"

# Add your api key here
os.environ["OPENAI_API_KEY"] = "YOUR_API_KEY"

In [None]:
input_file_path = os.path.join(os.getcwd(), "data")

!mkdir data
!wget -O data/vllm.pdf --user-agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" https://arxiv.org/pdf/2309.06180.pdf

## Model

In [None]:
llm = OpenAI(temperature=0, model="gpt-3.5-turbo")

service_context = ServiceContext.from_defaults(llm=llm, chunk_size=512)

documents = SimpleDirectoryReader(
    input_file_path
).load_data()

### Create Neo4j instance

In Neo4j, graphs are a data modeling paradigm that uses nodes to represent entities and relationships to depict connections between them. These nodes and relationships can store properties, allowing for rich and flexible data representation. Neo4j's graph database engine provides efficient traversal and querying capabilities, making it a powerful tool for working with highly interconnected data.


In [None]:
# Create Neo4j instance

neo4j_instance = Neo4jGraphStore(
    username=neo4j_username,
    password=neo4j_password,
    url=neo4j_url,
    database=neo4j_graph,
)

### Create Llamaindex instance

A knowledge graph in Neo4j is a structured, interconnected data model that represents real-world information as nodes and relationships. It enables the storage and retrieval of complex relationships and semantic connections between data entities. Neo4j's graph database technology is well-suited for building, querying, and navigating knowledge graphs, making it an ideal choice for applications involving data with intricate interdependencies.

In [None]:
storage_context = StorageContext.from_defaults(graph_store=neo4j_instance)

# NOTE: can take a while!
index = KnowledgeGraphIndex.from_documents(
    documents,
    storage_context=storage_context,
    max_triplets_per_chunk=2,
    service_context=service_context,
)

In [None]:
input_query = input("Enter the query to perform RAG\n")

In [None]:
llamaindex_instance = index.as_query_engine(include_text=True, response_mode="tree_summarize")

response = llamaindex_instance.query(input_query)

## Output

### Display result

In [None]:
# Display result
display(Markdown(f"<b>{response}</b>"))