<a href="https://colab.research.google.com/github/tomasonjo/blogs/blob/master/llm/Neo4j_vector_index_%26_Langchain.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
!pip install neo4j langchain==0.0.270 wikipedia openai tiktoken



In [2]:
from langchain.graphs import Neo4jGraph

NEO4J_URI="neo4j+s://1234.databases.neo4j.io"
NEO4J_USERNAME="neo4j"
NEO4J_PASSWORD="-"

graph = Neo4jGraph(
    url=NEO4J_URI,
    username=NEO4J_USERNAME,
    password=NEO4J_PASSWORD
)

In [3]:
graph.query("""
CALL db.index.vector.createNodeIndex(
  'wikipedia', // index name
  'Chunk',     // node label
  'embedding', // node property
   1536,       // vector size
   'cosine'    // similarity metric
)
""")

[]

In [4]:
graph.query("""
WITH [1, [1,2,3], ["2","5"], [x in range(0, 1535) | toFloat(x)]] AS exampleValues
UNWIND range(0, size(exampleValues) - 1) as index
CREATE (:Chunk {embedding: exampleValues[index], index: index})
""")

[]

In [5]:
graph.query("""
CALL db.index.vector.queryNodes('wikipedia', 3, [x in range(0,1535) | toFloat(x) / 2])
YIELD node, score
RETURN node.index AS index, score
""")

[{'index': 3, 'score': 1.0}]

In [6]:
graph.query("""
MATCH (n) DETACH DELETE n
""")

[]

In [7]:
import wikipedia

bg3 = wikipedia.page(pageid=60979422)

In [8]:
print(bg3.content)

Baldur's Gate 3 is a role-playing video game developed and published by Larian Studios. It is the third main game in the Baldur's Gate series, which is based on the Dungeons & Dragons tabletop role-playing system. A partial version of the game was released in early access format for macOS, Windows, and the Stadia streaming service, on 6 October 2020. The game remained in early access until its full release on Windows on 3 August 2023. macOS and PlayStation 5 versions are scheduled for release on 6 September 2023. The Stadia version was cancelled following Stadia's closure.
Baldur's Gate 3 was acclaimed by critics, who praised the gameplay, narrative, amount of content, and player choice.


== Gameplay ==
Baldur's Gate 3 is a role-playing video game that offers both a single-player and cooperative multiplayer element. Players can create one or more characters and form a party along with a number of pre-generated characters to explore the game's story. Optionally, players are able to tak

In [9]:
import os
from langchain.embeddings import OpenAIEmbeddings

os.environ["OPENAI_API_KEY"] = "API_KEY"

embeddings = OpenAIEmbeddings()

chunks = [{'text':el, 'embedding': embeddings.embed_query(el)} for
                  el in bg3.content.split("\n\n") if len(el) > 50]

In [10]:
graph.query("""
UNWIND $data AS row
CREATE (c:Chunk {text: row.text})
WITH c, row
CALL db.create.setVectorProperty(c, 'embedding', row.embedding)
YIELD node
RETURN distinct 'done'
""", {'data': chunks})

[{"'done'": 'done'}]

In [11]:
from langchain.chains.base import Chain
from langchain.chains.llm import LLMChain
from langchain.chat_models import ChatOpenAI
from langchain.chains.question_answering.stuff_prompt import CHAT_PROMPT
from langchain.callbacks.manager import CallbackManagerForChainRun

from typing import Any, Dict, List
from pydantic import Field

vector_search = """
WITH $embedding AS e
CALL db.index.vector.queryNodes('wikipedia',3, e) yield node, score
RETURN node.text AS result
ORDER BY score DESC
LIMIT 3
"""

class Neo4jVectorChain(Chain):
    """Chain for question-answering against a neo4j vector index."""

    graph: Neo4jGraph = Field(exclude=True)
    input_key: str = "query"  #: :meta private:
    output_key: str = "result"  #: :meta private:
    embeddings: OpenAIEmbeddings = OpenAIEmbeddings()
    qa_chain: LLMChain = LLMChain(llm=ChatOpenAI(temperature=0), prompt=CHAT_PROMPT)

    @property
    def input_keys(self) -> List[str]:
        """Return the input keys.
        :meta private:
        """
        return [self.input_key]

    @property
    def output_keys(self) -> List[str]:
        """Return the output keys.
        :meta private:
        """
        _output_keys = [self.output_key]
        return _output_keys

    def _call(self, inputs: Dict[str, str], run_manager) -> Dict[str, Any]:
        """Embed a question and do vector search."""
        question = inputs[self.input_key]
        embedding = self.embeddings.embed_query(question)
        run_manager.on_text(
            "Vector search embeddings:", end="\n", verbose=self.verbose
        )
        run_manager.on_text(
            embedding[:5], color="green", end="\n", verbose=self.verbose
        )

        context = self.graph.query(
            vector_search, {'embedding': embedding})
        context = [el['result'] for el in context]
        run_manager.on_text(
            "Retrieved context:", end="\n", verbose=self.verbose
        )
        run_manager.on_text(
            context, color="green", end="\n", verbose=self.verbose
        )

        result = self.qa_chain(
            {"question": question, "context": context},
        )
        final_result = result[self.qa_chain.output_key]
        return {self.output_key: final_result}

In [12]:
vector_qa = Neo4jVectorChain(graph=graph, embeddings=embeddings, verbose=True)

In [13]:
vector_qa.run("What is the gameplay of Baldur's Gate 3 like?")



[1m> Entering new Neo4jVectorChain chain...[0m
Vector search embeddings:
[32;1m[1;3m[0.001163442909793317, -0.03362896087993143, -0.008195404370601279, -0.002520517105697522, 0.0013487993944175268][0m
Retrieved context:
[32;1m[1;3m["\n== Gameplay ==\nBaldur's Gate 3 is a role-playing video game that offers both a single-player and cooperative multiplayer element. Players can create one or more characters and form a party along with a number of pre-generated characters to explore the game's story. Optionally, players are able to take one of their characters and team up online with other players to form a party. Like previous games in the Baldur's Gate series, Baldur's Gate 3 has turn-based combat, similar to Larian's earlier games Divinity: Original Sin and Divinity: Original Sin II; all combat is based on the D&D 5th Edition rules.", "Baldur's Gate 3 is a role-playing video game developed and published by Larian Studios. It is the third main game in the Baldur's Gate series, w

"The gameplay of Baldur's Gate 3 is a role-playing video game that offers both a single-player and cooperative multiplayer element. Players can create one or more characters and form a party along with pre-generated characters to explore the game's story. The game features turn-based combat, similar to Larian's earlier games Divinity: Original Sin and Divinity: Original Sin II, and all combat is based on the D&D 5th Edition rules."