# Demo - Lord of the Rings to Graph

* Read page from PDF document
* Prompt LLM to Extract interesting characters, locations and events and relationships
* Get output as JSON
* Use GraphCypherQAChain to translate the JSON to Cypher and store the found nodes and relations to Neo4

### Sources:
* https://learn.deeplearning.ai/langchain
* https://python.langchain.com/docs/get_started/quickstart

### Requirements

In [2]:
!pip install pypdf langchain openai langchain-openai neo4j python-dotenv langchainhub langchain-community --quiet

In [1]:
%load_ext watermark
%watermark -p langchain,langchainhub,langchain_community,pypdf

langchain          : 0.1.5
langchainhub       : 0.1.14
langchain_community: 0.0.17
pypdf              : 4.0.1



### Imports

In [36]:
import os
from graphdatascience import GraphDataScience
from dotenv import load_dotenv, find_dotenv, dotenv_values
from pathlib import Path
import neo4j

from langchain_openai import ChatOpenAI

from langchain.agents import AgentExecutor, create_react_agent
from langchain.chains import LLMChain
from langchain.chains.conversation.memory import ConversationBufferMemory
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.output_parsers.json import SimpleJsonOutputParser
from langchain.prompts import PromptTemplate
from langchain.tools import Tool

from langchain_community.graphs import Neo4jGraph
from langchain.chains import GraphCypherQAChain

from langchain import hub
from langchain_community.document_loaders import PyPDFLoader

### Settings

In [6]:
project_path = Path(os.getcwd()).parent
data_path = project_path / "data"
model_path = project_path / "models"
output_path = project_path / "output"

llm_model = "gpt-4"

# load env settings
load_dotenv("../.env.lotr")

neo4j_url = os.getenv('NEO4J_URL')
neo4j_database = os.getenv('NEO4J_DATABASE')
neo4j_user = os.getenv('NEO4J_USER')
neo4j_pass = os.getenv('NEO4J_PASS')
openai_api_key = os.getenv('OPENAI_API_KEY')

### 1. Read pdf

* Picked PyMyPDF because of this article: https://medium.com/social-impact-analytics/comparing-4-methods-for-pdf-text-extraction-in-python-fd34531034f

In [7]:
filename = "JRR Tolkien - Lord of the Rings Collection.pdf"

In [8]:
%%time
# Wall time: 4.18 s

loader = PyPDFLoader(str(data_path / filename))
pages = loader.load_and_split()

CPU times: user 4.15 s, sys: 48.8 ms, total: 4.2 s
Wall time: 4.21 s


In [10]:
print(pages[1].page_content)

Chapter I  
An Unexpected Party  
 
 
In a hole in the ground there lived a hobbit. Not a nasty, dirty, wet hole, filled with 
the ends of worms and an oozy smell, nor yet a dry, bare, sandy hole with nothing 
in it to sit down on or to eat: it was a hobbit -hole, and that means comfort.  
It had a p erfectly round door like a porthole, painted green, with a shiny 
yellow brass knob in the exact middle. The door opened on to a tube -shaped hall 
like a tunnel: a very comfortable tunnel without smoke, with panelled walls, and 
floors tiled and carpeted, pro vided with polished chairs, and lots and lots of pegs 
for hats and coats  - the hobbit was fond of visitors. The tunnel wound on and on, 
going fairly but not quite straight into the side of the hill - The Hill, as all the 
people for many miles round called it - and many little round doors opened out of 
it, first on one side and then on another. No going upstairs for the hobbit: 
bedrooms, bathrooms, cellars, pantries (lots of the

### 2. Extract nodes and relations from text 

#### 2.1 Response as text

In [74]:
llm = ChatOpenAI(openai_api_key=openai_api_key, temperature=0)
print("llm.temperature:", llm.temperature)

prompt = PromptTemplate(template="""Extract interesting elements out of a piece of text, can you extract the following from a text?
- Characters
- Events
- Locations
- Objects

<context>
{context}
</context>

Also, can you link these entities in the following way:
- character - RELATES_TO (`how`) - character
- character - INTERACTS_WITH (`how`)- character
- character - INVOLVED_IN - event
- event - LOCATED_ AT Location
- object - RELEVANT_FOR (relevance_score (between 0 - 1, why)- event

if the attribute `how` is not relevant omit it.

* Please estimate the relevance score yourself
* For RELATES_TO, provide max 2 words of how the characters relate (ie. has_father, has_mother, has_son, has_daughter, has_uncle, has_aunt, has_friend, has_colleague, etc) or interact. 
    * for the entiity links, add the keyword 'how' to indicate the type of relationship, ie. "Bilbo Baggins RELATES_TO (how:has_mother) Belladonna Took"
* For INTERACTS_WITH, provide a short summary why it is relevant and add it as a attribute 'why'

* For RELEVANT FOR, provide max 2 words of why the object is relevant.
    * for the entity links, add the keyword 'why' to indicate the type of relationship, ie. "Bilbo Baggins RELATES_TO (how:has_mother) Belladonna Took"
* For INTERACTS_WITH, provide a short summary why it is relevant and add it as a attribute 'why', ie: Staff - RELEVANT_FOR (relevance_score: 0.8, why: "Gandalf's staff is highly relevant to him as it is part of his iconic appearance and magical abilities.") - Gandalf


Finally, provide a summary for the text.

{output}
""", input_variables=["context", "output"])

llm.temperature: 0.0


In [75]:
%%time

document_chain = create_stuff_documents_chain(llm, prompt)

response = document_chain.invoke({
    "context": [pages[1]], "output": ""
})
print(response)

Characters:
- Hobbit (unnamed)
- Baggins (unnamed)
- Big People (mentioned)

Events:
- An Unexpected Party (mentioned)

Locations:
- A hole in the ground
- The Hill
- Garden
- Meadows
- River

Objects:
- Round door
- Porthole
- Green paint
- Shiny yellow brass knob
- Tube-shaped hall
- Tunnel
- Panelled walls
- Tiled and carpeted floors
- Polished chairs
- Pegs for hats and coats
- Bedrooms
- Bathrooms
- Cellars
- Pantries
- Wardrobes
- Kitchens
- Dining-rooms
- Windows
- Deep-set round windows

Character - RELATES_TO (how: has_friend) - Baggins
Character - INTERACTS_WITH (how: interacts_with) - Big People
Character - INVOLVED_IN - An Unexpected Party
An Unexpected Party - LOCATED_AT - The Hill
Object - RELEVANT_FOR (relevance_score: 0.7, why: The round door is relevant as it represents the entrance to the hobbit's comfortable home.)

Summary:
The text introduces a hobbit who lives in a comfortable hole in the ground. The hobbit's name is Baggins, and they are considered respectable an

#### 2.2 Response as JSON

In [81]:
%%time

document_chain = create_stuff_documents_chain(llm, prompt, output_parser=SimpleJsonOutputParser())

response = document_chain.invoke({
    "context": [pages[1]], "output": "Very important is that you output the response as valid JSON!"
})
response

CPU times: user 32.8 ms, sys: 6.87 ms, total: 39.7 ms
Wall time: 50.5 s


{'characters': ['hobbit', 'Baggins', 'mother', 'Dwarves', 'Big People'],
 'events': ['An Unexpected Party'],
 'locations': ['The Hill', 'garden', 'meadows', 'river'],
 'objects': ['door',
  'knob',
  'tunnel',
  'walls',
  'floors',
  'chairs',
  'pegs',
  'windows',
  'rooms',
  'wardrobes',
  'kitchens',
  'dining-rooms',
  'bedrooms',
  'bathrooms',
  'cellars',
  'pantries',
  'bright colours',
  'magic'],
 'links': [{'entity1': 'hobbit',
   'relation': 'RELATES_TO',
   'entity2': 'Baggins',
   'how': 'has_surname'},
  {'entity1': 'hobbit',
   'relation': 'INTERACTS_WITH',
   'entity2': 'Dwarves',
   'why': 'The hobbit interacts with the Dwarves during the unexpected party.'},
  {'entity1': 'Baggins',
   'relation': 'INVOLVED_IN',
   'entity2': 'An Unexpected Party'},
  {'entity1': 'An Unexpected Party',
   'relation': 'LOCATED_AT',
   'entity2': 'The Hill'},
  {'entity1': 'door',
   'relation': 'RELEVANT_FOR',
   'entity2': 'An Unexpected Party',
   'why': 'The door is the entranc

### 3. Match results with graph

#### 3.1 Connect to Neo4j

In [82]:
graph = Neo4jGraph(
    url=neo4j_url,
    username=neo4j_user,
    password=neo4j_pass
)

# double check if database is correct. Neo4jGraph uses magic to override the database argument with a value from the environment
graph._database

'neo4j'

In [83]:
# Deletes all nodes and edges from the graph!
delete_all = False

if delete_all == True:
    query = "MATCH (n) DETACH DELETE n"
    graph.query(query)

In [84]:
CYPHER_GENERATION_TEMPLATE = """
You are an expert Neo4j Developer translating json into valid Cypher. Use a semicolon ';' after each clause.

Make sure that nodes and relations are merged for the following json:
{question}
"""

cypher_generation_prompt = PromptTemplate(
    template=CYPHER_GENERATION_TEMPLATE,
    input_variables=["question"],
)

cypher_chain = GraphCypherQAChain.from_llm(
    llm,
    graph=graph,
    cypher_prompt=cypher_generation_prompt,
    verbose=True
)

cypher_chain.invoke({"query": str(response)})



[1m> Entering new GraphCypherQAChain chain...[0m
Generated Cypher:
[32;1m[1;3mMERGE (hobbit:Character {name: 'hobbit'})
MERGE (Baggins:Character {name: 'Baggins'})
MERGE (mother:Character {name: 'mother'})
MERGE (Dwarves:Character {name: 'Dwarves'})
MERGE (BigPeople:Character {name: 'Big People'})
MERGE (AnUnexpectedParty:Event {name: 'An Unexpected Party'})
MERGE (TheHill:Location {name: 'The Hill'})
MERGE (garden:Location {name: 'garden'})
MERGE (meadows:Location {name: 'meadows'})
MERGE (river:Location {name: 'river'})
MERGE (door:Object {name: 'door'})
MERGE (knob:Object {name: 'knob'})
MERGE (tunnel:Object {name: 'tunnel'})
MERGE (walls:Object {name: 'walls'})
MERGE (floors:Object {name: 'floors'})
MERGE (chairs:Object {name: 'chairs'})
MERGE (pegs:Object {name: 'pegs'})
MERGE (windows:Object {name: 'windows'})
MERGE (rooms:Object {name: 'rooms'})
MERGE (wardrobes:Object {name: 'wardrobes'})
MERGE (kitchens:Object {name: 'kitchens'})
MERGE (diningRooms:Object {name: 'dining-

{'query': '{\'characters\': [\'hobbit\', \'Baggins\', \'mother\', \'Dwarves\', \'Big People\'], \'events\': [\'An Unexpected Party\'], \'locations\': [\'The Hill\', \'garden\', \'meadows\', \'river\'], \'objects\': [\'door\', \'knob\', \'tunnel\', \'walls\', \'floors\', \'chairs\', \'pegs\', \'windows\', \'rooms\', \'wardrobes\', \'kitchens\', \'dining-rooms\', \'bedrooms\', \'bathrooms\', \'cellars\', \'pantries\', \'bright colours\', \'magic\'], \'links\': [{\'entity1\': \'hobbit\', \'relation\': \'RELATES_TO\', \'entity2\': \'Baggins\', \'how\': \'has_surname\'}, {\'entity1\': \'hobbit\', \'relation\': \'INTERACTS_WITH\', \'entity2\': \'Dwarves\', \'why\': \'The hobbit interacts with the Dwarves during the unexpected party.\'}, {\'entity1\': \'Baggins\', \'relation\': \'INVOLVED_IN\', \'entity2\': \'An Unexpected Party\'}, {\'entity1\': \'An Unexpected Party\', \'relation\': \'LOCATED_AT\', \'entity2\': \'The Hill\'}, {\'entity1\': \'door\', \'relation\': \'RELEVANT_FOR\', \'entity2