# Knowledge Graphs



# What is a Knowledge Graph?

Simple defination:
A **Knowledge Graph** is simply everything we know about a particular domain stored as Graphs.


Complex defination: A **Knowledge Graph** is a structured representation of knowledge that captures relationships between entities in a way that mimics human cognitive processes. It is a form of a graph database that organizes data in a network of interconnected nodes (entities) and edges (relationships).

## Key Components

1. **Entities**: These are the nodes in the graph, representing real-world objects or concepts, such as people, places, products, or ideas.
2. **Relationships**: The edges connecting the nodes, defining how entities are related to one another. For example, a person "works at" a company or a book "is written by" an author.
3. **Attributes**: Properties or characteristics of entities that provide additional context, such as the name, date of birth, or location.


## Applications

- **Recommendation Systems**: Helps in suggesting products or content based on user preferences and behaviors.
- **Social Network Analysis**:
   - **Relationship Mapping**: Visualize and analyze relationships between individuals, groups, or organizations. Identify key influencers and social clusters.
   - **Community Detection**: Discover communities or groups within a network based on connectivity patterns and interactions.
   - **Sentiment Analysis**: Analyze the sentiment of interactions between entities to understand public opinion and emotional trends.


## Benefits

- **Enhanced Data Accessibility**: Provides a clearer, more intuitive way to access and interact with data.
- **Improved Insights**: By connecting diverse data points, it reveals insights that might not be apparent from isolated datasets.
- **Flexibility and Scalability**: Adapts to new information and relationships, making it scalable and versatile.

In summary, a knowledge graph is a powerful tool for organizing, integrating, and leveraging data in a way that reflects real-world relationships and enhances data-driven decision-making.



In [None]:
!pip install --upgrade --quiet langchain langchain_community langchain_experimental langchain_openai langgraph neo4j openai tiktoken langchain-nvidia-ai-endpoints

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/204.3 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m[90m━[0m [32m194.6/204.3 kB[0m [31m8.4 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m204.3/204.3 kB[0m [31m5.3 MB/s[0m eta [36m0:00:00[0m
[?25h

### **Set up Project Variables and Import Database**

In [None]:
from langchain.graphs import Neo4jGraph
from langchain_openai import ChatOpenAI
from google.colab import userdata
from langchain_nvidia_ai_endpoints import ChatNVIDIA
import os

os.environ["OPENAI_API_KEY"] = userdata.get('OPENAI_API_KEY')
os.environ["NEO4J_URI"] = "bolt://44.203.231.196:7687"
os.environ["NEO4J_USERNAME"] = "neo4j"
os.environ["NEO4J_PASSWORD"] = "brass-barge-cells"
os.environ["NVIDIA_API_KEY"] = userdata.get('NVIDIA_API_KEY')


llmOllama = ChatNVIDIA(model="meta/llama-3.1-70b-instruct")
graph = Neo4jGraph()
llm = ChatOpenAI(temperature=0 ,model="gpt-3.5-turbo")

In [None]:
llmOllama.invoke("What is Neo4j?").content

'Neo4j is a graph database management system that stores data in the form of nodes and relationships rather than tables and columns like traditional relational databases. This allows for efficient querying and manipulation of complex, interconnected data.\n\nHere are some key features and concepts of Neo4j:\n\n**Key Features:**\n\n1. **Graph data model**: Neo4j stores data as a graph, consisting of nodes (entities) and relationships (edges) between them.\n2. **NoSQL**: Neo4j is a NoSQL database, which means it does not use a fixed schema or rigid data structure like relational databases.\n3. **Scalability**: Neo4j is designed to handle large amounts of data and scale horizontally to support growing datasets.\n4. **ACID compliance**: Neo4j ensures atomicity, consistency, isolation, and durability (ACID) for transactions, ensuring data integrity.\n\n**Use cases:**\n\n1. **Social networks**: Analyzing relationships between users, like friendships or followers.\n2. **Recommendation engines

In [None]:
# Test LLM

reponse = llm.invoke("What is Neo4j?")
print(reponse.content)

Neo4j is a popular open-source graph database management system that is designed to store, manage, and query data in the form of graphs. Unlike traditional relational databases that use tables and rows to represent data, Neo4j uses nodes, relationships, and properties to model complex data structures and their interconnections.

### Key Features of Neo4j:

1. **Graph Model**: Data is represented as nodes (entities) and relationships (connections between entities), making it intuitive for representing real-world scenarios like social networks, organizational structures, and more.

2. **Cypher Query Language**: Neo4j uses Cypher, a powerful and expressive query language specifically designed for working with graph data. Cypher allows users to easily create, read, update, and delete data in the graph.

3. **ACID Compliance**: Neo4j ensures data integrity and reliability through ACID (Atomicity, Consistency, Isolation, Durability) compliance, making it suitable for applications that requir

In [None]:
# Test Graph

graph.get_schema

'Node properties:\nPerson {name: STRING, age: INTEGER}\nCity {name: STRING}\nProfession {name: STRING}\nProject {name: STRING}\nWorkPlaces {name: STRING}\nMeeting {name: STRING}\nEvent {name: STRING}\nRelationship properties:\n\nThe relationships:\n(:Person)-[:FRIENDS_WITH]->(:Person)\n(:Person)-[:PARTNER]->(:Person)\n(:Person)-[:REGULAR_AT]->(:WorkPlaces)\n(:Person)-[:HAS_CHILD]->(:Person)\n(:Person)-[:WORKS_AT]->(:WorkPlaces)\n(:Person)-[:HAS_PROFESSION]->(:Profession)\n(:Person)-[:LEADS]->(:WorkPlaces)\n(:Person)-[:OWNS]->(:WorkPlaces)\n(:Person)-[:COLLABORATES_ON]->(:Project)\n(:Person)-[:ORGANIZES]->(:Project)\n(:Person)-[:NEIGHBOR_OF]->(:Person)\n(:Person)-[:HOSTS]->(:Meeting)\n(:Person)-[:HAS_PARENT]->(:Person)\n(:Person)-[:MARRIED_TO]->(:Person)\n(:Person)-[:HAS_EVENT]->(:Event)\n(:Person)-[:RESIDES_IN]->(:City)\n(:Person)-[:ENJOYS]->(:WorkPlaces)\n(:Person)-[:ENJOYS]->(:Meeting)'

## **Create Knowledge Graphs**

### **Two methods:**


1. Custom Method

2. Langchain Graph Transformers

In [None]:
#Custom method

from langchain_core.prompts import ChatPromptTemplate
from langchain_core.messages import SystemMessage
from langchain_core.output_parsers import StrOutputParser


prompt = ChatPromptTemplate.from_messages([
    SystemMessage(content="""
    You are a helpful assistant in creates knowledge graphs by Generating Cypher Queries.\n

    Task:
     *  Identify Entities, Relationships and Property Keys from Context.\n
     *  Generate Cypher Query to Create Knowledge Graph from the Entities Relationships and Property Keys discovered.\n
     *  Extract ALL Entities and RelationShips as Possible.\n
     *  Always extract a person Profession as an Entity.\n
     *  Be creative.
     *  Understand hidden relationships from the network.
     Note: Read the Context twice and carefully before generating Cypher Query.\n
     Note: Do not return anything other than the Cypher Query.\n
     Note: Do not include any explanations or apologies in your responses.\n


     Note: Do not hallucinate.\n

     Entities include Person, Place, Product, WorkPlaces, Companies , City, Country, Animals, Tags like peoples Profession and more \n

     Few Shot Prompts:
      Example Context:

       Mary was born in 1995. She is Friends with Jane and John. Jane is 2 years older than Mary.
       Mary has a dog named Max,and is 3 years old. She is also married to John.Mary is from USA and a Software Engineer.

      Answer:
        MERGE (Mary:Person {name: "Mary", birth_year: 1995})
        MERGE (Jane:Person {name: "Jane", age:1993})
        MERGE (John:Person {name: "John"})
        MERGE (Mary)-[:FRIENDS_WITH]->(Jane)
        MERGE (Mary)-[:FRIENDS_WITH]->(John)
        MERGE (Jane)-[:FRIENDS_WITH]->(Mary)
        MERGE (John)-[:FRIENDS_WITH]->(Mary)
        MERGE (Mary)-[:HAS_DOG]->(Max:Dog {name: "Max", age: 3})
        MERGE (Mary)-[:MARRIED_TO]->(John)
        MERGE (Mary)-[:HAS_PROFESSION]->(SoftwareEngineer:Profession {name: "Software Engineer"})
        MERGE (Mary)-[:FROM]->(USA:Country {name: "USA"})


    """),
    ("human", "Context:{text}"),
])

chain  = prompt | llm | StrOutputParser()

In [None]:
content = ""
# Open the file in read mode
with open('FictionalStory.txt', 'r') as file:
    # Read the contents of the file
    content = file.read()

# Print the content
print(content)


### The Enchanted Network of NexCity

---

In the heart of the bustling metropolis of **NexCity**, where skyscrapers kissed the clouds and neon lights painted the night, lived Emma Thompson. Her apartment was a marvel of modern design, perched high above the city streets, a shimmering beacon of her success as a software developer. Little did she know, her life was about to become a thrilling tapestry of connections, adventures, and surprises.

#### **Chapter 1: The Birthday Bash**

One sunny Saturday, Emma received an invitation from her childhood friend, Alex Martin. Alex, now a thriving entrepreneur, lived in a charming suburban house with his partner, Jessica, and their two playful children, Lily and Max. The occasion? Max’s fifth birthday party, and it promised to be a grand affair.

Emma arrived at Alex's home to find the backyard transformed into a carnival. Colorful balloons floated in the breeze, and laughter echoed as children darted around a bouncy castle. Emma marveled at th

In [None]:
result = chain.invoke({"text": content})
print(result)


MERGE (Emma:Person {name: "Emma Thompson"})
MERGE (NexCity:City {name: "NexCity"})
MERGE (Emma)-[:RESIDES_IN]->(NexCity)
MERGE (Emma)-[:HAS_PROFESSION]->(SoftwareDeveloper:Profession {name: "Software Developer"})

MERGE (Alex:Person {name: "Alex Martin"})
MERGE (Jessica:Person {name: "Jessica"})
MERGE (Lily:Person {name: "Lily"})
MERGE (Max:Person {name: "Max", age: 5})
MERGE (Alex)-[:FRIENDS_WITH]->(Emma)
MERGE (Alex)-[:PARTNER]->(Jessica)
MERGE (Jessica)-[:HAS_CHILD]->(Lily)
MERGE (Jessica)-[:HAS_CHILD]->(Max)

MERGE (Tom:Person {name: "Tom Wilson"})
MERGE (Sarah:Person {name: "Sarah"})
MERGE (Innovatech:WorkPlaces {name: "Innovatech"})
MERGE (Tom)-[:WORKS_AT]->(Innovatech)
MERGE (Tom)-[:FRIENDS_WITH]->(Emma)
MERGE (Sarah)-[:HAS_PROFESSION]->(Artist:Profession {name: "Artist"})

MERGE (Maria:Person {name: "Maria Lopez"})
MERGE (University:WorkPlaces {name: "University"})
MERGE (Maria)-[:FRIENDS_WITH]->(Emma)
MERGE (Maria)-[:WORKS_AT]->(University)
MERGE (Maria)-[:LEADS]->(Environmen

In [None]:
def create_knowledge_graph(cypher:str):
    graph.query(cypher)

In [None]:
create_knowledge_graph(f"""{result}""")

In [None]:
#Method 2

from langchain_core.documents import Document
from langchain_experimental.graph_transformers import LLMGraphTransformer

llm_transformer = LLMGraphTransformer(llm=llm)
documents = [Document(page_content=content)]
graph_documents = llm_transformer.convert_to_graph_documents(documents)
print(f"Nodes:{graph_documents[0].nodes}")
print(f"Relationships:{graph_documents[0].relationships}")

Nodes:[Node(id='Emma Thompson', type='Person'), Node(id='Nexcity', type='City'), Node(id='Alex Martin', type='Person'), Node(id='Jessica', type='Person'), Node(id='Lily', type='Person'), Node(id='Max', type='Person'), Node(id='Tom Wilson', type='Person'), Node(id='Sarah', type='Person'), Node(id='Maria Lopez', type='Person'), Node(id='Jake Anderson', type='Person'), Node(id='Raj Patel', type='Person'), Node(id='Nina', type='Person'), Node(id='Leo Martinez', type='Person'), Node(id='Maya Chen', type='Person'), Node(id='Helen Carter', type='Person'), Node(id='Chloe Edwards', type='Person'), Node(id='Liam', type='Person')]
Relationships:[Relationship(source=Node(id='Emma Thompson', type='Person'), target=Node(id='Nexcity', type='City'), type='RESIDES'), Relationship(source=Node(id='Alex Martin', type='Person'), target=Node(id='Nexcity', type='City'), type='RESIDES'), Relationship(source=Node(id='Jessica', type='Person'), target=Node(id='Alex Martin', type='Person'), type='SPOUSE'), Relati

In [None]:
graph.add_graph_documents(graph_documents)

### **Querying the Graph**

In [None]:
from langchain.chains import GraphCypherQAChain

graphchain = GraphCypherQAChain.from_llm(
    llm, graph=graph, verbose=True, return_intermediate_steps=True
)

In [None]:
results = graphchain.invoke({"query":"Show me friend of someone who is a Software Delevoper"})
print(results["result"])



[1m> Entering new GraphCypherQAChain chain...[0m
Generated Cypher:
[32;1m[1;3mMATCH (s:Person)-[:FRIEND]->(f:Person)
WHERE s.profession = 'Software Developer'
RETURN f[0m
Full Context:
[32;1m[1;3m[][0m

[1m> Finished chain.[0m
I don't know the answer.


In [None]:
os.environ["LANGCHAIN_TRACING_V2"]="true"
os.environ["LANGCHAIN_ENDPOINT"]="https://api.smith.langchain.com"
os.environ["LANGCHAIN_API_KEY"]=userdata.get('LANGCHAIN_API_KEY')
os.environ["LANGCHAIN_PROJECT"]="pr-shadowy-latex-56"

### **Prompting**

In [None]:



graph_prompt = ChatPromptTemplate.from_messages([
  ("system",
   f"""
    You are a Cypher Expert Generating Cypher Queries to query a Neo4j Database from User Questions.\n

    Task:
      *  Use the schema provided to generate Cypher Queries.\n
      *  Strictly follow the schema while generating Cypher Queries.\n
      *  Do not return any explanations or apologies in your responses.\n
      *  Do not return anything other than cypher queries.
      * use fullnames for cypher variables

    """),
    ("human","Schema: {schema}"),
    ("human","The question is: {query}") ,

  ]
)

qa_prompt = ChatPromptTemplate.from_messages([
    ("system","""
    You are a Chat Assistant helping users with their queries.
    Task: You are to generate answers to the Question asked using the result from context provided. Be concise and provide all relevant information from context.

    Instructions:
    - Read the question and the context again before answering.
    - Use only the provided context to answer the question.
    - If no context is provided, explain to the user you could not find that information.
    - Do not fall back to your pre-trained knowledge.
    - Do not mention context in your answer.
    - Be as helpful as you can.
    - Do not hallucinate.
    - Make sure to provide answer from context before saying you don't know.



    Note: The context provided comes from an Authoritative and is a fact source do not doubt it .
    Note: Do not fallback to your pre-trained knowledge.
    Note: Provide the answer an do not mention you are providing the answer from the context.

    *Be friendly with your with your tone.
    """),

    ("human","Question:{question}"),
    ("human","Context:{context}"),
])

second_graphQa = GraphCypherQAChain.from_llm(
    llm=llmOllama, qa_llm=llm,  graph=graph, verbose=True, cypher_prompt=graph_prompt, return_intermediate_steps=True, qa_prompt=qa_prompt
)

In [None]:
result = second_graphQa.invoke({"query":"Which Companies are there? and who works there?"})
print(result["result"])



[1m> Entering new GraphCypherQAChain chain...[0m
Generated Cypher:
[32;1m[1;3mMATCH (p:Person)-[:WORKS_AT]->(c:WorkPlaces) RETURN c.name AS Companies, collect(p.name) AS Employees[0m
Full Context:
[32;1m[1;3m[{'Companies': 'Innovatech', 'Employees': ['Tom Wilson', 'Raj Patel']}, {'Companies': 'University', 'Employees': ['Maria Lopez']}][0m

[1m> Finished chain.[0m
Companies: Innovatech, University
Employees at Innovatech: Tom Wilson, Raj Patel
Employee at University: Maria Lopez
