## What is a Knowledge Graph

A knowledge graph is a way to store and organize information in a form that highlights the relationships between different pieces of data. It consists mainly of two components: nodes and relationships.

Nodes Nodes represent entities or concepts in a knowledge graph. An entity can be anything with a distinct existence, such as a person, place, object, or idea. For example, in a knowledge graph about a book series, nodes could represent characters, locations, events, or items mentioned in the books.

Relationships Relationships connect the nodes and describe how they are related to each other. These connections make the graph structure meaningful by providing context about the relationships between entities. For instance, in a knowledge graph about a book series, relationships could illustrate friendships between characters, locations of events, or ownership of items.

Each relationship in a knowledge graph is directional and has a nature or type. For example, a relationship might be labeled "FRIEND_OF" to indicate a friendship between two characters or "LOCATED_IN" to show that an event takes place in a specific location.

Together, nodes and relationships in a knowledge graph create a network of interconnected information that can be queried and analyzed to uncover insights, relationships, and patterns that might not be evident from the data when viewed in isolation. This structure is particularly useful for complex domains where understanding the connections between pieces of information is crucial.

## Why we need knowledge graph

A knowledge graph is a sophisticated database that uses a graph-structured data model to store interlinked descriptions of entities — objects, events, situations, or concepts — along with their properties and relationships. It represents knowledge in a form that is both machine-readable and understandable, enabling complex queries and inferences about the data. Knowledge graphs are widely used in various applications, including semantic search, recommendation systems, and artificial intelligence, to enhance data connectivity and context for better decision-making and insights. Google's Knowledge Graph, for instance, helps improve search results by understanding the context and relationships between different entities and facts.

Imagine you have a giant wall in your room where you stick notes about everything you know, like your friends, your favorite games, and places you love. Now, for each note, you draw lines to other notes that are connected in some way, like linking your friend's note to the ice cream shop you both love or to the game you play together.

A knowledge graph is like this giant wall of notes, but it's for a computer. Instead of notes, there are bits of information about all sorts of things in the world, and the lines are like invisible strings that tie related bits together. This helps the computer understand how everything is connected, just like you understand how your friends, games, and favorite places are related. So, when you ask the computer a question, it looks at this giant wall of information to give you a really good answer!

## knowledge graph vs relational database

Let's consider a simple example involving a library system to illustrate how a knowledge graph might offer advantages over a traditional relational database in certain scenarios.

Scenario: Library System
Imagine you're managing a library system that includes information about books, authors, genres, and readers. The system is used to track which books are checked out by which readers and to provide book recommendations based on reader preferences and reading history.

Relational Database Approach

In a relational database, you might have tables for Books, Authors, Genres, and Readers. Relationships between these entities would be managed through foreign keys and join tables. For example, a Books table might have foreign keys to Authors and Genres, and a separate join table might be needed to manage the many-to-many relationships between Readers and the Books they've checked out.

To recommend a book to a reader based on their reading history and preferred genres, you would need to perform multiple joins. For instance, you'd join Readers to Books to find out what they've previously read, then Books to Genres to filter books by the reader's preferred genres, and perhaps again to Authors if the reader prefers books by certain authors.

Knowledge Graph Approach

In a knowledge graph, entities (Books, Authors, Genres, Readers) and their relationships ("written by," "belongs to genre," "checked out by") are directly represented as nodes and edges. This direct representation makes it easier to query complex relationships.

For the same recommendation task, the knowledge graph allows you to more intuitively follow the connections between nodes. You can start at a Reader node, follow edges to the Books they've checked out, continue to the Genres of those books, and then find other Books within those Genres that are written by favored Authors. This path through the graph can be more direct and semantically clear compared to constructing a complex SQL query with multiple joins.

## Create account 

https://neo4j.com/cloud/platform/aura-graph-database/

Sign up for free account for auraDB which is a fully managed graph database service

Make sure to save the related variables from the console like uri, username, password

### Install libraries

In [1]:
# !pip install neo4j

### set up neo4j related variables

In [1]:
import os

neo4j_uri = os.getenv("neo4j_uri")
neo4j_username = os.getenv("neo4j_username")
neo4j_password = os.getenv("neo4j_password")

In [3]:
from neo4j import GraphDatabase

neo4j_auth = (neo4j_username,neo4j_password)

### Create Nodes and relationships

In [4]:

driver = GraphDatabase.driver(neo4j_uri, auth=neo4j_auth)

def create_graph(tx):
    # Create nodes
    tx.run("MERGE (h:Person {name: 'Harry Potter'}) "
           "MERGE (r:Person {name: 'Ron Weasley'}) "
           "MERGE (hg:Person {name: 'Hermoine Granger'}) "
           "MERGE (hw:School {name: 'Hogwarts'})")
    
    # Create relationships
    tx.run("MATCH (h:Person {name: 'Harry Potter'}), (hw:School {name: 'Hogwarts'}) "
           "MERGE (h)-[:STUDIES_AT]->(hw)")
    tx.run("MATCH (h:Person {name: 'Harry Potter'}), (r:Person {name: 'Ron Weasley'}) "
           "MERGE (h)-[:FRIEND_OF]->(r)")
    tx.run("MATCH (r:Person {name: 'Ron Weasley'}), (hw:School {name: 'Hogwarts'}) "
           "MERGE (r)-[:STUDIES_AT]->(hw)")
    tx.run("MATCH (hg:Person {name: 'Hermoine Granger'}), (r:Person {name: 'Ron Weasley'}) "
           "MERGE (hg)-[:FRIEND_OF]->(r)")

# Execute the function within a Neo4j session
with driver.session() as session:
    session.write_transaction(create_graph)

driver.close() 

  session.write_transaction(create_graph)


### Query the graph database 

In [5]:
records, summary, keys = driver.execute_query(
    "MATCH (p:Person ) RETURN p.name AS name",
    database_="neo4j",
)

for person in records:
    print(person)

print("The query `{query}` returned {records_count} records in {time} ms.".format(
    query=summary.query, records_count=len(records),
    time=summary.result_available_after,
))

  records, summary, keys = driver.execute_query(


<Record name='Harry Potter'>
<Record name='Ron Weasley'>
<Record name='Hermoine Granger'>
The query `MATCH (p:Person ) RETURN p.name AS name` returned 3 records in 29 ms.


### Advanced querying to find who could be potential friend.
### This is useful in recommendation systems like facebook, netflix etc

In [7]:
query = """
    MATCH (student:Person)-[:STUDIES_AT]->(:School {name: 'Hogwarts'}),
          (Hermoine:Person {name: 'Hermoine Granger'})
    WHERE NOT (Hermoine)-[:FRIEND_OF]-(student) AND Hermoine <> student
    RETURN student.name AS PotentialFriend
    """
records, summary, keys = driver.execute_query(
    query,
    database_="neo4j",
)

# Loop through results and do something with them
for person in records:
    print(person)

# Summary information
print("The query `{query}` returned {records_count} records in {time} ms.".format(
    query=summary.query, records_count=len(records),
    time=summary.result_available_after,
))

  records, summary, keys = driver.execute_query(


<Record PotentialFriend='Harry Potter'>
The query `
    MATCH (student:Person)-[:STUDIES_AT]->(:School {name: 'Hogwarts'}),
          (Hermoine:Person {name: 'Hermoine Granger'})
    WHERE NOT (Hermoine)-[:FRIEND_OF]-(student) AND Hermoine <> student
    RETURN student.name AS PotentialFriend
    ` returned 1 records in 182 ms.


## Visualize from the console 

https://workspace-preview.neo4j.io/workspace/query

Query 1 - Find School and students

MATCH p=()-[:STUDIES_AT]->() RETURN p LIMIT 25;

![school_nodes](school_nodes.png)

Query 2 - Find person and friend

MATCH p=()-[:FRIEND_OF]->() RETURN p LIMIT 25;

![person_nodes](person_nodes.png)

Query 3 - All nodes and relations

MATCH (n)
OPTIONAL MATCH (n)-[r]->(m)
RETURN n, r, m

![all_nodes](all_nodes.png)

### To delete all nodes - Proceed with caution 

In [8]:
driver = GraphDatabase.driver(neo4j_uri, auth=neo4j_auth)

def delete_graph(tx):
    # Delete relationships first
    tx.run("MATCH (n)-[r]-() DELETE r")
    
    # Then delete nodes
    tx.run("MATCH (n) DELETE n")

# Execute the function within a Neo4j session
with driver.session() as session:
    session.write_transaction(delete_graph)

driver.close()  # Close the driver connection when done

  session.write_transaction(delete_graph)
