<a href="https://colab.research.google.com/github/SwapnaKasula/GenerativeAI/blob/master/Knowledge_Graph/KnowledgeGraph_DevelopingTheKGSchema.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Hands-On: Knowledge Graph Schema Design


## Introduction

In this session we will be expanding the Movie graph schema with new elements, based on a set of competency questions

## Competency questions


* Q1: Which movies are comedies?
* Q2: Are there any science fiction movies that are also horror movies?

## Question 1: What information do we need to add to the schema?


Answer:

The competency questions introduce 3 new terms that characterize the movies' genre: "comedies","science fiction" and "horror". Thus, we need the KG schema to allow the representation of movie genres

## Question 2: What KG elements can we define to model this information?

Hint 1: The elements we have at our disposal are Classes, Individuals, Relations and Attributes. Try not to think Neo4j-specific elements yet.

Hint 2: There might be more than one way to model this information



Answer:

We have 3 main options to model the genre information:

* Option 1: Consider "Comedy", "Science Fiction" and "Horror" as string values of an attribute "genre" of movies.

* Option 2: Model "Comedy", "Science Fiction" and "Horror" as subclasses of the class "Movie, and make each movie an instance of its corresponding subclasses.

* Option 3: Model "Comedy", "Science Fiction" and "Horror" as entities of the class "Genre" and define a relation "HAS_GENRE" that links movie entities with genre entities.

Which one to choose?

* With option 1 we limit our flexibility to define further knowledge about genres (e.g. what characteristics they have, how are they related to each other).

* Option 2 can be risky if we are not careful. By considering "Comedy" as a subclass of "Movie" we imply that every entity that is an instance of "Comedy" is also an instance of "Movie". That is only correct if instead of "Comedy" we call the subclass "ComedyMovie". But then that limits the use of genres only to movies. Moreover, with option 2, the knowledge about genres becomes implicit. If we also want to support a question like "What are the all movie genres" we cannot do it, unless all movie subclasses are only genres.

* Option 3 is the cleanest and most flexible option here.

## Question 3: What information do we need to specify for each element, based on question 2?





Answer:

Assuming we have selected option 3 we need to define the semantics of the class "Genre" and the relation "HAS_GENRE". For that we need to answer 3 questions:

* What characteristics and criteria does an entity need to satisfy in order to be a "genre"? E.g. is "anime" a genre or an art form?

* What characteristics and criteria make any two genres different? E.g. Is "thriller" the same as "horror"?

* What characteristics and criteria does a movie need to satisfy in order to belong to a particular genre? E.g., Should we consider a movie that contains jokes but does not make us laugh a comedy?

## Question 4: How can we represent the new elements in Neo4j?

### Let's represent the new elements

In [1]:
# Represent "Genre" class as label:

create_comedy_genre_entity_query = """CREATE (g:Genre {name:"Comedy"});"""


# Link movies to their genres via the relation "HAS_GENRE"

link_movie_to_genre_query = """MATCH (m:Movie {title:"The Birdcage"}), (g:Genre {name:"Comedy"}) CREATE (m)-[:HAS_GENRE]->(g);"""


### Question 4.1: Is there an alternative representation?



Answer: Yes, representing the "Genre" class as a node instead of a label

In [None]:
# Represent "Genre" class as a node:

create_genre_node_query = """CREATE (g:Class {name:"Genre"});"""

create_comedy_node_query = """CREATE (c:Entity {name:"Comedy"});"""

make_comedy_instance_of_genre_query = """MATCH (c:Entity {name:"Comedy"}), (g:Class {name:"Genre"}) CREATE (c-[:INSTANCE_OF]->(g);"""

# Link movies to their genres via the relation "HAS_GENRE"

link_movie_to_genre_query = """MATCH (m:Movie {title:"The Birdcage"}), (g:Entity {name:"Comedy"}) CREATE (m)-[:HAS_GENRE]->(g);"""



