## What Is Graph Theory?

In the domain of mathematics and computer science, graph theory is the study of graphs that concerns with the relationship among edges and vertices. It is a popular subject having its applications in computer science, information technology, biosciences, mathematics, and linguistics to name a few.

A graph is just a collection of vertices and edges or  a set of nodes and the relationships that connect them. Graphs represent entities as nodes and the ways in which those entities relate to the world as relationships.

![ANN](https://image.slidesharecdn.com/cdlconference-whatisagraphdatabasev4-171117144441/95/lju-lazarevic-3-638.jpg?cb=1510930570)

## The Labeled Property Graph Model

A labeled property graph has the following characteristics:

&nbsp; &nbsp; 1. It contains nodes and relationships.

&nbsp; &nbsp; 2. Nodes contain properties (key-value pairs).

&nbsp; &nbsp; 3. Nodes can be labeled with one or more labels.

&nbsp; &nbsp; 4. Relationships are named and directed, and always have a start and end node.

&nbsp; &nbsp; 5. Relationships can also contain properties.

![ANN](https://s3.amazonaws.com/dev.assets.neo4j.com/wp-content/uploads/property_graph_model.png)

## Graph Databases

A graph database management system (henceforth, a graph database) is an online database management system with Create, Read, Update, and Delete (CRUD) methods that expose a graph data model. Graph databases are generally built for use with transactional (OLTP) systems. Accordingly, they are normally optimized for transactional performance, and engineered with transactional integrity and operational availability in mind.

There are two properties of graph databases we should consider when investigating graph database technologies:

&nbsp; &nbsp; 1. The underlying storage:

Some graph databases use native graph storage that is optimized and designed for storing and managing graphs. Not all graph database technologies use native graph storage, however. Some serialize the graph data into a relational database, an object-oriented database, or some other general-purpose data store.

&nbsp; &nbsp; 2. The processing engine

Some definitions require that a graph database use index-free adjacency, meaningthat connected nodes physically “point” to each other in the database. Here we take a slightly broader view: any database that from the user’s perspective behaves like a graph database (i.e., exposes a graph data model through CRUD operations) qualifies as a graph database. We do acknowledge, however, the significant performance advantages of index-free adjacency, and therefore use the term native graph processing to describe graph databases that leverage index-free adjacency.

## Labels in the Graph

Often we want to categorize the nodes in our networks according to the roles they play. Some nodes, for example, might represent users, whereas others represent orders or products. In Neo4j, we use labels to represent the roles a node plays in the
graph. Because a node can fulfill several different roles in a graph, Neo4j allows us to add more than one label to a node.

Using labels in this way, we can group nodes. We can ask the database, for example, to find all the nodes labeled User. (Labels also provide a hook for declaratively indexing nodes, as we shall see later.) We use labels extensively in the examples in the rest of this book. Where a node represents a user, we’ve added a User label; where it represents an order we’ve added an Order label, and so on. 

## The Labeled Property Graph Model

&nbsp; &nbsp; • A labeled property graph is made up of nodes, relationships, properties, and labels.

&nbsp; &nbsp; • Nodes contain properties. Think of nodes as documents that store properties in the form of arbitrary key-value pairs. In Neo4j, the keys are strings and the values are the Java string and primitive data types, plus arrays of these types.

&nbsp; &nbsp; • Nodes can be tagged with one or more labels. Labels group nodes together, and indicate the roles they play within the dataset.

&nbsp; &nbsp; • Relationships connect nodes and structure the graph. A relationship always has a direction, a single name, and a start node and an end node—there are no dangling relationships. Together, a relationship’s direction and name add semantic clarity to the structuring of nodes.

&nbsp; &nbsp; • Like nodes, relationships can also have properties. The ability to add properties to relationships is particularly useful for providing additional metadata for graph algorithms, adding additional semantics to relationships (including quality and weight), and for constraining queries at runtime.

## Neo4j Features

SQL Like easy query language Neo4j CQL

&nbsp; &nbsp; • It follows Property Graph Data Model

&nbsp; &nbsp; • It supports Indexes by using Apache Lucence

&nbsp; &nbsp; • It supports UNIQUE constraints

&nbsp; &nbsp; • It contains a UI to execute CQL Commands : Neo4j Data Browser

&nbsp; &nbsp; • It supports full ACID(Atomicity, Consistency, Isolation and Durability) rules

&nbsp; &nbsp; • It uses Native graph storage with Native GPE(Graph Processing Engine)

&nbsp; &nbsp; • It supports exporting of query data to JSON and XLS format

&nbsp; &nbsp; • It provides REST API to be accessed by any Programming Language like Java, Spring,Scala etc.

&nbsp; &nbsp; • It provides Java Script to be accessed by any UI MVC Framework like Node JS.

&nbsp; &nbsp; • It supports two kinds of Java API: Cypher API and Native Java API to develop Java applications.

## Neo4j Advantages

&nbsp; &nbsp; • It is very easy to represent connected data.

&nbsp; &nbsp; • It is very easy and faster to retrieve/traversal/navigation of more Connected data.

&nbsp; &nbsp; • It represents semi-structured data very easily.

&nbsp; &nbsp; • Neo4j CQL query language commands are in humane readable format and very easy to learn.

&nbsp; &nbsp; • It uses simple and powerful data model.

&nbsp; &nbsp; • It does NOT require complex Joins to retrieve connected/related data as it is very easy to retrieve it's adjacent node or relationship details without Joins or Indexes.

## Nodes for Things, Relationships for Structure

Though not applicable in every situation, these general guidelines will help us choose when to use nodes, and when to use relationships:

&nbsp; &nbsp; • Use nodes to represent entities—that is, the things in our domain that are of interest to us, and which can be labeled and grouped.

&nbsp; &nbsp; • Use relationships both to express the connections between entities and to establish semantic context for each entity, thereby structuring the domain.

&nbsp; &nbsp; • Use relationship direction to further clarify relationship semantics. Many relationships are asymmetrical, which is why relationships in a property graph are always directed. For bidirectional relationships, we should make our queries ignore direction, rather than using two relationships.

&nbsp; &nbsp; • Use node properties to represent entity attributes, plus any necessary entity metadata, such as timestamps, version numbers, etc.

&nbsp; &nbsp; • Use relationship properties to express the strength, weight, or quality of a relationship, plus any necessary relationship metadata, such as timestamps, version numbers, etc.

## Cypher

Cypher is a declarative, SQL-inspired language for describing patterns in graphs visually using an ascii-art syntax.

It allows us to state what we want to select, insert, update or delete from our graph data without requiring us to describe exactly how to do it.

![ANN](https://s3.amazonaws.com/dev.assets.neo4j.com/wp-content/uploads/cypher_pattern_simple.png)

## Cypher Examples on Movie Database

#### // Viewing first 5 nodes

MATCH (z) RETURN z LIMIT 5

#### // (): represents "Node"

MATCH (anyone:Person) RETURN anyone LIMIT 5

#### // (node1)--(node2) represents relationship between nodes

#### // In order to represent the relation we need to insert "[]" to "--" operator

#### // In order to represent the direction of the relation we need to insert ">" to "-" operator

#### // In order to filter the relation based on a relation label we need to specify it in [rel:LABEL]

#### // In order to filter the nodes we need to specify labels in (node1:LABEL)

#### // In order to add more relation labels we need to insert it in [rel: LABEL1 |& LABEL2]

MATCH (node1:Person)-[rel:ACTED_IN | DIRECTED]->(node2:Movie) 

RETURN node1,rel,node2 

LIMIT 10
    
#### // Movie titles in which actors also directed the movie

MATCH (movie:Movie)

MATCH (director:Person)-[:DIRECTED]->(movie)

MATCH (director:Person)-[:ACTED_IN]->(movie)

return movie.title, director.name

LIMIT 10
       
#### // Nested Contact List

// In order to avoid contacts with themselves we insert <> operators to where clause

MATCH (node1:Person)-[:HAS_CONTACT]->(node2:Person)

MATCH (node2:Person)-[:HAS_CONTACT]->(node3:Person)

WHERE node1 <> node3 and node1<>node2 and node2<>node3

RETURN node1.name,node2.name,node3.name

LIMIT 1
    
#### // All contacts whether they directed a movie

MATCH (person1:Person)-[:HAS_CONTACT]->(person2:Person)

OPTIONAL MATCH  (person2:Person)-[:DIRECTED]->(movie)

return person1.name, person2.name,movie.title

LIMIT 100
    
#### // All movie titles with director.name column shows whether the actor and the director is the same with actor name

#### // Optional Match works same as left outer join

MATCH (movie:Movie)

OPTIONAL MATCH  (director:Person)-[:DIRECTED]->(movie)<-[:ACTED_IN]-(director)

return movie.title, director.name

LIMIT 100
    
#### // Filtering Nodes - 1: Filtering in Match

MATCH (person:Person{name:'Tom Hanks',born:1956})

RETURN person

LIMIT 1
                      
#### // Filtering Nodes: 2 - Where Clause

MATCH (person:Person)

WHERE person.name = 'Tom Hanks' and person.born = 1956

RETURN person

LIMIT 1
    
#### // Filtering Nodes: 3 - Comparsion Operators

MATCH (person:Person)

WHERE person.born >= 1956 and person.born <= 1986

RETURN person.name

LIMIT 5
    
#### // Filtering Nodes - 4: Boolean Operators

MATCH (person:Person)

WHERE (person.born in [1957,1958] or person.born >= 1986) and NOT person.born in [1957]

RETURN person.name
    
#### // Filtering Nodes - 5: Using nodes in where clause

#### // Filtering only the acted persons

MATCH (person:Person)-->(movie:Movie)

WHERE movie.title = 'Unforgiven' and NOT (person)-[:DIRECTED]->(movie)

RETURN person, movie
                                                                
#### // Filtering Nodes - 6: Regular Expressions

#### // The.* filters starting with The

#### // *.The filters finishing with The

#### // (?i) makes the filter incase sensitive

MATCH (movie:Movie)

WHERE movie.title =~ '(?i).*The.*'

RETURN movie.title
    
#### // Filtering Nodes - 7: Order By

MATCH (actor:Person)-[role:ACTED_IN]->(movie:Movie)

WHERE movie.title = 'Top Gun'

RETURN actor.name as Name, role.earnings as Earnings

ORDER BY role.earnings DESC

LIMIT 3    

#### // String Functions - 1: toString

RETURN toString(10),toString("String")

#### // String Functions - 2: Trim

RETURN trim("  ABC  ")

#### // String Functions - 3: Replace

RETURN replace("Hello","l","r")

#### // String Functions - 4:ToUpper

RETURN toUpper("abc")

#### // Math Functions - 1: Floor

RETURN floor(2.232323)

#### // Math Functions - 2: Ceil

RETURN ceil(2.232323232)

#### // Aggregation Functions

MATCH (person:Person{name: 'Tom Hanks'})-[role:ACTED_IN]->(movie:Movie)

RETURN count(person) as Acted_Count, person.name, sum(role.earnings) as Total_Earnt, avg(role.earnings) as Avg_Earnt,
min(role.earnings) as Min_Earnt, max(role.earnings) as Max_Earnt

#### // Find all Tom Hank's contacts that were born in 1960 or later and have earnt over 10m from a single movie

#### // Order the results by the highest paid actors first. label the columns 'ContactName' and 'Born'

MATCH (person1:Person)-[:HAS_CONTACT]->(person2:Person)

MATCH (person2:Person)-[role:ACTED_IN]->(movie:Movie)

WHERE person1.name = 'Tom Hanks' and person2.born >= 1960 and role.earnings > 10000000

RETURN person2.name as ContactName,person2.born as Born,role.earnings

ORDER BY role.earnings DESC

#### // Find the actor with the highest average earnings. Round their earnings and display Actor's name in uppercase

MATCH (person:Person)-[role:ACTED_IN]->(movie:Movie)

RETURN round(avg(role.earnings)) as Avg_Earnt, ToUpper(person.name) as Name

ORDER BY round(avg(role.earnings)) DESC

LIMIT 1

#### // Creating Node, Label and Property

CREATE (cat:Cat:Animal{sound:"Meow", eats:"Birds"})

RETURN cat

#### // Creating node and adding relationship

CREATE (cat:Cat{name:"Sutlac"})-[:GROOMS{period: 'Daily'}]->(cat)

#### // Adding Hateful Eight Movie to database with director Quentin Tarantino

// First checking whether movie exists or not

// MATCH (movie:Movie)

// WHERE movie.title =~ '(?i)the hateful eight'

// RETURN movie

CREATE (movie:Movie{title: 'The Hateful Eight'}),
(quentin:Person{name: 'Quentin Tarantino'}),
(quentin)-[:DIRECTED]->(movie)
RETURN quentin,movie

#### // Adding Hateful Eight Movie to database with director Quentin Tarantino

// First checking whether movie exists or not

// MATCH (movie:Movie)

// WHERE movie.title =~ '(?i)the hateful eight'

// RETURN movie

CREATE (movie:Movie{title: 'The Hateful Eight'}),
(quentin:Person{name: 'Quentin Tarantino'}),
(quentin)-[:DIRECTED]->(movie)
RETURN quentin,movie

#### // Adding Zoe Bell as Quentin Tarantino's contact who played in movie and earned 1m$

// MATCH (person:Person)

// WHERE person.name =~ '(?i).*zoe bell.*'

// RETURN person

CREATE (zoe_bell:Person{name: 'Zoe Bell', born: 1978})

MATCH (zoe_bell:Person{name: 'Zoe Bell'}), (quentin:Person{name: 'Quetin Tarantino'}), (movie:Movie{title:'The Hateful Eight'})

CREATE (quentin)-[:HAS_CONTACT]->(zoe_bell), (zoe_bell)-[:ACTED_IN{earnings:1000000}]->(movie)

RETURN zoe_bell,quentin,movie

#### // Deleting nodes: To delete the nodes first we need to delete the relationships.

#### // This code deletes all nodes and relationships in database

MATCH (node)-[rel]-()

DELETE node,rel

#### // To delete nodes with relationships

MATCH (node)

DETACH DELETE node

#### // Remove Tom Hanks as a contact from all actors. We need to delete just the relationship from others to Tom Hanks

// This only deletes the relationship. So Tom Hanks data is not deleted.

MATCH (tom:Person{name:'Tom Hanks'}), (other)-[rel:HAS_CONTACT]->(tom)

DELETE rel

#### // Removing Da Vinci Code movie. Actors, directors will stay.

#### // Solution 1

MATCH (movie:Movie{title:'The Da Vinci Code'}), (node)-[rel]->(movie)

DELETE movie,rel

#### // Solution 2

MATCH (movie:Movie{title:'The Da Vinci Code'})

DETACH DELETE movie

#### // Adding properties to existing nodes

MATCH (tom:Person{name:'Tom Hanks'})

SET tom.sex = 'male'

RETURN tom

#### // Removing property from a node

MATCH (tom:Person{name:'Tom Hanks'})

REMOVE tom.sex

RETURN tom

#### // Removing label from a node

MATCH (tom:Person{name:'Tom Hanks'})

REMOVE tom:Handsome	

RETURN tom

#### // Setting dynamic column with group by

MATCH (tom:Person{name: 'Tom Hanks'})-[:HAS_CONTACT]->(contact)

WITH tom, count(contact) AS num_of_contacts

SET tom.num_of_contacts = num_of_contacts

RETURN tom

#### // Merge clause searches nodes for specific case. If it does not exists it creates it, else nothing is done.

MERGE (movie:Movie{title: 'Pride and Prejudice and Zombies'})

MERGE (lily:Person{name: 'Lily James'})

MERGE (lily)-[role:ACTED_IN]->(movie)

SET lily.born = 1989

SET role.earnings = 900000, role.roles = ['Elizabeth']

RETURN lily, movie

#### // Nth Degree Relationships

MATCH(keanu:Person{name: 'Keanu Reeves'})-[rel:HAS_CONTACT*1]->(contact)

RETURN keanu, rel, contact

#### // Shortest path

MATCH(keanu:Person{name: 'Keanu Reeves'})

MATCH(tom:Person{name: 'Tom Cruise'})

MATCH path = shortestPath((keanu)-[:HAS_CONTACT*..20]->(tom))

RETURN length(path)

#### // All Shortest path

MATCH(keanu:Person{name: 'Keanu Reeves'})

MATCH(tom:Person{name: 'Tom Cruise'})

MATCH path = allShortestPaths((keanu)-[:HAS_CONTACT*..20]->(tom))

RETURN length(path), path

#### // Shortest path of actors in Matrix and Top Gun

MATCH(person1:Person)-[:ACTED_IN]->(movie1:Movie{title: 'The Matrix'})

MATCH(person2:Person)-[:ACTED_IN]->(movie2:Movie{title: 'Top Gun'})

MATCH path = shortestPath((person1)-[:HAS_CONTACT*..20]->(person2))

RETURN length(path), path

LIMIT 1