# Graph Types and Structures

In [None]:
import neo4j

import pandas as pd

from IPython.display import display

# Lab: Neo4j - Shapes - Random Network, Small-World Network, Scale-Free Network

## Neo4j is a NoSQL Graph database; scales out well; first and still biggest player in the graph database market; 

Community Edition, which we will be using, is quite limited compared to the paid Enterprise Edition, such as only supporting 1 database (graph) at a time

You don't have to know everything about Neo4j to use it - it's a huge product - if you learn the basics of Neo4j and learn how to run the major graph algorithms, you can be very productive with the product in a couple of weeks! 

Most Neo4j users simply create graph databases and run simple queries to return nodes based on labels, attributes, relationships, types, etc.

Neo4j has a special advanced Data Science module that will be using;  this is the module that will have all of the advanced graph algorithms mentioned in the asynch for both weeks; (most users of Neo4j can use it for years and never even know these advanced algorithms even exist!)

The goal of these 2 weeks is for you to learn:  the basics of a graph database, how to run simple queries,  how to run the Data Science graph algorithms, and what real world problems you can solve with the Data Science graph algorithms

## Nodes (vertices) can have: labels for classification and properties (attributes as key / value pairs)

## Relationships (edges) can have: type, direction, and properties (attributes as key / value pairs); 

One seemingly strange thing about Neo4j: every relationship has to have a direction, however, the default behavior is to ignore direction unless we explicitely put direction in our queries, and most canned algorithms ignore direction by default; when a relationship is 2-way, Neo4j documentation and their examples recommend just putting it in arbitrary direction and using directionless queries and algorithms

## Web server interface at https://xxxx:7473

#### Update - since the videos were filmed, neo4j requires a longer, more complex password, so the newest password is here:

**Username: neo4j**

**Password: ucb_mids_w205**

The above web server allows and interactive GUI which can output graphs visually in addition to table like output.  The nodes in the graphs can be moved around with the mouse to make the graphs more readable.


#### Basics:

```:server connect``` - connect to the server, username is "neo4j", password is "ucb_mids_w205"


```:server status``` - shows that username and server you are logged into


```:clear``` - clears off old cells


```show databases``` - note that community edition only has 1 application database that we can use neo4j, we cannot create now use other databases, we have to wipe out neo4j database for each new graph


## Python in Jupyter Notebooks for Functional Programming

We will be using Python in Jupyter Notebooks to programmatically interface with Neo4j.  The responses we receive will be in table format, similar to responses we received from SQL.  Just like we have functional programming of Python calling SQL, we can also have functional programming of Python calling Neo4j.


## Cypher is the query language, analogous to SQL for a relational database; Cypher is open source and like SQL is used for multiple database, Cypher can be used for other graph databases


```()``` node


```[]``` relationships


```-> <-``` directions, every relationship must have 1 and only 1 direction


```(p:Person)``` p is a variable, Person is a node label


```(:Person)``` no variable, Person is a node label


```(p:Person {name: 'John', birth_year: 1970})``` name is a property of the node with value 'John', and birth_year is a property with value 1970


```(p1:Person {name: 'John'})-[r:IS_FRIEND_OF]->(p2:Person {name: 'Mary'})``` r is a variable, IS_FRIEND_OF is a relationship type


```(p1:Person {name: 'John'})-[:IS_FRIEND_OF {since: 1983}]->(p2:Person {name: 'Mary'})``` since is a property of the relationship

```match``` matches a pattern of nodes and/or relationships

```return``` which properties of nodes and/or relationships to return

```order by``` sorting just like SQL

```limit```  limiting the rows returned just like SQL

```collect``` a form of a pivot to turn rows into a list

```unwind``` a form of an unpivot to turn a list into rows

```create``` creates nodes and/or relationships

```delete``` deletes nodes and/or relationships

## Connect, login, create driver, create session; with community edition, we can only use 1 database, the "neo4j" database

In [2]:
driver = neo4j.GraphDatabase.driver(uri="neo4j://neo4j:7687", auth=("neo4j","ucb_mids_w205"))

In [3]:
session = driver.session(database="neo4j")

## my_neo4j_wipe_out_database() - since community edition can only have 1 database "neo4j", this function will wipe out all the nodes and relationships

In [4]:
def my_neo4j_wipe_out_database():
    "wipe out database by deleting all nodes and relationships"
    
    query = "match (node)-[relationship]->() delete node, relationship"
    session.run(query)
    
    query = "match (node) delete node"
    session.run(query)

## my_neo4j_run_query_pandas() will run a Cypher query and put the results in a Pandas dataframe; easy to see how you can use Python to manipulate the returned data

In [5]:
def my_neo4j_run_query_pandas(query, **kwargs):
    "run a query and return the results in a pandas dataframe"
    
    result = session.run(query, **kwargs)
    
    df = pd.DataFrame([r.values() for r in result], columns=result.keys())
    
    return df

## my_neo4j_nodes_relationships() will print the nodes (assumes a name property) and relationships

In [6]:
def my_neo4j_nodes_relationships():
    "print all the nodes and relationships"
   
    print("-------------------------")
    print("  Nodes:")
    print("-------------------------")
    
    query = """
        match (n) 
        return n.name as node_name, labels(n) as labels
        order by n.name
    """
    
    df = my_neo4j_run_query_pandas(query)
    
    number_nodes = df.shape[0]
    
    display(df)
    
    print("-------------------------")
    print("  Relationships:")
    print("-------------------------")
    
    query = """
        match (n1)-[r]->(n2) 
        return n1.name as node_name_1, labels(n1) as node_1_labels, 
            type(r) as relationship_type, n2.name as node_name_2, labels(n2) as node_2_labels
        order by node_name_1, node_name_2
    """
    
    df = my_neo4j_run_query_pandas(query)
    
    number_relationships = df.shape[0]
    
    display(df)
    
    density = (2 * number_relationships) / (number_nodes * (number_nodes - 1))
    
    print("-------------------------")
    print("  Density:", f'{density:.1f}')
    print("-------------------------")
    

## Simple graph of several people and who is a friend of whom

In [7]:
my_neo4j_wipe_out_database()

query = """

CREATE
  (john:Person {name: 'John', born: 1970}),
  (mary:Person {name: 'Mary', born: 1972}),
  (mark:Person {name: 'Mark', born: 1968}),
  (linda:Person {name: 'Linda', born: 1967}),
  (larry:Person {name: 'Larry', born: 1965}),
  (john)-[:IS_FRIEND_OF {since: 1983}]->(mary),
  (john)-[:IS_FRIEND_OF {since: 1984}]->(mark),
  (john)-[:IS_FRIEND_OF {since: 1982}]->(linda),
  (mary)-[:IS_FRIEND_OF {since: 1981}]->(larry),
  (mark)-[:IS_FRIEND_OF {since: 1983}]->(larry),
  (linda)-[:IS_FRIEND_OF {since: 1984}]->(larry)

"""

session.run(query)

<neo4j._sync.work.result.Result at 0x7faa34351280>

In [8]:
my_neo4j_nodes_relationships()

-------------------------
  Nodes:
-------------------------


Unnamed: 0,node_name,labels
0,John,[Person]
1,Larry,[Person]
2,Linda,[Person]
3,Mark,[Person]
4,Mary,[Person]


-------------------------
  Relationships:
-------------------------


Unnamed: 0,node_name_1,node_1_labels,relationship_type,node_name_2,node_2_labels
0,John,[Person],IS_FRIEND_OF,Linda,[Person]
1,John,[Person],IS_FRIEND_OF,Mark,[Person]
2,John,[Person],IS_FRIEND_OF,Mary,[Person]
3,Linda,[Person],IS_FRIEND_OF,Larry,[Person]
4,Mark,[Person],IS_FRIEND_OF,Larry,[Person]
5,Mary,[Person],IS_FRIEND_OF,Larry,[Person]


-------------------------
  Density: 0.6
-------------------------


## In the Neo4j GUI, run the following query with graph output and rearrange the nodes with your mouse if necessary:

```match (n) return n```

## Random Network - flat, no patterns, all nodes have the same probability of being attached to each other

In [9]:
my_neo4j_wipe_out_database()

query = """

CREATE
  (a:Generic {name: 'A'}),
  (b:Generic {name: 'B'}),
  (c:Generic {name: 'C'}),
  (d:Generic {name: 'D'}),
  (e:Generic {name: 'E'}),
  (f:Generic {name: 'F'}),
  (g:Generic {name: 'G'}),
  (h:Generic {name: 'H'}),
  (i:Generic {name: 'I'}),
  (a)-[:IS_CONNECTED_TO]->(e),
  (a)-[:IS_CONNECTED_TO]->(g),
  (b)-[:IS_CONNECTED_TO]->(g),
  (c)-[:IS_CONNECTED_TO]->(f),
  (c)-[:IS_CONNECTED_TO]->(g),
  (d)-[:IS_CONNECTED_TO]->(h),
  (e)-[:IS_CONNECTED_TO]->(h),
  (f)-[:IS_CONNECTED_TO]->(g),
  (f)-[:IS_CONNECTED_TO]->(i)
  
"""

session.run(query)

<neo4j._sync.work.result.Result at 0x7fa9f8a5d790>

In [10]:
my_neo4j_nodes_relationships()

-------------------------
  Nodes:
-------------------------


Unnamed: 0,node_name,labels
0,A,[Generic]
1,B,[Generic]
2,C,[Generic]
3,D,[Generic]
4,E,[Generic]
5,F,[Generic]
6,G,[Generic]
7,H,[Generic]
8,I,[Generic]


-------------------------
  Relationships:
-------------------------


Unnamed: 0,node_name_1,node_1_labels,relationship_type,node_name_2,node_2_labels
0,A,[Generic],IS_CONNECTED_TO,E,[Generic]
1,A,[Generic],IS_CONNECTED_TO,G,[Generic]
2,B,[Generic],IS_CONNECTED_TO,G,[Generic]
3,C,[Generic],IS_CONNECTED_TO,F,[Generic]
4,C,[Generic],IS_CONNECTED_TO,G,[Generic]
5,D,[Generic],IS_CONNECTED_TO,H,[Generic]
6,E,[Generic],IS_CONNECTED_TO,H,[Generic]
7,F,[Generic],IS_CONNECTED_TO,G,[Generic]
8,F,[Generic],IS_CONNECTED_TO,I,[Generic]


-------------------------
  Density: 0.2
-------------------------


## Small-World Network - high degree of local clustering, short average path lengths, hub and spoke, no node more than a few relationships away from any other node

In [11]:
my_neo4j_wipe_out_database()

query = """

CREATE
  (a:Generic {name: 'A'}),
  (b:Generic {name: 'B'}),
  (c:Generic {name: 'C'}),
  (d:Generic {name: 'D'}),
  (e:Generic {name: 'E'}),
  (f:Generic {name: 'F'}),
  (g:Generic {name: 'G'}),
  (h:Generic {name: 'H'}),
  (i:Generic {name: 'I'}),
  (j:Generic {name: 'J'}),
  (k:Generic {name: 'K'}),
  (l:Generic {name: 'L'}),
  (m:Generic {name: 'M'}),
  (n:Generic {name: 'N'}),
  (o:Generic {name: 'O'}),
  (p:Generic {name: 'P'}),
  (a)-[:IS_CONNECTED_TO]->(c),
  (a)-[:IS_CONNECTED_TO]->(o),
  (a)-[:IS_CONNECTED_TO]->(i),
  (b)-[:IS_CONNECTED_TO]->(d),
  (b)-[:IS_CONNECTED_TO]->(n),
  (b)-[:IS_CONNECTED_TO]->(o),
  (b)-[:IS_CONNECTED_TO]->(p),
  (c)-[:IS_CONNECTED_TO]->(e),
  (d)-[:IS_CONNECTED_TO]->(i),
  (d)-[:IS_CONNECTED_TO]->(p),
  (e)-[:IS_CONNECTED_TO]->(g),
  (e)-[:IS_CONNECTED_TO]->(h),
  (e)-[:IS_CONNECTED_TO]->(n),
  (f)-[:IS_CONNECTED_TO]->(h),
  (g)-[:IS_CONNECTED_TO]->(i),
  (g)-[:IS_CONNECTED_TO]->(n),
  (h)-[:IS_CONNECTED_TO]->(j),
  (i)-[:IS_CONNECTED_TO]->(k),
  (j)-[:IS_CONNECTED_TO]->(o),
  (k)-[:IS_CONNECTED_TO]->(m),
  (l)-[:IS_CONNECTED_TO]->(n),
  (m)-[:IS_CONNECTED_TO]->(o)
  
"""

session.run(query)

<neo4j._sync.work.result.Result at 0x7faa34351790>

In [12]:
my_neo4j_nodes_relationships()

-------------------------
  Nodes:
-------------------------


Unnamed: 0,node_name,labels
0,A,[Generic]
1,B,[Generic]
2,C,[Generic]
3,D,[Generic]
4,E,[Generic]
5,F,[Generic]
6,G,[Generic]
7,H,[Generic]
8,I,[Generic]
9,J,[Generic]


-------------------------
  Relationships:
-------------------------


Unnamed: 0,node_name_1,node_1_labels,relationship_type,node_name_2,node_2_labels
0,A,[Generic],IS_CONNECTED_TO,C,[Generic]
1,A,[Generic],IS_CONNECTED_TO,I,[Generic]
2,A,[Generic],IS_CONNECTED_TO,O,[Generic]
3,B,[Generic],IS_CONNECTED_TO,D,[Generic]
4,B,[Generic],IS_CONNECTED_TO,N,[Generic]
5,B,[Generic],IS_CONNECTED_TO,O,[Generic]
6,B,[Generic],IS_CONNECTED_TO,P,[Generic]
7,C,[Generic],IS_CONNECTED_TO,E,[Generic]
8,D,[Generic],IS_CONNECTED_TO,I,[Generic]
9,D,[Generic],IS_CONNECTED_TO,P,[Generic]


-------------------------
  Density: 0.2
-------------------------


## Scale-Free Network - hub and spoke in multiple scales, power-law distribution (change in one quantity results in relatively proportional change in another quantity

In [13]:
my_neo4j_wipe_out_database()

query = """

CREATE
  (a:Generic {name: 'A'}),
  (b:Generic {name: 'B'}),
  (c:Generic {name: 'C'}),
  (d:Generic {name: 'D'}),
  (e:Generic {name: 'E'}),
  (f:Generic {name: 'F'}),
  (g:Generic {name: 'G'}),
  (h:Generic {name: 'H'}),
  (i:Generic {name: 'I'}),
  (j:Generic {name: 'J'}),
  (k:Generic {name: 'K'}),
  (l:Generic {name: 'L'}),
  (m:Generic {name: 'M'}),
  (n:Generic {name: 'N'}),
  (o:Generic {name: 'O'}),
  (p:Generic {name: 'P'}),
  (q:Generic {name: 'Q'}),
  (a)-[:IS_CONNECTED_TO]->(b),
  (a)-[:IS_CONNECTED_TO]->(c),
  (a)-[:IS_CONNECTED_TO]->(d),
  (b)-[:IS_CONNECTED_TO]->(n),
  (b)-[:IS_CONNECTED_TO]->(o),
  (b)-[:IS_CONNECTED_TO]->(p),
  (b)-[:IS_CONNECTED_TO]->(q),
  (c)-[:IS_CONNECTED_TO]->(j),
  (c)-[:IS_CONNECTED_TO]->(k),
  (c)-[:IS_CONNECTED_TO]->(l),
  (c)-[:IS_CONNECTED_TO]->(m),
  (d)-[:IS_CONNECTED_TO]->(e),
  (d)-[:IS_CONNECTED_TO]->(f),
  (d)-[:IS_CONNECTED_TO]->(g),
  (d)-[:IS_CONNECTED_TO]->(h),
  (d)-[:IS_CONNECTED_TO]->(i)
  
"""

session.run(query)

<neo4j._sync.work.result.Result at 0x7fa9f8a340a0>

In [14]:
my_neo4j_nodes_relationships()

-------------------------
  Nodes:
-------------------------


Unnamed: 0,node_name,labels
0,A,[Generic]
1,B,[Generic]
2,C,[Generic]
3,D,[Generic]
4,E,[Generic]
5,F,[Generic]
6,G,[Generic]
7,H,[Generic]
8,I,[Generic]
9,J,[Generic]


-------------------------
  Relationships:
-------------------------


Unnamed: 0,node_name_1,node_1_labels,relationship_type,node_name_2,node_2_labels
0,A,[Generic],IS_CONNECTED_TO,B,[Generic]
1,A,[Generic],IS_CONNECTED_TO,C,[Generic]
2,A,[Generic],IS_CONNECTED_TO,D,[Generic]
3,B,[Generic],IS_CONNECTED_TO,N,[Generic]
4,B,[Generic],IS_CONNECTED_TO,O,[Generic]
5,B,[Generic],IS_CONNECTED_TO,P,[Generic]
6,B,[Generic],IS_CONNECTED_TO,Q,[Generic]
7,C,[Generic],IS_CONNECTED_TO,J,[Generic]
8,C,[Generic],IS_CONNECTED_TO,K,[Generic]
9,C,[Generic],IS_CONNECTED_TO,L,[Generic]


-------------------------
  Density: 0.1
-------------------------


# Lab: Neo4j - Connected, Disconnected, Weighted, Unweighted

## Connected - path between any 2 nodes (regardless of the distance), all graphs we have seen up until now have been connected

## Disconnected - some nodes may not be connected to other nodes, "islands" - group of connected nodes disconnected from the main graph

In [15]:
my_neo4j_wipe_out_database()

query = """

CREATE
  (a:Generic {name: 'A'}),
  (b:Generic {name: 'B'}),
  (c:Generic {name: 'C'}),
  (d:Generic {name: 'D'}),
  (e:Generic {name: 'E'}),
  (f:Generic {name: 'F'}),
  (g:Generic {name: 'G'}),
  (h:Generic {name: 'H'}),
  (i:Generic {name: 'I'}),
  (a)-[:IS_CONNECTED_TO]->(b),
  (a)-[:IS_CONNECTED_TO]->(c),
  (d)-[:IS_CONNECTED_TO]->(e),
  (f)-[:IS_CONNECTED_TO]->(g),
  (f)-[:IS_CONNECTED_TO]->(h),
  (f)-[:IS_CONNECTED_TO]->(i),
  (g)-[:IS_CONNECTED_TO]->(h),
  (h)-[:IS_CONNECTED_TO]->(i)

"""

session.run(query)

<neo4j._sync.work.result.Result at 0x7fa9f8a349d0>

In [16]:
my_neo4j_nodes_relationships()

-------------------------
  Nodes:
-------------------------


Unnamed: 0,node_name,labels
0,A,[Generic]
1,B,[Generic]
2,C,[Generic]
3,D,[Generic]
4,E,[Generic]
5,F,[Generic]
6,G,[Generic]
7,H,[Generic]
8,I,[Generic]


-------------------------
  Relationships:
-------------------------


Unnamed: 0,node_name_1,node_1_labels,relationship_type,node_name_2,node_2_labels
0,A,[Generic],IS_CONNECTED_TO,B,[Generic]
1,A,[Generic],IS_CONNECTED_TO,C,[Generic]
2,D,[Generic],IS_CONNECTED_TO,E,[Generic]
3,F,[Generic],IS_CONNECTED_TO,G,[Generic]
4,F,[Generic],IS_CONNECTED_TO,H,[Generic]
5,F,[Generic],IS_CONNECTED_TO,I,[Generic]
6,G,[Generic],IS_CONNECTED_TO,H,[Generic]
7,H,[Generic],IS_CONNECTED_TO,I,[Generic]


-------------------------
  Density: 0.2
-------------------------


## Unweighted - no numeric value placed on the relationships, all the graphs we have seen up until now have been unweighted, some algorithms require weights, some algorithms do not consider weights, some algoritms default unweighted to a weight of 1 

## Weighted - numeric values placed on a relationship

In [17]:
my_neo4j_wipe_out_database()

query = """

CREATE
  (a:Generic {name: 'A', weight: 5}),
  (b:Generic {name: 'B', weight: 7}),
  (c:Generic {name: 'C', weight: 8}),
  (d:Generic {name: 'D', weight: 3}),
  (e:Generic {name: 'E', weight: 4}),
  (a)-[:IS_CONNECTED_TO {weight: 110}]->(b),
  (a)-[:IS_CONNECTED_TO {weight: 120}]->(c),
  (b)-[:IS_CONNECTED_TO {weight: 123}]->(c),
  (c)-[:IS_CONNECTED_TO {weight: 127}]->(d),
  (d)-[:IS_CONNECTED_TO {weight: 117}]->(e)
  
"""

session.run(query)

<neo4j._sync.work.result.Result at 0x7fa9f8a34c10>

In [18]:
my_neo4j_nodes_relationships()

-------------------------
  Nodes:
-------------------------


Unnamed: 0,node_name,labels
0,A,[Generic]
1,B,[Generic]
2,C,[Generic]
3,D,[Generic]
4,E,[Generic]


-------------------------
  Relationships:
-------------------------


Unnamed: 0,node_name_1,node_1_labels,relationship_type,node_name_2,node_2_labels
0,A,[Generic],IS_CONNECTED_TO,B,[Generic]
1,A,[Generic],IS_CONNECTED_TO,C,[Generic]
2,B,[Generic],IS_CONNECTED_TO,C,[Generic]
3,C,[Generic],IS_CONNECTED_TO,D,[Generic]
4,D,[Generic],IS_CONNECTED_TO,E,[Generic]


-------------------------
  Density: 0.5
-------------------------


# Lab: Neo4j - Cyclic, Acyclic, Trees

## Cyclic  - has cycles, path from a node back to itself, some algorithms can get stuck in cycles

In [21]:
my_neo4j_wipe_out_database()

query = """

CREATE
  (a:Generic {name: 'A'}),
  (b:Generic {name: 'B'}),
  (c:Generic {name: 'C'}),
  (d:Generic {name: 'D'}),
  (e:Generic {name: 'E'}),
  (f:Generic {name: 'F'}),
  (a)-[:IS_CONNECTED_TO]->(c),
  (c)-[:IS_CONNECTED_TO]->(d),
  (d)-[:IS_CONNECTED_TO]->(a),
  (d)-[:IS_CONNECTED_TO]->(e),
  (c)-[:IS_CONNECTED_TO]->(b),
  (b)-[:IS_CONNECTED_TO]->(f),
  (f)-[:IS_CONNECTED_TO]->(c)
  
"""

session.run(query)

<neo4j._sync.work.result.Result at 0x7fa9f8a34400>

In [22]:
my_neo4j_nodes_relationships()

-------------------------
  Nodes:
-------------------------


Unnamed: 0,node_name,labels
0,A,[Generic]
1,B,[Generic]
2,C,[Generic]
3,D,[Generic]
4,E,[Generic]
5,F,[Generic]


-------------------------
  Relationships:
-------------------------


Unnamed: 0,node_name_1,node_1_labels,relationship_type,node_name_2,node_2_labels
0,A,[Generic],IS_CONNECTED_TO,C,[Generic]
1,B,[Generic],IS_CONNECTED_TO,F,[Generic]
2,C,[Generic],IS_CONNECTED_TO,B,[Generic]
3,C,[Generic],IS_CONNECTED_TO,D,[Generic]
4,D,[Generic],IS_CONNECTED_TO,A,[Generic]
5,D,[Generic],IS_CONNECTED_TO,E,[Generic]
6,F,[Generic],IS_CONNECTED_TO,C,[Generic]


-------------------------
  Density: 0.5
-------------------------


## Acyclic - no cycles, no node has a path back to itself, a lot of algorithms require acyclic

In [24]:
my_neo4j_wipe_out_database()

query = """

CREATE
  (a:Generic {name: 'A'}),
  (b:Generic {name: 'B'}),
  (c:Generic {name: 'C'}),
  (d:Generic {name: 'D'}),
  (e:Generic {name: 'E'}),
  (f:Generic {name: 'F'}),
  (g:Generic {name: 'G'}),
  (a)-[:IS_CONNECTED_TO]->(c),
  (a)-[:IS_CONNECTED_TO]->(d),
  (a)-[:IS_CONNECTED_TO]->(f),
  (c)-[:IS_CONNECTED_TO]->(b),
  (c)-[:IS_CONNECTED_TO]->(g),
  (d)-[:IS_CONNECTED_TO]->(e)
  
"""

session.run(query)

<neo4j._sync.work.result.Result at 0x7fa9f8a47af0>

In [25]:
my_neo4j_nodes_relationships()

-------------------------
  Nodes:
-------------------------


Unnamed: 0,node_name,labels
0,A,[Generic]
1,B,[Generic]
2,C,[Generic]
3,D,[Generic]
4,E,[Generic]
5,F,[Generic]
6,G,[Generic]


-------------------------
  Relationships:
-------------------------


Unnamed: 0,node_name_1,node_1_labels,relationship_type,node_name_2,node_2_labels
0,A,[Generic],IS_CONNECTED_TO,C,[Generic]
1,A,[Generic],IS_CONNECTED_TO,D,[Generic]
2,A,[Generic],IS_CONNECTED_TO,F,[Generic]
3,C,[Generic],IS_CONNECTED_TO,B,[Generic]
4,C,[Generic],IS_CONNECTED_TO,G,[Generic]
5,D,[Generic],IS_CONNECTED_TO,E,[Generic]


-------------------------
  Density: 0.3
-------------------------


## Trees - acyclic graphs, spanning tree - all nodes in a graph with relationships removed to remove cycles; we will take our cyclic graph example and remove relationships until we have a spanning tree

In [26]:
my_neo4j_wipe_out_database()

query = """

CREATE
  (a1:Generic {name: 'A1'}),
  (b1:Generic {name: 'B1'}),
  (c1:Generic {name: 'C1'}),
  (d1:Generic {name: 'D1'}),
  (e1:Generic {name: 'E1'}),
  (f1:Generic {name: 'F1'}),
  (a1)-[:IS_CONNECTED_TO]->(c1),
  (c1)-[:IS_CONNECTED_TO]->(d1),
  (d1)-[:IS_CONNECTED_TO]->(a1),
  (d1)-[:IS_CONNECTED_TO]->(e1),
  (c1)-[:IS_CONNECTED_TO]->(b1),
  (b1)-[:IS_CONNECTED_TO]->(f1),
  (f1)-[:IS_CONNECTED_TO]->(c1),
  (a2:Generic {name: 'A2'}),
  (b2:Generic {name: 'B2'}),
  (c2:Generic {name: 'C2'}),
  (d2:Generic {name: 'D2'}),
  (e2:Generic {name: 'E2'}),
  (f2:Generic {name: 'F2'}),
  (a2)-[:IS_CONNECTED_TO]->(c2),
  (c2)-[:IS_CONNECTED_TO]->(b2),
  (b2)-[:IS_CONNECTED_TO]->(f2),
  (c2)-[:IS_CONNECTED_TO]->(d2),
  (d2)-[:IS_CONNECTED_TO]->(e2)
  
"""

session.run(query)

<neo4j._sync.work.result.Result at 0x7fa9f8aba820>

In [27]:
my_neo4j_nodes_relationships()

-------------------------
  Nodes:
-------------------------


Unnamed: 0,node_name,labels
0,A1,[Generic]
1,A2,[Generic]
2,B1,[Generic]
3,B2,[Generic]
4,C1,[Generic]
5,C2,[Generic]
6,D1,[Generic]
7,D2,[Generic]
8,E1,[Generic]
9,E2,[Generic]


-------------------------
  Relationships:
-------------------------


Unnamed: 0,node_name_1,node_1_labels,relationship_type,node_name_2,node_2_labels
0,A1,[Generic],IS_CONNECTED_TO,C1,[Generic]
1,A2,[Generic],IS_CONNECTED_TO,C2,[Generic]
2,B1,[Generic],IS_CONNECTED_TO,F1,[Generic]
3,B2,[Generic],IS_CONNECTED_TO,F2,[Generic]
4,C1,[Generic],IS_CONNECTED_TO,B1,[Generic]
5,C1,[Generic],IS_CONNECTED_TO,D1,[Generic]
6,C2,[Generic],IS_CONNECTED_TO,B2,[Generic]
7,C2,[Generic],IS_CONNECTED_TO,D2,[Generic]
8,D1,[Generic],IS_CONNECTED_TO,A1,[Generic]
9,D1,[Generic],IS_CONNECTED_TO,E1,[Generic]


-------------------------
  Density: 0.2
-------------------------


# Lab: Neo4j - Density Calculations, Sparse Graphs, Dense Graphs

## Density Calculations:

* Maximum Density = (nodes (nodes - 1) ) / 2


* Actual Density = (2 * relationships) / (nodes * (nodes - 1) )

## Sparce Graph - low density 

In [28]:
my_neo4j_wipe_out_database()

query = """

CREATE
  (a:Generic {name: 'A'}),
  (b:Generic {name: 'B'}),
  (c:Generic {name: 'C'}),
  (d:Generic {name: 'D'}),
  (e:Generic {name: 'E'}),
  (f:Generic {name: 'F'}),
  (a)-[:IS_CONNECTED_TO]->(b),
  (b)-[:IS_CONNECTED_TO]->(c),
  (c)-[:IS_CONNECTED_TO]->(d),
  (d)-[:IS_CONNECTED_TO]->(e),
  (e)-[:IS_CONNECTED_TO]->(f)
  
"""

session.run(query)

<neo4j._sync.work.result.Result at 0x7faa34351820>

In [29]:
my_neo4j_nodes_relationships()

-------------------------
  Nodes:
-------------------------


Unnamed: 0,node_name,labels
0,A,[Generic]
1,B,[Generic]
2,C,[Generic]
3,D,[Generic]
4,E,[Generic]
5,F,[Generic]


-------------------------
  Relationships:
-------------------------


Unnamed: 0,node_name_1,node_1_labels,relationship_type,node_name_2,node_2_labels
0,A,[Generic],IS_CONNECTED_TO,B,[Generic]
1,B,[Generic],IS_CONNECTED_TO,C,[Generic]
2,C,[Generic],IS_CONNECTED_TO,D,[Generic]
3,D,[Generic],IS_CONNECTED_TO,E,[Generic]
4,E,[Generic],IS_CONNECTED_TO,F,[Generic]


-------------------------
  Density: 0.3
-------------------------


## Dense Graph - high density

In [30]:
my_neo4j_wipe_out_database()

query = """

CREATE
  (a:Generic {name: 'A'}),
  (b:Generic {name: 'B'}),
  (c:Generic {name: 'C'}),
  (d:Generic {name: 'D'}),
  (e:Generic {name: 'E'}),
  (f:Generic {name: 'F'}),
  (a)-[:IS_CONNECTED_TO]->(b),
  (a)-[:IS_CONNECTED_TO]->(c),
  (a)-[:IS_CONNECTED_TO]->(d),
  (a)-[:IS_CONNECTED_TO]->(f),
  (b)-[:IS_CONNECTED_TO]->(c),
  (b)-[:IS_CONNECTED_TO]->(d),
  (b)-[:IS_CONNECTED_TO]->(e),
  (b)-[:IS_CONNECTED_TO]->(f),
  (c)-[:IS_CONNECTED_TO]->(d),
  (c)-[:IS_CONNECTED_TO]->(e),
  (c)-[:IS_CONNECTED_TO]->(f),
  (d)-[:IS_CONNECTED_TO]->(e),
  (d)-[:IS_CONNECTED_TO]->(f),
  (e)-[:IS_CONNECTED_TO]->(f)
  
"""

session.run(query)

<neo4j._sync.work.result.Result at 0x7fa9f8a47790>

In [31]:
my_neo4j_nodes_relationships()

-------------------------
  Nodes:
-------------------------


Unnamed: 0,node_name,labels
0,A,[Generic]
1,B,[Generic]
2,C,[Generic]
3,D,[Generic]
4,E,[Generic]
5,F,[Generic]


-------------------------
  Relationships:
-------------------------


Unnamed: 0,node_name_1,node_1_labels,relationship_type,node_name_2,node_2_labels
0,A,[Generic],IS_CONNECTED_TO,B,[Generic]
1,A,[Generic],IS_CONNECTED_TO,C,[Generic]
2,A,[Generic],IS_CONNECTED_TO,D,[Generic]
3,A,[Generic],IS_CONNECTED_TO,F,[Generic]
4,B,[Generic],IS_CONNECTED_TO,C,[Generic]
5,B,[Generic],IS_CONNECTED_TO,D,[Generic]
6,B,[Generic],IS_CONNECTED_TO,E,[Generic]
7,B,[Generic],IS_CONNECTED_TO,F,[Generic]
8,C,[Generic],IS_CONNECTED_TO,D,[Generic]
9,C,[Generic],IS_CONNECTED_TO,E,[Generic]


-------------------------
  Density: 0.9
-------------------------


# Lab: Neo4j - Monopartite, Bipartite, k-Partite Graphs

## Monopartitie - 1 node label, 1 relationship type, all graphs we have seen so far are monopartitie

## Bipartite - two sets, nodes from one set only connect to nodes in the other set

In [32]:
my_neo4j_wipe_out_database()

query = """

CREATE
  (john:Person {name: 'John'}),
  (mary:Person {name: 'Mary'}),
  (linda:Person {name: 'Linda'}),
  (a:Club {name: 'Club A'}),
  (b:Club {name: 'Club B'}),
  (c:Club {name: 'Club C'}),
  (mary)-[:IS_MEMBER_OF]->(a),
  (mary)-[:IS_MEMBER_OF]->(b),
  (mary)-[:IS_MEMBER_OF]->(c),
  (linda)-[:IS_MEMBER_OF]->(a),
  (john)-[:IS_MEMBER_OF]->(b),
  (john)-[:IS_MEMBER_OF]->(c)
  
"""

session.run(query)

<neo4j._sync.work.result.Result at 0x7fa9f8a17790>

In [33]:
my_neo4j_nodes_relationships()

-------------------------
  Nodes:
-------------------------


Unnamed: 0,node_name,labels
0,Club A,[Club]
1,Club B,[Club]
2,Club C,[Club]
3,John,[Person]
4,Linda,[Person]
5,Mary,[Person]


-------------------------
  Relationships:
-------------------------


Unnamed: 0,node_name_1,node_1_labels,relationship_type,node_name_2,node_2_labels
0,John,[Person],IS_MEMBER_OF,Club B,[Club]
1,John,[Person],IS_MEMBER_OF,Club C,[Club]
2,Linda,[Person],IS_MEMBER_OF,Club A,[Club]
3,Mary,[Person],IS_MEMBER_OF,Club A,[Club]
4,Mary,[Person],IS_MEMBER_OF,Club B,[Club]
5,Mary,[Person],IS_MEMBER_OF,Club C,[Club]


-------------------------
  Density: 0.4
-------------------------


## k-Partite - k sets, nodes from one set only connect to nodes in another set, most real world graphs have a high k value; in this example k=4: Person, Club, Course, Day

In [34]:
my_neo4j_wipe_out_database()

query = """

CREATE
  (john:Person {name: 'John'}),
  (mary:Person {name: 'Mary'}),
  (linda:Person {name: 'Linda'}),
  (a:Club {name: 'Club A'}),
  (b:Club {name: 'Club B'}),
  (c:Club {name: 'Club C'}),
  (ds:Course {name: 'Data Structures'}),
  (as:Course {name: 'Assembler'}),
  (st:Course {name: 'Statictics'}),
  (lt:Course {name: 'Laplace Transform'}),
  (tu:Day {name: 'Tuesday'}),
  (th:Day {name: 'Thursday'}),
  (mary)-[:IS_MEMBER_OF]->(a),
  (mary)-[:IS_MEMBER_OF]->(b),
  (mary)-[:IS_MEMBER_OF]->(c),
  (linda)-[:IS_MEMBER_OF]->(a),
  (john)-[:IS_MEMBER_OF]->(b),
  (john)-[:IS_MEMBER_OF]->(c),
  (mary)-[:IS_TAKING]->(st),
  (mary)-[:IS_TAKING]->(lt),
  (linda)-[:IS_TAKING]->(ds),
  (linda)-[:IS_TAKING]->(st),
  (linda)-[:IS_TAKING]->(lt),
  (john)-[:IS_TAKING]->(as),
  (ds)-[:TAUGHT_ON]->(tu),
  (st)-[:TAUGHT_ON]->(tu),
  (as)-[:TAUGHT_ON]->(th),
  (lt)-[:TAUGHT_ON]->(th)
  
"""

session.run(query)

<neo4j._sync.work.result.Result at 0x7fa9f8a34520>

In [35]:
my_neo4j_nodes_relationships()

-------------------------
  Nodes:
-------------------------


Unnamed: 0,node_name,labels
0,Assembler,[Course]
1,Club A,[Club]
2,Club B,[Club]
3,Club C,[Club]
4,Data Structures,[Course]
5,John,[Person]
6,Laplace Transform,[Course]
7,Linda,[Person]
8,Mary,[Person]
9,Statictics,[Course]


-------------------------
  Relationships:
-------------------------


Unnamed: 0,node_name_1,node_1_labels,relationship_type,node_name_2,node_2_labels
0,Assembler,[Course],TAUGHT_ON,Thursday,[Day]
1,Data Structures,[Course],TAUGHT_ON,Tuesday,[Day]
2,John,[Person],IS_TAKING,Assembler,[Course]
3,John,[Person],IS_MEMBER_OF,Club B,[Club]
4,John,[Person],IS_MEMBER_OF,Club C,[Club]
5,Laplace Transform,[Course],TAUGHT_ON,Thursday,[Day]
6,Linda,[Person],IS_MEMBER_OF,Club A,[Club]
7,Linda,[Person],IS_TAKING,Data Structures,[Course]
8,Linda,[Person],IS_TAKING,Laplace Transform,[Course]
9,Linda,[Person],IS_TAKING,Statictics,[Course]


-------------------------
  Density: 0.2
-------------------------
