# Exercise 00 - Getting Started

In [1]:
%%capture
%pip install -U graphdatascience pandas ipywidgets

In [2]:
# Update this if you're not running locally with the provided Docker instances.
USE_TLS = False
NEO4J_HOST = "neo4j.arrow"
NEO4J_URI = f"neo4j{'+s' * int(USE_TLS)}://{NEO4J_HOST}:7687"
NEO4J_AUTH = ("neo4j", "password")

In [3]:
import pandas as pd
from time import time

from graphdatascience import GraphDataScience

gds = GraphDataScience(NEO4J_URI, auth=NEO4J_AUTH)

gds.version()

'2.2.2'

In [4]:
info = gds.debug.sysInfo()

edition = info[info.key == "gdsEdition"].value.item()
#assert edition == "Licensed"

print(f"Your copy of GDS is '{edition}.'")
info

Your copy of GDS is 'Licensed.'


Unnamed: 0,key,value
0,gdsVersion,2.2.2
1,gdsEdition,Licensed
2,gdsLicenseExpirationTime,2022-10-27T07:00:00.000000000+00:00
3,neo4jVersion,5.1.0
4,minimumRequiredJavaVersion,11
...,...,...
93,server.memory.pagecache.size,1073741824
94,server.memory.off_heap.max_size,2147483648
95,dbms.memory.transaction.total.max,0
96,db.memory.transaction.total.max,0


## Setting the Database

This is simple and required for performing **graph projections**. Simply call `set_database` on the client.

In [5]:
gds.set_database("neo4j")

## Projecting with DataFrames

Pandas DataFrames are a _de facto_ standard API across numerous analytics platforms and packages. While many systems implement a portion of the API, we'll be using them directly from the Pandas library itself.

An easy way to create a DataFrame is from a Python Dictionary.

In [6]:
# Create our Nodes DataFrame
nodes = pd.DataFrame(
    {
        "nodeId": [0, 1, 2, 3, 60_000_000],
        "labels":  ["A", "B", "C", "A", "X"],
        "prop1": [42, 1337, 8, 0, 0],
        "otherProperty": [0.1, 0.2, 0.3, 0.4, 0.0]
    }
)

nodes

Unnamed: 0,nodeId,labels,prop1,otherProperty
0,0,A,42,0.1
1,1,B,1337,0.2
2,2,C,8,0.3
3,3,A,0,0.4
4,60000000,X,0,0.0


In [7]:
# Create our Relationships DataFrame
relationships = pd.DataFrame(
    {
        "sourceNodeId": [0, 1, 2, 60_000_000],
        "targetNodeId": [1, 2, 3, 60_000_000],
        "relationshipType": ["REL", "REL", "REL", "REL"],
        "weight": [0.0, 0.0, 0.1, 42.0]
    }
)

relationships

Unnamed: 0,sourceNodeId,targetNodeId,relationshipType,weight
0,0,1,REL,0.0
1,1,2,REL,0.0
2,2,3,REL,0.1
3,60000000,60000000,REL,42.0


## Project our Graph

Creating a graph projection from DataFrames is easy and made possible by GDS's **Arrow Flight** service.

In [8]:
G = gds.alpha.graph.construct(
    f"my-graph-{time()}",   # Graph name
    nodes,                     # One or more dataframes containing node data
    relationships              # One or more dataframes containing relationship data
)
G.memory_usage()

'9985 KiB'

Neo4j logs should show:

```
2022-09-08 15:47:28.464+0000 INFO  [system/00000000] Received action CREATE_GRAPH with configuration CreateGraphAction{name=my-graph, databaseName=neo4j, concurrency=8}
2022-09-08 15:47:28.484+0000 INFO  [system/00000000] Put stream started
2022-09-08 15:47:28.498+0000 INFO  [system/00000000] Put command: PutCommand{name=my-graph, entityType=node}
2022-09-08 15:47:28.515+0000 INFO  [system/00000000] Received action NODE_LOAD_DONE with configuration NodeLoadDoneAction{name=my-graph}
2022-09-08 15:47:28.536+0000 INFO  [system/00000000] Put stream started
2022-09-08 15:47:28.537+0000 INFO  [system/00000000] Put command: PutCommand{name=my-graph, entityType=relationship}
2022-09-08 15:47:28.554+0000 INFO  [system/00000000] Received action RELATIONSHIP_LOAD_DONE with configuration RelationshipLoadDoneAction{name=my-graph}
```

## Refresher: Using the Graph Object

- running an algo
- dropping the graph

In [9]:
# Quick refresher...running algos is easy with the Graph object:

gds.wcc.stream(G)

Unnamed: 0,nodeId,componentId
0,0,0
1,1,0
2,2,0
3,3,0
4,60000000,4


In [10]:
# Let's drop the graph.

G.drop()

graphName                                  my-graph-1666720297.3877385
database                                                         neo4j
memoryUsage                                                           
sizeInBytes                                                         -1
nodeCount                                                            5
relationshipCount                                                    4
configuration                                                       {}
density                                                            0.2
creationTime                       2022-10-25T17:51:37.670123550+00:00
modificationTime                   2022-10-25T17:51:37.669751808+00:00
schema               {'graphProperties': {}, 'relationships': {'REL...
Name: 0, dtype: object

# ~~ _fin_ ~~