# Visualize Graph Data Model Using Diagram-as-Code: Property Graph

This notebook shows how to introspect property graph data in your Neptune database and draw a diagram representing the contents of that model. It uses a diagram-as-code approach. We use a combination of Neptune summary API and OpenCypher queries to discover your graph schema. We then use PlantUML to draw the schema.

### Min Requirement: 
- Neptune 1.2.x or higher
- Summary API must be working. See requirements https://docs.aws.amazon.com/neptune/latest/userguide/neptune-graph-summary.html. 

There is a companion notebook for RDF.


TODO add sample data with:
- Lots of node labels - E.g., CREATE (n:Customer<N>-Device {`~id`: 'Device_15', ID: 15, Value: 'XYZ123456'})
- Multiple labels, so lots of Customer<N> labels: CREATE (n:Device:Customer1 {`~id`: 'Device_15', ID: 15, Value: 'XYZ123456'})
- Would I ever have LOTS of edge labels? - try one for fun connectedTo-<N>
    


## Setup PlantUML
We will use Plant UML to render the diagram

In [None]:
%pip install iplantuml

** Restart kernel **

In [None]:
import iplantuml

## Setup Discovery
Get connection to Neptune

In [None]:
import json
import os
import lpg_discovery

def get_neptune_env(var):
    return os.popen(f"source ~/.bashrc ; echo ${var}").read().split("\n")[0]
    
# Grab Neptune cluster host/port from notebook instance environment variables
GRAPH_NOTEBOOK_HOST= get_neptune_env("GRAPH_NOTEBOOK_HOST")
GRAPH_NOTEBOOK_PORT= get_neptune_env("GRAPH_NOTEBOOK_PORT")
GRAPH_NOTEBOOK_AUTH_MODE= get_neptune_env("GRAPH_NOTEBOOK_AUTH_MODE")
AWS_REGION= get_neptune_env("AWS_REGION")
USE_IAM_AUTH = GRAPH_NOTEBOOK_AUTH_MODE != 'DEFAULT'

lpg_discovery.set_neptune_env(GRAPH_NOTEBOOK_HOST, GRAPH_NOTEBOOK_PORT, AWS_REGION, USE_IAM_AUTH)


## Load some data if you like
If not, skip and we'll go with what you've got

In [None]:
%seed --model property_graph --dataset airports

In [None]:
%seed --model property_graph --dataset fraud_graph

In [None]:
%seed --model property_graph --dataset knowledge-graph

### Movies

In [None]:
# Use the boto3 session SDK to fetch the region being used for this workshop
import boto3

# Dynamically build the S3 path based on the region
s3datapath = "s3://ee-assets-prod-" + AWS_REGION + \
    "/modules/f3f89ef4607743429fb01ae23d983197/v1/workshop/data-v2/imdb-pg/"

# Using the Neptune Workbench's %load magic to instantiate a bulk load from the social graph in S3
%load -s {s3datapath} -f csv -p OVERSUBSCRIBE --store-to result1 --run

In [None]:
%load_status {result1['payload']['loadId']} --details --errors

## Use Summary API to Get Schema

Start with summary API to get basic schema from stats

Then dig a bit deeper with some queries

Here's a reference: https://github.com/aws/amazon-neptune-for-graphql/blob/main/src/NeptuneSchema.js

In [None]:
%summary --detail --store-to pgsummary propertygraph

## Run discovery
Run OpenCypher introspection queries.
The result is a list of introspected node types, their properties, their relationships.

In [None]:
observation=lpg_discovery.discover(pgsummary)

observation


## Build PlantUML spec
Map the observed to UML class diagram PlantUML form.

In [None]:
plantspec = lpg_discovery.to_plant_uml(observation)
print(plantspec)

## Render from the Spec
Draw it in PlantUML

In [None]:
#import plantuml
ipython = get_ipython()
ipython.run_cell_magic("plantuml", "-n lpg_all", plantspec)

## Too much? Wittle it down

Just airport.

In [None]:
class_filter = ["continent", "country", "airport", "version"]
plantspec = lpg_discovery.to_plant_uml(observation, class_filter)
print(plantspec)

In [None]:
ipython = get_ipython()
ipython.run_cell_magic("plantuml", "-n lpg_airport", plantspec)