<a href="https://colab.research.google.com/github/ESIPFed/wildlife-with-neo4j/blob/main/Use_cases_v2_0.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
# Run this only on cold-start or after factory reset runtime
# Module dependency installation block

!pip install neo4j-driver
!pip install "graphistry[all]"

In [None]:
# Constants declaration section

# Constants for Neo4j in the cloud
DATABASE_AURA_UID = "neo4j"
DATABASE_AURA_PWD = "Qmr2Yi06ACQxz7RxiJxyM261tYDlwn8Khamnn1Bic5M"
DATABASE_AURA_URI = "//bde80852.databases.neo4j.io"
DATABASE_AURA_CONNECTION_SCHEME = "neo4j+s:"
DATABASE_AURA_CONNECTION_SCHEME_AND_URI = DATABASE_AURA_CONNECTION_SCHEME + DATABASE_AURA_URI

# Constants for Graphistry
GRAPHISTRY_API_NUMBER = 3
GRAPHISTRY_PROTOCOL = "https"
GRAPHISTRY_SERVER = "hub.graphistry.com"
GRAPHISTRY_UID = "brian"
GRAPHISTRY_PWD = "bigbirdtweet"

# Neo4j required modules and connect to Neo4j
from neo4j import GraphDatabase, basic_auth
neo4jSessionDriver = GraphDatabase.driver(DATABASE_AURA_CONNECTION_SCHEME_AND_URI, auth=(DATABASE_AURA_UID, DATABASE_AURA_PWD))

# Graphistry required modules and register Graphistry API key
import graphistry
graphistry.register(api=GRAPHISTRY_API_NUMBER, protocol=GRAPHISTRY_PROTOCOL, server=GRAPHISTRY_SERVER, username=GRAPHISTRY_UID, password=GRAPHISTRY_PWD)

# Pandas DataFrame required for working with Graphistry
from pandas import DataFrame


#################### FUNCTION DECLARATIONS ####################

# Function definition block if approach #2 is eventually used

def makeSrcDstList(inputDF):
  # Function returns a list of lists that can be used to draw a graph of connected nodes.  This "list of lists" can also be thought of as a "list of edges", where 
  # each edge is a list made up of a "source node" and "destination node" pair, like [source_node_1, destination_node_1].  So the returned list looks like: 
  # [[source_node_1, destination_node_1], [source_node_2, destination_node_2], [source_node_3, destination_node_3], ...]
  # inputDF must be a DataFrame.  Format of inputDF: each row has multiple columns, here each column corresponds to the variable returned by the Cypher query.
  srcDstList = []

  # Iterate through the resultDF table.  For each row, register an edge linking the cell in column N to the cell in column N+1 
  # resultDF.shape returns a tuple "(<number of rows>, <number of columns>".
  for row in range(inputDF.shape[0]):
    for column in range(inputDF.shape[1]-1):
      candidateSrcDst = [inputDF.iloc[row, column], inputDF.iloc[row, column+1]]
      try:
        x = srcDstList.index(candidateSrcDst)
      except ValueError:
        srcDstList.append(candidateSrcDst)
  return srcDstList



def makeGraphistryPlotterObjectFromQueryString(queryString):
  # Function takes a query string and returns a graphistry plotter object ready for visualization
  
  # Issue the query to Neo4j and get back the results in a DataFrame
  # resultDF will be a DataFrame with a number of rows and the columns corresponding to the rows and columns returned by the Cypher query
  queryResults = neo4jSessionDriver.session().run(queryString)
  resultDF = DataFrame(queryResults.data())

  display (resultDF)

  # Convert the result DataFrame into a list of graph edges that can be used to create a graphistry Plotter object
  graphEdgesDF = DataFrame(makeSrcDstList(resultDF), columns = ['Source', 'Destination'])
  graphistryPlotterObject = graphistry.bind(source="Source", destination="Destination").edges(graphEdgesDF)
  
  return graphistryPlotterObject

Part 1: What organizations have conservation priorities for Species X?

In [None]:
# CHANGE THE SPECIES NAME BELOW ACCORDINGLY
speciesName = 'Piping Plover'

# DO NOT TOUCH THE REST
query_string = '''
match (bird:Bird {name:"TARGET_1_STRING"})-[:HAS_PRIORITY]->(priority:Priority)-[:HAS_PLAN]->(plan:Plan)-[:HAS_ORGANIZATION]->(organization:Organization)
return bird.hasCommonName, priority.name, plan.name, organization.name
'''

query_string = query_string.replace("TARGET_1_STRING", speciesName)

with neo4jSessionDriver.session() as session:
    queryResults = session.run(query_string)
    resultDF = DataFrame(queryResults.data())

# resultDF will return a DataFrame table with a number of rows and the columns corresponding to the Cypher query

In [None]:
resultDF

Unnamed: 0,bird.hasCommonName,priority.name,plan.name,organization.name
0,Piping Plover,NCWAP 2015 Surveys Priority 361,North Carolina Wildlife Action Plan 2015,North Carolina Wildlife Resources Commission
1,Piping Plover,NCWAP 2015 Monitoring Priority 370,North Carolina Wildlife Action Plan 2015,North Carolina Wildlife Resources Commission
2,Piping Plover,NCWAP 2015 Conservation Programs And Partnersh...,North Carolina Wildlife Action Plan 2015,North Carolina Wildlife Resources Commission


### **METHOD #1:  DON'T LIKE THIS METHOD, NOT RECOMMENDED**



In [None]:
# Method 1: results in the nuisance "row result" entity, cannot be gotten rid of because of the way the hypergraph API does it job

# CHANGE THE SPECIES NAME BELOW ACCORDINGLY
speciesName = 'Piping Plover'

# DO NOT TOUCH THE REST
query_string = '''
match (bird:Bird {name:"TARGET_1_STRING"})-[:HAS_PRIORITY]->(priority:Priority)-[:HAS_PLAN]->(plan:Plan)-[:HAS_ORGANIZATION]->(organization:Organization)
return bird.hasCommonName, priority.name, plan.name, organization.name
'''

query_string = query_string.replace("TARGET_1_STRING", speciesName)

queryResults = neo4jSessionDriver.session().run(query_string)
resultDF = DataFrame(queryResults.data())

viz = graphistry.hypergraph(resultDF, direct=False)
viz['graph'].plot()

# links 12
# events 3
# attrib entities 6


### **METHOD #2: GOING WITH THIS METHOD FOR NOW, RESULTS IN RELATIVELY USER-FRIENDLY STUFF**






Part 1: What organizations have conservation priorities for Species X? (Part 2, also using Method #2, continues below)

(Insert text written by wildlife refuge manager who is concerned about mananaging this Species X)

In [None]:
# CHANGE THE SPECIES NAME BELOW ACCORDINGLY
speciesName = 'Piping Plover'

# DO NOT TOUCH THE REST
query_string = '''
match (bird:Bird {name:"TARGET_1_STRING"})-[:HAS_PRIORITY]->(priority:Priority)-[:HAS_PLAN]->(plan:Plan)-[:HAS_ORGANIZATION]->(organization:Organization)
return bird.hasCommonName, priority.name, plan.name, organization.name
'''

query_string = query_string.replace("TARGET_1_STRING", speciesName)

viz = makeGraphistryPlotterObjectFromQueryString(query_string)
viz.plot()


Unnamed: 0,bird.hasCommonName,priority.name,plan.name,organization.name
0,Piping Plover,NCWAP 2015 Surveys Priority 361,North Carolina Wildlife Action Plan 2015,North Carolina Wildlife Resources Commission
1,Piping Plover,NCWAP 2015 Monitoring Priority 370,North Carolina Wildlife Action Plan 2015,North Carolina Wildlife Resources Commission
2,Piping Plover,NCWAP 2015 Conservation Programs And Partnersh...,North Carolina Wildlife Action Plan 2015,North Carolina Wildlife Resources Commission


Part 2: How are SGCN species being managed through conservation plans?

In [None]:
query_string = '''
MATCH (s:Species:NcSgcn)-[:HAS_PLAN]-(p:Plan) 
RETURN s.name as speciesname, p.name as planname
'''

viz = makeGraphistryPlotterObjectFromQueryString(query_string)
viz.plot()

Unnamed: 0,speciesname,planname
0,American Oystercatcher,South Atlantic Migratory Bird Initiative Imple...
1,American Oystercatcher,Conservation Plan For The American Oystercatch...
2,American Oystercatcher,Atlantic Flyway Shorebird Business Strategy 2013
3,Piping Plover,South Atlantic Migratory Bird Initiative Imple...
4,Piping Plover,Conservation Plan For The American Oystercatch...
...,...,...
151,Prothonotary Warbler,South Atlantic Migratory Bird Initiative Imple...
152,Purple Sandpiper,Atlantic Flyway Shorebird Business Strategy 2013
153,Red-cockaded Woodpecker,South Atlantic Migratory Bird Initiative Imple...
154,Red-cockaded Woodpecker,Partners In Flight Landbird Conservation Plan ...


### **METHOD #3:  CLEANEST APPROACH, BUT RESULTS IN A VISUALIZATION THAT IS NOT USER FRIENDLY... REQUIRES FURTHER INVESTIGATION INTO THE THOROUGHLY UNFRIENDLY GRAPHISTRY API DOCUMENTATION.**






In [None]:
# Method 3: Most elegant method.  Uses a direct call to graphistry's cypher method and then plots the results.  
# Problem is: don't know how to make graphistry display the node name instead of the node ID, and the rest of the graph is not intuitive.

query_string = '''
match (bird:Bird {name:"TARGET_1_STRING"})-[edgepriority:HAS_PRIORITY]->(priority:Priority)-[edgeplan:HAS_PLAN]->(plan:Plan)-[edgeorganization:HAS_ORGANIZATION]->(organization:Organization)
return bird, edgepriority, priority, edgeplan, plan, edgeorganization, organization
'''

query_string = query_string.replace("TARGET_1_STRING", speciesName)

# Register Neo4j connection in Graphistry 
GRAPHISTRY_NEO4j_CREDENTIALS = {'uri': DATABASE_AURA_CONNECTION_SCHEME + DATABASE_AURA_URI, 'auth': (DATABASE_AURA_UID, DATABASE_AURA_PWD)}
graphistry.register(bolt=GRAPHISTRY_NEO4j_CREDENTIALS)

# The Cypher method returns a graphistry plotter object, and binds source, destination, and node.
results = graphistry.cypher(query_string)

results.plot()