<p>
  <a href="https://colab.research.google.com/github/neo4j-partners/hands-on-lab-neo4j-and-vertex-ai/blob/main/Lab%204%20-%20Exploring%20Data/exploring_cypher.ipynb" target="_blank">
    <img src="https://cloud.google.com/ml-engine/images/colab-logo-32px.png" alt="Colab logo"> Run in Colab
  </a>
</p>

First off, you'll also need to install a few packages.

In [None]:
!pip install --quiet --upgrade neo4j

You'll need to enter the credentials from your Neo4j instance below.

The default DB_NAME is always neo4j.

In [None]:
# Edit this variable!
DB_URL = "neo4j://34.148.114.80:7687"

# You can leave these defaults
DB_USER = "neo4j"
DB_PASS = "foo123"
DB_NAME = "neo4j"

In [None]:
import pandas as pd
from neo4j import GraphDatabase

driver = GraphDatabase.driver(DB_URL, auth=(DB_USER, DB_PASS))

Now that we're connected to the database, let's try running a few queries.  Earlier in the labs, we ran a query on the S&P 500 ETF, SPY.  Let's try it again on our new, indexed data set.

In [None]:
with driver.session(database=DB_NAME) as session:
  result = session.read_transaction(
    lambda tx: tx.run(
      """
        MATCH (n:Company{cusip:"78462F103"}) RETURN n
      """
    ).data()
  )
df = pd.DataFrame(result)
display(df)

One result!  Looks like our indexing worked.  We can try other cypher queries as well.  But, let's try something new.

Neo4j has a set of procedures which are analogous to stored procedures in the RDMS world.  Those are called Awesome Procedures on Cypher (APOC).

Let's trying running Cypher through the APOC interface.

In [None]:
# node labels
with driver.session(database=DB_NAME) as session:
  result = session.read_transaction(
    lambda tx: tx.run(
      """
        CALL db.labels() YIELD label
        CALL apoc.cypher.run('MATCH (:`'+label+'`) RETURN count(*) as freq', {})
        YIELD value
        RETURN label, value.freq AS freq
      """
    ).data()
  )
df = pd.DataFrame(result)
display(df)

Note that this yieled us a usable dataframe rather than the embedded JSON blob we got by calling the raw cypher.

In [None]:
# relationship types
with driver.session(database=DB_NAME) as session:
    result = session.read_transaction(
        lambda tx: tx.run(
            """
      CALL db.relationshipTypes() YIELD relationshipType as type
      CALL apoc.cypher.run('MATCH ()-[:`'+type+'`]->() RETURN count(*) as freq', {})
      YIELD value
      RETURN type AS relationshipType, value.freq AS freq
      ORDER by freq DESC
      """
        ).data()
    )
df = pd.DataFrame(result)
display(df)

If you like, you can try creating your own queries as well.