<p>
  <a href="https://colab.research.google.com/github/neo4j-partners/hands-on-lab-neo4j-and-vertex-ai/blob/main/Lab%204%20-%20Exploring%20Data/exploring_cypher.ipynb" target="_blank">
    <img src="https://cloud.google.com/ml-engine/images/colab-logo-32px.png" alt="Colab logo"> Run in Colab
  </a>
</p>

First off, you'll also need to install a few packages.

In [1]:
!pip install --quiet --upgrade neo4j

[?25l[K     |███▋                            | 10 kB 19.6 MB/s eta 0:00:01[K     |███████▏                        | 20 kB 26.2 MB/s eta 0:00:01[K     |██████████▉                     | 30 kB 32.2 MB/s eta 0:00:01[K     |██████████████▍                 | 40 kB 35.3 MB/s eta 0:00:01[K     |██████████████████              | 51 kB 23.4 MB/s eta 0:00:01[K     |█████████████████████▋          | 61 kB 24.9 MB/s eta 0:00:01[K     |█████████████████████████▎      | 71 kB 26.7 MB/s eta 0:00:01[K     |████████████████████████████▉   | 81 kB 27.4 MB/s eta 0:00:01[K     |████████████████████████████████| 90 kB 8.3 MB/s 
[?25h  Building wheel for neo4j (setup.py) ... [?25l[?25hdone


You'll need to enter the credentials from your Neo4j instance below.

The default DB_NAME is always neo4j.

In [2]:
# Edit these variables!
DB_URL = "neo4j+s://d1901d9c.databases.neo4j.io:7687"
DB_PASS = "e4QzO_Ipfii1F6wB7MNMmUm4UxF2S1KAKqS-qWe9DK0"

# You can leave these defaults
DB_USER = "neo4j"
DB_NAME = "neo4j"

In [3]:
import pandas as pd
from neo4j import GraphDatabase

driver = GraphDatabase.driver(DB_URL, auth=(DB_USER, DB_PASS))

Now that we're connected to the database, let's try running a few queries.  Earlier in the labs, we ran a query on the S&P 500 ETF, SPY.  Let's try it again on our new, indexed data set.

In [4]:
with driver.session(database=DB_NAME) as session:
  result = session.read_transaction(
    lambda tx: tx.run(
      """
        MATCH (n:Company{cusip:"78462F103"}) RETURN n
      """
    ).data()
  )
df = pd.DataFrame(result)
display(df)

Unnamed: 0,n
0,"{'cusip': '78462F103', 'nameOfIssuer': 'SPDR S..."


One result!  Looks like our indexing worked.  We can try other cypher queries as well.  But, let's try something new.

Neo4j has a set of procedures which are analogous to stored procedures in the RDMS world.  Those are called Awesome Procedures on Cypher (APOC).

Let's trying running Cypher through the APOC interface.

In [5]:
# node labels
with driver.session(database=DB_NAME) as session:
  result = session.read_transaction(
    lambda tx: tx.run(
      """
        CALL db.labels() YIELD label
        CALL apoc.cypher.run('MATCH (:`'+label+'`) RETURN count(*) as freq', {})
        YIELD value
        RETURN label, value.freq AS freq
      """
    ).data()
  )
df = pd.DataFrame(result)
display(df)

Unnamed: 0,label,freq
0,Manager,3906
1,Company,7342
2,Holding,446922


Note that this yieled us a usable dataframe rather than the embedded JSON blob we got by calling the raw cypher.

In [6]:
# relationship types
with driver.session(database=DB_NAME) as session:
    result = session.read_transaction(
        lambda tx: tx.run(
            """
      CALL db.relationshipTypes() YIELD relationshipType as type
      CALL apoc.cypher.run('MATCH ()-[:`'+type+'`]->() RETURN count(*) as freq', {})
      YIELD value
      RETURN type AS relationshipType, value.freq AS freq
      ORDER by freq DESC
      """
        ).data()
    )
df = pd.DataFrame(result)
display(df)

Unnamed: 0,relationshipType,freq
0,OWNS,446922
1,PARTOF,446922


If you like, you can try creating your own queries as well.