<table align="left">

  <td>
    <a href="https://colab.research.google.com/github/neo4j-partners/apevue-knowledge-graph/blob/master/load.ipynb" target="_blank">
      <img src="https://cloud.google.com/ml-engine/images/colab-logo-32px.png" alt="Colab logo"> Run in Colab
    </a>
  </td>
  <td>
    <a href="https://github.com/neo4j-partners/apevue-knowledge-graph/blob/master/load.ipynb" target="_blank">
      <img src="https://cloud.google.com/ml-engine/images/github-logo-32px.png" alt="GitHub logo">
      View on GitHub
    </a>
  </td>
  <td>
    <a href="https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/neo4j-partners/apevue-knowledge-graph/main/load.ipynb">
      <img src="https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32" alt="Vertex AI logo">Open in Vertex AI Workbench
    </a>
</td>
</table>

# Load
In this notebook, you will learn how to use Neo4j AuraDS to load data from [ApeVue](https://apevue.com/). This is private equity data from the ApeVue 50, an index of private firms.  The dataset includes information about subindex, investors and returns during H1 2022.

## Connect to Neo4j
We assume you've already deployed a Neo4j AuraDS instance on GCP.  To do that, you can go to the listing [here](https://console.cloud.google.com/marketplace/product/endpoints/prod.n4gcp.neo4j.io).

Neo4j has two Python APIs we can use to connect.  The Graph Database API is the standard Neo4j way to interface with the database.  The Graph Data Science API is simplified with wrapper code that hides transaction semantics.  We're going to use that second API.  To do so we need to get the driver with this command.

In [None]:
%pip install graphdatascience

Now, you're going to need the connection string and credentials from the AuraDS deployment.  You'll need to fill these variables out.

In [None]:
# Edit these variables!
DB_URL = "neo4j+s://XXXXX.databases.neo4j.io"
DB_PASS = "<your-password>"

# You can leave this default
DB_USER = "neo4j"

In [None]:
from graphdatascience import GraphDataScience

gds = GraphDataScience(DB_URL, auth=(DB_USER, DB_PASS))

## Load Data into Neo4j
Now that we've got our connection object, let's load the dataset into Neo4j.

Let's start by defining some constraints.

In [None]:
result = gds.run_cypher(
    "CREATE CONSTRAINT IF NOT EXISTS FOR (p:Company) REQUIRE (p.name) IS NODE KEY;"
)
display(result)

result = gds.run_cypher(
    "CREATE CONSTRAINT IF NOT EXISTS FOR (p:Investor) REQUIRE (p.name) IS NODE KEY;"
)
display(result)

result = gds.run_cypher(
    "CREATE CONSTRAINT IF NOT EXISTS FOR (p:Sector) REQUIRE (p.name) IS NODE KEY;"
)
display(result)

Let's start by loading nodes.

In [None]:
result = gds.run_cypher(
    """
        LOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/neo4j-partners/apevue-knowledge-graph/main/data/sectors.csv" AS row
        MERGE (x:Sector {name:row.Sector})
    """
)
display(result)

In [None]:
result = gds.run_cypher(
    """
        LOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/neo4j-partners/apevue-knowledge-graph/main/data/investors.csv" AS row
        MERGE (x:Investor {name:row.Investor})
    """
)
display(result)

In [None]:
result = gds.run_cypher(
    """
        LOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/neo4j-partners/apevue-knowledge-graph/main/data/returns.csv" AS row
        MERGE (x:Company {name:row.Company})
        ON CREATE SET
            x.Return=toFloat(row.Return),
            x.OpenInterest=toInteger(row.OpenInterest),
            x.Depth=toInteger(row.Depth)
    """
)
display(result)

Now let's create relationships between those nodes.

In [None]:
result = gds.run_cypher(
    """
        LOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/neo4j-partners/apevue-knowledge-graph/main/data/sectors.csv" AS row
        MATCH (s:Sector {name:row.Sector})
        MATCH (c:Company {name:row.Company})
        MERGE (s)-[r:CONTAINS]->(c)
    """
)
display(result)

In [None]:
result = gds.run_cypher(
    """
        LOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/neo4j-partners/apevue-knowledge-graph/main/data/investors.csv" AS row
        MATCH (c:Company {name:row.Company})
        MATCH (i:Investor {name:row.Investor})
        MERGE (i)-[r:OWNS]->(c)
    """
)
display(result)

## Conclusion
In this notebook you loaded ApeVue data into Neo4j.  That created a knowledge graph that you can use tools like Neo4j Browser and Neo4j Bloom to explore.