# How to connect Neo4j to Hopsworks
In this notebook we will
* import data into Neo4j
* use Neo4j's Graph Data Science library to calculate node2vec graph node embeddings, and store these on the nodes in the graph database.
* read these embeddings into a dataframe
* create feature groups in a Hopsworks feature store

## Step 1: Importing the data into Neo4j

First we do a few Imports and set a few parameters.

In [1]:
from neo4j import GraphDatabase
from graphdatascience import GraphDataScience

URI = "bolt://localhost:7687"
AUTH = ("neo4j", "changeme")
DATABASE = "neo4j2"

Then we create a few indexes in Neo4j.

In [2]:
with GraphDatabase.driver(URI, auth=AUTH) as driver:
    driver.execute_query("create text index party_id_index for (p:Party) on (p.partyId)", database_=DATABASE)
    driver.execute_query("create text index party_type_index for (p:Party) on (p.partyType)", database_=DATABASE)
    driver.execute_query("create text index transaction_id_index for ()-[r:TRANSACTION]-() ON r.tran_id", database_=DATABASE)
    driver.execute_query("create range index transaction_timestamp_index for ()-[r:TRANSACTION]-() ON r.tran_timestamp", database_=DATABASE)

Then we do the first import of the first .csv file, holding the (:Party) nodes. This will finish very quickly, as there are only 7-8k nodes.

In [3]:
with driver.session(database=DATABASE) as session:
            result = session.run("""
                load csv with headers from "https://repo.hops.works/master/hopsworks-tutorials/data/aml/party.csv" as parties
                create (p:Party)
                set p = parties
            """)
print(result.consume().counters)

  with driver.session(database=DATABASE) as session:


{'_contains_updates': True, 'labels_added': 7347, 'nodes_created': 7347, 'properties_set': 14694}


Next we will import the relationshops. There are approx 430k [:TRANSACTION] relationships, and importing these will take a few minutes.

In [4]:
with driver.session(database=DATABASE) as session:
            result = session.run("""
                LOAD CSV WITH HEADERS FROM "https://repo.hops.works/master/hopsworks-tutorials/data/aml/transactions.csv" AS Transaction
                    MATCH (startNode:Party)
                    WHERE startNode.partyId = Transaction.src
                    CALL {
                        WITH Transaction, startNode
                        MATCH (endNode:Party)
                        WHERE endNode.partyId = Transaction.dst
                        CREATE (startNode)-[rel:TRANSACTION {tran_id: Transaction.tran_id, tx_type: Transaction.tx_type, base_amt: Transaction.base_amt, tran_timestamp: datetime(Transaction.tran_timestamp)}]->(endNode)
                    } IN TRANSACTIONS OF 2500 ROWS;
            """)
print(result.consume().counters)

  with driver.session(database=DATABASE) as session:


This completes the importing of the data into Neo4j.