# Neo4j Python API Practice (with CSV Files on internet)
### Overview:

Teresa will write her own python code in Jupyter Notebook where...
- She will access a Neo4j database (local)
- She will reference a folder full of .cypher files
- She will run CYPHER queries to load/create entire database from CSV files
- She will attempt to connect to a cloud DB and run queries after success of all other steps.
  
### Motivation:
- Establish python code that can be used in an automated pipeline.
- Progress toward a "touch-less" graph database and data management system.

## Step 0:

If you are attempting to run this code:

**Local Database**: you will first need to start your own local DBMS using the [Neo4j Desktop App](https://neo4j.com/download/?utm_source=Google&utm_medium=PaidSearch&utm_campaign=Evergreen&utm_content=AMS-Search-SEMBrand-Evergreen-None-SEM-SEM-NonABM&utm_term=download%20neo4j&utm_adgroup=download&gad_source=1&gclid=Cj0KCQjwpNuyBhCuARIsANJqL9Mfw2KSzysHnaaX0w_SPaPP49aDQPg5k6T-joWu_UnTcMYiWsrE4NEaAm4TEALw_wcB).

The default Bolt Port that is used should be 7687, but you can change the code to match your settings.

**Cloud Database**: Make sure you have a *practice* cloud database currently running.  Don't mess up a real database with this practice.

## Step 1: Load the required packages, as usual:

In [1]:
from neo4j import GraphDatabase
import os

## Step 2: Connect to a database:

### Establish the DRIVER

#### Local Option

In [2]:
# URI examples: "neo4j://localhost", "neo4j+s://xxx.databases.neo4j.io"
URI = "neo4j://localhost:7687" # Specify URI of already running database
AUTH = ("neo4j","thisispractice") # Enter username and password (this should probably be a reqeusted input in final product)

with GraphDatabase.driver(URI, auth=AUTH) as driver:
    driver.verify_connectivity()

#### Cloud Option

In [None]:
# URI examples: "neo4j://localhost", "neo4j+s://xxx.databases.neo4j.io"
URI = "" # Specify URI of already running database
AUTH = ("neo4j","") # Enter username and password (this should probably be a reqeusted input in final product)

with GraphDatabase.driver(URI, auth=AUTH) as driver:
    driver.verify_connectivity()

## Step 3: Specify directory where I want to run cypher files

In [3]:
directory = ""

## Step 4: Navigate through folders and execute cypher code(s) in each one

In [4]:
# Define parameters for use in cypher codes (they cannot be hard coded into the queries)
params = {
    "users_link" :  "https://raw.githubusercontent.com/lteresah/AMLS-GraphDatabase/main/Stage2/CSVs_Version2/Neo4j_UsersF.csv",
    "OTC_ingredients_link" : "https://raw.githubusercontent.com/lteresah/AMLS-GraphDatabase/main/Stage2/CSVs_Version2/Neo4j_OTC_Ingredients.csv",
    "mix_makepre_link" : "https://raw.githubusercontent.com/lteresah/AMLS-GraphDatabase/main/Stage2/CSVs_Version2/Neo4j_op_MixMakePre.csv",
    "heat_link" : "https://raw.githubusercontent.com/lteresah/AMLS-GraphDatabase/main/Stage2/CSVs_Version2/Neo4j_op_Heat.csv",
    "rest_link" : "https://raw.githubusercontent.com/lteresah/AMLS-GraphDatabase/main/Stage2/CSVs_Version2/Neo4j_op_Rest.csv",
    "mix_makesamp_link" : "https://raw.githubusercontent.com/lteresah/AMLS-GraphDatabase/main/Stage2/CSVs_Version2/Neo4j_op_MixMakeSamp.csv",
    "idsToSkip": []
}

### Mini Function Repository

In [6]:
# Function that separates cyph files from directories within the current directory and orders them alphabetically (important)
# Node constraints must be created first, then nodes, then relationships
def separate_cyphdir():
    dirobs = os.listdir()
    cyphfiles = []
    dirs = []
    for ob in dirobs:
        if os.path.isfile(ob) and ob.endswith(".cypher"):
            cyphfiles.append(ob)
        if os.path.isdir(ob):
            dirs.append(ob)
    cyphfiles = sorted(cyphfiles)
    dirs = sorted(dirs)
    print(f"Found cypher files: {cyphfiles}")
    print(f"Found directories: {dirs}")
    return cyphfiles, dirs

# Function that executes every .cypher file in current directory (not including subdirectories)
def execute_cyphs(cyphfiles,params):
     with GraphDatabase.driver(URI, auth=AUTH) as driver:
        with driver.session(database="neo4j") as session:
            for file in cyphfiles:
                f = open(file)
                ff = f.read()
                session.run(ff,params)
                print(f"Executed {file}")


#### Plans for a recursive function
For a specified directory:
1. Enter the directory 
2. List elements in directory
3. Separate Cyph Files and Directories
4. Run every cyph file in Directory
5. For each directory in the directory:
- Repeat 1-5
6. Return to directory above 


In [7]:
# Function which runs all cypher files within the directory (including subdirectories)
def execute_cyphs_alldir(directory,params, count = 0):
    if count == 0:
        curdir = ""
    else:
        curdir = os.getcwd()
    os.chdir(directory)
    print(f"Entered {curdir}/{directory}")
    cyphs, dirs = separate_cyphdir()
    execute_cyphs(cyphs,params)
    for dir in dirs:
        execute_cyphs_alldir(dir,params, count + 1)
    os.chdir("..")

#### Run the recursive function to execute all cyph files in specified directory

In [None]:
execute_cyphs_alldir(directory,params)
driver.close()