In [None]:
import pandas as pd
import time

import subprocess

# Preparing Nodes to import
In the first step, we open the tables that have been exported by the DataGeneration.ipynb script.

In [None]:
# Note:
# Replace "... Path to Neo4j directory ..." with the actual path to your Neo4j installation directory where
#     the \import folder is located.
# Example: "C:\\Programs\\neo4j-community-2025.x.x\\import\\customers.csv"

customers = pd.read_csv("... Path to Neo4j directory ...\\import\\customers.csv")
terminals = pd.read_csv("... Path to Neo4j directory ...\\import\\terminals.csv")
transactions = pd.read_csv("... Path to Neo4j directory ...\\import\\transactions.csv")

As we seen in the DataGeneration.ipynb, we exported the tables with an index column. That column is automatically named "Unnamed: 0". Here, we rename it to the followings:
- ":ID(customer_ref)" for the Customer table.
- ":ID(terminal_ref)" for the Terminal table.
- ":ID(transaction_ref)" for the Transaction table.

These columns will play a crucial part in forming the relarionship tables. But before we go to the columns, we first save our edited tables.

In [None]:
customers = customers.rename(columns={"Unnamed: 0": ":ID(customer_ref)"})
terminals = terminals.rename(columns={"Unnamed: 0": ":ID(terminal_ref)"})
transactions = transactions.rename(columns={"Unnamed: 0": ":ID(transaction_ref)"})

customers.to_csv("... Path to Neo4j directory ...\\import\\customers.csv", index=False)
terminals.to_csv("... Path to Neo4j directory ...\\import\\terminals.csv", index=False)
transactions.to_csv("... Path to Neo4j directory ...\\import\\transactions.csv", index=False)

# Creating Relationships
To form the relationship tables, we use the three tables we have.

### [:MADE]
This relationship is between a customer node and a transaction node. Therefore, we make a copy of the "CUSTOMER_ID" and ":ID(transaction_ref)" columns from the Transactions table and store it into a new dataframe. Then, we name these two columns ":START_ID(customer_ref)" and ":END_ID(transaction_ref)", respectively. We also add a third column named ":TYPE" and set the values to "MADE" which will be the name of this relationship.

In [None]:
made = transactions[["CUSTOMER_ID", ":ID(transaction_ref)"]].copy()
made.columns = [":START_ID(customer_ref)", ":END_ID(transaction_ref)"]
made[":TYPE"] = "MADE"

### [:OCCURED_AT]
This relationship, however, is between a transaction and a terminal. Just like MADE, we follow the same procedure. Again in the Transactions table, we make a copy of ":ID(transaction_ref)" and "TERMINAL_ID" columns and save it in a dataframe. We rename the columns to ":START_ID(transaction_ref)" and ":END_ID(terminal_ref)", add a new column named ":TYPE" and set its values to "OCCURREED_AT".

In [None]:
occured_at = transactions[[":ID(transaction_ref)", "TERMINAL_ID"]].copy()
occured_at.columns = [":START_ID(transaction_ref)", ":END_ID(terminal_ref)"]
occured_at[":TYPE"] = "OCCURRED_AT"

Then, we export them as a CSV file.

In [None]:
made.to_csv("... Path to Neo4j directory ...\\import\\made.csv", index=False)
occured_at.to_csv("... Path to Neo4j directory ...\\import\\occurred_at.csv", index=False)

# Importing
All the adjustments that we did to the tables were neccessary in order to use Neo4j Admin Import. According to the syntax of the Admin, it will look at the tables and search for the columns that are named in the form of ":ID(...)", ":START_ID(...)" and ":END_ID(...)". Using these columns, it connects the tables together in order to form the relationships. In other words, the added columns to the node tables and the namings and :TYPE columns for the relationship tables are a part of its syntax. The ":ID(...)" acts like a key for its table, while ":START_ID(...)" and ":END_ID(...)" act like a pointer to those keys depending on the name that is written in the parenthesese. This way, Admin can connect a customer in Customers table to a transaction in Transactions table under the name of MADE. The same goes for the OCCURRED_AT relationship. Note that these ajdustments are only for Admin. Therefore, the nodes created from these tables will not have a column named ":ID(...)". Admin only uses those columns to form the relationships.

In order to import the nodes and relationships into the Neo4j community server database, we need to execute a command in CMD. The command is as follow:
```
>>... Path to Neo4j directory ...\bin\neo4j-admin.bat database import full  
&emsp;--overwrite-destination  
&emsp;--nodes=Customer=import\customers.csv  
&emsp;--nodes=Terminal=import\terminals.csv  
&emsp;--nodes=Transaction=import\transactions.csv  
&emsp;--relationships=MADE=import\made.csv  
&emsp;--relationships=OCCURRED_AT=import\occurred_at.csv
```
For this project, we give the Admin 6 variables. the first variable is the overwrite permission. Meaning if the database is already exists, we overwrite the new data. The other variables are the csv tables. They are first defined by the type of the data which is node or relationship, followed by the csv file location, which is in the /import foulder.  
However, the goal of this project is to do everything by the script, instead of opening CMD manually and type the command. Therefore, the following script is used.

In [None]:
path = "... Path to Neo4j directory ...\\bin\\neo4j-admin.bat"

command = [
    path, "database", "import", "full", 
    "--overwrite-destination",
    "--nodes=Customer=import\\customers.csv",
    "--nodes=Terminal=import\\terminals.csv",
    "--nodes=Transaction=import\\transactions.csv",
    "--relationships=MADE=import\\made.csv",
    "--relationships=OCCURRED_AT=import\\occurred_at.csv"
]

result = subprocess.run(command, capture_output=True, text=True)

print("STDOUT: ", result.stdout)
print("STDERR: ", result.stderr)

The command is a list of values, which are every parameters in the command used in CMD seperated by space. Then, using the run() method of the subprocess library, we can automatically open a CMD and feed it the command.  
capture_output is set to True to allow us to recieve the result of the command. the parameter text is makeing sure that the result is recieved as a string.  
these relusts are STDOUT (the output if the command is executed successfully), and STDDERR (in case of the occurance of an error).  
This script allows us to import the data to the database which is done by CMD completely automatic simply by just running this code.  
If the csv files are in the correct shape, the nodes and relationships will be created in the database without any error.  

>IMPORTANT. Importing by Admin acts also like a reset. Meaning it can be done only once. Because when we import by Admin, it erases every data in the database and replace them with the new data.