# Tidying up the Neo4j Database

While we were building the Neo4j database we took some shortcuts to make it easier to loop over the data in batch and store things. A perfect example is that if there was a missing "Country" in a registration or an address we connected that node to a Country node with code = "UNKNOWN". This is a pointless set of nodes and relationships and can potentially slow things down so we are going to tidy some of these things up.

In [2]:
# We need the connection driver imported
from neo4j.v1 import GraphDatabase

In [3]:
user = "myusername"
password = "mypassword"
connection_path = "bolt://10.0.0.1:7687"
driver = GraphDatabase.driver(connection_path, auth=(user, password))

Let's find our example country node that is "UNKNOWN"

In [5]:
with driver.session() as session:
    result = session.run("MATCH (c:Country {code: 'UNKNOWN'}) RETURN c;")
    print(result.data())                     

[{'c': <Node id=100249 labels={'Country'} properties={'code': 'UNKNOWN'}>}]


Now we can slightly change the query to DETACH and DELETE the node, this will effectively remove the node and ALL relationships conntected to it!

In [6]:
with driver.session() as session:
    result = session.run("MATCH (c:Country {code: 'UNKNOWN'}) DETACH DELETE c;")
    print(result.data())                     

[]


In some cases the Country code may also be blank which is just as bad as an actual node, so we will remove that version too.

In [None]:
with driver.session() as session:
    result = session.run("MATCH (c:Country {code: ''}) DETACH DELETE c;")
    print(result.data())         

In [8]:
print("Done")

Done
