# About

This notebook and series after it will focus on interacting with a Neo4j database from Python. The downside to this, is it's hard to visualize some of the queries being ran. It's highly encourage to switch between the notebook and your local Neo4j Desktop database.

To download Neo4j Desktop, follow the instructions [here.](https://neo4j.com/download/)

# Cypher Keywords

## `CREATE`

`CREATE` -> Used to create nodes in a graph.
* **Creating Nodes**
    * **WARNING:** Although this section is focused on using `CREATE`, it does have the ability to generate duplicate nodes. For example, `CREATE (n:Person {id: 1}) RETURN n;` will generate a new 'Person' node with property: '{id: 1}' as many times as the query is executed. However, swapping `CREATE` for `MERGE` will prevent duplicates from appearing in the graph.
        * It is better practice to use `MERGE` over `CREATE`
    * Create a node with single labels by using: `MERGE (n:Person) RETURN n`. Specify multiple labels like `MERGE (n:Person:Animal:Professional) RETURN n`
    * Create a node with properties inside curly brackets like so: `MERGE (n:Person {name: "Henry", age: 17}) RETURN n`

* **Creating Relationships**
    * This is best illustrated through an example. Let's say Sally goes to the store to buy some pickles. To represent this graphically, first create two nodes for entities Sally (person) and pickles (food). The action or relationship between the two is the act of purchasing. So, in cypher we would write:
        > `CREATE (p:Person {name: "Sally"}), (f:Food {item: "Pickles"}), (p)-[:PURCHASES]->(f) RETURN p, f;`

        > `MERGE (p:Person {name: "Sally"})`<br>
          `MERGE (f:Food {item: "Pickles"})`<br>
          `MERGE (p)-[:PURCHASES]->(f)`<br>
          `RETURN p, f;`<br>

`RETURN` -> Instructed the graph to send back data from the graph based on what follows the 'RETURN' keyword.
* `RETURN *` will return all variables and properties from the query.

In [10]:
from neo4j import GraphDatabase, Record, ResultSummary, EagerResult
from neo4j.graph import Node, Relationship, Path
from neo4j.time import Date

import pandas as pd
pd.set_option('display.max_colwidth', 100)

import os 
import socket
from dotenv import load_dotenv 
load_dotenv()

NEO4J_URI = os.getenv("NEO4J_URI")
NEO4J_USERNAME = os.getenv("NEO4J_USERNAME")
NEO4J_PASSWORD = os.getenv("NEO4J_PASSWORD")

driver = GraphDatabase.driver(NEO4J_URI, auth=(NEO4J_USERNAME, NEO4J_PASSWORD))

**NOTE:**

When working from WSL2 and Neo4j Desktop is installed on the Windows side, you have to set up port forwarding. To do this, open a Powershell administrator window and run the following:
1. Run `ipconig`
* From this point forward assume you have a Windows ip address of: '123.456.78.900'
2. Launch the Neo4j database you want to query and take note of the port number (at the time of righting this, the default bolt port is '7687').
3. Run `netsh interface portproxy set v4tov4 listenport=7687 listenaddress=123.456.78.900 connectport=7687 connectaddress=127.0.0.1`
4. To verify, run `netsh interface portproxy show v4tov4`
5. To disable the port forwarding, run `netsh interface portproxy delete v4tov4 listenport=1234 listenaddress=123.456.78.900`

If working from a windows environment where Neo4j Desktop is installed, the default 'localhost' URI should be sufficient.

In [11]:
def process_node(node: Node) -> dict:
    """ 
    Parses the result from the query into a more consumable dictionary

    Args:
        A neo4j.graph.Node object 
    
    Returns:
        A dictionary with the following defintions for each key:
            * "elementId" -> A unique identifier of each object returned from the graph for any given transaction.
              IMPORTANT!! Not that each 'elementId' may not be the same across transactions. Meaning each time the 
              graph is queried, it's possible that a new 'elementId' is generated for the same object.
            * "labels" -> The label(s) of the node as it appears in the graph.
            * "properties" -> The properties of the node as represented in the graph.
    """
    if isinstance(node, Node):
        return {
            "elementId": node.element_id,
            "labels": node.labels,
            "properties": dict(node.items())
        }
    else:
        raise ValueError(f"Input value `node` is not of type neo4j.graph.Relationship. Type {type(node)} passed.")


def process_relationship(relationship: Relationship) -> dict:
    """ 
    Parses the result from the query into a more consumable dictionary.

    Args:
        A neo4j.graph.Relationship object 
    
    Returns:
        A dictionary with the following defintions for each key:
            * "startNode" -> In a "source/target" diagram, the start node is the source of the relationship. The 
              relationship comes from the 'startNode'.
                - "elementId" -> A unique identifier of each object returned from the graph for any given transaction.
                  IMPORTANT!! Not that each 'elementId' may not be the same across transactions. Meaning each time the 
                  graph is queried, it's possible that a new 'elementId' is generated for the same object.
                - "labels" -> The label(s) of the node as it appears in the graph.
                - "properties" -> The properties of the node as represented in the graph.
            * "elementId" -> The UID for the relationship. Please see "startNode" -> "elementId".
            * "type" -> The "label" of the relationship, or its name.
            * "properties" -> The properties of the relationship as represented in the graph.
            * "endNode" - > In a "source/target" diagram, the end node is the target of the relationship. The 
            relationship points towards the 'endNode'.
                - "elementId" -> The UID for the "endNode". Please see "startNode" -> "elementId".
                - "labels" -> The label(s) of the node as it appears in the graph.
                - "properties" -> The properties of the node as represented in the graph.
    """
    if isinstance(relationship, Relationship):
        return {
            "startNode": {
                "elementId": relationship.nodes[0].element_id,
                "labels": relationship.nodes[0].labels,
                "properties": dict(relationship.nodes[0].items())
                },
            "elementId": relationship.element_id,
            'type': relationship.type,
            'properties': dict(relationship.items()),
            "endNode": {
                "elementId": relationship.nodes[1].element_id,
                "labels": relationship.nodes[1].labels,
                "properties": dict(relationship.nodes[1].items())
            }
        }
    else:
        raise ValueError(f"Input value `relationship` is not of type neo4j.graph.Relationship. Type {type(relationship)} passed.")
    

def process_path(path: Path) -> dict:
    """ 
    Parses the result from the query into a more consumable dictionary.

    Args:
        A neo4j.graph.Path object 
    
    Returns:
        A dictionary with the following defintions for each key:
            * "startNodeElementId" -> A unique identifier of each object returned from the graph for any given transaction.
              IMPORTANT!! Not that each 'elementId' may not be the same across transactions. Meaning each time the 
              graph is queried, it's possible that a new 'elementId' is generated for the same object.
            * "nodes" -> A list of processed nodes, please see doc string for `process_node` function. The length of the list
              will depend on the number of nodes specified in the path expression.
            * "relationships" -> A list of processed relationships, please see doc string for `process_relationship` function.
              The length of the list will depend on the number of relationships specific in the path expression.
            * "endNodeElementId" -> A UID for the "end node". Please see "startNodeElementId".
    """
    if isinstance(path, Path):
        return {
            "startNodeElementId": path.start_node.element_id,
            "nodes": [process_node(node) for node in path.nodes],
            "relationships": [process_relationship(relationship) for relationship in path.relationships],
            "endNodeElementId": path.end_node.element_id
        }
    else:
        raise ValueError(f"Input value `path` is not of type neo4j.graph.Path. Type {type(path)} passed.")

In [12]:
def annotate_results(result: EagerResult, verbose: bool = False, return_table: bool = True) -> dict | pd.DataFrame:
    """ 
    A helper function to optionally print out some helpful metadata around the query and return
    a dictionary similar to what you would see in Neo4j Browser 'Table' view.

    Args:
        result -> The EagerResult object returned by Neo4j
        verbose -> Optional parameter to print out informative query metadata or not

    Returns:
        A list of records as dictionaries
    """
    start = result.summary.result_available_after
    finish = start + result.summary.result_consumed_after
    print(f"Started streaming {len(result.records)} records after {start} ms and completed after {finish} ms.\n")
    
    if verbose:
        print(f"Query executed against database: '{result.summary.database}': {result.summary.query}")

    # Parse the initial EagerResult
    data: list[list[list[tuple[str, Path | Relationship | Node]]]] = [record.items() for record in result.records]
    # Determine the column headers required based on the number of variables returned in the query.
    column_headers = list(set([variable[0] for record in data for variable in record]))
    # The final processed dictionary to be returned at the end
    processed_records: dict[str, list[dict]] = {key: [] for key in column_headers}

    # 'data' holds EagerResult parsed data represented as a list for each record returned by the query
    for i in range(len(data)):
        # 'var_returned' holds a single record of lists where each list is a data point for each variable returned
        for var_returned in data[i]:
            # 'data_object' iterates through the variable name (column value) at index 0, and beyond that and individual data point returned by the query
            for index, data_object in enumerate(var_returned):
                # The first index will hold the variable name that will help us determine how to store this data tabularly
                if index == 0:
                    column_header = str(data_object)
                    continue
                
                # After index 0, each 'data_object' will contain actual data returned from the graph
                if isinstance(data_object, list):
                    # Paths like to return neo4j.graph objects in lists, while non paths do not. So, we need to handle this
                    data_object = data_object[0]

                if isinstance(data_object, Node):
                    processed_records[column_header].append(process_node(data_object))
                elif isinstance(data_object, Relationship):
                    processed_records[column_header].append(process_relationship(data_object))
                elif isinstance(data_object, Path):
                    processed_records[column_header].append(process_path(data_object))
                else:
                    processed_records[column_header].append(data_object)
                    # raise ValueError(f"Unexpected datatype encountered. Of type: {type(data_object)}")
                
    if return_table:
        return pd.DataFrame(processed_records)
    else:
        return processed_records

In [13]:
# We are going to create a dev database so that we can do whatever we want in it
driver.execute_query("CREATE DATABASE dev IF NOT EXISTS;")

EagerResult(records=[], summary=<neo4j._work.summary.ResultSummary object at 0x7f6dcd0e46b0>, keys=[])

What the heck is an ['EagerResult'](https://neo4j.com/docs/api/python-driver/current/api.html#neo4j.EagerResult)?

In [14]:
# Let's create 5 nodes 
for _ in range(5):
    result: EagerResult[list[Record], ResultSummary, list[str]] = driver.execute_query(
        """ 
        CREATE (n) RETURN n;
        """,
        database_="dev"
    )

**NOTE:**

To keep things clean, I'll only annotate the 'result' variable once. Clearly the response returned from Neo4j is a pretty complex object that holds a ton of useful information. I would encourage you to explore their documentation and modify the `annotate_results` function to provide you with information that is the most helpful to you!

In [15]:
# Let's see how many nodes are in our graph
result: EagerResult[list[Record], ResultSummary, list[str]] = driver.execute_query(
    """ 
    MATCH (n) RETURN n;
    """,
    database_="dev"
)

df = annotate_results(result, True)
df

Started streaming 51 records after 0 ms and completed after 0 ms.

Query executed against database: 'dev':  
    MATCH (n) RETURN n;
    


Unnamed: 0,n
0,"{'elementId': '4:ff73e06d-56ad-4959-b409-fcc3d9dce978:0', 'labels': (), 'properties': {}}"
1,"{'elementId': '4:ff73e06d-56ad-4959-b409-fcc3d9dce978:1', 'labels': (), 'properties': {}}"
2,"{'elementId': '4:ff73e06d-56ad-4959-b409-fcc3d9dce978:2', 'labels': (), 'properties': {}}"
3,"{'elementId': '4:ff73e06d-56ad-4959-b409-fcc3d9dce978:3', 'labels': (), 'properties': {}}"
4,"{'elementId': '4:ff73e06d-56ad-4959-b409-fcc3d9dce978:4', 'labels': (), 'properties': {}}"
5,"{'elementId': '4:ff73e06d-56ad-4959-b409-fcc3d9dce978:5', 'labels': (), 'properties': {}}"
6,"{'elementId': '4:ff73e06d-56ad-4959-b409-fcc3d9dce978:6', 'labels': (), 'properties': {}}"
7,"{'elementId': '4:ff73e06d-56ad-4959-b409-fcc3d9dce978:7', 'labels': (), 'properties': {}}"
8,"{'elementId': '4:ff73e06d-56ad-4959-b409-fcc3d9dce978:8', 'labels': (), 'properties': {}}"
9,"{'elementId': '4:ff73e06d-56ad-4959-b409-fcc3d9dce978:9', 'labels': (), 'properties': {}}"


**NOTE:**

Notice how `CREATE` made 5 copies of the same node.

In [19]:
# Let's create a person in our graph named Sally
result = driver.execute_query(
    """ 
    CREATE (p:Person {name: "Sally"})
    RETURN p
    """,
    database_="dev"
)

df = annotate_results(result, True)

# Let's take a look at how our record is represented in python
df.at[0, 'p']

Started streaming 1 records after 1 ms and completed after 1 ms.

Query executed against database: 'dev':  
    CREATE (p:Person {name: "Sally"})
    RETURN p
    


{'elementId': '4:ff73e06d-56ad-4959-b409-fcc3d9dce978:54',
 'labels': frozenset({'Person'}),
 'properties': {'name': 'Sally'}}

In [20]:
# Let's delete all records in Neo4j
result = driver.execute_query(
    """ 
    MATCH (n) DETACH DELETE n;
    """,
    database_="dev"
)

df = annotate_results(result, True)

Started streaming 0 records after 24 ms and completed after 24 ms.

Query executed against database: 'dev':  
    MATCH (n) DETACH DELETE n;
    


In [23]:
# Let's create a relationship with one direction to represent the scenario "Sally goes to the store and purchases pickles"
result = driver.execute_query(
    """ 
    CREATE (p:Person {name: "Sally"}), (f:Food {item: "Pickles"}), (p)-[r:PURCHASES]->(f) RETURN *;
    """,
    database_="dev"
)

df = annotate_results(result, True, False)
df

Started streaming 1 records after 0 ms and completed after 0 ms.

Query executed against database: 'dev':  
    CREATE (p:Person {name: "Sally"}), (f:Food {item: "Pickles"}), (p)-[r:PURCHASES]->(f) RETURN *;
    


{'f': [{'elementId': '4:ff73e06d-56ad-4959-b409-fcc3d9dce978:3',
   'labels': frozenset({'Food'}),
   'properties': {'item': 'Pickles'}}],
 'r': [{'startNode': {'elementId': '4:ff73e06d-56ad-4959-b409-fcc3d9dce978:2',
    'labels': frozenset({'Person'}),
    'properties': {'name': 'Sally'}},
   'elementId': '5:ff73e06d-56ad-4959-b409-fcc3d9dce978:1152921504606846978',
   'type': 'PURCHASES',
   'properties': {},
   'endNode': {'elementId': '4:ff73e06d-56ad-4959-b409-fcc3d9dce978:3',
    'labels': frozenset({'Food'}),
    'properties': {'item': 'Pickles'}}}],
 'p': [{'elementId': '4:ff73e06d-56ad-4959-b409-fcc3d9dce978:2',
   'labels': frozenset({'Person'}),
   'properties': {'name': 'Sally'}}]}

In [24]:
# Add a date property to our relationship for event or transactional data
purchase_date = Date(year=2025, month=1, day=31)

result = driver.execute_query(
    """ 
    CREATE (p:Person {name: "Sally"}), (f:Food {item: "Pickles"}), (p)-[r:PURCHASES {purchased_on: $purchase_date}]->(f) RETURN *;
    """,
    database_="dev",
    purchase_date=purchase_date
)

df = annotate_results(result, True)

Started streaming 1 records after 1 ms and completed after 2 ms.

Query executed against database: 'dev':  
    CREATE (p:Person {name: "Sally"}), (f:Food {item: "Pickles"}), (p)-[r:PURCHASES {purchased_on: $purchase_date}]->(f) RETURN *;
    


**NOTE:**

Notice the additional argument in the `driver.execute_query` function. Neo4j lets you parametize queries for a number of reasons. In this case, we are defining a variable using a Neo4j datatype so we know it will insert into the graph the way we expect. Parametize queries by prefixing the variable with "$". Check out the datatypes for the Neo4j python driver [here](https://neo4j.com/docs/python-manual/current/data-types/)!

In [26]:
# Let's create a bi-directional relationship. Let's say Sally has a friend, Sarah
driver.execute_query(
    """ 
    CREATE (p:Person {name: "Sarah"});
    """,
    database_="dev"
)

result = driver.execute_query(
    """ 
    MATCH (p:Person {name: "Sally"}), (f:Person {name: "Sarah"})
    WITH p, f
    CREATE (p)-[r:HAS_FRIEND]->(f)-[rr:HAS_FRIEND]->(p)
    RETURN *;
    """,
    database_="dev"
)

df = annotate_results(result, True, True)
df

Started streaming 6 records after 0 ms and completed after 2 ms.

Query executed against database: 'dev':  
    MATCH (p:Person {name: "Sally"}), (f:Person {name: "Sarah"})
    WITH p, f
    CREATE (p)-[r:HAS_FRIEND]->(f)-[rr:HAS_FRIEND]->(p)
    RETURN *;
    


Unnamed: 0,f,r,rr,p
0,"{'elementId': '4:ff73e06d-56ad-4959-b409-fcc3d9dce978:6', 'labels': ('Person'), 'properties': {'...","{'startNode': {'elementId': '4:ff73e06d-56ad-4959-b409-fcc3d9dce978:0', 'labels': ('Person'), 'p...","{'startNode': {'elementId': '4:ff73e06d-56ad-4959-b409-fcc3d9dce978:6', 'labels': ('Person'), 'p...","{'elementId': '4:ff73e06d-56ad-4959-b409-fcc3d9dce978:0', 'labels': ('Person'), 'properties': {'..."
1,"{'elementId': '4:ff73e06d-56ad-4959-b409-fcc3d9dce978:7', 'labels': ('Person'), 'properties': {'...","{'startNode': {'elementId': '4:ff73e06d-56ad-4959-b409-fcc3d9dce978:0', 'labels': ('Person'), 'p...","{'startNode': {'elementId': '4:ff73e06d-56ad-4959-b409-fcc3d9dce978:7', 'labels': ('Person'), 'p...","{'elementId': '4:ff73e06d-56ad-4959-b409-fcc3d9dce978:0', 'labels': ('Person'), 'properties': {'..."
2,"{'elementId': '4:ff73e06d-56ad-4959-b409-fcc3d9dce978:6', 'labels': ('Person'), 'properties': {'...","{'startNode': {'elementId': '4:ff73e06d-56ad-4959-b409-fcc3d9dce978:2', 'labels': ('Person'), 'p...","{'startNode': {'elementId': '4:ff73e06d-56ad-4959-b409-fcc3d9dce978:6', 'labels': ('Person'), 'p...","{'elementId': '4:ff73e06d-56ad-4959-b409-fcc3d9dce978:2', 'labels': ('Person'), 'properties': {'..."
3,"{'elementId': '4:ff73e06d-56ad-4959-b409-fcc3d9dce978:7', 'labels': ('Person'), 'properties': {'...","{'startNode': {'elementId': '4:ff73e06d-56ad-4959-b409-fcc3d9dce978:2', 'labels': ('Person'), 'p...","{'startNode': {'elementId': '4:ff73e06d-56ad-4959-b409-fcc3d9dce978:7', 'labels': ('Person'), 'p...","{'elementId': '4:ff73e06d-56ad-4959-b409-fcc3d9dce978:2', 'labels': ('Person'), 'properties': {'..."
4,"{'elementId': '4:ff73e06d-56ad-4959-b409-fcc3d9dce978:6', 'labels': ('Person'), 'properties': {'...","{'startNode': {'elementId': '4:ff73e06d-56ad-4959-b409-fcc3d9dce978:4', 'labels': ('Person'), 'p...","{'startNode': {'elementId': '4:ff73e06d-56ad-4959-b409-fcc3d9dce978:6', 'labels': ('Person'), 'p...","{'elementId': '4:ff73e06d-56ad-4959-b409-fcc3d9dce978:4', 'labels': ('Person'), 'properties': {'..."
5,"{'elementId': '4:ff73e06d-56ad-4959-b409-fcc3d9dce978:7', 'labels': ('Person'), 'properties': {'...","{'startNode': {'elementId': '4:ff73e06d-56ad-4959-b409-fcc3d9dce978:4', 'labels': ('Person'), 'p...","{'startNode': {'elementId': '4:ff73e06d-56ad-4959-b409-fcc3d9dce978:7', 'labels': ('Person'), 'p...","{'elementId': '4:ff73e06d-56ad-4959-b409-fcc3d9dce978:4', 'labels': ('Person'), 'properties': {'..."


**NOTE:**

You can also use pathes to create a complex pattern.

In [27]:
# Use a path to create a pattern where Mike follows Jackie who is followed by Tim
result = driver.execute_query(
    """ 
    CREATE p=(mike:Person {name:"Mike"})-[:FOLLOWS]->(jackie:Person {name:"Jackie"})<-[:FOLLOWS]-(tim:Person {name:"Tim"})
    RETURN p;
    """,
    database_="dev"
)

df = annotate_results(result, True)

Started streaming 1 records after 1 ms and completed after 3 ms.

Query executed against database: 'dev':  
    CREATE p=(mike:Person {name:"Mike"})-[:FOLLOWS]->(jackie:Person {name:"Jackie"})<-[:FOLLOWS]-(tim:Person {name:"Tim"})
    RETURN p;
    


**NOTE:**

Cypher keywords *ARE NOT* case sensitive. Let's try it below.

In [28]:
# Create a long series of node and relationships for a family tree
result = driver.execute_query(
    """ 
    create 
    (D:Person{name:'Dan'}),
    (K:Person{name:'Kate'}),
    (M:Person{name:'Mike'}),
    (L:Person{name:'Luke'}),
    (S:Person{name:'Steve'}),
    (F:Person{name:'Favour'}),
    (faith:Person{name:'Faith'}),
    (J:Person{name:'Jane'}),
    (D)-[:MARRIED_TO]->(K)-[:MARRIED]->(D),
    (D)-[:PARENT_OF]->(M)<-[:PARENT_OF]-(K),
    (D)-[:PARENT_OF]->(L)<-[:PARENT_OF]-(K),
    (D)-[:PARENT_OF]->(S)<-[:PARENT_OF]-(K),
    (F)-[:MARRIED_TO]->(S)-[:MARRIED]->(F),
    (F)-[:PARENT_OF]->(faith)<-[:PARENT_OF]-(S),
    (F)-[:PARENT_OF]->(J)<-[:PARENT_OF]-(S)
    return *
    """,
    database_="dev"
)

df = annotate_results(result, True)

Started streaming 1 records after 32 ms and completed after 33 ms.

Query executed against database: 'dev':  
    create 
    (D:Person{name:'Dan'}),
    (K:Person{name:'Kate'}),
    (M:Person{name:'Mike'}),
    (L:Person{name:'Luke'}),
    (S:Person{name:'Steve'}),
    (F:Person{name:'Favour'}),
    (faith:Person{name:'Faith'}),
    (J:Person{name:'Jane'}),
    (D)-[:MARRIED_TO]->(K)-[:MARRIED]->(D),
    (D)-[:PARENT_OF]->(M)<-[:PARENT_OF]-(K),
    (D)-[:PARENT_OF]->(L)<-[:PARENT_OF]-(K),
    (D)-[:PARENT_OF]->(S)<-[:PARENT_OF]-(K),
    (F)-[:MARRIED_TO]->(S)-[:MARRIED]->(F),
    (F)-[:PARENT_OF]->(faith)<-[:PARENT_OF]-(S),
    (F)-[:PARENT_OF]->(J)<-[:PARENT_OF]-(S)
    return *
    


## `MATCH`

* `MATCH` -> Used like SQL's `SELECT` statement. It is a read only command to extract data from the graph.
    * To return everything from the graph we can run `MATCH (n) RETURN n;`
* To return related nodes, we call a node's label. We can write `MATCH (n:Person) RETURN n;` to return ALL 'Person' nodes.
    * To return related nodes with more than one label: `MATCH (n:Person:Doctors) RETURN n;` to return ALL 'Person' and 'Doctor' nodes.
* We can match nodes via a pattern by not specifying a direction in the relationship, for example: `MATCH (n:Person)--(d:Doctor) RETURN *` tells the query to return all instances of person and doctor regardless of the direction of the relationship.
* You can use backticks '\`' to introduce uncommon characters in your queries. For example `MATCH (``THIS IS MY NODE VARIABLE``) RETRUN ``THIS IS MY NODE VARIABLE``;`
* You can use `MATCH` to select a node and use it's properties as properties for a new node, for example:<br>

    `MATCH (n:Person {name:"Tom Hanks"})`<br>
    `CREATE (n:Person {name:n.name})`<br>
    `RETURN *`
* We can do the same for relationships, for example:<br>

    `MATCH (n:Person {name:"Sally"})-[r:PURCHASED]->(f:Food {item:"Pickles})`<br>
    `CREATE (n:Person {name:"Sally", purchased_pickles_on: r.purchased_on})`<br>
    `RETURN *`


## `RETURN`

* At this point are familiar with the basics of `RETURN`, but here we will look at it a little bit deeper.
* `RETURN` is equivalent to SQL's `SELECT`, much like the `MATCH` keyword we learned about previously.
* `RETURN` allows us to return all, or we can return specific properties, aggregations, filtered data, etc. just like `SELECT` in SQL.
* If I have a query: `MATCH (n:Person) RETURN n;` this will return the nodes and we will see a picture of all the nodes. If we want to return the data in tabular format. We can return the specific properties or use a function `properties` to return all properties of my nodes: `MATCH (n) RETURN properties(n) as prop;`. This will not include the label of the nodes. To add that you can do: `MATCH (n) RETURN properties(n) as prop, labels(n) as n_label;`
* For returned relationship data, we have a few options that include but are not limited to, assuming we have the query `MATCH (n)-[r]-(m)`: `type(r)` -> returns the name or label of the relationship, `r` returns all data about 'r', or `r.<property_name>` to return specific properties.
* You can use `DISTINCT` keyword after `RETURN` the same way you would use it in SQL, to return non-duplicated information.
* You can use `RETURN reltionships(n)` to return all relationships in your query. This is helpful in a complicated path variable.

In [30]:
# Let's query our graph and process the results to extract the labels, properties, and relationship types
result = driver.execute_query(
    """ 
    MATCH (p:Person)-[r:PURCHASES]->(f:Food)
    RETURN {
        source: {properties: properties(p), labels: labels(p)}, 
        relationship: {properties: properties(r), type: type(r)}, 
        target: {properties: properties(f), labels: labels(f)}
    } AS record
    """,
    database_="dev"
)

df = annotate_results(result, True, False)
df

Started streaming 3 records after 0 ms and completed after 1 ms.

Query executed against database: 'dev':  
    MATCH (p:Person)-[r:PURCHASES]->(f:Food)
    RETURN {
        source: {properties: properties(p), labels: labels(p)}, 
        relationship: {properties: properties(r), type: type(r)}, 
        target: {properties: properties(f), labels: labels(f)}
    } AS record
    


{'record': [{'relationship': {'properties': {}, 'type': 'PURCHASES'},
   'source': {'labels': ['Person'], 'properties': {'name': 'Sally'}},
   'target': {'labels': ['Food'], 'properties': {'item': 'Pickles'}}},
  {'relationship': {'properties': {}, 'type': 'PURCHASES'},
   'source': {'labels': ['Person'], 'properties': {'name': 'Sally'}},
   'target': {'labels': ['Food'], 'properties': {'item': 'Pickles'}}},
  {'relationship': {'properties': {'purchased_on': neo4j.time.Date(2025, 1, 31)},
    'type': 'PURCHASES'},
   'source': {'labels': ['Person'], 'properties': {'name': 'Sally'}},
   'target': {'labels': ['Food'], 'properties': {'item': 'Pickles'}}}]}

In [32]:
# Let's return all relationships in a very open ended path match pattern
result = driver.execute_query(
    """ 
    MATCH x = (p:Person {name:"Tom Hanks"})--()--()
    RETURN x
    """,
    database_="neo4j"
)

df = annotate_results(result, True, True)
df

Started streaming 59 records after 1 ms and completed after 6 ms.

Query executed against database: 'neo4j':  
    MATCH x = (p:Person {name:"Tom Hanks"})--()--()
    RETURN x
    


Unnamed: 0,x
0,"{'startNodeElementId': '4:552b0252-2f83-4c7e-a0bf-f921a4b1b7cf:71', 'nodes': [{'elementId': '4:5..."
1,"{'startNodeElementId': '4:552b0252-2f83-4c7e-a0bf-f921a4b1b7cf:71', 'nodes': [{'elementId': '4:5..."
2,"{'startNodeElementId': '4:552b0252-2f83-4c7e-a0bf-f921a4b1b7cf:71', 'nodes': [{'elementId': '4:5..."
3,"{'startNodeElementId': '4:552b0252-2f83-4c7e-a0bf-f921a4b1b7cf:71', 'nodes': [{'elementId': '4:5..."
4,"{'startNodeElementId': '4:552b0252-2f83-4c7e-a0bf-f921a4b1b7cf:71', 'nodes': [{'elementId': '4:5..."
5,"{'startNodeElementId': '4:552b0252-2f83-4c7e-a0bf-f921a4b1b7cf:71', 'nodes': [{'elementId': '4:5..."
6,"{'startNodeElementId': '4:552b0252-2f83-4c7e-a0bf-f921a4b1b7cf:71', 'nodes': [{'elementId': '4:5..."
7,"{'startNodeElementId': '4:552b0252-2f83-4c7e-a0bf-f921a4b1b7cf:71', 'nodes': [{'elementId': '4:5..."
8,"{'startNodeElementId': '4:552b0252-2f83-4c7e-a0bf-f921a4b1b7cf:71', 'nodes': [{'elementId': '4:5..."
9,"{'startNodeElementId': '4:552b0252-2f83-4c7e-a0bf-f921a4b1b7cf:71', 'nodes': [{'elementId': '4:5..."
