# Temporal Node Embedding Time Tree Interval Approach

#### Import the required libraries

First of all we have to install and import the libraries that we need for the implementation of the Time Tree Embedding.

- neo4j: The Neo4j Python driver is used to connect to the Neo4j database. 
- graphdatascience: The graph datascience client is a Python client for working with the Neo4j Graph Data Science Library which is used for the in-memory graph projection and the FastRP algorithm for the embedding.

In [None]:
%pip install neo4j
%pip install graphdatascience

In [None]:
from neo4j import GraphDatabase
import graphdatascience

### Configure Driver and Client

We have to configure the driver and the client for the connection to the Neo4j database. The driver is used to execute Cypher queries and the client is used to execute the Graph Data Science Library algorithms. 

- Endpoint: Bolt URL of the Neo4j database 
- Username: Username 
- Password: Password
- database: Database where you imported the trips

In [None]:
endpoint = "neo4j://localhost:7687"
username = "neo4j"
password = "#Bachelorarbeit"
database = "neo4j"

gds = graphdatascience.GraphDataScience(endpoint=endpoint, auth=(username, password))
gds.set_database(database)

db_driver = GraphDatabase.driver(endpoint, auth=(username,password)).session(database=database)

### Constants for the Time Tree

We define some constants for the Time Tree. The Time Tree is a tree structure that represents the time. In this work we only create the Time Tree for one year. Therefor it is possible to define the year, and calendar explicit. It would be possible to extend the Time Tree for multiple years but that is not part of this work. 

Year: Is the year for which the Time Tree is created.
Calendar: The calendar is a dictionary that contains the number of days for each month.
Hours: The number of hours in a day.
Minutes: The minutes in an hour. We use 15 minutes as a time interval.

In [None]:
YEAR = 2017
HOURS = 24
MINUTES = [0, 15, 30, 45]

CALENDAR = {1: 31, 2: 28, 3: 31, 4: 30, 5: 31, 6: 30,
            7: 31, 8: 31, 9: 30, 10: 31, 11: 30, 12: 31}

### Function for running Cypher Queries

We introduce a simple function that will be used to run cypher queries. The function takes a query as an argument and returns the result of the query.

In [None]:
def run_query(query):
    with db_driver as session:
        result = session.run(query)
        return [record.data() for record in result]

### Helper Functions and Function for creating the Time Tree
In this part we define some helper functions that are used to create the Time Tree and also the function that creates the Time Tree for one year. 

In [None]:
def add_root_node(tx):
    return tx.run(
        "CREATE (root:Root)"
    )


def add_has_year_relationship(tx, year):
    return tx.run(
        "MATCH (r:Root) "
        "CREATE (y:Year {value: $year}) "
        "CREATE (r)-[:HAS_YEAR]->(y)",
        year=year
    )


def add_has_month_relationship(tx, year, month):
    return tx.run(
        "MATCH (y:Year {value: $year}) "
        "CREATE (m:Month {value: $month}) "
        "CREATE (y)-[:HAS_MONTH]->(m)",
        year=year, month=month
    )


def add_has_month(tx, year, month):
    return tx.run(
        "MATCH (y:Year {value: $year}) "
        "CREATE (m:Month {value: $month}) "
        "CREATE (y)-[:HAS_MONTH]->(m)",
        year=year, month=month
    )


def add_has_day(tx, month, day):
    return tx.run(
        "MATCH (m:Month {value: $month}) "
        "CREATE (d:Day {value: $day}) "
        "CREATE (m)-[:HAS_DAY]->(d)",
        month=month, day=day
    )

def add_has_hour(tx, month, day, hour):
    return tx.run(
        "CREATE (h:Hour {value: $hour})"
        "WITH (h)"
        "MATCH (m:Month{value: $month})-[:HAS_DAY]->(d:Day{value: $day})"
        "MERGE (d)-[:HAS_HOUR]->(h)",
        month=month, day=day, hour=hour
    )


def next_relationship_in_month(tx, day, month):
    return tx.run(
        "MATCH (m:Month {value: $month})-[:HAS_DAY]->(d:Day {value: $day}) "
        "WITH d, d.value AS currentDay, d.value + 1 AS nextDay, m "
        "MATCH (m)-[:HAS_DAY]->(n:Day {value: nextDay})"
        "MERGE (d)-[:NEXT]->(n)",
        day=day, month=month
    )


def next_relationship_next_month(tx, day, month):
    return tx.run(
        """
        MATCH (m:Month {value: $month})-[:HAS_DAY]->(d:Day {value: $day})
        WITH m, d, m.value AS currentMonth, m.value + 1 AS nextMonth
        MATCH (c:Month {value: nextMonth})-[:HAS_DAY]->(k:Day {value: 1})
        MERGE (d)-[:NEXT]->(k)
        """,
        day=day, month=month
    )


def has_minute(tx, month, day, hour, minute):
    return tx.run(
        "CREATE (m:Minute {value: $minute})"
        "WITH (m)"
        "MATCH (month:Month{value: $month})-[:HAS_DAY]->(d:Day{value: $day})-[:HAS_HOUR]->(h:Hour{value: $hour})"
        "CREATE (h)-[:HAS_MINUTE]->(m)",
        hour=hour, minute=minute, month=month, day=day
    )

def create_time_tree(year, driver):
    with driver as session:
        add_root_node(session)
        add_has_year_relationship(session, year)

        for month, days in CALENDAR.items():
            add_has_month(session, year, month)
            for day in range(1, days + 1):
                add_has_day(session, month, day)
                for hour in range(1, HOURS + 1):
                    add_has_hour(session, month, day, hour)
                    for minute in MINUTES:
                        has_minute(session, month, day, hour, minute)
            for day in range(1, days + 1):
                if day < days:
                    next_relationship_in_month(session, day, month)
        for month, days in CALENDAR.items():
            if month < 12:
                next_relationship_next_month(session, days, month)

create_time_tree(2017, db_driver)

### Connection between the Time Tree and the Trips
In this section we provide a query that connects the trip nodes with the Time Tree. Therefor we preprocess the minute of a trip and round it to the floor of the nearest 15 minutes. After that we connect the trip with the minute node in the Time Tree with a STARTED_AT relationship.

In [None]:
connection_timetree_trip_start = """CALL apoc.periodic.iterate(
  "
  MATCH (t:Trip)
  RETURN t, 
         t.validFrom.year AS year, 
         t.validFrom.month AS month, 
         t.validFrom.day AS day, 
         t.validFrom.hour AS hour, 
         CASE 
           WHEN t.validFrom.minute % 15 = 0 THEN t.validFrom.minute
           ELSE toInteger(floor(t.validFrom.minute / 15.0) * 15)
         END AS roundedMinute
  ",
  "
  MATCH (:Year {value: year})-[:HAS_MONTH]->(:Month {value: month})-[:HAS_DAY]->(:Day {value: day})-[:HAS_HOUR]->(:Hour {value: hour})-[:HAS_MINUTE]->(mi:Minute {value: roundedMinute})
  MERGE (t)-[:STARTED_AT]->(mi)
  ",
  {batchSize: 1000, parallel: false}
) YIELD batches, total
RETURN batches, total;
"""

connection_timetree_trip_end = """CALL apoc.periodic.iterate(
  "
  MATCH (t:Trip)
  RETURN t, 
         t.validTo.year AS validToYear, 
         t.validTo.month AS validToMonth, 
         t.validTo.day AS validToDay, 
         t.validTo.hour AS validToHour, 
         CASE 
           WHEN t.validTo.minute % 15 = 0 THEN t.validTo.minute
           ELSE toInteger(floor(t.validTo.minute / 15.0) * 15)
         END AS validToRoundedMinute
  ",
  "
  // Verbinde validTo mit dem TimeTree
  MATCH (:Year {value: validToYear})-[:HAS_MONTH]->(:Month {value: validToMonth})-[:HAS_DAY]->(:Day {value: validToDay})-[:HAS_HOUR]->(:Hour {value: validToHour})-[:HAS_MINUTE]->(toMinute:Minute {value: validToRoundedMinute})
  MERGE (t)-[:ENDED_AT]->(toMinute)
  ",
  {batchSize: 1000, parallel: false}
) YIELD batches, total
RETURN batches, total;"""


run_query(connection_timetree_trip_start)
run_query(connection_timetree_trip_end)

## Temporal Node Embedding with FastRP

In this section we will create the in-memory graph projection of the Time Tree and apply the FastRP algorithm to embed the nodes of the Time Tree.


### Create the In-Memory Graph Projection
First we create the in-memory graph projection of the Time Tree. This is necessary to apply the FastRP algorithm to the Time Tree. Projected graphs can also include additional numerical properties from the original graph. In case of the Time Tree we only use the structure of the Time Tree and the trip and station nodes from the original graph.

In [None]:
create_memory_graph = """
MATCH (source)-[r:HAS_MONTH|HAS_DAY|NEXT|HAS_HOUR|HAS_MINUTE|STARTED_AT|ENDED_AT|HAS_START|HAS_END]->(target)
WITH gds.graph.project(
  'timeTreeInterval',
  source,
  target,
  {},
  {undirectedRelationshipTypes: ['*']}
)as g
RETURN g.graphName AS graph, g.nodeCount AS nodes, g.relationshipCount AS rels
""" 

run_query(create_memory_graph)

Now we will check if we created the memory graph correctly and save it into the variable G.

In [None]:
G = gds.graph.get("timeTreeInterval")

### FastRP Algorithm Estimation
We estimate the FastRP algorithm. If we checked those with our resources we can run the FastRP algorithm to embed the nodes of the Time Tree.

In [None]:
gds.fastRP.write.estimate(
    G,
    writeProperty="timeTreeIntervalEmbedding",
    randomSeed = 42,
    embeddingDimension=64,
    iterationWeights=[1.0,1.0,1.0,1.0,1.0]
)

### FastRP Algorithm Execution
Now we run the FastRP algorithm to embed the nodes of the Time Tree. We will write the embedding into the property timeTreeIntervalEmbedding.

In [None]:
result = gds.fastRP.write(
    G,
    writeProperty="timeTreeIntervalEmbedding",
    randomSeed = 42,
    embeddingDimension=64,
    iterationWeights=[1.0,1.0,1.0,1.0,1.0]
)

### Dropping Graph and Closing Connection
After we have finished the embedding we can drop the graph and close the connection to the database.

In [None]:
G.drop()
db_driver.close()
gds.close()