## Prepare Project
### Import Libraries and set up database connection

In [2]:
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

# Import pandas
import pandas as pd
pd.set_option('display.width', 2000)

# Import logging and surpress warnings
import logging
logging.getLogger("neo4j").setLevel(logging.ERROR)
logging.getLogger("pd").setLevel(logging.ERROR)

# Import Path
from pathlib import Path

# Import promg
from promg import Query

In [3]:
from util.db_helper_functions import get_db_connection, get_graph_statistics
from util.transformer_functions import build_entities, build_relationships
from util.assign_types_functions import add_object_type_node, add_event_type_node

## Set up connection

In [4]:
conf_path = Path('bpic14', 'config.yaml')
db_connection = get_db_connection(conf_path)

These are the credentials that I expect to be set for the database.
db_name: neo4j
uri: bolt://localhost:7687
password: bpic2014
----------------------
If you have other credentials, please change them at: bpic14\config.yaml


# 1. Map Entities into Objects and Events
Instead of mapping _Record_ nodes first into domain-specific _Entity_ nodes and then into domain-specific _Object_ and _Event_ nodes, we will map the _Record_ nodes immediately into domain-specific _Object_ and _Event_ nodes.

Conceptually, we obtain the same result, but this saves us computation as we only need to perform the mapping once (instead of twice).

In [6]:
bpic14_incident = "BPIC14Incident.csv"
bpic14_interaction = "BPIC14Interaction.csv"
bpic14_change = "Detail_Change.csv"
bpic14_incident_activity = "Detail_Incident_Activity.csv"

### Objects Nodes

Create objects directly from the records available in the input files.<br>
We take all entities in the domain model that refer to an object. Those are:

- Incident
- Interaction
- Change
- Knowledge Document
- Resource
- Configuration Item
- Service Component

For every entity, we define how it should be created, considering:
- which log to read from.
- which field to use as the unique sysId
- which attributes to keep
- any constant properties that should be added to the node

#### Primary Entities
The following entities can be directly extracted from their primary logs, where their ID serves as the primary key:
   Entity      | Primary Log Table   | Primary Key   |
 |-------------|---------------------|---------------|
 | Incident    | bpic14_incident     | incidentId    |
 | Interaction | bpic14_interaction  | interactionId |
 | Change      | bpic14_change       | changeId      |

---

#### Foreign Key References
These entities are also referenced as foreign keys in other logs:
 | Entity      | Referenced In               | Foreign Key Field   |
 |-------------|-----------------------------|---------------------|
 | Incident    | bpic14_interaction          | relatedIncident     |
 |             | bpic14_incident_activity    | incidentId          |
 | Interaction | bpic14_incident             | relatedInteraction |
 | Change      | bpic14_incident             | relatedChange       |

---

#### Supporting Entities (Referenced Only as Foreign Keys)
The following entities are **not** extracted from a primary log but are referenced as foreign keys in other logs:
 | Entity                     | Referenced In               | Foreign Key Field      | Notes                                      |
 |----------------------------|-----------------------------|------------------------|--------------------------------------------|
 | Knowledge Document         | All logs                    | kmNumber               | No primary log; referenced across all logs.|
 | Resource                   | bpic14_incident_activity    | assignmentGroup        |                                            |
 | (Affected) Configuration Item | All logs                  | CINameAff              | For CIs affected by the primary entity.    |
 | (CausedBy) Configuration Item | bpic14_incident           | CINameCBy              |                                            |
 | (Affected) Service Component | All logs                  | serviceComponentAff    | For SCs affected by the log.               |
 | (CausedBy) Service Component | bpic14_incident           | serviceComponentCBy    |                                            |

In case of All logs, we set `log` to `None` in our object description

In [5]:
objects = {
    "Incident": [
        {
            "log": bpic14_incident,
            "sysId": "incidentId",
            "attributes": {
                "incidentId": "incidentId",
                "status": "status",
                "impact": "impact",
                "priority": "priority",
                "category": "category",
                "handleTimeHours": "handleTimeHours",
                "closureCode": "closureCode",
                "alertStatus": "alertStatus",
                "numReassignments": "numReassignments",
                "numRelatedInteractions": "numRelatedInteractions",
                "numRelatedIncidents": "numRelatedIncidents",
                "numRelatedChanges": "numRelatedChanges"
            },
        },
        {
            "log": bpic14_interaction,
            "sysId": "relatedIncident",
            "attributes": {
                "incidentId": "relatedIncident"
            },
            "constants": {
                "derivedFromInteraction": True
            }
        },

        {
            "log": bpic14_incident_activity,
            "sysId": "incidentId",
            "attributes": {
                "incidentId": "incidentId"
            }
        }
    ],
    "Interaction": [
        {
            "log": bpic14_interaction,
            "sysId": "interactionId",
            "attributes": {
                "interactionId": "interactionId",
                "status": "status",
                "impact": "impact",
                "priority": "priority",
                "category": "category",
                "handleTimeSecs": "handleTimeSecs",
                "closureCode": "closureCode",
                "firstCallResolution": "firstCallResolution"
            },
        },
        {
            "log": bpic14_incident,
            "sysId": "relatedInteraction",
            "attributes": {
                "interactionId": "relatedInteraction"
            },
        }
    ],
    "Change": [
        {
            "log": bpic14_change,
            "sysId": "changeId",
            "attributes": {
                "changeId": "changeId",
                "type": "changeType",
                "riskAssessment": "riskAssessment",
                "cabApprovalNeeded": "cabApprovalNeeded",
                "plannedStart": "plannedStart",
                "plannedEnd": "plannedEnd",
                "scheduledDowntimeStart": "scheduledDowntimeStart",
                "scheduledDowntimeEnd": "scheduledDowntimeEnd",
                "requestedEndDate": "requestedEndDate",
                "originatedFrom": "originatedFrom",
                "numRelatedInteractions": "numRelatedInteractions",
                "numRelatedIncidents": "numRelatedIncidents"
            },
        }, {
            "log": bpic14_incident,
            "sysId": "relatedChange",
            "attributes": {
                "changeId": "relatedChange"
            },
            "constants": {
                "derivedFromIncident": True
            }
        }
    ],
    "KnowledgeDocument": [
        {
            "log": None,
            "sysId": "kmNumber",
            "attributes": {"kmNumber": "kmNumber"}
        }
    ],
    "Resource": [
        {
            "log": bpic14_incident_activity,
            "sysId": "assignmentGroup",
            "attributes": {"assignmentGroup": "assignmentGroup"}
        }
    ],
    "ConfigurationItem": [
        {  # affected CIs
            "log": None,
            "sysId": "ciNameAff",
            "attributes": {
                "ciName": "ciNameAff",
                "ciType": "ciTypeAff",
                "ciSubtype": "ciSubtypeAff"
            },
            "constants": {
                "affected": True
            }
        },
        {  # caused by CIs
            "log": bpic14_incident,
            "sysId": "ciNameCby",
            "attributes": {
                "ciName": "ciNameCby",
                "ciType": "ciTypeCby",
                "ciSubtype": "ciSubtypeCby"
            },
            "constants": {
                "caused": True
            }

        }

    ],
    "ServiceComponent": [
        {  # affected SCs
            "log": None,
            "sysId": "serviceComponentAff",
            "attributes": {
                "scName": "serviceComponentAff"
            },
            "constants": {
                "affected": True
            }
        },
        {  # caused by SCs
            "log": bpic14_incident,
            "sysId": "serviceComponentCBy",
            "attributes": {
                "scName": "serviceComponentCBy"
            },
            "constants": {
                "caused": True
            }
        },
    ]
}

We create _Object_ nodes with their domain-specific name as label (i.e. Change or ServiceComponent).

We extract the data as specified in the description above using the imported `build_entities`

In [8]:
build_entities(db_connection, entities=objects)


=== INDEXES ===
Index for :Incident(sysId)
Index for :Interaction(sysId)
Index for :Change(sysId)
Index for :KnowledgeDocument(sysId)
Index for :Resource(sysId)
Index for :ConfigurationItem(sysId)
Index for :ServiceComponent(sysId)

=== Building ENTITY NODES ===
→ Incident nodes created.
→ Incident nodes created.
→ Incident nodes created.
→ Interaction nodes created.
→ Interaction nodes created.
→ Change nodes created.
→ Change nodes created.
→ KnowledgeDocument nodes created.
→ Resource nodes created.
→ ConfigurationItem nodes created.
→ ConfigurationItem nodes created.
→ ServiceComponent nodes created.
→ ServiceComponent nodes created.


In [9]:
get_graph_statistics(db_connection)


=== GRAPH STATISTICS ===

--- Node counts ---
(:Record)                      690622
(:Interaction)                 147172
(:Incident)                    47057
(:Change)                      18026
(:ConfigurationItem)           15134
(:KnowledgeDocument)           2373
(:ServiceComponent)            340
(:Resource)                    242
(:Log)                         4

--- Relationship counts ---
[:EXTRACTED_FROM]              2371470
[:CONTAINS]                    690622

--- Totals ---
Total nodes: 920970
Total relationships: 3062092


## Object-to-Object (O2O) Relationships

Similarly to the objects, we define and specify the following Object-to-Object (O2O) relations:
- (Incident|Interaction)-[:USED_KM]->(KnowledgeDocument)
- (Incident)-[:RELATED_CHANGE]->(Change)
- (Interaction)-[:RELATED_INCIDENT]->(Incident)
- (Incident|Interaction|Change)-[:AFFECTED_CI_SC]->(CI_SC)
- (Incident|Interaction|Change)<-[:CAUSED_BY_CI_SC]-(CI_SC)

In [7]:
o2o_relationships = {
    "USED_KM": [{
        "from_object": {
            "label": "Incident|Interaction"
        },
        "to_object": {
            "label": "KnowledgeDocument",
            "foreign_key": "kmNumber"
        }
    }],
    "RELATED_CHANGE": [{
        "from_object": {
            "label": "Incident"
        },
        "to_object": {
            "label": "Change",
            "foreign_key": "relatedChange"
        }
    }],
    "RELATED_INCIDENT": [
        {
            "from_object": {
                "label": "Interaction"
            },
            "to_object": {
                "label": "Incident",
                "foreign_key": "relatedIncident"
            }
        },
        {
            "from_object": {
                "label": "Interaction",
                "foreign_key": "relatedInteraction"
            },
            "to_object": {
                "label": "Incident"
            },
            "constants": {
                "primary": True
            }
        }],
    "AFFECTED_CI": [{
        "from_object": {
            "label": "Incident"
        },
        "to_object": {
            "label": "ConfigurationItem",
            "foreign_key": "ciNameAff",
        },
        "log": bpic14_incident
    },
        {
            "from_object": {
                "label": "Interaction"
            },
            "to_object": {
                "label": "ConfigurationItem",
                "foreign_key": "ciNameAff",
            },
            "log": bpic14_interaction
        },
        {
            "from_object": {
                "label": "Change"
            },
            "to_object": {
                "label": "ConfigurationItem",
                "foreign_key": "ciNameAff",
            },
            "log": bpic14_change
        }],
    "AFFECTED_SC": [{
        "from_object": {
            "label": "Incident"
        },
        "to_object": {
            "label": "ServiceComponent",
            "foreign_key": "serviceComponentAff",
        },
        "log": bpic14_incident
    },
        {
            "from_object": {
                "label": "Interaction"
            },
            "to_object": {
                "label": "ServiceComponent",
                "foreign_key": "serviceComponentAff",
            },
            "log": bpic14_interaction
        },
        {
            "from_object": {
                "label": "Change"
            },
            "to_object": {
                "label": "ServiceComponent",
                "foreign_key": "serviceComponentAff"
            },
            "log": bpic14_change
        }],
    "CAUSED_BY_CI": [{
        "from_object": {
            "label": "Incident"
        },
        "to_object": {
            "label": "ConfigurationItem",
            "foreign_key": "ciNameCby"
        },
    }],
    "CAUSED_BY_SC": [{
        "from_object": {
            "label": "Incident"
        },
        "to_object": {
            "label": "ServiceComponent",
            "foreign_key": "serviceComponentCBy"
        },
    }],
    "CONTAINS": [{
        "from_object": {
            "label": "ServiceComponent",
            "foreign_key": "serviceComponentAff"
        },
        "to_object": {
            "label": "ConfigurationItem",
            "foreign_key": "ciNameAff"
        },
    }, {
        "from_object": {
            "label": "ServiceComponent",
            "foreign_key": "serviceComponentCBy"
        },
        "to_object": {
            "label": "ConfigurationItem",
            "foreign_key": "ciNameCby"
        },
    }]
}

In [8]:
build_relationships(_db_connection=db_connection,
                    _relationships=o2o_relationships)


=== INDEXES ===
Index ensured for :Record(kmNumber)
Index ensured for :Record(relatedChange)
Index ensured for :Record(relatedIncident)
Index ensured for :Record(relatedInteraction)
Index ensured for :Record(ciNameAff)
Index ensured for :Record(ciNameAff)
Index ensured for :Record(ciNameAff)
Index ensured for :Record(serviceComponentAff)
Index ensured for :Record(serviceComponentAff)
Index ensured for :Record(serviceComponentAff)
Index ensured for :Record(ciNameCby)
Index ensured for :Record(serviceComponentCBy)
Index ensured for :Record(serviceComponentAff)
Index ensured for :Record(ciNameAff)
Index ensured for :Record(serviceComponentCBy)
Index ensured for :Record(ciNameCby)

=== O2O RELATIONSHIPS ===
→ (:{'label': 'Incident|Interaction'}) - [:USED_KM] -> (:{'label': 'KnowledgeDocument', 'foreign_key': 'kmNumber'}) Relationship built
→ (:{'label': 'Incident'}) - [:RELATED_CHANGE] -> (:{'label': 'Change', 'foreign_key': 'relatedChange'}) Relationship built
→ (:{'label': 'Interaction'

In [9]:
get_graph_statistics(db_connection)


=== GRAPH STATISTICS ===

--- Node counts ---
(:Record)                      690622
(:Interaction)                 147172
(:Incident)                    47057
(:Change)                      18026
(:ConfigurationItem)           15134
(:KnowledgeDocument)           2373
(:ServiceComponent)            340
(:Resource)                    242
(:Log)                         4

--- Relationship counts ---
[:EXTRACTED_FROM]              2371470
[:CONTAINS]                    705949
[:AFFECTED_CI]                 223734
[:AFFECTED_SC]                 212948
[:USED_KM]                     194437
[:RELATED_INCIDENT]            52687
[:CAUSED_BY_CI]                45499
[:CAUSED_BY_SC]                43123
[:RELATED_CHANGE]              536

--- Totals ---
Total nodes: 920970
Total relationships: 3850383


## Event Nodes

There are four types of events Incident Events, Incident Activity Events, Change Events and Interaction Events.

TODO: FZE: WHY??? How do you infer this from the raw data?

In [7]:
EVENTS = {
    "IncidentEvent": [
        {
            "log": bpic14_incident,
            "sysId": "incidentId",
            "id_addition": "_Open",
            "attributes": {
                "timestamp": "openTime"
            },
            "constants": {
                "activity": "'Open'"
            }
        }, {
            "log": bpic14_incident,
            "sysId": "incidentId",
            "id_addition": "_Resolve",
            "attributes": {
                "timestamp": "resolvedTime"
            },
            "constants": {
                "activity": "'Resolve'"
            }
        }, {
            "log": bpic14_incident,
            "sysId": "incidentId",
            "id_addition": "_Close",
            "attributes": {
                "timestamp": "closeTime"
            },
            "constants": {
                "activity": "'Close'"
            }
        }
    ],
    "ChangeEvent": [
        {
            "log": bpic14_change,
            "sysId": "changeId",
            "id_addition": "_Start",
            "attributes": {
                "timestamp": "actualStart"
            },
            "constants": {
                "activity": "'Start'"
            }
        }, {
            "log": bpic14_change,
            "sysId": "changeId",
            "id_addition": "_End",
            "attributes": {
                "timestamp": "actualEnd"
            },
            "constants": {
                "activity": "'End'"
            }
        }
    ],
    "InteractionEvent": [
        {
            "log": bpic14_interaction,
            "sysId": "interactionId",
            "id_addition": "_Open",
            "attributes": {
                "timestamp": "openTime"
            },
            "constants": {
                "activity": "'Open'"
            }
        }, {
            "log": bpic14_interaction,
            "sysId": "interactionId",
            "id_addition": "_Close",
            "attributes": {
                "timestamp": "closeTime"
            },
            "constants": {
                "activity": "'Close'"
            }
        }],
    "IncidentActivityEvent": [
        {
            "log": bpic14_incident_activity,
            "sysId": "activityNumber",
            "attributes": {
                "activity": "incidentActivityType",
                "timestamp": "dateStamp"
            }
        }
    ],
}


In [11]:
build_entities(db_connection, entities=EVENTS)


=== INDEXES ===
Index for :IncidentEvent(sysId)
Index for :ChangeEvent(sysId)
Index for :InteractionEvent(sysId)
Index for :IncidentActivityEvent(sysId)

=== Building ENTITY NODES ===
→ IncidentEvent nodes created.
→ IncidentEvent nodes created.
→ IncidentEvent nodes created.
→ ChangeEvent nodes created.
→ ChangeEvent nodes created.
→ InteractionEvent nodes created.
→ InteractionEvent nodes created.
→ IncidentActivityEvent nodes created.


In [12]:
get_graph_statistics(db_connection)


=== GRAPH STATISTICS ===

--- Node counts ---
(:Record)                      690622
(:IncidentActivityEvent)       466737
(:InteractionEvent)            294008
(:Interaction)                 147172
(:IncidentEvent)               138038
(:Incident)                    47057
(:ChangeEvent)                 33381
(:Change)                      18026
(:ConfigurationItem)           15134
(:KnowledgeDocument)           2373
(:ServiceComponent)            340
(:Resource)                    242
(:Log)                         4

--- Relationship counts ---
[:EXTRACTED_FROM]              3324284
[:CONTAINS]                    705949
[:AFFECTED_CI]                 223734
[:AFFECTED_SC]                 212948
[:USED_KM]                     194437
[:RELATED_INCIDENT]            52687
[:CAUSED_BY_CI]                45499
[:CAUSED_BY_SC]                43123
[:RELATED_CHANGE]              536

--- Totals ---
Total nodes: 1853134
Total relationships: 4803197


## Event-to-Object (E2O) Relationships

TODO: FZE: add here what these relationships are and how they are built

In [13]:
e2o_relationships = {
    "CORR": [
        {
            "from_object": {
                "label": "IncidentEvent"
            },
            "to_object": {
                "label": "Incident",
                "foreign_key": "incidentId"
            }
        },
        {
            "from_object": {
                "label": "ChangeEvent"
            },
            "to_object": {
                "label": "Change",
                "foreign_key": "changeId"
            }
        },
        {
            "from_object": {
                "label": "InteractionEvent"
            },
            "to_object": {
                "label": "Interaction",
                "foreign_key": "interactionId"
            }
        },
        {
            "from_object": {
                "label": "IncidentActivityEvent"
            },
            "to_object": {
                "label": "Incident",
                "foreign_key": "incidentId"
            }
        }
    ],
    "EXECUTED_BY": [
        {
            "from_object": {
                "label": "IncidentActivityEvent"
            },
            "to_object": {
                "label": "Resource",
                "foreign_key": "assignmentGroup"
            }
        }
    ]

}

In [14]:
build_relationships(db_connection, _relationships=e2o_relationships)


=== INDEXES ===
Index ensured for :Record(incidentId)
Index ensured for :Record(changeId)
Index ensured for :Record(interactionId)
Index ensured for :Record(incidentId)
Index ensured for :Record(assignmentGroup)

=== O2O RELATIONSHIPS ===
→ (:{'label': 'IncidentEvent'}) - [:CORR] -> (:{'label': 'Incident', 'foreign_key': 'incidentId'}) Relationship built
→ (:{'label': 'ChangeEvent'}) - [:CORR] -> (:{'label': 'Change', 'foreign_key': 'changeId'}) Relationship built
→ (:{'label': 'InteractionEvent'}) - [:CORR] -> (:{'label': 'Interaction', 'foreign_key': 'interactionId'}) Relationship built
→ (:{'label': 'IncidentActivityEvent'}) - [:CORR] -> (:{'label': 'Incident', 'foreign_key': 'incidentId'}) Relationship built
→ (:{'label': 'IncidentActivityEvent'}) - [:EXECUTED_BY] -> (:{'label': 'Resource', 'foreign_key': 'assignmentGroup'}) Relationship built


In [None]:
get_graph_statistics(db_connection)

# 2. Assign Types

This function creates an ObjectType node (e.g., "Incident", "Interaction") and then links every node of that label in the graph to this type node with an IS_OF_TYPE relationship.

In [7]:
for label in objects.keys():
    add_object_type_node(_db_connection=db_connection, _object_type=label)

-> (:ObjectType {objectType: "Incident"}) created.
-> (:ObjectType {objectType: "Interaction"}) created.
-> (:ObjectType {objectType: "Change"}) created.
-> (:ObjectType {objectType: "KnowledgeDocument"}) created.
-> (:ObjectType {objectType: "Resource"}) created.
-> (:ObjectType {objectType: "ConfigurationItem"}) created.
-> (:ObjectType {objectType: "ServiceComponent"}) created.


In [8]:
for label in EVENTS.keys():
    add_event_type_node(_db_connection=db_connection, event_type=label)

Index for :Event(sysId)
-> (:EventType {eventType: "IncidentEvent"}) created.
Index for :Event(sysId)
-> (:EventType {eventType: "ChangeEvent"}) created.
Index for :Event(sysId)
-> (:EventType {eventType: "InteractionEvent"}) created.
Index for :Event(sysId)
-> (:EventType {eventType: "IncidentActivityEvent"}) created.
