# Example Notebook for Time-Order Map for Seamless Zooming

This notebook was created by *Christoffer Rubensson*. 

The notebook can can be used to test out the python modules in this project. It also contains example graphs presented in the paper *Time-Order Map for Seamless Zooming between
Process Models and Process Instances*.

#### Initialization

In [None]:
%load_ext autoreload
%autoreload 2

In [None]:
# add packages from 'src'
import sys
from pathlib import Path

# App directory
project_root = Path.cwd().parent  # Adjust this if necessary
sys.path.append(str(project_root))

In [None]:
# Package Modules
from src.utils.data_exporting import *
from src.utils.data_importing import *
from src.utils.data_processing import *
from src.algo.global_ranking import *
from src.orchestrator import *

In [None]:
# Third-party packages
import pandas as pd
import pm4py
from pm4py.visualization.dfg import visualizer
import plotly
import plotly.express as px
import graphviz

In [None]:
# Package version information
version_pandas = pd.__version__ 
print(f"Pandas version: {version_pandas}")
version_pm4py = pm4py.__version__ 
print(f"PM4Py version: {version_pm4py}")
version_plotly = plotly.__version__ 
print(f"Plotly version: {version_plotly}")
version_graphviz = graphviz.__version__ 
print(f"Graphviz version: {version_graphviz}")

## Testing

**Data import**

In [None]:
# Evaluation
df_org = load_event_log("runningexample.xes", "evaluation_data")
df = df_org.copy()

**Data pre-processing**

*Filter Cases*

Can be used to reduce the amount of cases in the log.

In [None]:
#cases = df["case:concept:name"].unique()
#casessorted = sorted(cases)
#casessorted_filtered = casessorted[0:1050]
#len(casessorted_filtered)
#df_filtered = df[df["case:concept:name"].isin(casessorted_filtered)]

*Prepare log for D3.js*

This function pre-processes the log data before applying it to D3. It utilizes the functions below (--). 

In [None]:
df_d3 = process_log_for_d3js(df)

*-- Break down process_log_for_d3js*

In [None]:
df_note = simplifyLog(df) # reduces the number of log attributes to the most basic ones
df_note = relativeTimestamps(df_note) # adds relative timestamps to the log

*-- Ranking methods*

In [None]:
global_ranking_method_df_relativetime(df_note, annotation=True) # ranks activities

In [None]:
df_note, ranking_map = global_ranking_of_eventdata(df_note) # add ranked activities to the log as a new attribute

**Export log**

In [None]:
file_path = "data/export_data"
file_name = "data-runningexample.csv"

In [None]:
# Alternative export
#file_path = "evaluation_data/"
#file_name = "roadtrafficfine_1050.xes"

Export as *CSV*

In [None]:
#df.to_csv(file_path + file_name)

Export as *XES*

In [None]:
#export_event_log(df, filename= file_name, foldername=file_path)

### Additional Visaulizations in Paper

**Dotted Chart**

In [None]:
# Sort log
log = df_note.sort_values(by=['time:timestamp','case:concept:name'])
log['time:relative:days'] = pd.to_timedelta(log['time:timestamp:relative'], unit='ns') / pd.Timedelta(days=1)

# Create a scatter plot for a dotted chart
fig = px.scatter(
    log,
    x='time:relative:days',
    y='case:concept:name',
    #y='concept:name', 
    color='concept:name',
    opacity=0.9,
    width=750, # default: 750
    height=250, # default: 250
    category_orders={
        "case:concept:name": sorted(log["case:concept:name"].unique())[::-1]  # alphabetical sorting of cases
    }
)

# Customize layout
fig.update_layout(
    xaxis_title="Relative Time (days)",
    yaxis_title="Cases",
    legend_title="Activities",
    xaxis=dict(showgrid=True),
    yaxis=dict(showgrid=True),
    plot_bgcolor='#f2f2f2',
)

# Visualizate the plot
fig.update_traces(marker=dict(size=10))
fig.show()

*Export image:*

In [None]:
# Write image
#file_path = f"{project_root}/data/"
#file_name="fig_ex-instance-running"
#file_suffix=".svg"
#fig.write_image(file_path+file_name+file_suffix)

**Performance DFG**

Note: In the paper, the graph was post-processed to change the color and size.

In [None]:
log = df_org.copy()
performance_dfg, start_activities, end_activities = pm4py.discover_performance_dfg(log)

# Create the visualization object (gviz)
gviz = visualizer.apply(performance_dfg,
                        variant=visualizer.Variants.PERFORMANCE,
                        parameters={"start_activities": start_activities,
                                    "end_activities": end_activities})
gviz

*Export image*

In [None]:
#filepath
file_path = f"{project_root}/data/"
file_name="fig_ex-model-running"
file_suffix=".svg"

#export DiGraph
gviz.render(file_path+file_name, format='svg', cleanup=True)