In [None]:
from google.colab import drive
drive.mount('/content/drive')

In [None]:
!pip install pm4py owlready2 pandas

<h1>Experiments Leveraging Conformance Checking Techniques for Multi-Cloud SLA Compliance </h1>

The main objective of this notebook is to present the implementation part of the paper submited to SAC 2023. This implementation is decomposed in two steps: Log pre-processing and conformance checking. For this example, we will work on the following example of event logs collected from an execution on Docker Swarm and the following state machine.

In [None]:
import pandas as pd
pd.set_option('display.width',1000)

df = pd.read_csv('logs.csv')
df['Timestamp'] = pd.to_datetime(df['Timestamp'])

print(df)

             Timestamp     Source Resource Name         Event-Type     Metric Value
0  2022-10-30 00:00:03   Provider            UI     Service_Create   replicas     2
1  2022-10-30 00:00:03   Provider            UI   Container_Create          /     /
2  2022-10-30 00:00:03   Provider            UI    Container_Start          /     /
3  2022-10-30 00:00:05  Ressource            UI    Ressource_Usage  Cpu Usage   15%
4  2022-10-30 00:01:05  Ressource            UI    Ressource_Usage  Cpu Usage   15%
..                 ...        ...           ...                ...        ...   ...
73 2022-10-30 00:03:15  Ressource          Stor    Ressource_Usage  Cpu Usage   15%
74 2022-10-30 00:03:30  Ressource          Stor    Ressource_Usage  Cpu Usage   15%
75 2022-10-30 00:03:45  Provider           Stor     Service_Update   replicas     0
76 2022-10-30 00:03:45  Provider           Stor     Container_Stop          /     /
77 2022-10-30 00:03:45  Provider           Stor  Container_Destroy          

<h2>Pre-processing the collected logs</h2>
<h3>Annotation</h3>

We begin the pre-processing by the annotation of this later collected event logs based on the domain Knowledge formulated as an ontology with protègè. This domain knowledge represents the correlation between event type and state-machine elements. 

In [None]:
# Importing the owlready2 library and load the ontology 
from owlready2 import *
onto = get_ontology("eventLog.owl").load()

Then, we perform the high-level activity Identification using the ontology which returns the enriched Event logs with High-Level Activity. 

In [None]:
def get_ancestor(onto, value):
    """
        Return the ancestor of a eventType to identify if it's related to an Event or a Transition
        Input Ontology 'owlready', value: name of classes
    """
    ancestor = {}
    # Search the value in the ontology
    search = onto.search(iri = f"*{value}")[0]
    if search != None:
        # Identify ancestor which is not the root node or the class itself
        ancestor[0] = search.is_a[0]
        if ancestor[0].name != 'Transition':
            ancestor[1] = ancestor[0].is_a[0].name
            ancestor[0] = search.is_a[0].name
        else:
            ancestor[1] = ancestor[0].name
            ancestor[0] = ''
        return ancestor
    else:
        return 'N/A'

# StateMachine Element
smElement = []
# Lifecycle Step
lcStep = []
for index, row in df.iterrows():
    anc = get_ancestor(onto, row['Event-Type'])
    smElement.append(anc[1])
    lcStep.append(anc[0])

df['smElement'] = smElement
df['lcStep'] = lcStep

print(df)

             Timestamp     Source Resource Name         Event-Type     Metric Value   smElement    lcStep
0  2022-10-30 00:00:03   Provider            UI     Service_Create   replicas     2       State     Start
1  2022-10-30 00:00:03   Provider            UI   Container_Create          /     /       State   Execute
2  2022-10-30 00:00:03   Provider            UI    Container_Start          /     /       State  Complete
3  2022-10-30 00:00:05  Ressource            UI    Ressource_Usage  Cpu Usage   15%  Transition          
4  2022-10-30 00:01:05  Ressource            UI    Ressource_Usage  Cpu Usage   15%  Transition          
..                 ...        ...           ...                ...        ...   ...         ...       ...
73 2022-10-30 00:03:15  Ressource          Stor    Ressource_Usage  Cpu Usage   15%  Transition          
74 2022-10-30 00:03:30  Ressource          Stor    Ressource_Usage  Cpu Usage   15%  Transition          
75 2022-10-30 00:03:45  Provider           Sto

<h3>Abstraction</h3>
Then, we perform the abstraction of Annotated logs in order to discover a state-machine representing the "real" observed behavior using defined patterns.

In [None]:
df['Timestamp'] = pd.to_datetime(df['Timestamp'], format= '%H:%M:%S')
a = df.loc[df['smElement'] =='State']
a = a.groupby(['smElement'])

print(a)

df['Potential_State'] = 'State'+(a['Timestamp'].diff()/pd.Timedelta(seconds=15)).gt(1).cumsum().add(1).astype(str)

SMdisc = []

<h2>Checker</h2>
In this last step, we implement the checker component. We construct the search space as defined in the paper. 

In [None]:
import pm4py
import networkx as nx
SS = nx.Graph()

for eltx in SMdisc:
    if eltx == elty:
        SS.add_node(1)
    else :
        SS.add_node(1)
        SS.add_node(2)
    SMdef.next()

From this space, we search the optimal alignment using an A* algorithm.

In [None]:
y_optimal = nx.shortest_path(SS, start, end)

Finally, we compute the fitness value of the identified alignment and return the report with the alignment.

In [None]:
y_worst_sum = sum(y_worst)
y_optimal_sum = sum(y_optimal)
fitness Value = 1 - y_optimal/y_worst