# Visualisation of PROVenance-Models

In order to visualise the provenance models, we employ the python `prov` library.
Details of all required libraries including their version information can be found in the `requirements.txt`.

First of all import the required libraries:

In [1]:
import glob
import pandas as pd
from prov.model import ProvDocument, ProvEntity, ProvAssociation, ProvGeneration, ProvUsage, ProvActivity
from prov.dot import prov_to_dot

Define some useful functions that can be employed later for the particular provenance models respectively files.

In [2]:
def readProv(filename):
    """
    Deserialise provenance information from a turtle file given as parameter.
    Note that the file ending 'ttl' is automatically added.
    """
    with open("%s.ttl" % (filename,), 'r') as f:
        return(ProvDocument.deserialize(source=f, format='rdf'))

In [3]:
def prov2svg(prov_doc, svg_filename):
    """
    Export the provenance document (given as parameter) as SVG file under the given name.
    Note that the file ending 'svg' is automatically added.
    """
    prov_doc.plot(filename='%s.svg' % (svg_filename,))

In [4]:
def prov2dot(prov_doc, dot_filename):
    """
    Export the provenance document (given as parameter) as DOT file under the given name.
    Note that the file ending 'dot' is automatically added.
    """
    prov_to_dot(prov_doc).write('%s.dot' % (dot_filename,))

In [5]:
def prov2entities_csv(prov_doc, csv_filename):
    """
    Create a list of entities from the given provenance document.
    Extract information about the 'type', 'identifier', 'prov:label', and 'da:fileType' (if available).
    Export the list as CSV file under the given name.
    Note that '_entities.csv' will be automatically added to the csv filename.
    """
    df = pd.DataFrame()
    for r in prov_doc.get_records((ProvEntity, ProvActivity)):
        label = "; ".join(r.get_attribute('prov:label'))
        file_type = "; ".join(r.get_attribute('da:fileType'))
        ID = str(r.identifier)
        entry = {
            'type': r.__class__.__name__,
            'ID': ID,
            'label': label,
            'file_type': file_type,
        }
        df = df.append(pd.DataFrame(entry, index=[0]), ignore_index=True)
    df.sort_values(by=['type','ID']).to_csv("%s_entities.csv" % (csv_filename,))

In [6]:
def prov2rel_csv(prov_doc, csv_filename):
    """
    Create a list of relations from the given provenance document.
    Extract information about the two involved elements including Identifier and type as well as their role (if available).
    Export the list as CSV file under the given name.
    Note that '_relations.csv' will be automatically added to the csv filename.
    """
    df = pd.DataFrame()
    for r in prov_doc.get_records((ProvAssociation, ProvGeneration, ProvUsage)):
        attrs = r.formal_attributes
        entry = {
            'type': r.__class__.__name__,
            'com1_type': str(attrs[0][0]),
            'com1_ID': str(attrs[0][1]),
            'com2_type': str(attrs[1][0]),
            'com2_ID': str(attrs[1][1]),
            'role': "; ".join([str(role) for role in r.get_attribute('prov:role')])
        }
        df = df.append(pd.DataFrame(entry, index=[0]), ignore_index=True)
    df.sort_values(by=['type', 'com1_ID', 'com2_ID']).to_csv("%s_relations.csv" % (csv_filename,))

Now, as the helping functions are defined, search for provenance models in the folders `model-based` and `model-based/pattern`.
In case they are not marked as old by containing `_old` in their path, parse the document and export them into SVG as well as into DOT files using the helper functions.

In [8]:
# Compute PROV models
files = glob.glob("model-based/*.ttl")
files += glob.glob("model-based/pattern/*.ttl")
for file in files:
    if '_old' in file:
        continue
    n = file.replace(".ttl", "")
    print("Computing '%s'..." % (n,), end='')
    pdoc = readProv(n)
    prov2svg(pdoc, n)
    prov2dot(pdoc, n)
    #prov2entities_csv(pdoc, n)
    #prov2rel_csv(pdoc, n)
    print("OK")

Computing 'model-based/simulation_models'...OK
Computing 'model-based/geometry'...OK
Computing 'model-based/context_models'...OK
Computing 'model-based/pattern/extract-information'...OK
Computing 'model-based/pattern/parameterisation'...OK
Computing 'model-based/pattern/composition'...OK
Computing 'model-based/pattern/refinement'...OK
Computing 'model-based/pattern/generation'...OK
