Skip to content

Reporting provenance using JSON

Hassaan edited this page Jun 1, 2021 · 6 revisions

This reporter allows provenance to be described in JSON using W3C PROV concepts. The information must be stored in a file that contains valid JSON and conforms to the data model described below.


Data model

The file must contain a JSON array of JSON objects. Each provenance vertex and edge must be described in a separate JSON object. Vertices and edges have different key-value pairs, as described below. In both cases, the JSON object can optionally contain an annotations key. If this key is present, the value associated with it must a JSON object of key-value pairs (that describe domain-specific provenance details).

Provenance element Key Value Required
Vertex
type can be one of:
Agent
Activity
Entity
Yes
id unique identifier Yes
annotations JSON object with key-value pairs No
Edge
type can be one of:
ActedOnBehalfOf
WasInformedBy
WasDerivedFrom
WasAssociatedWith
WasAttributedTo
Used
WasGeneratedBy
Yes
from unique identifier of source vertex Yes
to unique identifier of destination vertex Yes
annotations JSON object with key-value pairs No

Sample valid JSON provenance

In the example below, the first JSON object represents an Activity vertex with unique identifier 1. The second JSON object represents an Entity vertex with unique identifier 2. Finally, the third JSON object represents a Used edge, indicating how the activity in vertex 1 is connected to the entity in vertex 2.

[
    {
        "type": "Activity",
        "id": 1,
        "annotations": {
            "program": "firefox",
            "pid": "1234"
        },
    },
    {
        "type": "Entity",
        "id": 2,
        "annotations": {
            "filename": "index.html",
            "owner": "user"
        },
    },
    {
        "type": "Used",
        "from": 1,
        "to": 2,
        "annotations": {
            "time": "0420"
        }
    }
]

Configuring JSON reporting

The JSON reporter takes a single argument, which is the location in the filesystem where the JSON file is located. Note that this must be done in the SPADE controller (after the SPADE server has been started):

-> add reporter JSON input=/tmp/provenance.json
Adding reporter JSON... done

The reporter can be deactivated using the following command in the SPADE controller:

-> remove reporter JSON
Shutting down reporter JSON... done
Clone this wiki locally