-
Notifications
You must be signed in to change notification settings - Fork 72
Provenance Privacy
To preserve privacy while sharing provenance metadata with other hosts in a network, SPADE provides two techniques: sanitization and encryption. The details for both are in the following sections.
Privacy preservation through sanitization performs irreversible transformation on the response Graph. Under sanitization, a graph annotation or its part is removed from the graph leaving no trace. Annotations of vertices and edges could be sanitized depending upon the sanitization level and scheme defined for each of those levels and annotations. It is available through the Sanitization transformer. There are 3 defined levels of sanitization: low, medium and high. For a given level of sanitization, provenance is individually sanitized for that level, as well for all the levels below. For example, if the level defined is high, provenance is sanitized for low, medium and high levels individually.
To use the transformer, execute the following on the control client:
add transformer Sanitization [sanitizationLevel={low,medium,high}]
The various settings of the sanitization process could be defined in the config file spade.transformer.Sanitization.config
whose structure is as follows:
'level' indicates the level of sanitization to perform on the response graph. They could be low, medium or high. Each sanitization level is stated on one line, followed by its details. The details include a comma-separated list of annotations to sanitization for each type of Vertex and Edge. Encryption could also be done annotations regardless of the type of vertex or edge.
The various strategies for sanitizing composite annotations as well as the config file format are illustrated at the end of the encryption section below.
Privacy preservation through sanitization performs reversible transformation on the response Graph. Data is encrypted using Attribute-based encryption (ABE) policy. In this policy, attributes serve as the credentials of a host and a policy is defined over the encrypted data. We take attributes as the levels of encryption or decryption to perform on the data. In a provenance graph, annotations of vertices and edges could be encrypted depending upon the level of desired encryption and the encryption scheme for each annotation.
There are 3 defined levels of encryption: low, medium and high. Each of these levels has an associated private key for encryption/decryption, as well as a common public key. The public key and the appropriate private keys have be to shared out-of-band with the other host in order for them to successfully decrypt the data shared with them. For a given level of encryption, provenance is individually encrypted for that level, as well for all the levels below. For example, if the encryption level defined is high, provenance is encrypted for low, medium and high levels individually.
Attribute-based encryption is available through the ABE transformer. SPADE use OpenABE implementation available under AGPL 3.0 license. OpenABE could be downloaded and installed for your system from OpenABE GitHub Repository. After installing OpenABE, complete the following steps:
-
Setup OpenABE and generate master key pair for the
Ciphertext-Policy(CP) ABE
algorithm. -
Generate the private keys for each given set of attributes. A set of attributes corresponds to the level of encryption in our scheme.
-
Share the master public key and the private key(s) with the party you want to communicate.
The details for each step can be found in the first 6 pages of OpenABE CLI Documentation.
To use the transformer, execute the following on the control client:
add transformer ABE [encryptionLevel={low,medium,high}]
The various settings of the encryption process could be defined in the config file spade.transformer.ABE.config
whose structure is as follows:
'level' indicates the level of encryption to perform on the response graph. They could be low, medium or high. Each encryption level is stated on one line, followed by its details. The details include a comma-separated list of annotations to encrypt for each type of Vertex and Edge. Encryption could also be done annotations regardless of the type of vertex or edge.
Following are the various strategies for encryption composite annotations. The same strategies are used for sanitization using Sanitization transformer defined above.
remote address (xxx.xxx.xxx.xxx)
low
, the first octet is encrypted.
medium
, the second octet is encrypted.
high
, the third octet is encrypted.
path (w/x/y/z/...)
low
, path after third level is encrypted.
medium
, path after the second level is encrypted.
high
, path after the first level is encrypted.
time (yyyy-MM-dd HH:mm:ss)
low
, minute and second are encrypted.
medium
, hour is encrypted.
high
, day is encrypted.
Below is a sample config file:
level=low
low
Process=cwd
Artifact=remote address,path
Edge=time
medium
Process=command line
Artifact=remote address,path
Edge=time,size
high
Process=name
Artifact=remote address,path
Edge=time,operation
This material is based upon work supported by the National Science Foundation under Grants OCI-0722068, IIS-1116414, and ACI-1547467. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
- Setting up SPADE
- Storing provenance
-
Collecting provenance
- Across the operating system
- Limiting collection to a part of the filesystem
- From an external application
- With compile-time instrumentation
- Using the reporting API
- Of transactions in the Bitcoin blockchain
- Filtering provenance
- Viewing provenance
-
Querying SPADE
- Illustrative example
- Transforming query responses
- Protecting query responses
- Miscellaneous