Skip to content

Storing provenance in a relational database

Ashish Gehani edited this page Aug 12, 2020 · 22 revisions

An SQL storage can be added to SPADE using the controller, allowing provenance records to be sent to a relational database. The configuration is loaded from cfg/spade.storage.<database>.config. An entry from the file can be overridden by passing <key>=<value> as an argument when adding the storage in the controller. If the username or password are not defined, the string literal null should be used.

H2

SPADE includes the H2 embedded SQL library. The following command can be used in the SPADE controller to store provenance in an H2 SQL database:

-> add storage H2 database=/tmp/sql/spade.sql_db databasePassword=null
Adding storage H2... done

This will create /tmp/spade.sql_db in the filesystem if it does not exist, or use the existing database if it has previously been created. The username for connecting to this database will be loaded from the config file, and no password will be used.

The following command will cause SPADE to stop sending provenance records to the H2 database:

-> remove storage H2
Shutting down storage H2... done

PostgreSQL

To use SPADE with PostgreSQL, the database must be installed, started, and configured. These steps can be effected by running bin/installPostgres. When this is done, a database spadedb is created by user spade.

To store provenance in a PostgreSQL database spade_pg, modify bin/installPostgres so it creates one with this name. This can then be specified in the SPADE controller with:

-> add storage PostgreSQL database=spade_pg
Adding storage PostgreSQL... done

This will create the database spade_pg if it does not exist, or use the existing database if it had previously been created.

Configuration File

Note that the username and password are not provided when adding the storage in the SPADE controller. They are automatically loaded from the configuration file cfg/spade.storage.PostgreSQL.config. This is the default version of the file:

databaseDriver=org.postgresql.Driver
databaseURLPrefix = jdbc:postgresql://localhost:5432/
database = spadedb
databaseUsername = spade
databasePassword = 12345
reportingEnabled = false
bulkUpload = true
setPrimaryKey = false
buildSecondaryIndexes = false
globalTxSize = 1000

bulkUpload is used to accelerate insertion into the database, but requires PostgresSQL to run on the same host as SPADE and the PostgresSQL user specified above to have SQL COPY permission. If either of those are not true, bulkUpload = false should be used.

If bulkUpload = false is used, globalTxSize insertions are batched together into a single transaction. Using globalTxSize = 1 ensures that each provenance element is visible to queries as soon as it enters the storage (but imposes higher overhead from the increased number of transactions).

The following command will cause SPADE to stop sending provenance records to the PostgreSQL database:

-> remove storage PostgreSQL
Shutting down storage PostgreSQL... done

Quickstep

To use SPADE with Quickstep, the database server must be installed and running. This can be effected with bin/installQuickstep. After Quickstep is installed, the database can be started in the background with:

./bin/startQuickstep /tmp/quickstep-database &

The command above creates the directory /tmp/quickstep-database if it does not exist and starts the server on port 3000, by default. If the directory /tmp/quickstep-database already exists, it must a directory that was previously created by Quickstep.

To store provenance in a Quickstep database, use this in the SPADE controller:

-> add storage Quickstep
Adding storage Quickstep... done

This will connect the SPADE Kernel to the Quickstep database specified in the configuration file cfg/spade.storage.Quickstep.config. In particular, the database needs to be running on the host serverIP at port serverPort specified.

The following command will cause SPADE to stop sending provenance records to the Quickstep database:

-> remove storage Quickstep
Shutting down storage Quickstep... done
Clone this wiki locally