Collecting system wide provenance on Linux

Ashish Gehani edited this page Feb 8, 2018 · 27 revisions

The Audit reporter collects provenance from across the operating system using the Linux kernel's audit event stream of system calls. (Note: Activity of the user that SPADE runs as is excluded.)

This reporter is built automatically when SPADE's top-level make command is issued.


Before this reporter can be used, the below commands must be run. These commands only need to be executed once after SPADE is compiled. (Note: This will allow a normal user to configure and access the audit stream.)

The first two commands allow users to configure the audit rules and packet filtering needed to generate the provenance graph. The next two commands grant users access to the audit stream:

sudo chmod ug+s `which auditctl`
sudo chmod ug+s `which iptables`
sudo chown root lib/spadeAuditBridge
sudo chmod ug+s lib/spadeAuditBridge

To let the above utility access the audit stream, edit the file /etc/audisp/plugins.d/af_unix.conf and activate the plugin by changing the line that says

active = no


active = yes

Restart auditd to activate the dispatcher (audispd):

sudo service auditd restart

Real-time collection

The Audit reporter can be started using SPADE's controller:

-> add reporter Audit
Adding reporter Audit... done

The reporter will transform records from the Linux audit dispatcher into an Open Provenance Model representation. The details of the key-value annotations are available here.

Configuring I/O reporting

Filesystem reads and writes, as well as network connection sends and receives, can generate significant log overhead. In many contexts, knowledge that a process opened a file or made a network connection, suffices for understanding the provenance of data.

By default, this reporter only tracks when files are opened for reading or writing, and when network connection are made or accepted. To report all filesystem reads and writes, the argument fileIO=true should be provided when starting the reporter with the SPADE controller. Similarly, to report all network sends and receives, the argument netIO=true should be used:

-> add reporter Audit fileIO=true netIO=true
Adding reporter Audit... done

Saving the audit records

For debugging purposes, the Linux Audit records that have been processed can be stored in a file using the outputLog argument. For example, the records can be stored in the file /tmp/audit.log by using this command to start the reporter in the SPADE controller:

-> add reporter Audit outputLog=/tmp/audit.log
Adding reporter Audit... done

Using a saved log

Instead of collecting Linux Audit records from the running system, a previously saved log can be used by specifying it with the inputLog argument. The hardware architecture of the machine on which the log was generated must be provided with the argument arch.

Currently, the architecture can only be set to 32 or 64, to indicate IA-32 or x86-64, respectively. For example, to read records from the file /tmp/audit.log from an x86-64 machine, this command can be used to start the reporter in the SPADE controller:

-> add reporter Audit inputLog=/tmp/audit.log arch=64
Adding reporter Audit... done

Logs must be sorted by event identifier. This is done automatically during preprocessing. If a sorted log is used, the sortLog argument can be used to disable sorting:

-> add reporter Audit inputLog=/tmp/audit.log arch=64 sortLog=false
Adding reporter Audit... done

The end of audit log processing is reported in SPADE's log (that is stored in log/SPADE_<date>-<time>.log).

Clone this wiki locally
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.