Skip to content

Audit Archivematica user activities via Nginx access logs

License

Notifications You must be signed in to change notification settings

artefactual-labs/auditmatica

Repository files navigation

auditmatica

About

Audit Archivematica user activities via nginx access logs.

Overview

auditmatica is intended to facilitate auditing of user activities in Archivematica and the Archivematica Storage Service. It uses nginx access logs as its data source, and outputs either logs in Common Event Format (CEF) or a JSON overview of user activities.

Usage

auditmatica has two subcommands, write-cef and overview.

Usage: auditmatica [OPTIONS] COMMAND [ARGS]...

  Auditmatica: Archivematica auditing package

Options:
  --version  Show the version and exit.
  --help     Show this message and exit.

Commands:
  overview   Print overview of user activities from nginx access log.
  write-cef  Write Common Event Format (CEF) log from nginx access log.

write-cef

To write CEF events, use the write-cef subcommand. E.g.:

auditmatica write-cef /path/to/nginx/access.log

or

cat /var/log/nginx/access.log | auditmatica write-cef

CEF is a widely used standard for network and security analysis. CEF events can be sent to applications for review, monitoring, and visualization via a file or over syslog. CEF events written by auditmatica include an event name, signature (unique identifier), and severity level (0-10), which are determined based on details from the nginx access log such as URL, HTTP method, and HTTP return code.

A sample CEF event written by auditmatica looks like the following:

CEF:0|Artefactual Systems, Inc.|Archivematica|hosted|16|AIP downloaded from Archival Storage|3|cs1Label=requestClientApplication cs1="Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:84.0) Gecko/20100101 Firefox/84.0" rt=Jan 13 2021 20:01:33 requestMethod=GET request=/archival-storage/download/aip/8fa54cfc-f5c5-4673-b44e-fc514496bad7/ src=172.19.0.1 suser=test msg=UUID:8fa54cfc-f5c5-4673-b44e-fc514496bad7

In addition to the required CEF fields, each CEF event produced by auditmatica also has the following extensions:

Field Value Mandatory?
cs1Label requestClientApplication True
cs1 User agent string True
rt Request time True
requestMethod Request HTTP method True
request Request URL True
src Requester IP address True
suser Authenticated username False - present only if usernames are configured in nginx log and there is a username associated with the event's log line
msg UUID:<uuid from request URL> False - present only if there is a UUID associated with the event

For a comprehensive list of Archivematica CEF events, see IMPLEMENTATION.md.

For more details on CEF, see the CEF specification.

This command accepts several optional arguments:

Usage: auditmatica write-cef [OPTIONS] [LOG]

  Write Common Event Format (CEF) log from nginx access log.

Options:
  -o, --output PATH       Filepath for output CEF file (default=None, print to
                          stdout)

  -s, --syslog            Write CEF events to syslog instead of file
  --syslog-address TEXT   Address for syslog connection (default='/dev/log')
  --syslog-facility TEXT  Facility for syslog messages (default='USER')
  --syslog-port INTEGER   Port for remote syslog connections
  --ss-base-url TEXT      Override the Storage Service URL to scan for
                          (default='http://127.0.0.1:62081')

  --suppress              Suppress log lines that do not map to a specific
                          event instead of reverting to default event

  -v, --verbose           Enable verbose error message reporting
  --help                  Show this message and exit.


Storage Service

auditmatica looks for Storage Service events in the nginx access log by checking each URL to determine if it begins with the expected base URL of the Storage Service. By default, this is http://127.0.0.1:62081.

To override the Storage Service URL to scan for, use --ss-base-url. E.g.:

auditmatica --ss-base-url http://archivematica.example.com:8000

Output

By default, auditmatica writes CEF events to stdout and some end-user facing messages to stderr.

To write CEF events to a file, use the -o/--output option to specify a filepath for the output file. E.g.:

auditmatica write-cef /path/to/nginx/access.log --output my-output-file.log

To write CEF events to syslog, use the -s/--syslog option. By default, this will write syslog messages to /dev/log/ using the USER facility. The --syslog-address, --syslog-port, and --syslog-facility options can be used to customize the syslog connection. E.g.:

auditmatica write-cef -s \
  --syslog-address localhost \
  --syslog-port 514 \
  --syslog-facility local0 \
  /path/to/nginx/access.log

--syslog-port will only be used if an address other than /dev/log is also passed with--syslog-address.

Valid --syslog-facility values are local0-local7, which are reserved by syslog for local use.

overview

To generate a high-level JSON overview of Archivematica user activities, use the overview subcommand. E.g.:

cat access.log | auditmatica overview
Usage: auditmatica overview [OPTIONS] [LOG]

  Write JSON overview of user activities from nginx access log.

Options:
  --ss-base-url TEXT  Override the Storage Service URL to scan for
                      (default='http://127.0.0.1:62081')
  --help              Show this message and exit.

Storage Service

auditmatica looks for Storage Service events in the nginx access log by checking each URL to determine if it begins with the expected base URL of the Storage Service. By default, this is http://127.0.0.1:62081.

To override the Storage Service URL to scan for, use --ss-base-url. E.g.:

auditmatica --ss-base-url http://archivematica.example.com:8000

Install

Install auditmatica package

auditmatica requires Python 3.6+.

Via PyPI

pip install auditmatica

Manually

Download this repo:

git clone https://github.com/artefactual-labs/auditmatica.git

Change into the cloned directory and install:

cd auditmatica/
pip install .

Usernames

Including authenticated usernames in auditmatica's outputs requires some additional setup:

  1. Enable auditing middleware in Archivematica and the Storage Service via environment variables
# Archivematica 1.13+
ARCHIVEMATICA_DASHBOARD_DASHBOARD_AUDIT_LOG_MIDDLEWARE: "true"

# Storage Service 0.18+
SS_AUDIT_LOG_MIDDLEWARE: "true"
  1. Restart Archivematica and Storage Service services
sudo service archivematica-dashboard restart
sudo service archivematica-storage-service restart
  1. Add the following configuration to the http block of the nginx.conf configuration file (likely /etc/nginx/nginx.conf, though this may vary)
log_format main '$remote_addr - $remote_user [$time_local] "$request" $status $body_bytes_sent "$http_referer" "$http_user_agent" "$http_x_forwarded_for" user=$upstream_http_x_username';
access_log "/var/log/nginx/access.log" main;

This format must be exact, as it maps directly to a pattern auditmatica uses to parse nginx access log lines with usernames. If an access_log is already specified, replace it or name it something other than main. The access_log path (by default /var/log/nginx/access.log) can be changed as needed.

  1. Optionally, add proxy_hide_header x-username; to the nginx server blocks to prevent the authenticated username from being sent back with each response to the client device.

  2. Restart nginx service:

sudo service nginx restart

If the above is configured correctly, the resulting nginx access log lines should look as follows:

172.10.5.1 - - [13/Jan/2021:19:53:10 +0000] "GET /backlog/download/2e28b8a9-351c-4da7-92d2-837ac04cd2d9/ HTTP/1.1" 200 19436360 "http://127.0.0.1:62080/backlog/" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:84.0) Gecko/20100101 Firefox/84.0" "-" user=test

Development

Installation

For development, it may be useful to install auditmatica with pip install -e ., which will apply changes made to the source code immediately.

Testing

To run all tests with tox: tox

Or run tests directly with pytest:

pip install -r requirements/test.txt
pytest

Publishing to PyPI

This repository contains a Makefile with commands to aid in building packages and publishing to PyPI.

To check that the package is valid:

make package-check

To upload the package to PyPI (this requires PyPI credentials and being listed as a collaborator on the auditmatica project):

make package-upload

To clean up package distribution files:

make clean

About

Audit Archivematica user activities via Nginx access logs

Resources

License

Stars

Watchers

Forks

Packages

No packages published