Logging

The preferred log format for eLife applications.

<timestamp> - <log level> - <process name> - <program section> - <message>

Why be so formal about this? These log files are eventually parsed by other programs and the extracted bits are used in calculations, stored, displayed, etc.

Many applications with many different logging formats means more work for me. KISS

Examples:

1420811950.9472651 - DEBUG - elife-api - somemodule.py - a message goes here
1420811950.9472651 - INFO - elife-bot - somemodule.py - a message goes here
1420811950.9472651 - WARN - SimpleScraper.js - func_name - a message goes here
1420811950.9472651 - ERROR - foo - somemodule.py - a message goes here
1420811950.9472651 - CRITICAL - bar - somemodule.py - a message goes here
1420811950 - INFO - baz - somemodule.py - a message goes here

timestamp: the time in seconds since the epoch as a floating point or integer number.

log level: one of DEBUG, INFO, WARN, ERROR or CRITICAL.

process name: name of the running program/process.

program section: where within the application this log entry originated.

message: a free form message; may include hyphens; avoid line breaks.

Structure

If your application and preferred logging library can handle it easily, please use JSON with the following keys:

{"timestamp": ..., "log_level": ..., "process": ..., "section": ..., "message": ..., "context": {...}}

With __context__ being a simple map of additional key+values. An empty context can be either an empty list or an empty map (because of the nature of PHP's keyed arrays).

Examples:

{"timestamp":1439482969,"log_level":"ERROR","process":"example-app","section":"\/srv\/ingestor\/web\/logging.php","message":"this is an error"}

{"timestamp":1439482969,"log_level":"ERROR","process":"example-app","section":"\/srv\/ingestor\/web\/logging.php","message":"this is an error", "context": []}

{"timestamp":1439482969,"log_level":"ERROR","process":"example-app","section":"\/srv\/ingestor\/web\/logging.php","message":"this is an error", "context": {}}

{"timestamp":1439482969,"log_level":"ERROR","process":"example-app","section":"\/srv\/ingestor\/web\/logging.php","message":"this is an error", "context": {"key": "val"}}

Python

Use this formatter:

%(created)f - %(levelname)s - %(processName)s - %(name)s - %(message)s

The included example can be run with:

cd python && ./main.py

Python docs for the logging module are here: https://docs.python.org/2/library/logging.html#logrecord-attributes

PHP+Monolog

The below custom formatter coerces some of the naming of the original record to familiar values and converts the date to a Unix timestamp (sans microseconds):

function el_format_record(array $record) {
    return array(
        "timestamp" => date_timestamp_get($record['datetime']),
        "log_level" => $record['level_name'],
        "process" => $record['channel'],
        "section" => __FILE__,
        "message" => $record['message'],
        "context" => $record['context'],
    );
}

class eLifeJsonFormatter extends Monolog\Formatter\JsonFormatter {
    public function format(array $record) {
        return parent::format(el_format_record($record));
    }

    public function formatBatch(array $records) {
        $records = array_map('el_format_record', $records);
        return parent::formatBatch($records);
    }
}

See the php example for usage.

Javascript

...

notes

I use Unix time %(created)f rather than the YMD format because Unix time is used natively by Riemann (our realtime monitor).

Logging in production

Storage is cheap, logging is easy, so why wouldn't we log everything we possibly can and keep it around forever?

data must be transformed into information, otherwise it's worthless

All other reasons derive from this one. If logging data is not used then it's just noise.

retention

INFO is the default level for all applications in production.

DEBUG log statements are discarded by monitoring applications and will not be stored.

All application log messages sent to the monitoring server will be kept for 6 months. This is entirely arbitrary and handled by the logrotate application.

All audit log messages sent to the monitoring server will be kept indefinitely, or, as long as some business rule tells us. This rule will be encapsulated entirely within code.

audit logs

Auditing is a type of logging.

Auditing within an application exists because there is a business requirement to do so. A regular log message might be 'user "joe" logged in', but becomes part of the audit trail when there is a business requirement to track a user's activities through a system.

If a formal audit log is required, it must go to it's own log file.

S3 bucket access

Logging for an S3 bucket should only be enabled if:

a person explicitly wants it monitored
the bucket is acting as a server and is the only point of access to that data

All S3 access logs should go to the logging bucket elife-log-data with the name of the originating bucket as the prefix.

For example, if the S3 bucket "elife-annual-2012" require access logging, the 'target bucket' will be elife-log-data and the 'target prefix' will be elife-annual-2012/.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
php		php
python		python
.gitignore		.gitignore
README.md		README.md
s3-logging-example.png		s3-logging-example.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

php

php

python

python

.gitignore

.gitignore

README.md

README.md

s3-logging-example.png

s3-logging-example.png

Repository files navigation

Logging

Structure

Python

PHP+Monolog

Javascript

notes

Logging in production

retention

audit logs

S3 bucket access

About

Releases

Packages

Languages

elifesciences/logging-example

Folders and files

Latest commit

History

Repository files navigation

Logging

Structure

Python

PHP+Monolog

Javascript

notes

Logging in production

retention

audit logs

S3 bucket access

About

Resources

Stars

Watchers

Forks

Languages