The preferred log format for eLife applications.
<timestamp> - <log level> - <process name> - <program section> - <message>
Why be so formal about this? These log files are eventually parsed by other programs and the extracted bits are used in calculations, stored, displayed, etc.
Many applications with many different logging formats means more work for me. KISS
Examples:
1420811950.9472651 - DEBUG - elife-api - somemodule.py - a message goes here
1420811950.9472651 - INFO - elife-bot - somemodule.py - a message goes here
1420811950.9472651 - WARN - SimpleScraper.js - func_name - a message goes here
1420811950.9472651 - ERROR - foo - somemodule.py - a message goes here
1420811950.9472651 - CRITICAL - bar - somemodule.py - a message goes here
1420811950 - INFO - baz - somemodule.py - a message goes here
timestamp: the time in seconds since the epoch as a floating point or integer number.
log level: one of DEBUG, INFO, WARN, ERROR or CRITICAL.
process name: name of the running program/process.
program section: where within the application this log entry originated.
message: a free form message; may include hyphens; avoid line breaks.
If your application and preferred logging library can handle it easily, please use JSON with the following keys:
{"timestamp": ..., "log_level": ..., "process": ..., "section": ..., "message": ..., "context": {...}}
With __context__
being a simple map of additional key+values. An empty context can be either an empty list or an empty map (because of the nature of PHP's keyed arrays).
Examples:
{"timestamp":1439482969,"log_level":"ERROR","process":"example-app","section":"\/srv\/ingestor\/web\/logging.php","message":"this is an error"}
{"timestamp":1439482969,"log_level":"ERROR","process":"example-app","section":"\/srv\/ingestor\/web\/logging.php","message":"this is an error", "context": []}
{"timestamp":1439482969,"log_level":"ERROR","process":"example-app","section":"\/srv\/ingestor\/web\/logging.php","message":"this is an error", "context": {}}
{"timestamp":1439482969,"log_level":"ERROR","process":"example-app","section":"\/srv\/ingestor\/web\/logging.php","message":"this is an error", "context": {"key": "val"}}
Use this formatter:
%(created)f - %(levelname)s - %(processName)s - %(name)s - %(message)s
The included example can be run with:
cd python && ./main.py
Python docs for the logging
module are here:
https://docs.python.org/2/library/logging.html#logrecord-attributes
The below custom formatter coerces some of the naming of the original record to familiar values and converts the date to a Unix timestamp (sans microseconds):
function el_format_record(array $record) {
return array(
"timestamp" => date_timestamp_get($record['datetime']),
"log_level" => $record['level_name'],
"process" => $record['channel'],
"section" => __FILE__,
"message" => $record['message'],
"context" => $record['context'],
);
}
class eLifeJsonFormatter extends Monolog\Formatter\JsonFormatter {
public function format(array $record) {
return parent::format(el_format_record($record));
}
public function formatBatch(array $records) {
$records = array_map('el_format_record', $records);
return parent::formatBatch($records);
}
}
See the php example for usage.
...
I use Unix time %(created)f
rather than the YMD
format because Unix time
is used natively by Riemann (our realtime monitor).
Storage is cheap, logging is easy, so why wouldn't we log everything we possibly can and keep it around forever?
- data must be transformed into information, otherwise it's worthless
All other reasons derive from this one. If logging data is not used then it's just noise.
INFO is the default level for all applications in production.
DEBUG log statements are discarded by monitoring applications and will not be stored.
All application log messages sent to the monitoring server will be kept for
6 months. This is entirely arbitrary and handled by the logrotate
application.
All audit log messages sent to the monitoring server will be kept indefinitely, or, as long as some business rule tells us. This rule will be encapsulated entirely within code.
Auditing is a type of logging.
Auditing within an application exists because there is a business requirement to do so. A regular log message might be 'user "joe" logged in', but becomes part of the audit trail when there is a business requirement to track a user's activities through a system.
If a formal audit log is required, it must go to it's own log file.
Logging for an S3 bucket should only be enabled if:
- a person explicitly wants it monitored
- the bucket is acting as a server and is the only point of access to that data
All S3 access logs should go to the logging bucket elife-log-data
with the
name of the originating bucket as the prefix.
For example, if the S3 bucket "elife-annual-2012" require access logging, the
'target bucket' will be elife-log-data
and the 'target prefix' will be
elife-annual-2012/
.