Kool Logfmt Parser
klp is a lightweight, command-line interface (CLI) tool for analyzing and understanding structured logs in various formats. Designed for software developers involved in debugging and troubleshooting.
Supports logfmt, JSON Lines, TSV and some other formats.
Single file Python script without dependencies apart from Python 3.7+ and its standard library.
- Focus on the Essentials: Instantly view the crucial fields: timestamp, log level, and message.
- Selective Key Display: Choose to include or exclude specific keys in your output.
- Powerful Filtering: Filter logs by log level or specific time ranges, e.g. to focus on new events. Employ Python expressions to specify complex filters.
- Advanced Grep Capabilities: Search whole lines or specific keys. Show context lines to see what happened right before or after an interesting event. Use built-in regexes for URLs, IPs, file paths or common errors.
- Overview Stats: Get a bird's eye view of your logs with a statistics page summarizing event count, keys, log levels, and time span. Or with a map of log levels.
- Customizable Output: Expand newlines, print each key on a separate line, or specify your own output templates, using Python f-strings or expressions
- Enhanced Readability: Enjoy colored and pretty-printed output for easier reading.
- Supports JSON Lines and TSV as alternative input/output formats (and some others)
klp is designed not for large-scale log analysis, but as a lightweight, efficient tool for developers. It's your companion for uncovering the mysteries in your logs, streamlining your development workflow, and accelerating troubleshooting.
Install using pip:
$ pip install klp-logviewer
Or copy klp.py
to a folder on your PATH and make it executable:
$ install -m 755 klp.py ~/.local/bin/klp
Or just run it using python3
, without any installation.
Here are some basic examples to get you started with klp:
# View logs with timestamp, log level, and message:
$ klp -c mylog.logfmt
# Specify the input format (default is logfmt):
$ klp -f jsonl applog.jsonl
$ klp -f csv mydata.csv
# Filter logs by log level:
$ klp -l error,warning mylog.logfmt
# Get an overview of your log data:
$ klp -S mylog.logfmt
# View only new log entries (useful with streaming logs):
$ tail -f mylog.logfmt | klp -n
klp can be used as a filter, reading from stdin:
$ tail -f log.txt | klp
Or you can read from a file:
$ klp log.txt
klp can directly process compressed log files:
$ klp app.log.gz
$ klp logs.zip
Options can be given before or after the filename.
See klp --help
for the list of available options.
klp supports a wide range of input formats:
logfmt
: Key-value pairs (default)jsonl
: JSON Lines (shortcut:-j
)json
: JSON (only for complete files, not for streaming)tsv
: Tab separated valuespsv
: Pipe separated valuesclf
: NCSA Common Log Formatcombined
: Combined Log Format of Apache httpdunix
: common format of Unix server logs (timestamp, hostname, service, optional pid, message)line
: lines of text (trailing whitespace removed)sqlite
: SQLite databasedata
: parse everything as one stringtsNm
: timestamp (consisting of N components) and messagetsNlm
: timestamp (consisting of N components), log level and message
Use the --input-format
or -f
option to specify the input format. For example:
$ klp -f jsonl input.log
klp supports parsing simple space-separated log formats via the tsNm
(timestamp + message) and tsNlm
(timestamp + level + message) format specifiers,
where N indicates how many space-separated timestamp components to expect.
Here are some examples:
# Log format: <timestamp> <message>
# Example: 2024-02-05T20:18:10.538Z Connection established to database
$ klp -f ts1m server.log
# Log format: <date> <time> <level> <message>
# Example: 2024-02-05 20:18:10 INFO Starting background worker
$ klp -f ts2lm worker.log
# Log format: <date> <time> <timezone> <message>
# Example: 2024-02-05 20:18:10 UTC Processing batch job #1234
$ klp -f ts3m batch.log
Similar formats exist up to ts5m
/ts5lm
for logs with more timestamp components.
The timestamp parts are joined and parsed using klp's standard timestamp parsing logic.
More complicated formats can often be parsed using the line
format and creating or transforming events using Python code (--input-exec
).
See the Advanced input transformations using Python code section below.
klp parses each line of the input file (or stdin stream) into an event. If a line cannot be parsed, it is ignored silently. An event can have several fields (key/value pairs). The events in a file or stream don't have to have the same fields.
klp tries to automatically detect timestamps, log levels and log messages by looking for the following keys:
- Timestamp keys:
timestamp
,time
,ts
,t
,at
- Log level keys:
log_level
,loglevel
,level
,lvl
,severity
- Message keys:
message
,msg
klp has special features to select events by timestamp or log level (see below).
By default, klp prints every event that was recognized and shows both the key and the value of each field.
Colors are used for syntax-highlighting (use --no-color
to turn this off or --theme
to select different colors).
Often the time stamp, log level and message are the most important fields.
Use --common
/-c
to print them, without showing the names of the keys (only the values):
$ klp -c mylog.txt
2023-01-17T07:55:48.233Z DEBUG preparing query s190578: select * from applications
2023-01-17T07:55:48.234Z DEBUG executing statement s190578 with parameters: []
2023-01-17T07:55:48.656Z DEBUG preparing query s190579: select * from oauthproviderconfig where id = 0
2023-01-17T07:55:48.657Z DEBUG executing statement s190579 with parameters: []
2023-01-17T07:55:48.662Z DEBUG preparing query s190580: select * from oauthproviderconfig where id = 0
2023-01-17T07:55:48.663Z DEBUG executing statement s190580 with parameters: []
Use --stats-only
/-S
to get some basic statistics about number of events in the log file,
the timespan, the names of the keys and the log levels:
$ klp -S mylog.logfmt
Events shown: 8043 (100% of 8043 lines seen)
Time span shown: 2024-02-05T19:58:59.874Z to 2024-02-05T20:19:20.575Z (0:20:20.701000)
Keys seen: timestamp,log_level,source,target,thread_id,message
Log levels seen: DEBUG,TRACE,INFO (keys: log_level)
Or use --stats
/-s
to print those stats (to stderr), in addition to regular output.
Use --levelmap
/-M
to print only the first character of the log level of each event.
The timestamp is for the first event shown in that line:
$ klp -M mylog.logfmt
2024-02-05T20:18:10.538Z TTTTDTTTDITITTTTTTTTTTTTTTTDTTTDITITTTTTTTTTTTTTTTDTTTDITITTTTTTTTTTTTTTTDTTTDIT
2024-02-05T20:18:30.546Z ITTTTTTTTTTTTTTTDDTTITITTTTTTTTTTTTTTITITTTTTTTTTTTTTTTTITITTTTTTTTTTTTTTTTITITT
2024-02-05T20:18:34.098Z TTTTTTTTTTTTTDTTTDITITTTTTTTTTTTTTTTDDTTTDITITTTTTTTTTTTTTTTDTTTDITITTTTTTTTTTTT
2024-02-05T20:18:45.555Z TTTDTTTDITITTTTTTTTTTTTTTTDTTTDITITTTTTTTTTTTTTTTDTTTDITITTTTTTTTTTTTTTTDTITITTT
2024-02-05T20:19:02.666Z TTTTDTTTTTTTTTITITTTTTTTTTTTTTTTDTTTDITITTTTTTTTTTTTTTTDTTTDITITTTTTTTTTTTTTTTDT
2024-02-05T20:19:15.571Z TTDITITTTTTTTTTTTTTTTDTTTDITITTTTTTTTTTTTTTT
Visualize patterns for any key:
$ klp --keymap status_code app.log
To select the fields yourself, use --keys
/-k
and a comma-separated list of keys to print:
$ klp -k timestamp,message mylog.txt
timestamp="2022-09-05T06:25:27.465Z" message="poll_read: waiting on response"
timestamp="2022-09-05T06:25:27.465Z" message="polled new request"
timestamp="2022-09-05T06:25:27.465Z" message="poll_write: waiting on request"
timestamp="2022-09-05T06:25:27.465Z" message="poll_flush: flushed"
timestamp="2022-09-05T06:25:27.466Z" message="poll_read: waiting on response"
Or specify the fields you want to exclude, using --keys-not
/-K
.
In this case, all the other fields will be printed.
Use --plain
/-p
to leave out the keys and only print the values:
$ cat mylog.txt | klp -k -p timestamp,message
2022-09-05T06:25:27.465Z poll_read: waiting on response
2022-09-05T06:25:27.465Z polled new request
2022-09-05T06:25:27.465Z poll_write: waiting on request
2022-09-05T06:25:27.465Z poll_flush: flushed
2022-09-05T06:25:27.466Z poll_read: waiting on response
klp can add some additional, synthetic fields to the event.
For performance reasons this is only done if you explicitly list these keys using --keys
/-k
.
_klp_timedelta
: time span between this event and the event before (hours:minutes:seconds.microseconds)_klp_ts
: timestamp showing when the event was shown by klp
Use --loglevels
/-l
to restrict the output to specific log levels.
You can give a comma-separated list of levels.
$ cat mylog.txt | klp -l warning,error,fatal
Exclude log levels with --not-loglevel
/-L
.
This is useful to suppress trace and debug output:
$ cat mylog.txt | klp -L trace,debug
klp tries to parse timestamps automatically.
The following keys are treated as possibly holding timestamps: timestamp
, ts
, time
, at
.
You can specify your own key with --ts-key KEY
.
klp includes a list of common time formats that will be tried for parsing.
Use --ts-format
to specify your own format, using Python strptime() format codes (e.g. %a %d %b %Y, %I:%M%p
to parse Mon 31 Jan 2022, 11:59PM
).
There are multiple ways to specify the time range that you're interested in:
- absolute time:
--from ISODATETIME
and--to ISODATETIME
- relative from now:
--since INTERVAL
and--until INTERVAL
- timespan, computed from program start or first event shown:
--duration INTERVAL
,--timespan INTERVAL
Timespans can be specified with these units (for example: --since 1w2h3.2s
):
w
= weeksd
= daysh
= hoursm
= minutess
= secondsms
= millisecondsus
= microseconds
Sometimes you want to skip old events and only want to see new events, e.g. when using tail -f
or the --follow
option of kubectl
.
Use the --new
/-n
flag, which is equivalent to --since 0s`:
$ kubectl logs mypod --follow | klp -n
By default, klp preserves timestamps as they appear in the logs. Use the following options to control how timestamps are displayed:
--localtime
/-z
: Convert timestamps to your local timezone--utc
/-Z
: Convert timestamps to UTC
# Show timestamps in local timezone
$ klp --localtime -c mylog.logfmt
2015-03-26T06:27:38.000+01:00 debug Started observing beach
# Show timestamps in UTC
$ klp --utc -c mylog.logfmt
2015-03-26T05:27:38.000Z debug Started observing beach
Note that timezone conversion can slightly impact performance when processing large log files. Only use these options when timezone conversion is specifically needed.
These options cannot be used together - choose either local time or UTC.
They work with all input formats and can be combined with other timestamp-related features like --since
, --until
, and --timespan
.
Use --mark-gaps
to visually separate events that are far apart in time:
$ klp --mark-gaps 1h app.log
This will insert a visual separator between events that are more than 1 hour apart.
The --fuse
option allows you to condense events that occur close together in time:
$ klp --fuse 5s app.log
This will show only the first and last events for each group of events that occur within 5 seconds of each other.
Use --grep
/-g
to limit the processing to lines that match a given regular expression.
When this flag is given multiple times, any of those regexes matching
will allow the line to be processed (logical OR).
If you need a logical AND, use an appropriate regex or pipe the output to another instance of klp.
To specify lines that should NOT be processed, use --grep-not
/-G
/-v
.
Search is case-sensitive by default.
Use --ignore-case
/-i
for case-insensitive matches.
If you want case-insensitive matching only for specific regexes, prepend (?i)
to them.
By default, --grep
searches on the whole line.
To limit the search to a specific key, prepend that key and a tilde to the regex (key~REGEX
).
klp has several builtin regexes to match URLs, email addresses, common errors, path names, FQDN's or IPv4 addresses.
Use --grep-builtin
/-r
to use them for matching lines or --grep-builtin-not
/-R
for ignoring them.
Like with with the original UNIX grep, klp can print context lines (-B
, -A
, -C
).
Events before the matching line are visually marked with /
, lines after with \
.
You can define start and stop conditions to process specific blocks of logs:
--start-after REGEX
: Start processing after a line matching REGEX--start-with REGEX
: Start processing when a line matches REGEX--stop-before REGEX
: Stop processing when a line matches REGEX--stop-with REGEX
: Stop processing after a line matches REGEX
For example, to process logs between two specific events:
$ klp --start-with "Session started" --stop-before "Session ended" app.log
Use --num-blocks
to limit the number of start/stop blocks processed.
Use --where EXPR
to only process lines where the given Python expression is True.
All fields of the event are available by their key name.
$ cat httplog.txt | klp --where "len(message) > 10 and int(response[:3] > 499"
The whole event dict can be accessed as the underscore _
.
By default, errors are ignored, because some keys might not be available in all lines.
Use --debug
to print exceptions.
This can be be combined with other filters, such as --grep
and grep context lines.
Use --max-events
/-m
to limit the output to the given number of events.
This is useful to avoid being flooded with lots and lots of output.
Use --skip
to avoid parsing the first lines.
Customize your output further with these options:
--header
: Add text before the first event--footer
: Add text after the last event--output-event-sep
: Specify the separator between events--output-sep
: Specify the separator between fields
Experiment with --indent
, --no-indent
, --expand
, --no-wrap
, and --each-key
to change the formatting of the output.
For advanced output formatting, use --output-template
(with Python f-strings) or --output-eval
which allows Python code:
$ klp --output-template "{timestamp} - {level}: {message}" app.log
$ klp --output-eval "{ts} {level.upper()} {'#'*len(msg)}" app.log
Several Python modules can be used in these expressions.
Use --help-python
to show the list.
$ klp --help-python
The following additional functions are available:
extract_json()
: return the first JSON object or array in a given string as a stringextract_email()
: return the first email address in a given stringextract_fqdn()
: return the first FQDN in a given stringextract_ipv4()
: return the first IPv4 address in a given stringextract_regex()
: return first part of the string matching a regexextract_url()
: return the first URL in a given stringformat_datetime()
: format given string according to ISO 8601 (with millisecond precision), guessing the datetime formatguess_datetime()
: convert a string into a Python datetime objectpprint_json()
: pretty print JSON data
default
: default: Colored and formatted logfmtlogfmt
: Plain logfmtjsonl
: JSON Lines (shortcut:-J
)json
: JSONtsv
: Tab separated valuespsv
: Pipe separated valuessqlite
: SQLite database (use-o
to specify the filename)
Select an output format with --output-format
or -F
:
$ klp -F jsonl app.log > output.jsonl
$ klp -F sqlite -o app.db app.log
The JSONL and TSV output formats are useful for further processing with tools like jq
or awk
.
Use --output-file
/-o
to direct klp's output to a file instead of stdout.
# Write formatted output to a file
$ klp app.log -o processed.log
# Convert to JSON for later processing
$ klp app.log -F json -o events.json
# Create a SQLite database from logs
$ klp app.log -k time,level,msg -F sqlite -o metrics.db
# Write events to file but show stats on screen
$ klp app.log -s -F jsonl -o events.jsonl
Events shown: 8043 (100% of 8043 lines seen)
Time span shown: 2024-02-05T19:58:59Z to 2024-02-05T20:19:20Z
Note that using --output-file
disables color output by default, unless explicitly requested with --color
.
Use --input-exec
/-I
to specify Python code that transforms the event after it has been parsed.
This allows you to introduce new fields based on existing ones, or even develop an ad-hoc parser for an unknown format.
All functions available for output formatting (see above) are available here as well:
$ klp examples/mylog.logfmt -I "msg_len=len(msg); del msg"
$ klp examples/alertmanager.logfmt -I "path=(extract_path(file) or None)" -k path
$ klp -f line BGL_2k.log -I "ts=guess_datetime(line.split()[4]); msg=' '.join(line.split()[5:])" -c
When using --input-exec
/-I
, there are three special underscore variables available for more complex transformations:
_
: Contains the current event dictionary (useful for keys that aren't valid Python identifiers)__
: Merge the contents of this dictionary into the current event___
: Generate multiple output events from a single input event
# Using _ to access keys with special characters
$ klp examples/qryn.jsonl -j -I "method=_['req.method'].lower()"
# Using __ to add or update fields
$ klp app.log -I "__={'new_field': 'value', 'updated_field': msg.upper()}"
# Using ___ to generate multiple events
$ klp app.log -I "___=[{'split': word} for word in msg.split()]"
The __
approach is useful when you want to modify the event in-place or add new fields:
# Add length fields for all string values
$ klp app.log -I "__={k+'_len': len(v) for k,v in _.items() if isinstance(v, str)}"
The ___
list is useful when you need to split one event into multiple events:
# Split a comma-separated list into separate events
$ klp app.log -I "___=[{'item': item.strip()} for item in msg.split(',')]"
# Create events for each key-value pair
$ klp app.log -I "___=[{'key': k, 'value': v} for k,v in _.items()]"
# Create events for headers in a Markdown file
$ klp README.md -f line -I "___=[{'header': line, 'len': len(line) } if line.startswith('#') else None]"
When using these special variables:
_
provides read-only access to the current event__
merges its dictionary contents with the current event___
must be a list of dictionaries, each becoming a separate output event
Several helper functions are available:
parse_kv(text, sep=None, kvsep="=")
: Parse key-value pairs from a string and merge them into the current event
text
: Input string containing key-value pairssep
: Separator between different pairs (whitespace by default)kvsep
: Separator between keys and values (defaults to "=")- Commonly used with
__
to add the parsed fields to the current event
# Parse query parameters into separate fields
$ echo 'time=2024-02-08T15:04:05Z msg="GET /search" query="user=alice&role=admin"' | \
klp -I '__=parse_kv(query, sep="&")'
time="2024-02-08T15:04:05Z" msg="GET /search" query="user=alice&role=admin" user="alice" role="admin"
# Parse metrics with custom separator
$ echo 'time=2024-02-08T15:04:05Z msg="System stats" metrics="cpu:95.2 mem:87.5 disk:45.8"' | \
klp -I '__=parse_kv(metrics, kvsep=":")'
time="2024-02-08T15:04:05Z" msg="System stats" metrics="cpu:95.2 mem:87.5 disk:45.8" cpu="95.2" mem="87.5" disk="45.8"
# Parse semicolon-separated configuration
$ echo 'time=2024-02-08T15:04:05Z msg="Session created" config="db=postgres;port=5432"' | \
klp -I '__=parse_kv(config, sep=";")'
time="2024-02-08T15:04:05Z" msg="Session created" config="db=postgres;port=5432" db="postgres" port="5432"
Particularly useful for:
- Breaking URL query strings into fields
- Parsing metrics or stats with custom key-value separators
- Extracting configuration parameters
- Converting any key-value formatted substring into top-level fields
sh(command, **kwargs)
: Execute a shell command and return its output
command
: The shell command to execute- Returns the command's stdout as a string (stripped of trailing whitespace)
- Raises an exception if the command fails, so that the event is ignored (disable with
check=False
) - Can be customized with subprocess.run keyword arguments
# Add git commit info to each event
$ echo 'time=2024-02-08T15:04:05Z level=info msg="Deployment started"' | \
klp -I '__={"git_rev": sh("git rev-parse HEAD")[:7]}'
time=2024-02-08T15:04:05Z level=info msg="Deployment started" git_rev=62efbb3
# Add system load information
$ echo 'time=2024-02-08T15:04:05Z level=info msg="Health check"' | \
klp -I '__={"load": sh("uptime").split("load average:")[-1].strip()}'
time=2024-02-08T15:04:05Z level=info msg="Health check" load="13:14 up 2:30, 1 user, load averages: 2.39 1.96 1.90"
# Look up hostname for IP addresses, ignore errors
$ echo 'time=2024-02-08T15:04:05Z level=info msg="Connection from 8.8.8.8"' | klp -I '__={"hostname": sh(f"host {msg.split()[-1]}"), check=False}' --debug
time=2024-02-08T15:04:05Z level=info msg="Connection from 8.8.8.8" hostname="8.8.8.8.in-addr.arpa domain name pointer dns.google."
Note: Use the sh()
function very carefully, and only with trusted input.
Untrusted input can lead to command injection vulnerabilities.
These tools aren't specialized on log files, but useful to post-process or view klp`s output, particularly the JSONL or TSV formats:
- jq: a lightweight and flexible command-line JSON processor
- jaq: A jq clone focussed on correctness, speed, and simplicity
- yq: yq is a portable command-line YAML, JSON, XML, CSV, TOML and properties processor
- Miller: like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON
- qsv: CSVs sliced, diced & analyzed
- dasel: Select, put and delete data from JSON, TOML, YAML, XML and CSV files with a single tool. Supports conversion between formats
- jc: CLI tool and python library that converts the output of popular command-line tools, file-types, and common strings to JSON, YAML, or Dictionaries
- jtbl: CLI tool to convert JSON and JSON Lines to terminal, CSV, HTTP, and markdown tables
- ov: Feature-rich terminal-based text viewer. It is a so-called terminal pager.
- csvlens: Command line csv viewer
- Visidata: A terminal spreadsheet multitool for discovering and arranging data
- Benthos: high performance and resilient stream processor, able to connect various sources and sinks in a range of brokering patterns and perform hydration, enrichments, transformations and filters on payloads
The TSV format can be processed by spreadsheet software and common Unix tools:
- cut: uses tab separator by default
- AWK:
-F$'\t'
- sort:
-t$'\t'
- column:
-s$'\t' -N timestamp,log_level,message -J
(can convert to JSON) - Vim:
:set list :set listchars=tab:>-
- Pandas:
pd.read_table()
- SQLite:
sqlite> .mode tabs
sqlite> .import data.tsv data
sqlite> select date(timestamp) as day, count(*) from data group by day;
If you don't like klp, maybe try one of these other tools:
- hutils: the original logfmt tools (Ruby)
- lfq: command-line logfmt processor (Go). Quite similar to klp. Faster, but fewer features
- angle-grinder: Slice and dice logs on the command line (Rust). Flexible query syntax
- pq: Parse and Query log files as time series
- Kelpie: small, fast, and magical command-line data processor similar to pxi, jq, mlr, and awk
- lnav: very advanced log file viewer. Files-only (no streaming), lots of formats
- fblog: small command-line JSON Log viewer (Rust). Lua filters.
- json-log-viewer: powerful terminal based viewer for JSON logs using ncurses (JavaScript). TUI
- Toolong: A terminal application to view, tail, merge, and search log files, plus JSONL (Python). TUI
- sumoshell: collection of utilities to improve analyzing log files (Go). Successor is angle-grinder.
- l'oGGo: A powerful terminal app for structured log streaming (Go). TUI
- hl: A log viewer that translates JSON logs into human-readable representation (Rust)
- logdissect: CLI utility and Python module for analyzing log files and other data (Python). Multiple input formats
- tailspin: A log file highlighter (Rust)
- GoAccess: real-time web log analyzer and interactive viewer that runs in a terminal in *nix systems or through your browser
- Klogg: multi-platform GUI application that helps browse and search through long and complex log files
- frontail: streaming logs to the browser
- LogScreen: Loglines can be messy, read it better on a browser,
command | npx logscreen
- Logdy: Web based real-time log viewer. Stream ANY content to a web UI with autogenerated filters. Parse any format with TypeScript
- OpenObserve: cloud-native observability platform built specifically for logs, metrics, traces, analytics, RUM, designed to work at petabyte scale
- Graylog: Free and open log management
- Grafana Loki: horizontally-scalable, highly-available, multi-tenant log aggregation system