Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 25 additions & 1 deletion Lsof.8
Original file line number Diff line number Diff line change
Expand Up @@ -924,7 +924,10 @@ is mutually exclusive with
.B \-j
and
.BR \-t .
Warnings and errors are sent to stderr; stdout is always valid JSON.
Warnings and errors are sent to stderr; stdout is always valid JSON
(see
.B "CHARACTER ENCODING NOTE"
below).
.TP \w'names'u+4
.B \-j
selects JSON Lines output mode. Each open file produces one JSON
Expand All @@ -940,6 +943,27 @@ is mutually exclusive with
.B \-J
and
.BR \-t .
.IP
.B "Character encoding note:"
JSON (RFC\ 8259) mandates that strings be valid UTF\-8.
However, file names on Unix\-like systems are arbitrary byte sequences
and may contain bytes that are not valid UTF\-8.
When such bytes appear,
.B lsof
passes them through to the output unchanged.
This means the output is not strictly conformant JSON, but the
original file name can be recovered.
This is consistent with the behaviour of
.BR lsfd (1),
.BR ip (8)
.RB ( \-j ),
and other Linux utilities that produce JSON output.
Consumers that require strict RFC\ 8259 conformance should
filter or re\-encode such values (e.g.\& using
.BR iconv (1)
or Python's
.B surrogateescape
error handler).
.TP \w'names'u+4
.BI \-i " [i]"
selects the listing of files any of whose Internet address
Expand Down
17 changes: 17 additions & 0 deletions docs/options.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,23 @@ Lsof has these options to control its output format:
- -F produce output that can be parsed by a subsequent
program.

- -J produce nested JSON output. Instead of tabular or
field output, lsof emits a single JSON object with a
`processes` array. Field selection follows -F rules.
Mutually exclusive with -j and -t.

- -j produce JSON Lines output. Each open file produces
one JSON object per line (denormalized with process
fields). Suitable for streaming pipelines and log
ingestion tools. Mutually exclusive with -J and -t.

**Note:** Unix file names are arbitrary byte sequences and may
contain bytes that are not valid UTF-8. When this occurs, lsof
passes the raw bytes through unchanged, producing output that is
not strictly conformant with RFC 8259. This matches the behavior
of `lsfd(1)`, `ip -j`, `systemctl --output=json`, and other Linux
tools.

- -g print process group (PGID) IDs.

- -l list UID numbers instead of login names.
Expand Down
25 changes: 24 additions & 1 deletion docs/tutorial.md
Original file line number Diff line number Diff line change
Expand Up @@ -602,7 +602,30 @@ homogeneous across Unix dialects. Thus, if you write a script
to post-process field output for AIX, it probably will work for
HP-UX, Solaris, and Ultrix as well.

Support for other formats e.g. JSON is planned.
### JSON Output

Lsof supports two JSON output modes:

- **`-J`** (nested JSON) — produces a single JSON object containing a
`processes` array, where each process has a `files` array of open-file
entries. Suitable for tools that consume a complete document (e.g.
`python3 -m json.tool`, `jq`).

- **`-j`** (JSON Lines) — produces one JSON object per line, combining
process and file fields in a single denormalized record. Suitable for
streaming pipelines, log ingestion (Splunk, Datadog, Elastic Stack),
and line-oriented tools.

Both modes reuse the `-F` field-selection mechanism. For example,
`lsof -J -Fpcfn` limits output to PID, command, fd, and name fields.

**Encoding caveat:** JSON (RFC 8259) requires strings to be valid UTF-8,
but Unix file names are arbitrary byte sequences. When file names
contain non-UTF-8 bytes, lsof passes them through unchanged — the output
is technically not valid JSON, but preserves the original file name.
This is the same approach taken by `lsfd`, `ip -j`, and most Linux tools
that produce JSON. If your consumer requires strict UTF-8, use a filter
such as `iconv` or Python's `surrogateescape` codec error handler.

## The Lsof Exit Code and Shell Scripts

Expand Down
10 changes: 10 additions & 0 deletions src/print.c
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,16 @@ static int human_readable_size(SZOFFTYPE sz, int print, int col);
* JSON output helpers
*/

/*
* json_puts_escaped() - write a C string as a JSON string value (without
* the surrounding quotes).
*
* Control characters (< 0x20) are escaped as \uXXXX. Bytes >= 0x80 are
* passed through unchanged. This means non-UTF-8 file names produce
* output that is not strictly RFC 8259 conformant, but preserves the
* original byte sequence. This is the same trade-off made by lsfd(1),
* ip(8) -j, and other Linux JSON-producing tools. See issue #354.
*/
static void json_puts_escaped(const char *s) {
const unsigned char *p = (const unsigned char *)s;
while (*p) {
Expand Down
Loading