Skip to content

[DOC], when strings in the output are encoded as JSON strings but is not proper JSON per-RFC ASCII or UTF-8. #354

@kurok

Description

@kurok

I think we will document this, but JSON-like output helps automate a lot of stuff and avoid "grep" parsing.

FYI, as currently written, as strings in the output are encoded as JSON strings, and JSON kind of mandates UTF-8, that doesn't work properly for file names that are encoded in a character encoding other than ASCII or UTF-8.

From testing it looks like it replaces all sequences of 1 or more bytes that can't be decoded into UTF-8 with one character.

There's no good way to address that. That's a shortcoming of the JSON format.

lsfd and many other tools choose to dump those bytes as-is. That means it's not proper JSON per-RFC, but means the information can be extracted reliably provided you have JSON processing utilities that can cope with that.

I'm not saying lsof should or should not do the same, but either way, it would be good to document it.

See https://unix.stackexchange.com/questions/757832/how-to-process-json-with-strings-containing-invalid-utf-8 for more background on that.

Originally posted by @stephane-chazelas in #353 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions